This assignment is designed to apply the following actions:
- Perform a sensitivity analysis on a decision tree model.
- Define and calculate success/non-success in probabilistic terms.
- Use those probabilities to estimate future success/non-success in a Markov model.
- Put that new knowledge to work in a decision analysis model.
- Use real data to inform decision analysis.
Using SensIt and your completed decision tree from Problem Set 6 (which is provided as Problem Set 7 Tree.xls, found in the DAT 520 Data Files folder in Course Information), perform a two-way sensitivity analysis on the Research/No Research arms, based on varying the marginal probability for p(YV, a.k.a. Viable), which is cell D9.
Follow the steps below to complete the assignment. The version of Excel you are using in this course has the TreePlan Add In installed.
Use Excel to Perform Sensitivity Analysis:
- Open Problem Set 7 Tree.
- On the Add In tab, select Sensitivity Analysis > one input, one output from the Tools menu.
- For Input Variable Value, use cell D9 (which is p(YV)) and give it a label like Research.
- For Output Variable Value, use cell J11 (the EV for No Research) and give it a label like No Research.
- Start at 0.01, step by 0.01 and end at 1 (this will give you 1% to 100%).
- Click OK. This generates another tab with some information at the top, two columns of values and a chart that you will not use.
- Go back to the tree and do it all again, except do it for the Research arm of the tree.
- When you click OK, you will get another tab with the column of results for Research.
- Copy the Research column into the tab with the No Research column (or vice versa; it does not matter).
- Now you will have three columns: 1% to 100%, Research and No Research. Select the headers and values in the Research and No Research columns and make a line graph of that data.
- Now you should see two crossing lines, like the graphic at the beginning of the assignment. The example lines are red and blue. Enlarge the graph. If all went well, do you see the region where the (red) research line is above the no research line? This is the zone where the EV of research is better than the EV of not having Dustin do his research, based on varying the values for the marginal probability p(YV). You can SEE when one situation is better or worse, depending on p(YV)!
- Recall the value for p(YV) used in Problem Set 6. How was this value calculated? It was your X%, which we said was “How many properties are worth at least $150,000 and a market index of at least 1.1 in year 1 and then, of those, how many go on to be worth more than $200,000 with a market index of at least 1.2 in year 5?”
- With all of that in mind, look at your graph. At what value (%) does doing the research first become better than not doing the research? Where does it again become worse than not doing the research?
- How do you explain the usefulness of the information in this sensitivity analysis?
Use R and Rattle to Perform Bottom-up Tree Model Diagnostics:
- Using Rattle in R, load the Mopps_with_commas.csv file and make sure the partition box is checked before you execute.
- Once it is loaded, in the “data” tab, set tot_success as the target and ignore everything except:
- Execute a tree model with the defaults. To know that you loaded everything right, you should see a Root node error of 74/140 = 0.52857.
- Using complexity and the other tuning parameters you learned about in section 11.5 of Data Mining with Rattle and R, explain what combination of tweaks gets you the lowest overall cross-validation error. This is found as the last value in the “xerror” column?
- Using the Draw button, copy your ideal Rattle decision tree from the Mopps data set into your assignment document, and interpret what it means.
- When your tree looks like the image below, it may need to be simplified! What would you suggest doing to help this tree?