9.3 Multitree Model Browsing - Tree View


[Picture]
Figure 9.3: Once the random tree generation has completed, this Multitree Model window opens to allow further analysis of the trees.

The Multitree Model dialog has options that prepare data for tree displays. The top of the Multitree Model dialog window has two viewers: the Tree List tab, which it opens with, and the Variable List tab. The bottom of the window has a tree viewer.

9.3.1 Multitree Model – Tree List

The Tree List shows the parameters for each of the random trees generated. As you click on each row, that generated tree appears in the tree view at the bottom of the window.

Here is what each Tree List column of data means:


Column Header What Value Means
Tree # Runs from 1 to the number of trees in the file.
Orig. RMS Error This is the root mean squared prediction error on the set of observations that were used to build the trees.
# Leaves Is the number of leaves at the bottom of the tree. It gives an indication of tree complexity. If two trees have near equal RMS error, one might apply the Occam’s razor principle and choose the simpler tree as a model over the more complex tree.

This dialog also has several icons that call up a tree viewer for analyzing the tree and several variations of the distance matrix. This feature is further detailed in the Section 12, Viewing Observation Distance Matrix.

9.3.2 Multitree Model – Variable List

The Variables List shows the variables in the data set. The Variables menu options are available from this tab view.

The four columns are:


Column Name Describes
Variable Name The name of the variable.
Column What column it is in the original data.
# times used The number of times the variable is used in the random trees.
Data Type The data type of the variable.

9.3.2.1 Sorting

You can manipulate the order of the variables by clicking on any column header button to sort in ascending or descending order.

9.3.2.2 Subset

You can use the Click and the Shift+Click to pick a contiguous range of variables. This subset of variables will appear in the Correlation/Interaction View. We will study that view in chapter 13.

9.3.2.3 Variables->View Variable Usage

The menu selection Variables->View Variable Usage brings up a Variable Usage Data Set spreadsheet of the data with columns showing the Tree, Variable, rawP, adjP and Depth.

9.3.2.4 Variables->View Variable Frequency

The menu selection Variables->View Variable Frequency brings up the data shown in the Variable List window in a spreadsheet form.

9.3.2.5 Viewing Variable Correlations

After creating a data set or subset by shift - click selecting (highlighting) a contiguous range of variables or holding Ctrl and clicking on several non-contiguous variables, you can view the interactions of the variables as a table (Viewing Variable Correlation Table or as a plotted matrix (View Correlation Interaction Plot. This matrix is discussed in detail in Chapter 13, The Correlation Interaction View.

9.3.3 File->Analyze Current Tree Tools

From the Tree List view, click on the tree you want to analyze. Once a tree has been selected, the File->Analyze Current Tree menu item becomes active. Clicking on the menu brings up a tree view that can then be analyzed with the Tree->Search Tree submenu items shown in Fig. 7.53.

The Search Tree menu items are covered in detail in Chapter 7. These tools work similarly with the multitree views.

9.3.4 Predictions (InSample)->View Average Tree In-Sample Predictions

This menu item brings up a spreadsheet view of the observations of the various trees showing the Actual and Predicted data.

9.3.5 Predictions (InSample)->Save All Tree In-Sample Predictions to CSV File


[Picture]
Figure 9.4: Excel view of the exported CSV file

The menu Predictions (InSample)->Save All Tree In-Sample Predictions to CSV File saves the predictions to an external comma-separated (.csv) file. The observations for which the predictions are generated are the ones that you selected in the spreadsheet before doing interactive tree analysis. Every observation is dropped down the trees created in the tree model. If the trees were created with a different set of observations than the ones under consideration, you can get a prediction for every (possibly unknown) observation.

The file, shown above in a spreadsheet view, includes the tree number, the actual value, predicted value, residual, and the name of the node where the observation fell in the particular tree Save Average Tree Predictions.

9.3.6 Save “C” Prediction Program

This will output to tree prediction structure to standard C/C++ language source code. The program will include functions that model the prediction behavior of the tree or trees, a structure to contain data in a format the prediction functions can use and a main() function with an example of how to call the prediction functions and handle the results.

The multitree output is similar to the unitree version available from the File->Output C File from the tree window. Multiple trees will be outputted as various prediction functions, and an example of how to call the multiple functions and calculate an average prediction result is in the main() function.

9.3.7 Close

Closes the Multitree Model dialog window.

9.3.8 Help

Opens a simple help screen on this dialog.