‹‹ Back to SVS Home
7.4 Manually Splitting Nodes
7.4 Manually Splitting Nodes
You can use manual split to both control the variables on which you wish to split and also to adjust the split point.
|
Once the best split for each independent variable is computed, the independent variables are displayed in the dialog window shown in Fig. 7.23, ranked by its associated p-value. The user can click on a given row of the list and the tree in the tree view will be redrawn to show the split denoted by that row.
Here is what the column headers mean.
| Column Header | Meaning |
|---|---|
| Var # | This is the column number of the predictor within the spreadsheet. |
| P | This is the raw P-value computed through whatever statistic is appropriate for the split or regression. |
| aP | Adjusted P-value. A multiplicity adjustment applies for continuous and categorical splits where multiple possible cut points or categories are searched through for an optimal split. |
| FDR (aP) | The False Discovery Rate, based on the adjusted P-values. This is the FDR for ALL splits whose adjusted P-value is less than or equal to the adjusted P-value of this row’s split. See 17.15. |
| bP | Bonferroni-adjusted P-value. P-values are Bonferroni-adjusted by the number of descriptors scanned through and that actually could be split. The number of descriptors includes regression splitters in addition to the variable upon which they are based. A descriptor may not be a valid splitter because all values are identical, and hence will not contribute toward the Bonferroni correction. |
| Splitter | The name of the variable that splits the data. |
| Split Type | The type of split/regression that the variable performs. This can be one of binary, monotone, categorical or regression. |
| Split Rule | Shows in short text form the split rule for the given predictor. |
7.4.1 P Value Plots
A p-value plot may be initiated from the manual split window according to either of two different sort methods.
The Plot P Values by Var ♯ button initiates a set of p-value plots sorted by variable number (that is, in spreadsheet order). When the p-value plots are sorted this way, Simes’ method may be applied to the raw and adjusted p-values.
The Plot FDR (by aP) button initiates a set of p-value plots which are sorted by the value of the adjusted P-value (aP). As implied by this button, when the p-value plots are sorted this way, the False Discovery Rate (FDR), as applied to the aP values, is offered as one of the plots in the p-value plot window.
Please see 14 for more information on p-value plots.
7.4.2 Define Split
Allows you to explicitly drive the splitting process. There are two types of user-defined splits – splits for continuous or discrete predictors, and splits using categorical predictors. We will explore defining splits in the Section 7.5.
7.4.3 Don’t Split
Closes the Manual Split window and returns you to the tree window, which will return to showing the original tree node as a leaf node.
7.4.4 Using the Tree and Manual Split Window Together
It is possible to go back and forth between the tree window and the manual split window. The menu options usually available via the node pop-up menu are disabled until the split has been finalized.
The various buttons on the Manual Split Window have the following functions:
| Button | Means |
|---|---|
| OK | This finalizes any chosen split in the Manual Split Window. |
| Select All | This button selects all variables. This is useful in conjunction with P Value Plot |
| P Value Plot | Will plot the p-values for the subset of the predictors that the user has highlighted, in the order that the predictors are listed. Thus if you want to see a plot of p-values in order of size, sort by p-value rank and then highlight the subset of predictors of interest. You can either click and drag to select a range, or click then shift-click to highlight a sub range of predictors. |
| Define Split | Allows the user to explicitly drive the splitting process. There are two types of user-defined splits – splits for continuous or discrete predictors, and splits using categorical predictors. They are depicted and described next. We will see how to define splits in the next topic, 7.5. |
| Cancel | Closes the window and returns you to the tree window. |