7.7 The Tree Menu
This is what the submenus of the Tree menu do:
| Menu Choice | Does This |
|---|---|
| Options | Opens a two tabbed dialog that allows you to change calculation and display parameters. |
| Subset Spreadsheet | Creates a new spreadsheet view of any node highlighted with Shift+Click or Search Tree |
| Subset Tree | Creates a new tree view of any node highlighted with Shift+Click or Search Tree |
| Cherry Pick Compounds | Opens a dialog that allows you to Cherry Pick Compounds |
| Extend Current Tree Randomly | Opens a dialog that allows you to create multiple trees using random values. |
| Search Tree | Opens a submenu with various methods of highlighting nodes of the tree. |
|
The Tree menu allows you to create subsets of your current tree and to create randomly created groups of trees.
7.7.1 Tree->Options
The menu choices under Tree->Options is covered in Chapter 3.5.3.1.
7.7.2 Tree->Subset Spreadsheet
|
You can select any node by using Shift+Click or using the Search Tree submenu. It will indicate selection by showing a pink border around the node. Once selected, the menus Tree->Subset Spreadsheet will open a new spreadsheet filled with the records in that node.
|
This spreadsheet will hold only the compounds indicated by a previously selected node.
7.7.3 Tree->Subset Tree
|
Using the same selected node we can use the menus Tree->Subset Tree to create a new tree view with the selected node as the parent node. You can now do analysis on the subset of the original data set.
Multiple interesting nodes in a tree can be selected by Shift Clicking on them, and then performing a tree analysis only on a merged subset of the observations that belong to the selected nodes. This menu selection will cause a new tree analysis window to be displayed with a root node corresponding to the observations from all previously selected (magenta-bordered) nodes.
|
In Fig. 7.46 we see that the center node has a higher mean that the two side nodes. We can shift click on each of the two side nodes and then select the menus Tree->Subset Tree.
|
This gives a new tree view that allows us to explore the details of these lower mean nodes. Because these two nodes have an almost equal mean, it might make sense to combine them to obtain a larger sample size, rather than analyzing them separately.
7.7.4 Tree->Cherry Pick Compounds Using Current Tree
This is the “cherry-picking” mechanism – a feature only enabled with the cherry-picking module. It allows the in-silico prediction of compound activities. ChemTree allows predictions using either ChemTree computed descriptors or user-included descriptors. Cherry-Picking can be invoked from either a single tree model or the multitree model. Using a single tree is useful if you hand-built a tree and are interested in predicting from that single tree. Note, however, that best cherry-picking performance usually comes from averaging over multiple trees.
In either case it will drop the compounds in the specified file down every tree in the specified tree model, and predict the activity per tree based on the mean potency at the leaf node where the compound falls. It will also print the mean and optionally the median prediction of the forest of trees. It is optional whether to print the prediction for every tree, or just the final averaged/median prediction for all trees.
The cherry-pick menu selection starts the cherry-picking wizard shown below.
|
On this wizard screen you specify the Compound Structure Format to be cherry-picked and whether you have a separate descriptor file. Note that you can cherry-pick from a descriptor file only. To indicate this uncheck the Use a Compound Structure file check box and check the Use a separate descriptor file check box. A descriptor file contains set of user-defined descriptors to be used for cherry-picking the compounds. You must specify at least one input file, either this one or a compound structure file. The compound structure file is used by ChemTree for computing atom-pair descriptors. It is possible to specify both the compound structure file and a descriptor file to combine user-defined and ChemTree generated descriptors. After making your selections click the Next button.
|
This wizard screen is used for selecting the source of compound structures. It appears only if the Use a Compound Structure file was checked in the first wizard screen. Here we are showing the screen for the SMILES format, the SD format is the same except there is no ODBC option. To select the data source first click the radio button to indicate the type of data source. Then click the Browse button. In the case of ODBC, an ODBC browser will be started for selecting a database, and in the case of a text file a standard file browser will be started. If you are selecting data using ODBC you have an addition checkbox "Select data using SQL query". When this checkbox is not checked there will be an additional wizard screen (not shown) when you click the Next button. This additional screen is used for selecting the database table you want to cherrypick. In this mode all rows from the selected table will be cherrypicked. If this checkbox is checked and you entered a valid SQL select statement in the box below the checkbox, then the select statement you entered will determine which table is used and which rows from the table will be cherrypicked. After selecting your data source click Next and the following screen appears.
|
This screen has two versions. The SMILES version is shown here. It has one list for identifying the data field that represents the SMILES string and one list for identifying the field that represents the compound name. Just click one field name in each list to set the SMILES string and compound name fields used for analysis. If there is no compound name field in the file then do not select any fields in the second list. At the bottom of the screen there is a set of radio buttons that indicate how to deal with any other fields that may be in the file. The SD version is exactly the same except it has only one list for identifying the compound name.
After completing this screen you will see a screen for selecting an additional descriptor file if you checked the "Use a separate descriptor file" checkbox on the first screen. This screen is shown below and works exactly like the screen for selecting a data source described above with the addition of a drop down list to indicate how the fields are separated. Separation is either by comma or spaces.
|
After completing this screen and clicking Next you will see the final wizard screen. Here you set the parameters for how compounds will be cherry picked and what outputs you want. You can have all compounds selected or you can specify parameters of mean or median predictions to use when selecting compounds. This is done by selecting the appropriate radio button. Selecting by prediction allows the filtering of compounds according to a given activity range. You can specify that the mean or median predicted activity is above or below a given numeric threshold. The median activity only applies to multiple tree predictions.
|
The lower part of the wizard, shown above, is for setting output options. The first checkbox, Output predicted activity, causes either a new spreadsheet to be added to the project or a CSV file to be created showing the activity predictions. Use the radio buttons under the checkbox to identify where the output will go. If you choose to output to CSV then you must also click the Browse button and select where you want your output to go. Further options are available to print median prediction of all trees and to print prediction for each tree. The latter is useful if you want to do your own multiple tree analysis using other metrics than mean or median. If you turn on print prediction for each tree you also have the option to print standard deviation for each predictor node and to print predictor node names. The latter option may be useful for power users who want to look at how compounds cluster into nodes.
The Output cherry picked compounds check box is only active when you use the “Select compounds by prediction of” option, and have specified and input compound file. This option saves the subset of highly active compounds to a new compound file. If selected it will cause and additional wizard screen(not shown), to appear for specifying where your new compound file will be created.
7.7.5 Tree->Extend Current Tree Randomly
Creating random trees (or multiple trees as it is sometimes referred to) is covered in detail in the following Chapter 9, Random Tree Generation.
7.7.6 Tree->Search Tree
|
If you want a more dynamic method of searching, you can highlight nodes for the purpose of using Tree->Subset Tree and Tree->Subset Spreadsheet, or if you simply want to find a single node in a large tree, the Search Tree submenu can help you.
The Search Tree submenu highlights nodes in the same fashion as Shift-Click, allowing them to be selected by the Subset Spreadsheet and Subset Tree submenus of the Tree menu. See Section 7.7.6 for more information. This is what the five items of the Search Tree menu do:
| Menu Choice | Does This |
|---|---|
| Find Observation | Opens a dialog with options for searching through a tree for observation names. |
| Find Node | Opens a dialog with options for searching for the labels assigned to tree nodes. i.e.N112 |
| Select Node by Threshold | Opens a dialog with options for searching for nodes with a mean above or below a threshold |
| Highlight All Nodes | Highlights all the nodes in a tree. |
| UnHighlight All Nodes | UnHighlights all the nodes in a tree. |
The menus Tree->Subset Tree and Tree->Subset Spreadsheet perform operations only on the nodes highlighted by Shift-Clicking or using the search methods of this sub-menu. You can now search large trees for nodes of significance with Select Node by Threshold or find where in a tree a compound lies with Find Observation.
7.7.6.1 Tree->SearchTree->Find Observation
|
You can use the SearchTree->Find Observation to find where a compound lies in a tree, highlight that node, or the entire path to that node or search for all observations that contain a specific string in their name.
There are three options you can change:
| Value Labeled | Explanation |
|---|---|
| Highlight | When Leaves Only is selected, the search will only highlight the leaf node or nodes matching the search criteria. Otherwise when Entire Path is selected, all nodes matching the search will be highlighted |
| Search Method | Specifies whether the search will find only exact matches of the search string or will include all observations with labels that contain the search string. |
| Select Method | The Add to Lookup method will leave the current highlighted nodes on the tree, while the Replace Lookup will unhighlight the entire tree before highlighting any found nodes. |
7.7.6.2 Tree->SearchTree->Find Node
|
If you would like to highlight a node with a specific node name, or highlight a node and its subtree, you can use the SearchTree->Find Node dialog.
There are two options you can change:
| Value Labeled | Explanation |
|---|---|
| Search Method | Specifies whether the search will find only exact matches of the search string or will include all observations with labels that contain the search string. |
| Select Method | The Add to Lookup method will leave the current highlighted nodes on the tree, while the Replace Lookup will unhighlight the entire tree before highlighting any found nodes. |
|
7.7.6.3 Tree->SearchTree->Select Node by Threshold
If you would like to highlight only the nodes with a high or low mean response, you can use the SearchTree->Find Node by Threshold dialog.
There are two options you can change:
| Value Labeled | Explanation |
|---|---|
| Search | When Leaves Only is selected, the search will only highlight the leaf node or nodes matching the search criteria. Otherwise when All Nodes is selected, any node matching the search will be highlighted |
| Select Method | The Add to Lookup method will leave the current highlighted nodes on the tree, while the Replace Lookup will unhighlight the entire tree before highlighting any found nodes. |
|
7.7.6.4 Tree->SearchTree->Highlight All Nodes
These menu items allow a fast way to highlight all the nodes
7.7.6.5 Tree->SearchTree->UnHighlight All Nodes
These menu items allow a fast way to unhighlight all the nodes