6.3 Navigating the Spreadsheet Menus
6.3.1 File Menu
|
The File menu has three menu items: Save As Comma-Delimited Text File, Import a Legacy Tree Model and Close.
6.3.1.1 Save As - Exporting Data
The menu choice File->Save as opens an Export Data file dialog. The blank areas can not be filled in (they are place holders to show the selected file and format once chosen). Click the textbfBrowse button. This opens a Save As dialog window, entitled Choose a file and format to export to, that has the Save as type setting defaulting to the CSV file type. (For more information on CSV files see the topic Data Types.) The Save as type drop down menu allows the file to be saved in any number of popular data formats. When the file has been named and the file format designated, click the Save button. This returns you to the Export Data window
6.3.1.2 Save As Comma-Delimited Text File
Allows the saving of the spreadsheet as an ASCII, comma delimited (CSV) file. This menu choice opens a Save As dialog window. Use this dialog to navigate and name the new CSV file. This “.csv” file may then be imported by other spreadsheet programs.
6.3.1.3 Import a Legacy Tree Model
This option opens a file browser where you can select a previously created .ght file. The tree file you are importing must have the same number of columns, the same data types for each column and the same dependent variable. After importing the tree if the values displayed do not match in all three columns it indicates that the original data does not exactly match the current spreadsheet.
6.3.1.4 Closing the File
The File->Close menu selection closes the data file and its associated spreadsheet. A dialog window prompts for the saving of unsaved data.
6.3.2 Edit Menu
Using the Edit menu we can find content, create subsets within the current spreadsheet and spin off new spreadsheets.
|
6.3.2.1 Select Row Subset
|
The menus choices Edit->Select Row Subset give you several ways to automatically pick subset records (rows). We can
choose:
Random fraction to specify what percentage of the records to use (default = 0.5);
Random selection size to specify the number of records (default = all of them),
First N items to specify the first N records to use from the spreadsheet (default = all), and
Reset random seed to change the random seed although the default of 1 will do in most cases. Fig. 6.9 shows a choice of
randomly selecting 50% of the records.
|
Resetting the random seed to 1 enables you to pick the same random subset as would have occurred if you just started the program up and no random number generation had taken place.
500 Random Records
|
The result of creating a Random fraction - 0.5 subset on a dataset with 1000 records is a subset of 500 randomly selected records These 500 records can be analyzed with less concern of a bias in their selection.
6.3.2.2 Activate All Rows
The easiest way to activate all rows is to use this command.
6.3.2.3 Inverting the Records (Rows) Selected
|
There may be times when you want to run two exclusive groups of data from the same data set. Using the menus Edit->Invert row selection flips the selected-de selected records. By inverting, all of the de-selected records become selected and all of the selected records become de-selected. The mechanism of using the Select Row Subset and Invert Row Selection routines is often used in building a training set, and then inverting to the holdout or test set to validate the model.
6.3.2.4 Inverting the Columns Selected
For convenience, there is also a way to invert the selected columns. Selecting Edit->Invert Column Selection will activate all inactive columns, and inactivate all active columns. Using this menu item will also clear dependent column status.
6.3.2.5 Row Subset Spreadsheet
|
You can create a new spreadsheet from the selected records (rows) of another. If you look at the illustration and the compound numbers in the second label column at left you can see this spreadsheet contains some, but not all, of the compounds in the original spreadsheet.
This (row) subset spreadsheet displays in a separate viewer, allowing you to close the viewer of the original spreadsheet. However, if you delete the navigator node of the original spreadsheet, this subset spreadsheet and its navigator node will also be deleted.
You can use the subset spreadsheet activity to split a spreadsheet’s data set into several smaller data sets. After selecting the records you wish to place into each subset spreadsheet, you create that subset spreadsheet and then use File->Save As to create a new file (for example, save as a .txt, or .csv file) for importing into another project.
6.3.2.6 Find Column Search Tool
|
The menu choice Edit->Find column allows you to search for a matching column name. The found column is placed as the first column (left) in the spreadsheet. Type in the name of the column you are looking for and click the OK button. ChemTree makes the best possible match ignoring case and will make partial matches. For example, searching on “s” would find s, S, scented, or Scented, in whichever column a match is first found. Searching on “sc” would find scented or Scented.
6.3.2.7 Inactivate All Columns/Activate All Columns
If you have many columns, it is easier to deactivate (inactivate) all of them and then just activate the few you wish to analyze. You will, of course, have to select a column for the dependent variable and activate one or more independent variables.
Similarly, use Activate all columns in data sets with various columns set to inactive and you wish to have all or most activated.
These two commands are the same as the click and shift-click across all the column headings.
6.3.3 Analysis Menu
The Analysis menu is where we do our serious work in ChemTree. Up to this point in this documentation we have been acquiring, converting and manipulating data to get it ready for recursive partitioning analysis.
|
6.3.3.1 Interactive Tree Analysis
The menu choice Analysis-Interactive Tree Analysis opens up the tree view. This is covered in depth in Chapter 7, Interactive Tree Analysis.
6.3.3.2 Create a Multiple-Tree Model
This menu choice will create a forest of random trees for analysis. This is covered in depth in Chapter 9, Random Tree Generation.
6.3.3.3 Apply a Tree Model
This menu choice allows you to make predictions using a tree model. Using tree models for predictions is demonstrated in Chapter 8, Prediction Recipes.
6.3.4 Help Menu
The menu choice Help->Spreadsheet Help opens the on-line manual for help on using the menus on the spreadsheet.