‹‹ Back to SVS Home
5.6 Scripting Reference
5.6 Scripting Reference
5.6.1 Project Related Commands
5.6.1.1 Creating a New Project
To create a new project that can later be viewed in GUI mode use the following command. Once you create a project all new ChemTree objects will be added to the project as if you were doing the same operations in GUI mode.
EXAMPLE
ghi.newProject(’Discovery’, ’/projects’ )
SYNTAX
ghi.newProject(’projectName’, ’projectLocation’)
Note that the projectLocation must be an existing folder on your file system and projectName will be a new folder in the projectLocation folder.
5.6.1.2 Creating a Temporary Project
To create a temporary project, use the following command. Projects created with this command will not be saved to disk, and will not be available once the script has completed. This command cannot be used in GUI mode.
Note: Projects created with this command cannot be saved using the saveProject command.
EXAMPLE
ghi.newTempProject()
SYNTAX
ghi.newTempProject()
5.6.1.3 Open an Existing Project
To open a project previously created in either GUI mode or script mode use this command.
EXAMPLE
ghi.openProject(’/projects/Discovery/Discovery.ghp’)
SYNTAX
ghi.openProject(’projectPath/projectName.ghp’)
5.6.1.4 Saving a Project
When you are at a point in your workflow where you want to save the state of the current project use this command.
EXAMPLE
ghi.saveProject()
SYNTAX
ghi.saveProject()
5.6.1.5 Closing a Project
The following command will close the current project without saving the state of the project first. If you want to save the project state first use the saveProject command.
EXAMPLE
ghi.closeProject()
SYNTAX
ghi.closeProject()
5.6.2 General GHI Commands
5.6.2.1 Allowing Viewers to Display
There may be times when you are running a script from and you do not want to see viewers, such as progress dialogs, during the running of script commands. The following command will either suppress or allow the display of GUI viewers while executing scripts. Note that you can turn viewers on and off at any time while running a script and this command only affects scripts that are run from the Scripts menu of a viewer.
EXAMPLE
ghi.enableNewViewers(1)
SYNTAX
ghi.enableNewViewers(0 = false 1 = true)
5.6.2.2 Allowing Log Messages to Be Created
There might be times when you do not want to have logging take place during the execution of script commands, but other times when you do want logging. The following command will either suppress or allow the logging of actions while executing scripts. Note that you can turn logging on and off at any time while running a script.
EXAMPLE
ghi.enableLogging(1)
SYNTAX
ghi.turnOffLogging(0 = false, 1 = true)
5.6.2.3 Display a GUI Message
Sometimes you may want to pop up a GUI based message to report status or other information. This command will take the text parameter and display it in a standard message dialog window.
EXAMPLE
ghi.message("my important message")
SYNTAX
ghi.message("some message")
5.6.2.4 Display a GUI Error Message
When you create a script that uses try/except syntax you can put this command in the except clause and any exception message will be displayed in a GUI error dialog.
EXAMPLE
ghi.error()
SYNTAX
ghi.error()
5.6.2.5 Getting a Specific Navigator Node
When you know a Navigator Node display name or a Navigator Node ID you can retrieve an object representing that Navigator Node. The following command is an overloaded command that takes either an integer for the Node ID or a String for the Node name. When asking for a Navigator Node by ID a single object is returned. When asking for a Navigator Node by name a list of objects is returned because names are not guaranteed to be unique.
EXAMPLE
ghi.getObject(’name’)
SYNTAX
objectList = ghi.getObject(’navigator node display name’)
EXAMPLE
ghi.getObject(ID)
SYNTAX
objectVariable = ghi.getObject(navigator node ID)
5.6.2.6 Getting the Current Navigator Node
Another way to get access to Navigator Nodes it to ask for the currently highlighted Node. If no node is highlighted an error will be displayed otherwise an object representing the current Node will be returned.
EXAMPLE
myVar = ghi.getCurrentObject()
SYNTAX
objectName = ghi.getCurrentObject()
5.6.2.7 Choosing a File
This method will display a dialog window for browsing and selecting a file(s). If a file(s) is selected then a tuple with the complete path to the file(s) is returned. If the dialog is canceled then an empty tuple is returned. There are two required parameters the first defines a file extension mask. For example if you pass in "*.txt" the dialog will only display files that have the .txt extension. The second is the title to be displayed in the dialog’s title bar. The third argument is optional. If you put 1 for the third argument the the file chooser will allow multiple files to be selected. If the third argument is omitted or anything but a one is set the the chooser will default to selecting only a single file.
EXAMPLE
myFilePath = ghi.chooseFile("*.txt", "Choose A File Please", 1)
SYNTAX
file path = ghi.chooseFile(file extension mask, dialog title, allow multiple selection)
5.6.2.8 Choosing a directory
The following method can be used to create a file browser for browsing to, and selecting directories. This method has one required parameter, and takes one optional parameter. The required parameter is the title to be displayed in the dialog’s title bar. The optional parameter specifies the initial working directory of the browser. If this parameter is omitted, then the ChemTree application directory will be used as the initial working directory. If the dialog is cancelled, an empty string will be returned. Otherwise, the path of the selected directory will be returned.
EXAMPLE
myDirectoryPath = ghi.chooseDirectory("Choose A Directory Please", C:/ChemTree/example )
SYNTAX
directory path = ghi.chooseDirectory(dialog title, [initial working directory])
5.6.2.9 Creating a Progress Bar
This method will create a progress bar which can display the progress of a certain task, and be used to signal the cancellation of processes. There are two required arguments for this method. The first argument specifies the text to be displayed on the progress bar. The second argument defines the number of progress increments for the progress bar.
EXAMPLE
myProgressBar = ghi.progressBar(“Please Wait”, 100)
SYNTAX
progress bar = ghi.progressBar(“dialog caption”, total number of progress increments)
5.6.2.10 Setting the Progress Bar’s Progress
The following method allows you to set the progress displayed by the progress bar to the value passed in. This value will be displayed on the progress bar as a percentage based on the proportion of the specified progress to the total number or progress increments. For example, if a progress bar is defined as having 50 progress increments, setting the progress to 10 will cause the progress bar to display 20 percent completion
EXAMPLE
myProgressBar.setProgress(10)
SYNTAX
progressBar.setProgress(progressValue)
5.6.2.11 Checking Whether the Progress Bar Has Been Cancelled
The following function allows you to check whether a user has pressed the cancel button on the progress bar. This information may prove useful when trying to determine whether to stop a process prior to it’s completion. If the progress bar has been cancelled the method returns 1. Otherwise, the method returns 0.
EXAMPLE
variable = myProgressBar.wasCancelled()
SYNTAX
integerVariable = myProgressBar.wasCancelled()
5.6.2.12 Disposing of a Progress Bar When Done
It is good practice to make sure that a progress bar is disposed of when the task is complete. After this method is called, the Progress Bar will no longer show itself and calling methods on the script object will have no effect.
EXAMPLE
myProgressBar.finish()
SYNTAX
myProgressBar.finish()
5.6.2.13 Creating a Status Dialog
This method will create a status dialog which can display messages for a task that can not incrementally update a Progress Bar. This is also useful for brief tasks that do not require the full weight of a Progress Bar. There is one argument for this method, the message to by displayed by the Status Dialog.
EXAMPLE
myStatusDialog = ghi.statusDialog(“Doing something brief”)
SYNTAX
status dialog = ghi.statusDialog(“dialog message”)
5.6.2.14 Setting the Status Dialog’s Message
The following method allows you to change the message displayed by the status dialog. This may be useful when you have a series of tasks and you would like to inform the user which task is currently being worked on. The only argument is the message to update the dialog with.
EXAMPLE
myStatusDialog.setMessage(“Now working on a very hard problem.”)
SYNTAX
statusDialog.setMessage(newMessage)
5.6.2.15 Closing the Status Dialog When Done
To close the status dialog, simply call this method. You should always remember to finish the status dialog that you start and only use one at a time.
EXAMPLE
myStatusDialog.finish()
SYNTAX
myStatusDialog.finish()
5.6.3 Commands Common to All Objects
Some commands are available for all the ChemTree objects that you can access from the Python shell. These commands allow you to control GUI aspects of objects you create in scripting.
5.6.3.1 Change a Navigator Node Name
During the course of a script you could be creating Navigator Node objects that will appear in the Navigator Window the next time you open the project in GUI mode. If the generic names assigned to new Navigator Nodes is not the desired behavior you can change the name with this command.
EXAMPLE
myVariable.setName(’myname’)
SYNTAX
objectName.setName(new navigator node name)
5.6.3.2 Getting a Navigator Node Name
If you need to know the name of a Navigator Node use this command with any Python object that corresponds to a Navigator Node.
EXAMPLE
myVariable = someObject.getName()
SYNTAX
variableName = objectName.getName()
5.6.3.3 Getting a Navigator Node Type
If needed, you can get the Navigator Node type from an object with this command. The command returns a string displaying the object’s type.
EXAMPLE
myVariable.getType()
SYNTAX
objectName.getType()
5.6.3.4 Getting a Navigator Node ID
If needed, you can get the Navigator Node ID from an object with this command. The command returns an integer representing a nodes ID.
EXAMPLE
myVariable.getID()
SYNTAX
objectName.getID()
5.6.3.5 Deleting a Navigator Node
To delete a Navigator Node enter this command in the Python Shell window. If a node can not be deleted, such as the project node or a node that is used to create another node, then a message will be displayed and the node will not be deleted. After entering this command in the Python Shell the variable that represented the node will no longer be valid and any attempt to use it will display a message saying it is no longer valid.
EXAMPLE
myVariable.deleteObject()
SYNTAX
objectName.deleteObject()
5.6.3.6 Closing a Navigator Viewer
To cause the viewer for a Navigator Node to be shut down you can enter this command in the Python Shell window.
EXAMPLE
myNodeVariable.close()
SYNTAX
objectName.close()
5.6.3.7 Showing a Navigator Viewer
To cause the viewer for a Navigator Node to be displayed you can enter this command in the Python Shell window.
EXAMPLE
myNodeVariable.show()
SYNTAX
objectName.show()
5.6.3.8 Finding a Node’s Parent
To get an object that represents a node’s parent enter this command in the Python Shell window and it will return an object representing the parent node. You can use the getType() command to test what type of object is returned.
EXAMPLE
newNodeVariable = myNodeVariable.getParent()
SYNTAX
newObjectName = objectName.getParent()
5.6.3.9 Finding a Node’s Secondary Parent
This command returns an object representing the secondary parent of a node. A secondary parent is another node that was used in combination with the current node’s parent to create the current node. If there is no secondary parent then nothing is returned. You can check the type of secondary parent returned by using the getType() command.
EXAMPLE
newNodeVariable = myNodeVariable.getParentSecondary()
SYNTAX
newObjectName = objectName.getParentSecondary()
5.6.3.10 Getting a Node’s Annotations
This command will returned a string with the current contents of the annotations window.
EXAMPLE
stringVariable = myNodeVariable.getAnnotations()
SYNTAX
newStringName = objectName.getAnnotations()
5.6.3.11 Appending to a Node’s Annotations
This command will append a string to the end of the current contents of the annotations window.
EXAMPLE
myNodeVariable.appendAnnotations("some text")
SYNTAX
objectName.appendAnnotations("some text")
5.6.4 Importing and Loading Data
The following commands allow you to import datasets into your open project.
5.6.4.1 Importing GHD-format DataSets
To import a (“Legacy”) GHD-format dataset, use
EXAMPLE
ss = ghi.importGHD(’/home/mydata.ghd’)
SYNTAX
spreadsheet = ghi.importGHD(path and filename of GHD file)
The resulting spreadsheet is returned, and may be assigned to a variable.
5.6.4.2 Importing HTS Files
This method imports compound data either from an SD file or a combination of an SD file and a CSV file.
The parameters are as follows:
- The input SD file.
- The SD field name for the compound name, if needed. Leave blank if the compound name is to be found in the "normal place", that is, in the header line of the MOLFILE- format block at the beginning of the compound information. (This field is not used if you are in GUI mode and viewers are enabled.)
- Slash-separated field(s) desired to be imported from the SD file. Leave blank if none are needed. (This field is not used if you are in GUI mode and viewers are enabled.)
- A user descriptor file. Leave blank if no user descriptor file is to be used.
- ’C’ for comma-separated or ’S’ for space-separated user descriptor file. (Leave blank if no descriptor file is being used.)
- The minimum number of times the path-length descriptor must appear. Use zero to not use auto-generated path-length descriptors. Omit to use the default value of 5.
- "A" to ask for augmented atoms (available with MultiVariate only). Otherwise, omit or leave blank.
- Maximum path length to be used, if path lengths are to be converted to binary format (available with MultiVariate only). Omit or use zero otherwise.
EXAMPLE
ss = ghi.importHTS(’c:/ChemTree/example/external.SD’,”,”,’c:/ChemTree/example/external.dat’,’S’,5)
EXAMPLE
ss = ghi.importHTS(’c:/ChemTree/mydata/cancersubset.SD’,”,”,’c:/ChemTree/mydata/cancersubset.csv’,’C’,5,’A’,50)
SYNTAX
spreadsheet = ghi.importHTS(path and file name of SD file,SD field name,SD fields,path and file name of descriptor file,’C’ or ’S’,minimum occurrences of path length descriptor,optional- ’A’ for augmented atoms,optional- maximum path length for binary format)
5.6.5 Creating a New DataSet With Scripting
As you manipulate data in scripting there may be times when you would like to add a new dataset and its corresponding spreadsheet to a project. The following set of commands allows you to construct a dataset from Python lists and add the dataset to a project. The first step is to get a datasetbuilder object using the following command.
5.6.5.1 Getting a Datasetbuilder Object
This command returns a pyspreadsheetbuilder object for use in building new datasets. The first parameter is the display name for the dataset when it is added to the navigator window. The next two parameters are the number of rows and columns respectively. The last column indicates weather you want to add a column of row labels. Note that if you want a column of row labels you must use the addRowLabels command before adding any of you data columns.
EXAMPLE
myBuilder = ghi.startSpreadsheetBuilder("datasetName",10,10,1)
SYNTAX
builderObject = ghi.startSpreadsheetBuilder(dataset name, number of rows, number of columns, add a row label column 1=yes 0=no)
5.6.5.2 Adding Row Labels
If you specified that your dataset will have row labels you must use the following command to add the row label column before you add any data columns. There are two parameters for this command the first is the column header and the second is a Py::List of strings that are the row labels.
EXAMPLE
ssbuilderObject.addRowLabels("myLabels", column data list)
SYNTAX
ssbuilderObject.addRowLabels(column header, Py::List of strings)
5.6.5.3 Adding a Column of Boolean Values
The following command adds a column of boolean values. Note the the Py::List of values should be ints 0 and 1.
EXAMPLE
ssbuilderObject.addBoolColumn("myBools", column data list)
SYNTAX
ssbuilderObject.addBoolColumn(column header, Py::List of 0s and 1s)
5.6.5.4 Adding a Column of Integer Values
The following command adds a column of integer values.
EXAMPLE
ssbuilderObject.addIntColumn("myInts", column data list)
SYNTAX
ssbuilderObject.addIntColumn(column header, Py::List of integers)
5.6.5.5 Adding a Column of Double Values
The following command adds a column of double values.
EXAMPLE
ssbuilderObject.addDoubleColumn("myDoubles", column data list)
SYNTAX
ssbuilderObject.addDoubleColumn(column header, Py::List of doubles)
5.6.5.6 Adding a Column of Nominal Values
The following command adds a column of nominal values.
EXAMPLE
ssbuilderObject.addNominalColumn("myNominals", column data list)
SYNTAX
ssbuilderObject.addNominalColumn(column header, Py::List of strings)
5.6.5.7 Creating the DataSet
After you have added all the columns you desire to the pyssbuilder object you are ready to add the dataset to the current project. This command will add the dataset as a child of the node id you pass in as a parameter and return a spreadsheet object representing the new dataset. Note if no parameter is passed in the builder will default to placing the dataset under the project root node.
EXAMPLE
newSpreadsheetObject = ssbuilderObject.finishSpreadsheet(5)
SYNTAX
newSpreadsheetObject = ssbuilderObject.finishSpreadsheet(nodeID)
5.6.6 Spreadsheet Access and Manipulation
Once you have created a scripting spreadsheet object you can use the following commands to manipulate the spreadsheet. For all the commands listed here the scripting spreadsheet is called ss.
5.6.6.1 Getting the Spreadsheet as a Dictionary
Returns the entire spreadsheet as a dictionary of key value pairs, where the key is a string containing the spreadsheet column label, and the value is a list containing the contents of the spreadsheet column. If there a label column, it is also incorporated as a dictionary entry with its associated column label as its key.
EXAMPLE
dict = ss.asDict()
SYNTAX
dict = ss.asDict()
5.6.6.2 Getting the Spreadsheet as a List of Lists
Returns the entire spreadsheet as a list of column lists. The columns will be listed in spreadsheet column number order. If there is a label column it will be the first column.
EXAMPLE
list = ss.asList()
SYNTAX
list = ss.asList()
5.6.6.3 Getting a Spreadsheet Cell
Returns the spreadsheet entry from row i, column j. Row 0 is the row headers and column 0 is the column headers (if they exist). An invalid row or column index throws a RunTimeError exception.
EXAMPLE
object = ss.cell(rowNum, colNum)
SYNTAX
object = ss.cell(Int i, Int J)
5.6.6.4 Getting a Spreadsheet Column by Column Number
Returns the spreadsheet column values for column i. Column 0 is the column headers (if they exist). An invalid column index throws an exception.
EXAMPLE
list = ss.col(3)
SYNTAX
list = ss.col(conNum)
5.6.6.5 Getting a Spreadsheet Column by Column Name
Returns the spreadsheet column values for the column with a header == name. An invalid name throws an exception.
EXAMPLE
list = ss.col("name")
SYNTAX
list = ss.col("columnName")
5.6.6.6 Determining if a Spreadsheet is a Marker Map
Returns 1 if a spreadsheet is a marker map spreadsheet or 0 otherwise.
EXAMPLE
val = ss.isMarkerMap()
SYNTAX
val = ss.isMarkerMap()
5.6.6.7 Get a Spreadsheet Column Type
Returns the column type as one of the following values.
- 0 is Binary
- 1 is Integer
- 2 is Double
- 3 is Categorical
EXAMPLE
type = ss.getColType(3)
SYNTAX
type = ss.getColType(column number)
5.6.6.8 Get a Spreadsheet Column State
Returns the column state as one of the following values.
- 0 is inactive
- 1 is independent
- 2 is dependent
EXAMPLE
state = ss.getColState(4)
SYNTAX
state = ss.getColState(column number)
5.6.6.9 Export a Spreadsheet to CSV File
Writes the entire contents of the spreadsheet out to the comma-separated file specified in “fileName”. If an empty string is passed in, then the user is prompted for a file. If an error occurs in writing to the file in GUI mode, ChemTree shows an error message to the user.
EXAMPLE
ss.exportCSV("results.csv")
SYNTAX
ss.exportCSV("fileName.csv")
5.6.6.10 Finding a Column by Name
Searches for a column in the spreadsheet whose column label is equal to “colLabel”. It returns the index of that column, or throws an exception if no such column is found.
EXAMPLE
colNum = ss.findCol("name")
SYNTAX
colNum = ss.findCol("columnName")
5.6.6.11 Finding a Row by Name
Searches for a row in the spreadsheet whose row label is equal to “rowLabel”. It returns the index of that row, or throws an exception if no such row is found. The spreadsheet must have row labels otherwise this routine will throw an exception.
EXAMPLE
rowNum = ss.findRow("name")
SYNTAX
rowNum = ss.findRow("rowName")
5.6.6.12 Invert Row States
The state of all rows is inverted. That is, rows that were formerly active are made inactive, and rows that were formerly inactive are made active. This routine is useful in creating training and test sets.
EXAMPLE
ss.invertRowState()
SYNTAX
ss.invertRowState()
5.6.6.13 Getting the Number of Spreadsheet Columns
Returns the number of columns in the spreadsheet (not including the label column).
EXAMPLE
num = ss.numCols()
SYNTAX
numCols = ss.numCols()
5.6.6.14 Get the Number of Columns in a State
Returns the number of columns in the given state. There are three states: 0=Inactive, 1=Independent, 2=Dependent.
EXAMPLE
numIndependant = ss.numColsState(1)
SYNTAX
num = ss.numColsState(state)
5.6.6.15 Get the Number of Spreadsheet Rows
Returns the number of rows in the spreadsheet (not including the column header row).
EXAMPLE
numRows = ss.numRows()
SYNTAX
numRows = ss.numRows()
5.6.6.16 Get the Number of Rows in a State
Returns the number of rows in the given state. There are two states: 0=Inactive, 1=Active.
EXAMPLE
numActive = ss.numRowsState(1)
SYNTAX
num = ss.numRowsState(state)
5.6.6.17 Randomly Shuffle Rows
Randomly permutes the Rows in the spreadsheet by modifying the sort order at random. Subsequent calls to permuteRows will give new permutations, based on the current random seed.
EXAMPLE
ss.permuteRows()
SYNTAX
ss.permuteRows()
5.6.6.18 Getting a Row of Data
Returns a list of elements of row number i. Row 0 is the header row. Rows 1 and up contain the data elements of the spreadsheet. Note that row access is generally slower than column access. An exception is thrown if an invalid row number is specified.
EXAMPLE
list = ss.row(3)
SYNTAX
list = ss.row(rowNum)
5.6.6.19 Change the State of a Single Column
Sets column i to the specified state. There are three states: 0=Inactive, 1=Independent, 2=Dependent. Other column states remain unchanged.
EXAMPLE
ss.setColState(1,2)
SYNTAX
ss.setColState(colNumber, state)
5.6.6.20 Change the State of a Range of Columns
Sets column low through column high inclusively to the specified state. There are three states: 0=Inactive, 1=Independent, 2=Dependent. Other column states remain unchanged.
EXAMPLE
ss.setColState(1,50,1)
SYNTAX
ss.setColState(lowNum, highNum, state)
5.6.6.21 Setting the State of a Single Row
Sets row i to the specified state. There are two states: 0=Inactive, 1=Active. Other row states remain unchanged.
EXAMPLE
ss.setRowState(3,0)
SYNTAX
ss.setRowState(rowNum,state)
5.6.6.22 Getting the State of a Single Row
Returns the state of row i. There are two states: 0=Inactive, 1=Active.
EXAMPLE
ss.getRowState(3)
SYNTAX
ss.getRowState(rowNum)
5.6.6.23 Setting the State of a Range of Rows
Sets row low through row high inclusively to the specified state. There are two states: 0=Inactive, 1=Active. Other row states remain unchanged.
EXAMPLE
ss.setRowState(1,50,0)
SYNTAX
list = ss.setRowState(lowNum,highNum,state)
5.6.6.24 Randomly set a Number of Rows to a State
At random will set a number of rows to be to the specified state. To inactivate numRandomRows rows, set state=0, to activate numRandomRows rows, set state=1. The other rows will be set to the opposite state.
EXAMPLE
ss.setRowStateRandom(25,0)
SYNTAX
ss.setRowStateRandom(numRows,state)
5.6.6.25 Randomly Set a Percentage of Rows to a State
At random will set a fraction of rows to be to the specified state. This is useful for selecting a certain percentage of the data irregardless of its size. To inactive, set state=0, to activate set state=1. The other rows will be set to the opposite state.
EXAMPLE
ss.setRowStateRandom(.5,0)
SYNTAX
ss.setRowStateRandom(percent, state)
5.6.6.26 Sort a Column in Ascending Order
Sorts the spreadsheet by column i in ascending order.
EXAMPLE
ss.sortByColAscending(3)
SYNTAX
ss.sortByColAscending(colNum)
5.6.6.27 Sort a Column in Descending Order
Sorts the spreadsheet by column i in descending order.
EXAMPLE
ss.sortByColDescending(3)
SYNTAX
ss.sortByColDescending(colNum)
5.6.6.28 Remembering a Spreadsheet Page
When you start to make a change to a spreadsheet which has another viewer dependent on it, such as a tree model, the spreadsheet will actually be copied first. The change will then be made on the copy, rather than the original. For convenience’s sake, your spreadsheet variable will always catch up with the spreadsheet change. However, there are times when, after making such a change, you will want to reference the original spreadsheet page.
To make this easier, use the following command before making the spreadsheet change (on “ss” in this example):
EXAMPLE
origSS = ss.thisPage()
SYNTAX
newSheetVariable = current spreadsheet object.thisPage()
Alternatively, you may wish to make the changes using the new spreadsheet variable (“newSS” in the following example) after executing this command the following way:
EXAMPLE
newSS = ss.thisPage()
5.6.7 Using the P-Value plot
Once you have created a P-Value plot object, there are a number of funcions which can be run with that object.
5.6.7.1 Getting P-Values
Use this command to get a python dictionary containing P, aP, and bP values. The dictionary contains the following keys:
- P
- aP
- bP
EXAMPLE
dictionary = myPVPlot.pValue(9)
SYNTAX
dictionary = current P-Value object.pValue(column number)
5.6.7.2 Getting Simes P-Values
Use this command to get a python dictionary containing Simes P, and Simes aP values from a P-Value plot which is ordered by variable number. The dictionary contains the following keys:
- Simes P
- Simes aP
EXAMPLE
dictionary = myPVPlot.simesValue(9)
SYNTAX
dictionary = current P-Value object.simesValue(column number)
5.6.7.3 Setting the Simes Window
The following command can be used to change the window size used in calculating Simes P-Values.
EXAMPLE
myPVPlot.setSimes(3)
SYNTAX
current P-Value object.setSimes(new window size, must be > 0, odd, and <= number of plot columns)
5.6.7.4 Getting FDR (aP)
To find the false discovery rate for a specific column in a P-Value plot ordered by aP, use the following command.
EXAMPLE
myvariable = myPVPlot.FDRValue(9)
SYNTAX
myvariable = current P-Value object.FDRValue(column number)
5.6.7.5 Getting all P-Values as a spreadsheet
Use this command to get a spreadsheet object which contains P, aP, bP, Simes P, and Simes aP for all columns represented in the current P-Value plot.
EXAMPLE
myspreadsheet = myPVPlot.pvalueSpreadsheet()
SYNTAX
newSpreadsheet = current P-Value object.FDRValue(column number)
5.6.8 Getting and Setting Tree Options
In order to set parameters that affect how trees are built and what values are shown in GUI mode, you must first get a tree options object. This object works like a python dictionary. Each setting is accessed using subscript notation where the name of the setting is put inside the subscript brackets. Each setting is described below with an example showing the subscript notation. To get a tree options object use this command.
EXAMPLE
options = ghi.getTreeOptions()
SYNTAX
optionsVariable = ghi.getTreeOptions()
Once you have an options object you can change and view the parameters using the following commands.
5.6.8.1 Setting the Minimum Elements for Splitting
EXAMPLE
optionsVariable[’minelements’] = 2
SYNTAX
optionsVariable[’minelements’] = desired split size, must be >= 1
5.6.8.2 Viewing the Minimum Elements Setting
EXAMPLE
optionsVariable[’minelements’]
SYNTAX
myVariable = optionsVariable[’minelements’]
5.6.8.3 Setting the Number of Threads
EXAMPLE
optionsVariable[’numthreads’] = 2
SYNTAX
optionsVariable[’numthreads’] = desired number of threads, must be >= 1
5.6.8.4 Viewing the Number of Threads Setting
EXAMPLE
optionsVariable[’numthreads’]
SYNTAX
myVariable = optionsVariable[’numthreads’]
5.6.8.5 Setting the P Threshold
EXAMPLE
optionsVariable[’pthreshold’] = 0.01
SYNTAX
optionsVariable[’pthreshold’] = desired threshold, must be >= 0
5.6.8.6 Viewing the P Threshold Setting
EXAMPLE
optionsVariable[’pthreshold’]
SYNTAX
myVariable = optionsVariable[’pthreshold’]
5.6.8.7 Setting the Pairwise Threshold
EXAMPLE
optionsVariable[’pairwisepthreshold’] = 0.01
SYNTAX
optionsVariable[’pairwisepthreshold’] = desired pairwise threshold, must be >= 0
5.6.8.8 Viewing the Pairwise Threshold Setting
EXAMPLE
optionsVariable[’pairwisepthreshold’]
SYNTAX
myVariable = optionsVariable[’pairwisepthreshold’]
5.6.8.9 Setting the P Threshold Type
EXAMPLE
optionsVariable[’pthresholdtype’] = 2
SYNTAX
optionsVariable[’pthresholdtype’] = desired threshold type, 0 = raw P 1 = adjusted P 2 = bonferonni adjusted P
5.6.8.10 Viewing the P Threshold Type Setting
EXAMPLE
optionsVariable[’pthresholdtype’]
SYNTAX
myVariable = optionsVariable[’pthresholdtype’]
5.6.8.11 Setting the Segmenting Algorithm
EXAMPLE
optionsVariable[’segalgorithm’] = 0
SYNTAX
optionsVariable[’segalgorithm’] = desired algorithm, 0 = exact 1 = approximate
5.6.8.12 Viewing the Segmenting Algorithm
EXAMPLE
optionsVariable[’segalgorithm’]
SYNTAX
myVariable = optionsVariable[’segalgorithm’]
5.6.8.13 Setting the Maximum Segments
EXAMPLE
optionsVariable[’maxsegments’] = 3
SYNTAX
optionsVariable[’maxsegments’] = desired setting, must be >= 2
5.6.8.14 Viewing the Maximum Segments Setting
EXAMPLE
optionsVariable[’maxsegments’]
SYNTAX
myVariable = optionsVariable[’maxsegments’]
5.6.8.15 Setting Linear Regression
EXAMPLE
optionsVariable[’linearregression’] = 0
SYNTAX
optionsVariable[’linearregression’] = desired setting, 0 = off 1 = on
5.6.8.16 Viewing Linear Regression Setting
EXAMPLE
optionsVariable[’linearregression’]
SYNTAX
myVariable = optionsVariable[’linearregression’]
5.6.8.17 Setting Use Missing Values Option
EXAMPLE
optionsVariable[’usemissing’] = 0
SYNTAX
optionsVariable[’usemissing’] = desired setting, 0 = off 1 = on
5.6.8.18 Viewing Use Missing Values Option
EXAMPLE
optionsVariable[’usemissing’]
SYNTAX
myVariable = optionsVariable[’usemissing’]
5.6.8.19 Setting Resample Iterations
EXAMPLE
optionsVariable[’resample_iterations’] = 0
SYNTAX
optionsVariable[’resample_iterations’] = desired setting
5.6.8.20 Viewing Resample Iterations Setting
EXAMPLE
optionsVariable[’resample_iterations’]
SYNTAX
myVariable = optionsVariable[’resample_iterations’]
5.6.9 Creating a Tree Model
To build tree model you must first set the options you want by using the getTreeOptions() command to get an options object then setting the options to desired values. See(5.6.8). This options object is then passed as the first parameter to the buildTreeModel() command. In addition to the options object there are three optional parameters you can specify in any combination.
- numtrees=somevalue 100 is default.
- randseed=somevalue 12345678 is default.
- numsplitters=somevalue 10 is default.
The buildTreeModel command will return a tree model object.
EXAMPLE
myTreeModel = mySS. buildTreeModel(treeOptions,numtrees=50,randseed=7839743,numsplitters=6)
SYNTAX
newTreeModel = mySpreadsheetObject. buildTreeModel(treeOptions,numtrees=val,randseed=val,numsplitters=val)
5.6.10 Importing a Legacy Tree Model
You can import an existing tree model using the following command. The tree model will imported into the project as a child node of the current spreadsheet.
EXAMPLE
myTreeModel = mySS.importLegacyTreeModel(”C:/HelixTree/myProject/myTree.ght”)
SYNTAX
newTreeModel = spreadsheetObject. importLegacyTreeModel(treeFile)
5.6.11 Tree Model Commands
5.6.11.1 Get Variable Frequencies
This command creates a new project spreadsheet with the variable frequencies of a multi-tree model. The new spreadsheet is added to the project as a child of the tree-model and it is also returned as a spreadsheet in the Python shell.
EXAMPLE
spreadsheetVariable = myTreeModel.variableFrequencies()
SYNTAX
myVariable = myTreeModel.variableFrequencies()
5.6.11.2 Get Tree Predictions
This command creates a new project spreadsheet with the tree predictions. The spreadsheet is added to the project as a child of the tree-model and it is returned as a spreadsheet in the Python shell.
EXAMPLE
spreadsheetVariable = myTreeModel.averageTreePredictions()
SYNTAX
myVariable = myTreeModel.averageTreePredictions()
5.6.11.3 Get Tree Variables
This command returns a list of variables used as splitters in creating the trees in a multi-tree model.
EXAMPLE
myVariable = myTreeModel.getTreeVariables()
SYNTAX
myVariable = myTreeModel.getTreeVariables()
5.6.11.4 Get Correlation Table
This command builds a spreadsheet of correlation interaction for a given tree model’s variables. The spreadsheet is added to the navigator project as a child of the tree model and is returned as a spreadsheet to the Python shell.
EXAMPLE
myVariable = myTreeModel.correlationTable()
SYNTAX
myVariable = myTreeModel.correlationTable()
5.6.11.5 Get Correlation Plot
This command creates a new correlation interaction plot using the variables of a given tree model. The plot is added to the navigator project as a child of the tree model and a correlation object is returned in the Python shell.
EXAMPLE
myVariable = myTreeModel.correlationPlot()
SYNTAX
myVariable = myTreeModel.correlationPlot()
5.6.11.6 Cherry Picking Compounds
The cherryPick(...) method performs cherry-picking based on an SD file, a CSV file, or both. The parameters are as follows.
- The input SD or SMILES file. Leave blank if no SD or SMILES input file is to be used.
- The SD field name for the compound name, if needed. Leave blank if the compound name is to be found in the "normal place", that is, in the header line of the MOLFILE- format block at the beginning of the compound information. (Also leave blank if no SD file is to be used.) For SMILES: If applicable, the field number of the compound name, quoted as a string. Default is "2".
- The input user descriptor file. Leave blank if no user descriptor file is to be used.
- ’C’ for comma-separated or ’S’ for space-separated user descriptor file. (Leave blank if no descriptor file is being used.)
- To select compounds for multiple trees in univariate, use an expression containing "Mean" or "Median", followed by ">=" or "<=" and a number. (This number represents the prediction threshold value to be used.) In MultiVariate: Use one or more expressions containing the variable name, "Mean" or "Median", ">=" or "<=", and a number representing the prediction threshold value. Separate these expressions with "&&" symbols. Leave this parameter blank to select all compounds.
- The prediction output file, or use "SPREAD" or "SPREADSHEET" to instead" create a prediction spreadsheet, which will be returned from this method. Leave blank if no prediction output of either form is to be created.
- ’C’ for comma-separated or ’S’ for space-separated prediction output. (Leave blank if no prediction output will be created as a file.)
- Include or don’t include the following letter(s) relating to prediction output options: ’M’: print Median prediction for all trees ’E’: print prediction for Each tree ’S’: print Standard deviation for each predictor node ’N’: print predictor Node Names (Leave blank if no prediction output is to be created.)
- Output cherry-picks to this SD or SMILES file. Leave blank or omit if no cherry-pick output file is to be written.
- To use SMILES for the input and output compound files, enter "SM" or "SM##". The optional number indicates the field number of the SMILES string. The default number is 1. Leave this whole field blank or omit this whole field to use the (default) SD format for compound files.
5.6.11.7 Get Observation Distance Matrix Unsorted
This command creates an unsorted observation distance matrix plot using the tree model. The plot is added to the navigator window as a child of the tree model and a distance matrix object is returned in the Python shell.
EXAMPLE
myVariable = myTreeModel.distMatrixUnsorted()
SYNTAX
myVariable = myTreeModel.distMatrixUnsorted()
5.6.11.8 Get Observation Distance Matrix Sorted by First Principal Component
This command creates an observation distance matrix plot, sorted by the first principal component, using the tree model. The plot is added to the navigator window as a child of the tree model and a distance matrix object is returned in the Python shell.
EXAMPLE
myVariable = myTreeModel.distMatrixSorted()
SYNTAX
myVariable = myTreeModel.distMatrixSorted()
5.6.11.9 Get Observation Distance Sorted by Similarity to One Observation
This command creates an observation distance matrix plot of those compounds most similar to a selected compounds, sorted by the distance to that compounds, using the tree model. The plot is added to the navigator window as a child of the tree model and a distance matrix object is returned in the Python shell.
EXAMPLE
myVariable = myTreeModel.distMatrixSimSorted(37, 5)
SYNTAX
myVariable = myTreeModel.distMatrixSimSorted(number of similar compounds, row (compound) number)
5.6.11.10 Using the Distance Matrix Object
Once you have created a distance matrix object, there are a number of functions available which use that object.
The following functions translate between the ranking, or position, within the distance matrix plot, and the spreadsheet row name or number.
NOTE: For distance matrices sorted by first principal component, the earliest ranking (number one) corresponds to the eigenvector component with the greatest magnitude, the second ranking corresponds with the eigenvector component with the second-greatest magnitude, and so forth. For unsorted distance matrices, the ranking will be the same as the row number. For distance matrices sorted by similarity to a given observation, that observation will be number one, the closest observation to that one will be number two, and so forth.
EXAMPLE
lab = myDM.getObsLabel(1)
SYNTAX
label string = current distance matrix object.getObsLabel(rank index)
EXAMPLE
n = myDM.getObsNumber(2)
SYNTAX
row number = current distance matrix object.getObsNumber(rank index)
EXAMPLE
r = myDM.getRankIndex(5)
SYNTAX
rank index = current distance matrix object.getRankIndex(row number)
The following functions can be used with two parameters or two ranges to output the distance value between two observations, input by spreadsheet row number or by ranking index, respectively. If two parameters are specified a single distance value is returned. If a range of parameters are specified then a matrix of distance values is returned.
EXAMPLE
d = myDM.distance(5,7)
SYNTAX
computed distance = current distance matrix object.distance(first row number, second row number)
EXAMPLE
d = myDM.distance(5,7,8,9)
SYNTAX
matrix of computed distances = current distance matrix object.distance(first range start row, first range end row, second range start row, second range end row)
EXAMPLE
d = myDM.distanceByRank(1,3)
SYNTAX
computed distance = current distance matrix object.distanceByRank(rank index of first observation, rank index of second observation)
EXAMPLE
d = myDM.distanceByRank(1,3,4,6)
SYNTAX
matrix of computed distance = current distance matrix object.distanceByRank(first range start rank index, first range end rank index, second range start rank index, second range end rank index)
5.6.12 Applying a Tree Model
To apply a tree model to a spreadsheet, first get the spreadsheet and then use this command passing in a previously built tree model.
EXAMPLE
myAppliedTreeModel = mySS.applyTreeModel(treeModel)
SYNTAX
newAppliedTreeModel = mySpreadsheetObject. applyTreeModel(treeModel)
5.6.13 Performing Regression
There are two scripting commands which allow you to perform various regressions. These commands are:
- performRegression(...) - to perform linear or logistic regression.
- performStepwiseRegression(...) - to perform stepwise linear or logistic regression
In order to be able to use these commands, the spreadsheet object must have exactly one non-categorical column set as dependent.
These commands will return a tuple of result objects which will vary based on the options used. The results tuple will contain:
- A text viewer object at the first index, and ’None’ at the second index if the parameter for whether or not to create a residual spreadsheet is set to 0.
- A text viewer at the first index, and a residual spreadsheet at the second index if the parameter for whether or not to create a residual spreadsheet is set to 1.
The performRegression command has no required parameters, but does require that either covariates or first order interactions be specified.
The performStepwiseRegression command requires 1 parameter: a p-value cutoff for the stepwise procedure.
Additionally, these commands take a number of keyword arguments, each of which may be required or prohibited based on the other options used. These keywords, and the parameters they represent are as follows:
- numPermutations - represents the number of permutations to perform for permutation tests. To omit permutation testing, set this value to 0. Default value = 0
- covariates - This parameter represents the covariates that should be included in the regeression. This parameter must be specified as a python list of spreadsheet column numbers, i.e. covariates = [3, 4, 6] would include columns 3, 4 and 6 in the regression. All column numbers must represent active columns.
- firstOrderInteractions - defines the first-order interactions between covariates which will be used in the regression. These interactions must be specified in the form of a python list of tuples which contain spreadsheet column numbers, i.e. firstOrderInteractions = [(3, 4), (4, 4), (2, 3)]. All column numbers must represent active columns.
- createResidualSpreadsheet - if set to 1, a residual spreadsheet will be created. Default value = 0.
The performRegression command should be used in the following manner:
EXAMPLE
myTuple = mySpreadsheet.performRegression( covariates=[3,5,6], firstOrderInteractions=[(3, 6),(5, 5)])
SYNTAX
tuple = spreadsheet object.performRegression( keyword1=value1, keyword2=value2...)
The performStepwiseRegression command can be used as follows:
EXAMPLE
myTuple = mySpreadsheet.performStepwiseRegression( 0.01, covariates=[3,5,6], firstOrderInteractions=[(3, 6),(5, 5)])
SYNTAX
tuple = spreadsheet object.performStepwiseRegression( pvalueCutoff, keyword1=value1, keyword2=value2...)
For further description of the parameters used in these commands please see 16.2.
5.6.14 Output a C File
To create a C program with the prediction rules of the tree model use this command.
EXAMPLE
myTreeModel.outputCFile(’/tmp/mycfile’)
SYNTAX
myTreeModel.outputCFile(absolute path including file name)
5.6.15 Prompting the User for Input
In interactive scripts, it is often useful to prompt the user for data or parameters for the analysis. We have provided an interface to creating a simple modal dialog that prompts the user for a number of values, and provides error checking on the input. The main ghi object provides a convenient python function for this purpose, which returns a list of user inputs.
EXAMPLE
p = ghi.promptUser([{"label":"Enter string:", "type":"string", "tooltip":"Any string will do"},
{"label":"Enter integer:", "type":"integer","min":0, "max":100},
{"label":"Enter double:", "type":"double","min":-1},
{"label":"Select method:", "type":"combobox",
"list":["method 1", "method 2", "method 3"]}])
Would construct the following widget:
|
SYNTAX
listName = ghi.promptUser([Dictionary1, Dictionary2, ..., DictionaryN ])
The promptUser() function takes a list of Python dictionary objects, with each object defining one data entry sub-widget. The sub-widgets are included in list order from top to bottom within the parent widget. When the user hits the OK button, if the entries are valid, they are returned in a list.
There are a number of data entry methods available. Each is defined using a dictionary. Every sub-widget must have a "label" attribute and a "type" attribute. Every sub-widget may have an optional tooltip attribute, which is a message that appears when the user hovers the mouse over the sub-widget. Labels are listed at the left of the sub-widget. The "type" attribute defines which type of sub-widget is to be constructed. The available types are as follows:
- integer: prompts user for an integer. Does error checking to see that a valid integer has been entered.
Optional attributes:
- "min":<integer> specifies that integer must be greater than or equal to the specified minimum value.
- "max":<integer> specifies that integer must be less than or equal to the specified maximum value.
- double: prompts user for a double precision (64-bit) number. Does error checking to see that a valid double has been
entered.
Optional attributes:
- "min":<double> specifies that user-entered double must be greater than or equal to the specified minimum value.
- "max":<double> specifies that user-entered double must be less than or equal to the specified maximum value.
- float: same as double, provided for convenience.
- real: same as double, provided for convenience.
- string: prompts user for a string. Does error checking to see that the string entered is not blank.
- combobox: Takes an additional non-optional attribute, "list", which contains a list of strings to form a list of choices for the user to choose from. For example the dictionary entry "list":["item1", "item2"] specifies a combobox with two possible values to choose from, "item1" and "item2". The first item is specified by default.
If the user cancels, a python exception is thrown. If there is an error in syntax, a python exception is thrown with a description of the error. If the user hits OK, and there is any error in the input, the user is told by the dialog what the problem is, which may be remedied from within the dialog. If there are no errors, a list of the user inputs is returned in the same order as the Dictionaries that are passed in.
5.6.16 Text Viewer
5.6.16.1 Getting the text
Will return a python string with the contents of the text viewer.
EXAMPLE
myString = myTextViewer.getText()
SYNTAX
String = Text Viewer Object.getText()
5.6.16.2 Saving text to a file
This function will save the contents of the Text Viewer to a .txt file. “.txt” will be appended to the end of the file name if it is not already there.
EXAMPLE
myTextViewer.saveToFile("filename.txt")
SYNTAX
None = Text Viewer Object.saveToFile( file name string )
5.6.17 Regression Results
5.6.17.1 Getting the text
Will return a python string with the contents of the regression results.
EXAMPLE
myString = myRegressionResults.getText()
SYNTAX
String = Regression Results Object.getText()
5.6.17.2 Saving text to a file
This function will save the contents of the Regression Results to a .txt file. “.txt” will be appended to the end of the file name if it is not already there.
EXAMPLE
myRegressionResults.saveToFile("filename.txt")
SYNTAX
None = Regression Results Object.saveToFile( file name string )
5.6.17.3 Getting the covariates
This function will return the covariates which were used in the regression as a python list of spreadsheet column numbers.
EXAMPLE
myRegressionResults.covariates()
SYNTAX
List = Regression Results Object.covariates()
5.6.17.4 Getting the interactions
This function will return the interactions which were used in the regression as a python list of tuples, which contain a column number for each of the regression terms.
EXAMPLE
myRegressionResults.interactionTerms()
SYNTAX
List = Regression Results Object.interactionTerms()
5.6.18 Navigator Object Selection
5.6.18.1 Selecting a Spreadsheet
EXAMPLE
objectid = ghi.promptSpreadsheet()
SYNTAX
Int = ghi.promptSpreadsheet()
Constructs a widget that lists all of the navigator nodes, with all the spreadsheet nodes highlighted in white. The navigator node id is returned for the spreadsheet that the user selects. Nothing is returned if the user cancels.
5.6.18.2 Selecting a Tree model
EXAMPLE
objectid = ghi.promptTree()
SYNTAX
Int = ghi.promptTree()
Constructs a widget that lists all of the navigator nodes, with all the tree model nodes highlighted in white. The navigator node id is returned for the tree model that the user selects. Nothing is returned if the user cancels.