1.2 Release Notes

Optimus RP is a software system for disentangling complex relationships between many predictor variables and one or more outcome variables. In clinical applications one can understand how various environmental and clinical parameters might segment the patients into different classes of responders. In marketing applications Optimus RP can be used to understand the variables that define different market segments. In finance, it can be used to understand and predict credit risk or expected returns given various demographic predictor variables. Optimus RP employs novel recursive partitioning algorithms to understand relationships among variables using statistical hypothesis testing.

FIRM stands for Formal Inference Recursive Modeling, with its roots going back to work done in the 1970’s and 1980’s by Dr. Douglas Hawkins ( http://www.douglashawkins.com). Early recursive partitioning approaches such as AID suffered from a lack of statistical rigor, and Hawkins introduced statistical hypothesis testing as a means for better characterizing the statistical validity of the models generated. FIRM was released in the early 1980’s as a non-GUI package, and has been in use to this day.

Optimus RP has taken the statistical foundations of FIRM, and augmented it with faster and more exact segmenting algorithms and extended the methods to include multivariate response. We are grateful for the continued assistance of Dr. Hawkins in devising and improving many of the statistical and algorithmic methods underlying Optimus RP.

1.2.1 Improvements in version 4.2

  • An optimized version of the column subset feature has been added. This version will preserve spreadsheet attributes such as being mapped and being a pedigree or family-indexed spreadsheet.
  • Spreadsheets are now able to retain a “custom” sort order. If you attempt to re-sort a spreadsheet which is already custom-sorted, a new tab will be created, thus preserving the original custom-sorted tab.

    This spreadsheet feature is utilized by the following Optimus RP features:

    • Doing a left-click-and-drag on the distance-matrix plot to obtain a spreadsheet (or a tree).
    • Permuting spreadsheet rows from scripting.

1.2.2 Bug Fixes

  • Made the ASCII file import dialog “modal”–that is, force completion of the dialog before performing any other operation on the project.
  • Indicate progress more completely for a recursive split.
  • Made it possible to cancel out of resampling a subtree or the entire tree. An informative message is shown if you do.
  • Allow external Python “C” modules in the Linux implementation of Optimus RP.
  • Make the spreadsheet column sort arrow always appear, if there is one, under Windows.
  • Viewing or saving predictions from a tree model pays attention to the spreadsheet’s sort order.

1.2.3 Known Bugs

  • When using the toolbar Feedback feature from behind a firewall the email is sometimes prevented from going out. The current workaround is to copy the TO and Subject fields generated by Optimus RP and put them along with your feedback in your normal email program.
  • Multiple threading is disabled, for now, because there have been problems with multiple threading, perhaps related to the third-party library we use to handle threading and strings.
  • One of our third party libraries for importing data from different file formats will not let you specify a row for the column header different from their default for some data formats.
  • When importing ASCII delimited data using the Import Wizard and specifying comma as the allele delimiter, the format of the file must be tab delimited. Auto-detection of other field (column) delimiters will not work.
  • Exceeding 4GB of memory use will crash the program as it is a 32-bit application.
  • To improve performance for manual splitting, once the variables have been scanned, a question box will pop up if there are too many splits for the manual split window to appear quickly. You may choose to list only the most important splits. P-Value plots will still show data for all potential splitters.
  • When using the Import Wizard to import Microsoft Excel data with multiple worksheets, the correct worksheet may not be imported.
  • Under some circumstances, the iteration procedure for the logistic regression will be unstable and the regression may fail, even when the matrix has sufficient rank and significant regressors are included. (See 17.15.) At this time, the best workaround is to filter out the data that causes such instabilities.