‹‹ Back to SVS Home

4.3 Importing Data

4.3 Importing Data

The File->Import Data submenu presents three menu items for importing different types of data:

  1. Import Wizard
  2. Import ASCII File
  3. Import Legacy GHD File
  4. Import DSF File

These methods for importing data are described in the following sections.

4.3.1 The Import Wizard


[Picture]
Figure 4.2: The First Panel of the Import Wizard Dialog.

To open the Import Wizard, select the File->Import Data->Import Wizard menu item from the Optimus RP main menu. The Import Wizard is shown in Fig. 4.2 The Import Wizard allows you to specify the source of your data which can either be one of the file types listed in the following table or an ODBC data source. Note that some of the file formats may not be available on your operating system. Also, unless otherwise noted, the Import Wizard can import all versions of the specified file format.


File Type Version Information File extension(s)
1-2-3 All versions of 1-2-3 can be read. *.wk*, *.wr*
Access *.mdb
ASCII - Delimited *.txt, *.csv
ASCII- Fixed Format *.sts
dBASE, Foxpro, Clipper, and Alpha Four and compatible formats All versions can be read. *.dbf
Epi Info All versions can be read. *.rec
Excel All versions of Microsoft Excel files through the Excel 2000 can be read. *.xls
Gauss *.dat
JMP *.jmp
LIMDEP *.lpj
Matlab Matrices *.mat
Mineset *.schema, *.sch
Minitab All versions through version twelve can be read. *.mtw
ODBC requires ODBC driver and Data Source Name (DSN)
OSIRIS *.dct, *.dict
Paradox All versions through version seven can be read. *.db
Quattro Pro *.wq?, *.wb?
SAS Data Files Versions 6 through 9 can be read. *.sd2, *.ssd01, *.ssd04, *sd7, *.sas7bdat, *.sav, *.por
SAS Transport Files *.xpt, *.tpt
S-PLUS All versions of S-Plus matrices, lists and dataframes can be read. *.
SPSS Data *.sav
SPSS Portable *.por
Stata *.dta
Statistica All versions through version five. *.sta
SYSTAT *.sys

4.3.1.1 Importing Files

In order to specify a file for importing, you must left click the File button. This will cause the Open dialog to be displayed which allows you to navigate through your file system and select the file to be imported. The Open dialog has a drop down control with the label Files of type. Use this control to specify the specific file type that you are importing. Once you have selected the file and specified its type, left click the Open button to return to the Import Wizard.

The Name and Format controls displayed on the first panel of the Import Wizard are "read only". So, you will not be able to directly type in information about the file you want to import. However, once you have used the Open dialog to select a file for importing, the Name and Format controls will be filled in with correct information.


[Picture]
Figure 4.3: The Second Panel of the Import Wizard Dialog for choosing the label column on an imported file.

Left click the Next button to view the second panel of the Import Wizard which is shown in Fig. 4.3. This panel allows you to identify the label column. The label column contains information (labels) that identify each row of data. The label column is not used during analysis, but the labels are used in several graphs produced by Optimus RP.

By default, the Import Wizard assumes that there is no label column. However if your data set does contain a label column, left click on the Select label column radio button and highlight the name of the column that contains the label data. When you have completed your selection, left click on the Finish button to dismiss the Import Wizard and return to the Optimus RP main window. A new dataset will have been added to the Navigator Window and a new spreadsheet will be opened up for viewing the data.

4.3.1.2 Importing ODBC Data


[Picture]
Figure 4.4: The ODBC Data Sources Dialog.

If the data that you want to import is accessible via an ODBC data source name (DSN), you must left click the ODBC Driver button shown on the first panel of the Import Wizard (see Fig. 4.2). This will result in the display of the ODBC Data Sources dialog which displays all the known DSNs (see Fig. 4.4). Select the DSN that you would like to import and click the Use Data Source button. You will be returned to the first panel of the Import Wizard where you will see the Source and Driver text controls filled in with the name of the data source and driver you have chosen.

What happens after this depends entirely upon the type of DSN you have selected. For our purposes we can say that there are two categories of DSNs. Those that identify resources native to Microsoft Windows such as Microsoft Access databases and those that identify non-native resources such as Oracle or MySQL databases. Even within these two categories, there are variations on what happens next that depend on the specific resource.

If your DSN represents a resource native to Microsoft Windows such as a Microsoft Access database, you will be shown the second panel of the Import Wizard. This panel will allow you to choose the specific resource that you want to import. For instance, you could identify a particular Microsoft Access database. Once you have made your selection, click the Next button to view the third panel of the Import Wizard. You will then be asked to identify the specific table or file to be imported. Once you have made your selection, click the Next button to view the fourth panel of the Import Wizard. At this point you will be asked to identify a label column if there is one. Clicking the Finish button will cause the data source to be imported, the Import Wizard to be closed and a Spread Sheet Viewer to be displayed with the imported data.

If your DSN represents a resource not native to Microsoft Windows such as an Oracle or MySQL database, you will be shown a dialog box external to the Import Wizard that was provided by the vendor of the ODBC driver. The vendor dialog will ask you to provide the host, server name or IP address of the computer hosting the data source, the port number of the database server and the user ID and password. All of this information is used by the ODBC driver to establish a connection to the DSN. Once the vendor dialog box is dismissed, a connection will be established and you will be returned to the Import Wizard. You will then navigate through the Import Wizard panels identifying the specific table and label column as was described above for native resources.

4.3.2 Importing ASCII Data Files

Although the import wizard enables the import of ASCII files, Optimus RP also provides a separate utility for the conversion of these files. From within a Optimus RP project’s main screen select the menu choice File->Import Data->Import ASCII file.


[Picture]
Figure 4.5: The File submenu highlighting the Import ASCII file.


[Picture]
Figure 4.6: Clicking Choose opens this browser window.

From this subsequent dialog you need to identify an ASCII source file, select how the file is delimited and what column, if any, holds row labels.

First click on the Choose. . . button to open a typical file-find dialog. Navigate to the Optimus RP\example folder. Located here are the sample files included with Optimus RP. We use a CSV extension to indicate comma delimited files and DAT to indicate space delimited. After selecting a file click Open to continue.


[Picture]
Figure 4.7: The dialog window of Fig. 4.6 with all parameters filled in.

If you are getting files from other sources things may not be so neat. Many applications that export these structured files give them a TXT extension. See the section File Formats for a description of how these files are structured. For clarity you may want to rename your file’s extension to conform to this protocol.


[Picture]
Figure 4.8: ASCII convert dialog.

Since we picked a CSV file we need to change the File format: pull-down to indicate the file is Comma delimited.

If the data has a single field that identifies each record, you can indicate this in the row Label column number: text box. For example, the data set above has a patient id and it is the first field in the left-to-right sequence of fields. The label column is for identification only and is not used for data analysis.


[Picture]
Figure 4.9: Click OK to begin processing.

On a very large data set the importation process may take some time. There is a progress indicator (not shown) that helps estimate your progress.

Once the data has been converted you will see a summary of the processing. This should match your expectations of data types and number of records. Dismiss by clicking the OK button.

When finished you are returned to the main navigation screen of Optimus RP where you will see a new icon for the imported data and a new icon for a spreadsheet containing the imported data. The spreadsheet is automatically opened and ready for further analysis.


[Picture]
Figure 4.10: The spreadsheet view ready for further analysis.

4.3.3 Importing Legacy GHD Files

Users of Optimus RP prior to release 3.0 may have data files already in the GHD format. In fact, some of the files included in the examples directory are stored in legacy format. In order to continue to be used, these legacy files need to be imported into a project. This is done by selecting the File->Import Data->Import legacy GHD File menu item. An Open dialog is displayed which allows you to navigate to the folder where the GHD file is located. Selecting the GHD file and clicking on the Open button will cause the file to be imported, the Open dialog to be closed and control to be returned to the Optimus RP main window. The imported GHD file is automatically added to the current project and displayed in a Spread Sheet Viewer. Fig. 4.11 shows the legacy ghd data file, CSIM.ghd, being selected for import. As you can see, CSIM.ghd is located in the example folder. It contains data with one label column, Patient ID, PATID.


[Picture]
Figure 4.11: Importing the legacy ghd data file CSIM.ghd.

4.3.4 Importing DSF Files

Many users have noticed that Optimus RP stores data in a very efficient and compact format. The Dataset Storage Format (DSF) is designed to allow for the sharing and collaboration of data sets between Optimus RP users. The DSF format is also open to third-parties to develop the ability to create DSF files from their own products or data sources and thus more easily integrate with Optimus RP. To import a DSF file generated by an external source or exported from Optimus RP’s Spreadsheet Menu (see 6.3.1.3), select the File->Import Data->Import DSF File menu item.

An Open dialog is displayed which allows you to navigate to the folder where the DSF file is located. Selecting the DSF file and clicking on the Open button will cause the file to be imported, the Open dialog to be closed and control to be returned to the Optimus RP main window. The imported data is automatically added to the current project and displayed in a Spread Sheet Viewer.