Principal components analysis on a spectrum data set (BioNumerics 7)


In this video I will perform a PCA or Principal Components Analysis on the MALDI data set in my database. As you can see in the Database design panel, 3 levels are present in my database. My dataset consists of spectra from 3 species. For each species, several isolates have been analyzed and for each isolate, there are multiple technical replicates. The spectrum type MALDI is present in the Experiment types panel. Since a PCA is performed in the Comparison window, I first need to make a selection in the Main window. I will work on the data in the lowest level,
the raw data level. To select all 80 entries in this level, I
select the keyboard shortcut CTRL+A. I create a new comparison and click on the MALDI name in the Experiments panel to display the spectra in the Experiment data panel. For ease of interpretation, I will create comparison groups, based on the species names. I right click on the ‘Species’ column name, choose the last option and press ‘OK’. Three groups are created, corresponding to the three Species in my comparison. Since spectral data can only be analyzed by PCA if a peak matching has been calculated, I select ‘Spectra>Do peak matching’. The peak matching for our MALDI experiment is displayed in the Experiment data panel. I make sure ‘MALDI’ is selected and choose this button. By default ‘Use quantitative values’ is
checked. If this option is unchecked, the binary information will be used. ‘Subtraction of the averages’ over the characters results in a PCA plot arranged around the ori. Therefore it is recommended to check this option for general purposes. I press ‘OK’ to start the calculations. The Entry coordinates panel shows the entries plotted to the first two components. By looking at the colors in the Entry coordinates panel we can see that the spectra from the different species are clearly separated. The Character coordinates panel shows the plotted peak classes. A peak class that appears near the edge of the plot is a strong discriminator, while
a band class near the center is a weak discriminator. Furthermore, a peak class that appears near the position of an entry is an indicator for the entry. If you look at the entries for Species A for example, they are all found in this quadrant. This group of peaks is located in the same quadrant, meaning that they are likely linked specifically to Species A. The same can be seen for Species B. To obtain a three dimensional view of the PCA analysis, you can press this button. You can zoom in and out of the plot with the zoom sliders or with the Page up or down buttons. The image can be rotated by clicking and dragging. In the 3D view, the same can be seen: the
three species form distinct groups. This ends this video. Please tune in to our
other movies where more functionality is explained.

Leave a Reply

Your email address will not be published. Required fields are marked *