|
|
|
Institute: |
Lab or Branch |
NIMH |
Laboratory of Neurotoxicology |
|
Title: |
DBParser: A Perl Program for Proteomic
Data Analysis |
|
Authors: |
X. Yang, V. Dondetti, R. Dezube, D. M.
Maynard, S.P. Markey, L. Geer, J. Epstein and J.A. Kowalak |
|
Abstract: |
LC-MS/MS peptide analyses routinely generate
large data sets leading to long lists of identified proteins
using automated interpretation programs such as Sequest
or Mascot. The Mascot search engine generates a flat file
that contains relevant experimental information, e.g.,
search parameters, summary of each result, ms/ms ion masses,
peptide and protein assignments, etc. A Mascot HTML report
is generated from this flat file. However, the subsequent
sorting, collation, and comparison of these results pose
significant tasks when analyzing multiple files. Therefore,
parsing is necessary for data interpretation and protein
comparison. We have developed DBParser, a perl program
that takes the output from Mascot flat files and stores
data in MySQL database. DBParser generates user-friendly
html output reports of the sorted and compared protein
lists which can be used for subsequent analysis. DBParser
is run from a command prompt window. We have developed
a CGI-based graphical user interface to allow execution
of DBParser over a network using a web browser.
The DBParser consists of several perl scripts and
one central perl module. Individual scripts permit database
creation, parsing, and report generation. The web interface
consists of several HTML files, CGI scripts and one
stylesheet file. It was developed and tested using Apache
Server and Internet Explorer v. 6.0. The database creator
script creates a relational database with 10 tables
in MySQL. A parser script imports data from Mascot flat
files into the database. The report generators produce
HTML format reports of the significant proteins from
a single search or a group of searches. There are two
report generator scripts, a single data set report (proteins.pl)
or pair-wise comparison of data sets (quickcompare.pl).
The output of the latter script organizes the data to
present the proteins unique to each as well as proteins
that are common to both data sets. The HTML report links
to the public online databases GO and SwissProt, making
report analyses easier for end-users. DBParser has been
tested with Windows 2000 and Linux systems separately.
DBParser is freeware distributed under a Mozilla Open
Source license. Interested parties should contact J.A.
Kowalak (jeffrey.kowalak@nih.gov). |
|
|
|