|  | 
               
                |  |   
                |  |   
                | 
                     
                      | Institute: | Lab or Branch |   
                      | NIMH | Laboratory of Neurotoxicology |  |   
                | 
                     
                      | Title: |   
                      | DBParser: A Perl Program for Proteomic 
                        Data Analysis |  |   
                | 
                    
                      | Authors: |   
                      | X. Yang, V. Dondetti, R. Dezube, D. M. 
                        Maynard, S.P. Markey, L. Geer, J. Epstein and J.A. Kowalak |  |   
                | 
                     
                      | Abstract: |   
                      | LC-MS/MS peptide analyses routinely generate 
                        large data sets leading to long lists of identified proteins 
                        using automated interpretation programs such as Sequest 
                        or Mascot. The Mascot search engine generates a flat file 
                        that contains relevant experimental information, e.g., 
                        search parameters, summary of each result, ms/ms ion masses, 
                        peptide and protein assignments, etc. A Mascot HTML report 
                        is generated from this flat file. However, the subsequent 
                        sorting, collation, and comparison of these results pose 
                        significant tasks when analyzing multiple files. Therefore, 
                        parsing is necessary for data interpretation and protein 
                        comparison. We have developed DBParser, a perl program 
                        that takes the output from Mascot flat files and stores 
                        data in MySQL database. DBParser generates user-friendly 
                        html output reports of the sorted and compared protein 
                        lists which can be used for subsequent analysis. DBParser 
                        is run from a command prompt window. We have developed 
                        a CGI-based graphical user interface to allow execution 
                        of DBParser over a network using a web browser.  The DBParser consists of several perl scripts and 
                          one central perl module. Individual scripts permit database 
                          creation, parsing, and report generation. The web interface 
                          consists of several HTML files, CGI scripts and one 
                          stylesheet file. It was developed and tested using Apache 
                          Server and Internet Explorer v. 6.0. The database creator 
                          script creates a relational database with 10 tables 
                          in MySQL. A parser script imports data from Mascot flat 
                          files into the database. The report generators produce 
                          HTML format reports of the significant proteins from 
                          a single search or a group of searches. There are two 
                          report generator scripts, a single data set report (proteins.pl) 
                          or pair-wise comparison of data sets (quickcompare.pl). 
                          The output of the latter script organizes the data to 
                          present the proteins unique to each as well as proteins 
                          that are common to both data sets. The HTML report links 
                          to the public online databases GO and SwissProt, making 
                          report analyses easier for end-users. DBParser has been 
                          tested with Windows 2000 and Linux systems separately. 
                          DBParser is freeware distributed under a Mozilla Open 
                          Source license. Interested parties should contact J.A. 
                          Kowalak (jeffrey.kowalak@nih.gov). |  |   
                |  |  |