Help
Input
OMSSA can take one of the following input formats:
- a single dta file
- a set of dta files merged into a single file, separated by blank
lines.
- a set of dta files merged into a single file, separated by xml-like
tags that allow tracking of spectra by file name and by number.
A Perl script to do the merge
is included in the distribution.
It is called dta_merge_OMSSA.pl and is described in the README
file.
- an mgf file
- a pkl file
The maximum number of spectra per file is currently 2000.
(Remark: only true for the NCBI webinterface. We do not have this limitaton locally)
Sequence library to search. Refseq
is comprehensive, consistent, and stable library of sequences meant
as a reference standard. Nr is a non-redundant set of most protein
sequences in NCBI databases.
The enzyme used to theoretically cleave the sequence library.
If you cannot find the enzyme you want to use in the list, please
contact Lewis Geer.
Limits your search to a particular organism or organisms. Multiple
organisms can be selected by holding down the "Ctrl" key
while clicking on the organism names. Public search service
users must select the organisms to search. If you want to search
all organisms at once, you must download OMSSA and install it on
your own computer.
If the organism you wish to search is not listed, please contact
Lewis Geer.
Both options affect the number of hits kept in memory during the
search and returned to the user. The hitlist max length is the maximum
number of hits retained per spectrum per precursor charge. The E-value
cutoff is the maximum E-value allowed in the hit list.
Enter post translational modifications to be searched. Fixed modifications
are modifications that always occur. Variable modifications are
modifications that may occur -- OMSSA will search both the modified
and unmodified versions of peptides. Multiple modifications can
be selected by holding down the "Ctrl" key and clicking.
If a modification you wish to search is not listed, please contact
Lewis Geer.
To reduce the combinatorial expansion that results when specifying
multiple variable modifications, it is possible to put an upper
bound on the number of mass ladders generated per peptide using
this option. The ladders are generated in the order of the
least number of modifications to the most number of modifications
so that modifications are applied sparingly at first. If you set
this number too low, you will miss highly modified peptides. If
you set it too high, it will make the e-values less significant.
To give an example what this means, assume that the hard limit
is 11 and that the theoretical peptide is STYY and you've selected
phosphorylation of S, T, and Y as variable mods. The combinations
that OMSSA will test are:
STYY
----
0000
1000
0100
0010
0001
1100
1010
1001
0110
0101
0011
where 0 represents no modification and 1 represents a modification
at the site indicated by the column the digit is in. OMSSA tries
the combinations with the least number of variable modifications
and then adds modifications until the upper bound is reached.
The precursor ion is the ion before fragmentation and the product
ions are the ions generated after fragmentation. The mass tolerance
is the instrumental error in determining the mass of these ions.
These values are specified in Daltons +/- the measured value, e.g.
a value of 2.0 means +/- 2.0 Daltons of the measured value.
Allows you to specify how the mass tolerance scales with the charge
of the precursor. For example, you may search a precursor assuming
that it has a charge state of 2+ and 3+. If you set the parameter
to 1, then the mass tolerance for the 2+ charge state will be 2
times the precursor mass tolerance, and for the 3+ charge state
it will be 3 times the precursor mass tolerance. If you set the
parameter to 0, the mass tolerance will always be equal to the precursor
mass tolerance, irrespective of charge state.
These settings specify the use of average or monoisotopic amino
acid masses when calculating theoretical precursor and product ion
masses, respectively.
Determination of precursor charge and product ion charges.
Presently, OMSSA estimates which precursors are 1+ using the fraction
of peaks below the precursor as an indicator. If the number of peaks
is below the specified fraction, then the spectrum is searched as
1+. All other spectra are searched with precursor charge
ranging from the minimum to maximum specified.
OMSSA ignores charges specified in the input file, except when
using it to calculate the precursor m/z. For example, if you are
searching using a dta file, OMSSA will search over the minimum and
maximum precursor charge you specified, not the charge specified
in the dta file. In the case of dta files, this means that you do
not have to search multiple versions of the same file, where each
is calculated using a different precursor charge.
The "Charge at which to start considering multiply charged
products" setting is the lowest precursor charge at which OMSSA
should start considering multiply charge product ions.
Preprocessing is the process of eliminating noise from a spectrum.
Normally, you do not need to adjust options associated with preprocessing
as OMSSA automatically adjusts its preprocessing for best results.
The "Peak intensity cutoff" eliminates all peaks whose
intensity is less than a fraction of the most intense peak.
The "Number of top intensity peaks in first pass" is
the number of top intensity peaks from which OMSSA must find at
least one match between the a peak m/z value and the m/z values
calculated from the protein sequence library peptide.
OMSSA searches two ions series, both of which can be specified.
Normally one of the specified ion series is a forward ion series
and the other is a reverse ion series.
OMSSA can create output in CSV (excel), XML
and ASN.1 using the form
on the results page. The CSV format is a convenient summary of the
detailed results in the XML and ASN.1 format files. The XML and
ASN.1 format files are logically equivalent.
A sample Perl parser for the xml output is included (see the README
file). To use the sample parser, type "perl readOMSSA.pl
test.xml" where test.xml is a file containing xml output from
OMSSA.
|