Extracts portions of the data from an mzML, featureXML or consensusXML file.
pot. predecessor tools
FileFilter
pot. successor tools
any tool yielding output
in mzML, featureXML
or consensusXML format
any tool that profits on reduced input
With this tool it is possible to extract m/z, retention time and intensity ranges from an input file and to write all data that lies within the given ranges to an output file.
Depending on the input file type, additional specific operations are possible:
mzML
extract spectra of a certain MS level
filter by signal-to-noise estimation
filter by scan mode of the spectra
filter by scan polarity of the spectra
remove MS2 scans whose precursor matches identifications (from an idXML file in 'id:blacklist')
featureXML
filter by feature charge
filter by feature size (number of subordinate features)
filter by overall feature quality
consensusXML
filter by size (number of elements in consensus features)
filter by consensus feature charge
filter by map (extracts specified maps and re-evaluates consensus centroid)
e.g. FileFilter -map 2 3 5 -in file1.consensusXML -out file2.consensusXML
If a single map is specified, the feature itself can be extracted.
e.g. FileFilter -map 5 -in file1.consensusXML -out file2.featureXML
featureXML / consensusXML:
remove items with a certain meta value annotation. Allowing for >, < and = comparisons. List types are compared by length, not content. Integer, Double and String are compared using their build-in operators.
filter sequences, e.g. "LYSNLVER" or the modification "(Phospho)"
e.g. FileFilter -id:sequences_whitelist Phospho -in file1.consensusXML -out file2.consensusXML
filter accessions, e.g. "sp|P02662|CASA1_BOVIN"
remove features with annotations
remove features without annotations
remove unassigned peptide identifications
filter id with best score of features with multiple peptide identifications
e.g. FileFilter -id:remove_unannotated_features -id:remove_unassigned_ids -id:keep_best_score_id -in file1.featureXML -out file2.featureXML
remove features with id clashes (different sequences mapped to one feature)
The priority of the id-flags is (decreasing order): remove_annotated_features / remove_unannotated_features -> remove_clashes -> keep_best_score_id -> sequences_whitelist / accessions_whitelist
MS2 and higher spectra can be filtered according to precursor m/z (see 'peak_options:pc_mz_range'). This flag can be combined with 'rt' range to filter precursors by RT and m/z. If you want to extract an MS1 region with untouched MS2 spectra included, you will need to split the dataset by MS level, then use the 'mz' option for MS1 data and 'peak_options:pc_mz_range' for MS2 data. Afterwards merge the two files again. RT can be filtered at any step.
Note
For filtering peptide/protein identification data, see the IDFilter tool.
Currently mzIdentML (mzid) is not directly supported as an input/output format of this tool. Convert mzid files to/from idXML using IDFileConverter if necessary.
The command line parameters of this tool are:
INI file documentation of this tool:
For the parameters of the S/N algorithm section see the class documentation there:
peak_options:sn
Generated on Fri Apr 12 2024 05:42:36 for OpenMS by 1.9.8