imspy.timstof.dbsearch package¶

Submodules¶

imspy.timstof.dbsearch.imspy_dda module¶

imspy.timstof.dbsearch.imspy_dda.create_database(fasta, static, variab, enzyme_builder, generate_decoys, bucket_size, shuffle_decoys=True, keep_ends=True)¶

imspy.timstof.dbsearch.imspy_dda.load_config(config_path)¶

imspy.timstof.dbsearch.imspy_dda.main()¶

imspy.timstof.dbsearch.imspy_rescore_sage module¶

imspy.timstof.dbsearch.imspy_rescore_sage.main()¶

imspy.timstof.dbsearch.sage_output_utility module¶

class imspy.timstof.dbsearch.sage_output_utility.PatternReplacer(replacements, pattern='\\\\[.*?\\\\]')¶

Bases: object

apply(string)¶

Return type:: str

imspy.timstof.dbsearch.sage_output_utility.break_into_equal_size_sets(sequence_set, k=10)¶

Breaks a set of objects into k sets of equal size at random.

Parameters:

sequence_set – Set of sequences to be divided
k (int) – Number of sets to divide the objects into

Returns:

A list containing k sets, each with equal number of randomly chosen sequences

imspy.timstof.dbsearch.sage_output_utility.cosim_from_dict(observed, predicted)¶

imspy.timstof.dbsearch.sage_output_utility.fragments_to_dict(fragments)¶

imspy.timstof.dbsearch.sage_output_utility.generate_training_data(psms, method='psm', q_max=0.01, balance=True)¶

Generate training data. :type psms: DataFrame :param psms: List of PeptideSpectrumMatch objects :type method: str :param method: Method to use for training data generation :type q_max: float :param q_max: Maximum q-value allowed for positive examples :type balance: bool :param balance: Whether to balance the dataset

Returns:: X_train and Y_train
Return type:: Tuple[NDArray, NDArray]

imspy.timstof.dbsearch.sage_output_utility.plot_summary(TARGET, DECOY, save_path, dpi=300, file_format='png')¶

imspy.timstof.dbsearch.sage_output_utility.re_score_psms(psms, num_splits=10, verbose=True, balance=True, score='hyperscore', positive_example_q_max=0.01)¶

Re-score PSMs using LDA. :type psms: DataFrame :param psms: List of PeptideSpectrumMatch objects :type num_splits: int :param num_splits: Number of splits :type verbose: bool :param verbose: Whether to print progress :type balance: bool :param balance: Whether to balance the dataset :type score: str :param score: Score to use for re-scoring :type positive_example_q_max: float :param positive_example_q_max: Maximum q-value allowed for positive examples

Returns:: List of PeptideSpectrumMatch objects
Return type:: List[PeptideSpectrumMatch]

imspy.timstof.dbsearch.sage_output_utility.remove_substrings(input_string)¶

Return type:: str

imspy.timstof.dbsearch.sage_output_utility.row_to_fragment(r)¶

imspy.timstof.dbsearch.sage_output_utility.split_dataframe_randomly(df, n)¶

Return type:: list

imspy.timstof.dbsearch.utility module¶

imspy.timstof.dbsearch.utility.check_memory(limit_in_gb=16, msg='⚠️ Warning: System has only {total_ram_gb:.2f}GB of RAM, which is below the recommended {limit_in_gb}GB.')¶

imspy.timstof.dbsearch.utility.extract_timstof_dda_data(path, in_memory=False, use_bruker_sdk=False, isolation_window_lower=-3.0, isolation_window_upper=3.0, take_top_n=100, num_threads=16)¶

Extract TIMSTOF DDA data from bruker timsTOF TDF file. :type path: str :param path: Path to TIMSTOF DDA data :type in_memory: bool :param in_memory: Whether to load data in memory :type use_bruker_sdk: bool :param use_bruker_sdk: Whether to use bruker SDK for data extraction :type isolation_window_lower: float :param isolation_window_lower: Lower bound for isolation window (Da) :type isolation_window_upper: float :param isolation_window_upper: Upper bound for isolation window (Da) :type take_top_n: int :param take_top_n: Number of top peaks to take :type num_threads: int :param num_threads: Number of threads to use

Returns:: DataFrame containing timsTOF DDA data
Return type:: pd.DataFrame

imspy.timstof.dbsearch.utility.generate_balanced_im_dataset(psms)¶

Return type:: List[Psm]

imspy.timstof.dbsearch.utility.generate_balanced_rt_dataset(psms)¶

Return type:: List[Psm]

imspy.timstof.dbsearch.utility.generate_training_data(psms, method='psm', q_max=0.01, balance=True)¶

Generate training data. :type psms: List[Psm] :param psms: List of PeptideSpectrumMatch objects :type method: str :param method: Method to use for training data generation :type q_max: float :param q_max: Maximum q-value allowed for positive examples :type balance: bool :param balance: Whether to balance the dataset

Returns:: X_train and Y_train
Return type:: Tuple[NDArray, NDArray]

imspy.timstof.dbsearch.utility.get_searchable_spec(precursor, raw_fragment_data, spec_processor, time, spec_id, file_id=0, ms_level=2)¶

Get SAGE searchable spectrum from raw data. :type precursor: Precursor :param precursor: Precursor object :type raw_fragment_data: TimsFrame :param raw_fragment_data: TimsFrame object :type time: float :param time: float :type spec_processor: SpectrumProcessor :param spec_processor: SpectrumProcessor object :type spec_id: str :param spec_id: str :type file_id: int :param file_id: int :type ms_level: int :param ms_level: int

Returns:: ProcessedSpectrum object
Return type:: ProcessedSpectrum

imspy.timstof.dbsearch.utility.linear_map(value, old_min, old_max, new_min=0.0, new_max=60.0)¶

imspy.timstof.dbsearch.utility.list_to_semicolon_string(value)¶: Converts a list of proteins into a semicolon-separated string.

imspy.timstof.dbsearch.utility.map_to_domain(data, gradient_length=120.0)¶

Maps the input data linearly into the domain [0, l].

Parameters: - data: list or numpy array of numerical values - l: float, the upper limit of the target domain [0, l]

Returns: - mapped_data: list of values mapped into the domain [0, l]

imspy.timstof.dbsearch.utility.merge_dicts_with_merge_dict(dicts)¶

imspy.timstof.dbsearch.utility.parse_string_list(input_str)¶

Takes a string representation of a list and converts it into an actual list of strings.

Parameters:: input_str (str) – A string containing a list representation.
Returns:: A list of strings parsed from the input string.
Return type:: list

imspy.timstof.dbsearch.utility.parse_to_tims2rescore(TDC, from_mgf=False, file_name=None)¶

imspy.timstof.dbsearch.utility.peptide_length(peptide)¶

Takes a peptide sequence as a string and returns its length, excluding [UNIMOD:X] modifications.

Parameters:: peptide (str) – A peptide sequence with possible UNIMOD modifications.
Returns:: The length of the peptide without modifications.
Return type:: int

imspy.timstof.dbsearch.utility.sanitize_charge(charge)¶

Sanitize charge value. :type charge: Optional[float] :param charge: Charge value as float.

Returns:: Charge value as int.
Return type:: int

imspy.timstof.dbsearch.utility.sanitize_mz(mz, mz_highest)¶

Sanitize mz value. :type mz: Optional[float] :param mz: Mz value as float. :type mz_highest: float :param mz_highest: Highest mz value.

Returns:: Mz value as float.
Return type:: float

imspy.timstof.dbsearch.utility.split_fasta(fasta, num_splits=16, randomize=True)¶

Split a fasta file into multiple fasta files. :type fasta: str :param fasta: Fasta file as string. :type num_splits: int :param num_splits: Number of splits fasta file should be split into. :type randomize: bool :param randomize: Whether to randomize the order of sequences before splitting.

Return type:: List[str]
Returns:: List of fasta files as strings, will contain num_splits fasta files with equal number of sequences.

imspy.timstof.dbsearch.utility.split_psms(psms, num_splits=10)¶

Split PSMs into multiple splits.

Parameters:

psms (List[Psm]) – List of PeptideSpectrumMatch objects
num_splits (int) – Number of splits

Returns:

List of splits

Return type:

List[List[PeptideSpectrumMatch]]

imspy.timstof.dbsearch.utility.transform_psm_to_pin(psm_df)¶

imspy.timstof.dbsearch.utility.write_psms_binary(byte_array, folder_path, file_name, total=False)¶: Write PSMs to binary file. :type byte_array: :param byte_array: Byte array :type folder_path: str :param folder_path: Folder path :type file_name: str :param file_name: File name :type total: bool :param total: Whether to write to total folder

imspy.timstof.dbsearch package¶

Submodules¶

imspy.timstof.dbsearch.imspy_dda module¶

imspy.timstof.dbsearch.imspy_rescore_sage module¶

imspy.timstof.dbsearch.sage_output_utility module¶

imspy.timstof.dbsearch.utility module¶

Module contents¶

imspy

Navigation

Related Topics