imspy.simulation.timsim.jobs package¶
Submodules¶
imspy.simulation.timsim.jobs.add_noise_from_real_data module¶
- imspy.simulation.timsim.jobs.add_noise_from_real_data.add_real_data_noise_to_frames(acquisition_builder, frames, intensity_max_precursor=30, intensity_max_fragment=30, precursor_sample_fraction=0.2, fragment_sample_fraction=0.2, num_precursor_frames=5, num_fragment_frames=5)¶
Add noise to frame.
- Parameters:
acquisition_builder (
TimsTofAcquisitionBuilderDIA
) – Acquisition builder.frames (List[TimsFrame]) – Frames.
intensity_max_precursor (float) – Maximum intensity for precursor.
intensity_max_fragment (float) – Maximum intensity for fragment.
precursor_sample_fraction (float) – Sample fraction for precursor.
fragment_sample_fraction (float) – Sample fraction for fragment.
num_precursor_frames (int) – Number of precursor frames.
num_fragment_frames (int) – Number of fragment frames.
- Returns:
Frames.
- Return type:
List[TimsFrame]
imspy.simulation.timsim.jobs.assemble_frames module¶
- imspy.simulation.timsim.jobs.assemble_frames.assemble_frames(acquisition_builder, frames, batch_size, verbose=False, mz_noise_precursor=False, mz_noise_uniform=False, precursor_noise_ppm=5.0, mz_noise_fragment=False, fragment_noise_ppm=5.0, num_threads=4, add_real_data_noise=False, intensity_max_precursor=150, intensity_max_fragment=75, precursor_sample_fraction=0.01, fragment_sample_fraction=0.05, num_precursor_frames=10, num_fragment_frames=10, fragment=True)¶
Assemble frames from frame ids and write them to the database.
- Parameters:
acquisition_builder (
TimsTofAcquisitionBuilderDIA
) – Acquisition builder object.frames (
DataFrame
) – DataFrame containing frame ids.batch_size (
int
) – Batch size for frame assembly, i.e. how many frames are assembled at once.verbose (
bool
) – Verbosity.mz_noise_precursor (
bool
) – Add noise to precursor m/z values.mz_noise_uniform (
bool
) – Add uniform noise to m/z values.precursor_noise_ppm (
float
) – PPM value for precursor noise.mz_noise_fragment (
bool
) – Add noise to fragment m/z values.fragment_noise_ppm (
float
) – PPM value for fragment noise.num_threads (
int
) – Number of threads for frame assembly.add_real_data_noise (
bool
) – Add real data noise to the frames.intensity_max_precursor (
float
) – Maximum intensity for precursor noise.intensity_max_fragment (
float
) – Maximum intensity for fragment noise.precursor_sample_fraction (
float
) – Sample fraction for precursor noise.fragment_sample_fraction (
float
) – Sample fraction for fragment noise.num_precursor_frames (
int
) – Number of precursor frames.num_fragment_frames (
int
) – Number of fragment frames.fragment (
bool
) – if False, Quadrupole isolation will still be used, but no fragmentation will be performed.
- Return type:
None
- Returns:
None, writes frames to disk and metadata to database.
imspy.simulation.timsim.jobs.build_acquisition module¶
- imspy.simulation.timsim.jobs.build_acquisition.build_acquisition(path, reference_path, exp_name, acquisition_type='dia', verbose=False, gradient_length=None, use_reference_ds_layout=True, reference_in_memory=True, round_collision_energy=True, collision_energy_decimals=0, use_bruker_sdk=True)¶
Build acquisition object from reference path.
- Parameters:
path (
str
) – Path where the acquisition will be saved.reference_path (
str
) – Path to the reference dataset.exp_name (
str
) – Experiment name.acquisition_type (
str
) – Acquisition type, must be ‘dia’, ‘midia’, ‘slice’ or ‘synchro’.verbose (
bool
) – Verbosity.gradient_length (
float
) – Gradient length.use_reference_ds_layout (
bool
) – Use reference dataset layout for synthetic dataset.reference_in_memory (
bool
) – Load reference dataset into memory.round_collision_energy (
bool
) – Round collision energy.collision_energy_decimals (
int
) – Number of decimals for collision energy (controls coarseness).use_bruker_sdk (
bool
) – Use Bruker SDK for reading reference dataset.
- Returns:
Acquisition object.
- Return type:
imspy.simulation.timsim.jobs.digest_fasta module¶
- imspy.simulation.timsim.jobs.digest_fasta.digest_fasta(fasta_file_path, missed_cleavages=2, min_len=6, max_len=30, cleave_at='KR', restrict=None, decoys=False, verbose=False, job_name='digest_fasta', static_mods={'C': '[UNIMOD:4]'}, variable_mods={'M': ['[UNIMOD:35]'], '[': ['[UNIMOD:1]']}, exclude_accumulated_gradient_start=True, min_rt_percent=2.0, gradient_length=3600)¶
Digest a fasta file.
- Parameters:
fasta_file_path (
str
) – Path to the fasta file.missed_cleavages (
int
) – Number of missed cleavages.min_len (
int
) – Minimum peptide length.max_len (
int
) – Maximum peptide length.cleave_at (
str
) – Cleavage sites.restrict (
str
) – Restrict to specific proteins.decoys (
bool
) – Generate decoys.verbose (
bool
) – Verbosity.job_name (
str
) – Job name.static_mods (
dict
[str
,str
]) – Static modifications.variable_mods (
dict
[str
,list
[str
]]) – Variable modifications.exclude_accumulated_gradient_start (
bool
) – Exclude low retention times.min_rt_percent (
float
) – Minimum retention time in percent.gradient_length (
float
) – Gradient length in seconds (in seconds).
- Returns:
Peptide digest object.
- Return type:
imspy.simulation.timsim.jobs.simulate_charge_states module¶
- imspy.simulation.timsim.jobs.simulate_charge_states.simulate_charge_states(peptides, mz_lower, mz_upper, p_charge=0.5, max_charge=4, charge_state_one_probability=0.0, min_charge_contrib=0.15, use_binomial=False)¶
Simulate charge states for peptides.
- Parameters:
peptides (
DataFrame
) – Peptides DataFrame.mz_lower (
float
) – Lower m/z value.mz_upper (
float
) – Upper m/z value.p_charge (
float
) – Probability of charge.max_charge (
int
) – Maximum charge that will be simulated (should default to 4 since IMS simulations are limited to 4).charge_state_one_probability (
float
) – Probability of charge state one.min_charge_contrib (
float
) – Minimum charge contribution.use_binomial (
bool
) – Use binomial distribution model, otherwise use deep learning model.
- Returns:
Ions DataFrame.
- Return type:
pd.DataFrame
imspy.simulation.timsim.jobs.simulate_fragment_intensities module¶
- imspy.simulation.timsim.jobs.simulate_fragment_intensities.simulate_fragment_intensities(path, name, acquisition_builder, batch_size, verbose, num_threads, down_sample_factor=0.5)¶
Simulate fragment ion intensity distributions.
- Parameters:
path (
str
) – Path to the synthetic data.name (
str
) – Name of the synthetic data.acquisition_builder (
TimsTofAcquisitionBuilderDIA
) – Acquisition builder object.batch_size (
int
) – Batch size for frame assembly, i.e. how many frames are assembled at once.verbose (
bool
) – Verbosity.num_threads (
int
) – Number of threads for frame assembly.down_sample_factor (
int
) – Down sample factor for fragment ion intensity distributions.
- Return type:
None
- Returns:
None, writes frames to disk and metadata to database.
imspy.simulation.timsim.jobs.simulate_frame_distributions module¶
- imspy.simulation.timsim.jobs.simulate_frame_distributions.simulate_frame_distributions(peptides, frames, z_score, std_rt, rt_cycle_length, verbose=False, add_noise=False, normalize=False)¶
Simulate frame distributions for peptides.
- Parameters:
peptides (
DataFrame
) – Peptide DataFrame.frames (
DataFrame
) – Frame DataFrame.z_score (
float
) – Z-score.std_rt (
float
) – Standard deviation of retention time.rt_cycle_length (
float
) – Retention time cycle length in seconds.verbose (
bool
) – Verbosity.add_noise (
bool
) – Add noise.normalize (
bool
) – Normalize frame abundance.
- Returns:
Peptide DataFrame with frame distributions.
- Return type:
pd.DataFrame
imspy.simulation.timsim.jobs.simulate_frame_distributions_emg module¶
- imspy.simulation.timsim.jobs.simulate_frame_distributions_emg.sample_parameters_rejection(sigma_mean, sigma_variance, lambda_mean, lambda_variance, n)¶
- imspy.simulation.timsim.jobs.simulate_frame_distributions_emg.simulate_frame_distributions_emg(peptides, frames, mean_std_rt, variance_std_rt, mean_scewness, variance_scewness, target_p, step_size, rt_cycle_length, verbose=False, add_noise=False, n_steps=1000, num_threads=4, from_existing=False, sigmas=None, lambdas=None)¶
Simulate frame distributions for peptides.
- Parameters:
peptides (
DataFrame
) – Peptide DataFrame.frames (
DataFrame
) – Frame DataFrame.mean_std_rt (
float
) – mean retention time.variance_std_rt (
float
) – variance retention time.mean_scewness (
float
) – mean scewness.variance_scewness (
float
) – variance scewness.target_p (
float
) – target p.step_size (
float
) – step size.rt_cycle_length (
float
) – Retention time cycle length in seconds.verbose (
bool
) – Verbosity.add_noise (
bool
) – Add noise.normalize – Normalize frame abundance.
n_steps (
int
) – number of steps.num_threads (
int
) – number of threads.from_existing (
bool
) – Use existing parameters.sigmas (
ndarray
) – sigmas.lambdas (
ndarray
) – lambdas.
- Returns:
Peptide DataFrame with frame distributions.
- Return type:
pd.DataFrame
imspy.simulation.timsim.jobs.simulate_ion_mobilities module¶
- imspy.simulation.timsim.jobs.simulate_ion_mobilities.simulate_ion_mobilities(ions, im_lower, im_upper, verbose=False)¶
Simulate ion mobilities.
- Parameters:
ions (
DataFrame
) – Ions DataFrame.im_lower (
float
) – Lower ion mobility.im_upper (
float
) – Upper ion mobility.verbose (
bool
) – Verbosity.
- Returns:
Ions DataFrame.
- Return type:
pd.DataFrame
imspy.simulation.timsim.jobs.simulate_occurrences module¶
- imspy.simulation.timsim.jobs.simulate_occurrences.simulate_peptide_occurrences(peptides, intensity_mean, intensity_min, intensity_max, verbose=False, sample_occurrences=True, intensity_value=1000000.0, mixture_contribution=1.0)¶
Simulate peptide occurrences.
- Parameters:
peptides (
DataFrame
) – Peptides DataFrame.intensity_mean (
float
) – Intensity mean.intensity_min (
float
) – Intensity minimum.intensity_max (
float
) – Intensity maximum.verbose (
bool
) – Verbosity.sample_occurrences (
bool
) – Sample occurrences.intensity_value (
float
) – Intensity value.mixture_contribution (
float
) – Mixture contribution.
- Returns:
Peptides DataFrame.
- Return type:
pd.DataFrame
imspy.simulation.timsim.jobs.simulate_precursor_spectra module¶
imspy.simulation.timsim.jobs.simulate_retention_time module¶
- imspy.simulation.timsim.jobs.simulate_retention_time.simulate_retention_times(peptides, verbose=False, gradient_length=3600)¶
Simulate retention times.
- Parameters:
peptides (
DataFrame
) – Peptides DataFrame.verbose (
bool
) – Verbosity.gradient_length (
float
) – Gradient length in seconds.
- Returns:
Peptides DataFrame.
- Return type:
pd.DataFrame
imspy.simulation.timsim.jobs.simulate_scan_distributions module¶
- imspy.simulation.timsim.jobs.simulate_scan_distributions.simulate_scan_distributions(ions, scans, z_score, mean_std_im=0.01, variance_std_im=0.0, verbose=False, add_noise=False, normalize=False, from_existing=False, std_means=None)¶
Simulate scan distributions for ions.
- Parameters:
ions (
DataFrame
) – Ions DataFrame.scans (
DataFrame
) – Scan DataFrame.z_score (
float
) – Z-score.mean_std_im (
float
) – Standard deviation of ion mobility.variance_std_im (
float
) – Variance of standard deviation of ion mobility.verbose (
bool
) – Verbosity.add_noise (
bool
) – Add noise.normalize (
bool
) – Normalize scan abundance.from_existing (
bool
) – Use existing parameters.std_means (
ndarray
[Any
,dtype
[TypeVar
(_ScalarType_co
, bound=generic
, covariant=True)]]) – Standard deviations.
- Returns:
Ions DataFrame with scan distributions.
- Return type:
pd.DataFrame
imspy.simulation.timsim.jobs.utility module¶
- imspy.simulation.timsim.jobs.utility.check_path(p)¶
- Return type:
str
- imspy.simulation.timsim.jobs.utility.phosphorylation_sizes(sequence)¶
Checks if a sequence contains potential phosphorylation sites (S, T, or Y), and returns the count of sites and their indices.
- Parameters:
sequence (str) – The input sequence string, e.g., “IC[UNIMOD:4]RQHTK”.
- Returns:
- A tuple containing:
int: The number of phosphorylation sites.
list: A list of indices where the sites are found.
- Return type:
tuple