proteometer.ptm#

Functions#

get_ptm_pos_in_pept(→ list[int])

Get the positions of PTM labels in a peptide.

get_yst(→ list[tuple[int, str]])

Get YST positions in a peptide.

get_phosphositeplus_pos(→ list[int])

Extracts numeric positions from a string of modified residues.

combine_multi_ptms(→ pandas.DataFrame)

Combines multiple proteomics dataframes into a single dataframe.

Module Contents#

proteometer.ptm.get_ptm_pos_in_pept(peptide: str, ptm_label: str = '*', special_chars: str = '.]+-=@_!#$%^&*()<>?/\\|}{~:[') list[int][source]#

Get the positions of PTM labels in a peptide.

This function processes a peptide string to find the positions of post-translational modification (PTM) labels. It accounts for special characters and returns a list of positions adjusted to the stripped peptide sequence. Positions are 0-indexed from the start of the peptide.

Parameters:
  • peptide (str) – The peptide string potentially containing PTM labels.

  • ptm_label (str, optional) – The label representing PTM. Defaults to ‘*’.

  • special_chars (str, optional) – A string of special characters that might need escaping in regex operations. Defaults to common special characters.

Returns:

A sorted list of integer positions where the PTM labels

occur in the peptide, adjusted for any modifications made during processing.

Return type:

list[int]

proteometer.ptm.get_yst(strip_pept: str, ptm_aa: str = 'YSTyst') list[tuple[int, str]][source]#

Get YST positions in a peptide.

This function takes a stripped peptide sequence and finds the positions of Y, S, and T residues.

Parameters:
  • strip_pept (str) – The stripped peptide sequence.

  • ptm_aa (str, optional) – The residues letters for Y, S, and T residues. Defaults to ‘YSTyst’.

Returns:

A list of tuples where the first element is the

position of the label in the stripped peptide and the second element is the YST residue letter.

Return type:

list[tuple[int, str]]

proteometer.ptm.get_phosphositeplus_pos(mod_rsd: str) list[int][source]#

Extracts numeric positions from a string of modified residues.

Parameters:

mod_rsd (str) – A string of modified residues.

Returns:

A list of numeric positions extracted from the input string.

Return type:

list[int]

proteometer.ptm.combine_multi_ptms(multi_proteomics: dict[str, pandas.DataFrame], par: proteometer.params.Params) pandas.DataFrame[source]#

Combines multiple proteomics dataframes into a single dataframe.

This function processes and combines different types of proteomics data into a unified dataframe. It distinguishes between global proteomics data and post-translational modifications (PTM) data, assigning specific labels and counting site numbers accordingly.

Parameters:
  • multi_proteomics (dict[str, pd.DataFrame]) – Dictionary of proteomics dataframes with keys indicating the type of proteomics (‘global’ or PTM types).

  • par (Params) – Configuration parameters containing column names and PTM details.

Returns:

A combined dataframe containing all the input proteomics data,

labeled and processed as per the specified parameters.

Return type:

pd.DataFrame