proteometer.rollup#

Attributes#

Functions#

rollup_to_site(→ pandas.DataFrame)

Roll up peptide-level data to site-level data.

Module Contents#

proteometer.rollup.AggDictFloat[source]#
proteometer.rollup.rollup_to_site(df_ori: pandas.DataFrame, int_cols: list[str], uniprot_col: str, peptide_col: str, residue_col: str, residue_sep: str = ';', id_col: str = 'id', id_separator: str = '@', site_col: str = 'Site', multiply_rollup_counts: bool = True, ignore_NA: bool = True, rollup_func: Literal['median', 'mean', 'sum'] = 'sum') pandas.DataFrame[source]#

Roll up peptide-level data to site-level data.

Parameters:
  • df_ori (pd.DataFrame) – Original DataFrame containing peptide data.

  • int_cols (list[str]) – List of column names with intensity values to roll up.

  • uniprot_col (str) – Column name for UniProt identifiers.

  • peptide_col (str) – Column name for peptides.

  • residue_col (str) – Column name for residues.

  • residue_sep (str, optional) – Separator for residues in the residue column. Defaults to “;”.

  • id_col (str, optional) – Column name for generated IDs. Defaults to “id”.

  • id_separator (str, optional) – Separator for ID components. Defaults to “@”.

  • site_col (str, optional) – Column name for site information. Defaults to “Site”.

  • multiply_rollup_counts (bool, optional) – Whether to multiply rollup counts by the number of observations. Defaults to True.

  • ignore_NA (bool, optional) – Whether to ignore NA values during rollup. Defaults to True.

  • rollup_func (Literal["median", "mean", "sum"], optional) – Aggregation function to use. Defaults to “sum”.

Returns:

DataFrame with rolled-up site-level data.

Return type:

pd.DataFrame