proteometer.rollup#
Attributes#
Functions#
|
Roll up peptide-level data to site-level data. |
Module Contents#
- proteometer.rollup.rollup_to_site(df_ori: pandas.DataFrame, int_cols: list[str], uniprot_col: str, peptide_col: str, residue_col: str, residue_sep: str = ';', id_col: str = 'id', id_separator: str = '@', site_col: str = 'Site', multiply_rollup_counts: bool = True, ignore_NA: bool = True, rollup_func: Literal['median', 'mean', 'sum'] = 'sum') pandas.DataFrame [source]#
Roll up peptide-level data to site-level data.
- Parameters:
df_ori (pd.DataFrame) – Original DataFrame containing peptide data.
int_cols (list[str]) – List of column names with intensity values to roll up.
uniprot_col (str) – Column name for UniProt identifiers.
peptide_col (str) – Column name for peptides.
residue_col (str) – Column name for residues.
residue_sep (str, optional) – Separator for residues in the residue column. Defaults to “;”.
id_col (str, optional) – Column name for generated IDs. Defaults to “id”.
id_separator (str, optional) – Separator for ID components. Defaults to “@”.
site_col (str, optional) – Column name for site information. Defaults to “Site”.
multiply_rollup_counts (bool, optional) – Whether to multiply rollup counts by the number of observations. Defaults to True.
ignore_NA (bool, optional) – Whether to ignore NA values during rollup. Defaults to True.
rollup_func (Literal["median", "mean", "sum"], optional) – Aggregation function to use. Defaults to “sum”.
- Returns:
DataFrame with rolled-up site-level data.
- Return type:
pd.DataFrame