proteometer.abundance#
Functions#
|
Return a dictionary of protein abundance scalars for the given pairwise t-test. |
|
Perform protein abundance correction based on the provided parameters. |
|
Adjusts peptide abundance values based on protein abundance scalars for |
|
Correct the peptide abundance data using the protein abundance values. |
|
Perform global protein normalization and statistical analysis. |
Module Contents#
- proteometer.abundance.get_prot_abund_scalars(prot: pandas.DataFrame, pairwise_ttest_name: str, sig_type: str = 'pval', sig_thr: float = 0.05) dict[str, float] [source]#
Return a dictionary of protein abundance scalars for the given pairwise t-test.
- Parameters:
prot (pd.DataFrame) – DataFrame containing protein-level data.
pairwise_ttest_name (str) – Name of the pairwise t-test.
sig_type (str, optional) – Type of significance metric to use for filtering. Defaults to “pval”.
sig_thr (float, optional) – Threshold for significance filtering. Defaults to 0.05.
- Returns:
Dictionary of protein abundance scalars.
- Return type:
- proteometer.abundance.prot_abund_correction(pept: pandas.DataFrame, prot: pandas.DataFrame, par: proteometer.params.Params, columns_to_correct: collections.abc.Iterable[str] | None = None, pairwise_ttest_groups: collections.abc.Iterable[proteometer.stats.TTestGroup] | None = None, non_tt_cols: collections.abc.Iterable[str] | None = None) pandas.DataFrame [source]#
Perform protein abundance correction based on the provided parameters.
This function applies either paired or unpaired sample abundance correction depending on the
abundance_correction_paired_samples
attribute of thepar
parameter.- Parameters:
pept (pd.DataFrame) – A DataFrame containing peptide-level data.
prot (pd.DataFrame) – A DataFrame containing protein-level data.
par (Params) – A parameter object containing configuration for abundance correction.
columns_to_correct (Iterable[str] | None, optional) – Columns to correct for paired sample abundance correction. Required if
par.abundance_correction_paired_samples
is True.pairwise_ttest_groups (Iterable[stats.TTestGroup] | None, optional) – Groups for pairwise t-tests in unpaired sample abundance correction. Required if
par.abundance_correction_paired_samples
is False.non_tt_cols (Iterable[str] | None, optional) – Columns that should not be included in the t-test correction.
- Returns:
A DataFrame with corrected protein abundances.
- Return type:
pd.DataFrame
- Raises:
ValueError – If
columns_to_correct
is not provided for paired sample correction.ValueError – If
pairwise_ttest_groups
is not provided for unpaired sample correction.
- proteometer.abundance.prot_abund_correction_sig_only(pept: pandas.DataFrame, prot: pandas.DataFrame, pairwise_ttest_groups: collections.abc.Iterable[proteometer.stats.TTestGroup], uniprot_col: str, sig_type: str = 'pval', sig_thr: float = 0.05) pandas.DataFrame [source]#
Adjusts peptide abundance values based on protein abundance scalars for significant pairwise t-test groups.
This function iterates over a collection of pairwise t-test groups, computes or retrieves protein abundance scalars, and applies these scalars to adjust the peptide abundance values for the specified treatment samples.
- Parameters:
pept (pd.DataFrame) – DataFrame containing peptide-level data. Must include a column corresponding to
uniprot_col
for mapping protein identifiers.prot (pd.DataFrame) – DataFrame containing protein-level data. Must include columns for protein abundance scalars or data required to compute them.
pairwise_ttest_groups (Iterable[stats.TTestGroup]) – An iterable of TTestGroup objects, each representing a pairwise t-test group with associated metadata (e.g., labels and treatment samples).
uniprot_col (str) – Column name in
pept
that contains UniProt identifiers for mapping to protein abundance data.sig_type (str, optional) – Type of significance metric to use for filtering (e.g., “pval” for p-value or “adj-p” for adjusted p-value). Defaults to “pval”.
sig_thr (float, optional) – Threshold for significance filtering. Only proteins meeting this threshold will have their abundance scalars applied. Defaults to 0.05.
- Returns:
- Updated
pept
DataFrame with adjusted abundance values for treatment samples and additional columns for protein abundance scalars.
- Updated
- Return type:
pd.DataFrame
- proteometer.abundance.prot_abund_correction_matched(pept: pandas.DataFrame, prot: pandas.DataFrame, columns_to_correct: collections.abc.Iterable[str], uniprot_col: str, non_tt_cols: collections.abc.Iterable[str] | None = None) pandas.DataFrame [source]#
Correct the peptide abundance data using the protein abundance values.
This function takes the peptide data and corrects the intensity values for each peptide using the protein abundance values from the protein data. The correction is only applied to the treatment samples.
- Parameters:
pept (pd.DataFrame) – A DataFrame containing peptide-level data.
prot (pd.DataFrame) – A DataFrame containing protein-level data.
columns_to_correct (Iterable[str]) – Columns to correct for protein abundance changes. Must be shared by
pept
andprot
.uniprot_col (str) – Column name for the Uniprot ID in both
pept
andprot
.non_tt_cols (Iterable[str] | None, optional) – Columns that should not be included in the abundance correction. Must be shared by
pept
andprot
.
- Returns:
- Updated
pept
DataFrame with adjusted abundance values for treatment samples and additional columns for protein abundance scalars.
- Updated
- Return type:
pd.DataFrame
- proteometer.abundance.global_prot_normalization_and_stats(global_prot: pandas.DataFrame, int_cols: list[str], anova_cols: list[str], pairwise_ttest_groups: collections.abc.Iterable[proteometer.stats.TTestGroup], metadata: pandas.DataFrame, par: proteometer.params.Params) pandas.DataFrame [source]#
Perform global protein normalization and statistical analysis.
This function applies normalization and statistical tests to global proteomics data. It handles both median normalization and batch correction, depending on the parameters provided in the
par
object. It also performs ANOVA and pairwise t-tests.- Parameters:
global_prot (pd.DataFrame) – DataFrame containing global protein-level data.
int_cols (list[str]) – List of column names representing intensity data to normalize.
anova_cols (list[str]) – List of column names for ANOVA analysis.
pairwise_ttest_groups (Iterable[stats.TTestGroup]) – Iterable of TTestGroup objects for performing pairwise t-tests (each defines a control-treatment pair).
metadata (pd.DataFrame) – DataFrame containing metadata for batch correction and ANOVA analysis.
par (Params) – Parameter object containing configuration for normalization and statistical analysis.
- Returns:
The normalized and statistically analyzed global protein data.
- Return type:
pd.DataFrame