proteometer.abundance ===================== .. py:module:: proteometer.abundance Functions --------- .. autoapisummary:: proteometer.abundance.get_prot_abund_scalars proteometer.abundance.prot_abund_correction proteometer.abundance.prot_abund_correction_sig_only proteometer.abundance.prot_abund_correction_matched proteometer.abundance.global_prot_normalization_and_stats Module Contents --------------- .. py:function:: get_prot_abund_scalars(prot: pandas.DataFrame, pairwise_ttest_name: str, sig_type: str = 'pval', sig_thr: float = 0.05) -> dict[str, float] Return a dictionary of protein abundance scalars for the given pairwise t-test. :param prot: DataFrame containing protein-level data. :type prot: pd.DataFrame :param pairwise_ttest_name: Name of the pairwise t-test. :type pairwise_ttest_name: str :param sig_type: Type of significance metric to use for filtering. Defaults to "pval". :type sig_type: str, optional :param sig_thr: Threshold for significance filtering. Defaults to 0.05. :type sig_thr: float, optional :returns: Dictionary of protein abundance scalars. :rtype: dict[str, float] .. py:function:: prot_abund_correction(pept: pandas.DataFrame, prot: pandas.DataFrame, par: proteometer.params.Params, columns_to_correct: collections.abc.Iterable[str] | None = None, pairwise_ttest_groups: collections.abc.Iterable[proteometer.stats.TTestGroup] | None = None, non_tt_cols: collections.abc.Iterable[str] | None = None) -> pandas.DataFrame Perform protein abundance correction based on the provided parameters. This function applies either paired or unpaired sample abundance correction depending on the `abundance_correction_paired_samples` attribute of the `par` parameter. :param pept: A DataFrame containing peptide-level data. :type pept: pd.DataFrame :param prot: A DataFrame containing protein-level data. :type prot: pd.DataFrame :param par: A parameter object containing configuration for abundance correction. :type par: Params :param columns_to_correct: Columns to correct for paired sample abundance correction. Required if `par.abundance_correction_paired_samples` is True. :type columns_to_correct: Iterable[str] | None, optional :param pairwise_ttest_groups: Groups for pairwise t-tests in unpaired sample abundance correction. Required if `par.abundance_correction_paired_samples` is False. :type pairwise_ttest_groups: Iterable[stats.TTestGroup] | None, optional :param non_tt_cols: Columns that should not be included in the t-test correction. :type non_tt_cols: Iterable[str] | None, optional :returns: A DataFrame with corrected protein abundances. :rtype: pd.DataFrame :raises ValueError: If `columns_to_correct` is not provided for paired sample correction. :raises ValueError: If `pairwise_ttest_groups` is not provided for unpaired sample correction. .. py:function:: prot_abund_correction_sig_only(pept: pandas.DataFrame, prot: pandas.DataFrame, pairwise_ttest_groups: collections.abc.Iterable[proteometer.stats.TTestGroup], uniprot_col: str, sig_type: str = 'pval', sig_thr: float = 0.05) -> pandas.DataFrame Adjusts peptide abundance values based on protein abundance scalars for significant pairwise t-test groups. This function iterates over a collection of pairwise t-test groups, computes or retrieves protein abundance scalars, and applies these scalars to adjust the peptide abundance values for the specified treatment samples. :param pept: DataFrame containing peptide-level data. Must include a column corresponding to `uniprot_col` for mapping protein identifiers. :type pept: pd.DataFrame :param prot: DataFrame containing protein-level data. Must include columns for protein abundance scalars or data required to compute them. :type prot: pd.DataFrame :param pairwise_ttest_groups: An iterable of TTestGroup objects, each representing a pairwise t-test group with associated metadata (e.g., labels and treatment samples). :type pairwise_ttest_groups: Iterable[stats.TTestGroup] :param uniprot_col: Column name in `pept` that contains UniProt identifiers for mapping to protein abundance data. :type uniprot_col: str :param sig_type: Type of significance metric to use for filtering (e.g., "pval" for p-value or "adj-p" for adjusted p-value). Defaults to "pval". :type sig_type: str, optional :param sig_thr: Threshold for significance filtering. Only proteins meeting this threshold will have their abundance scalars applied. Defaults to 0.05. :type sig_thr: float, optional :returns: Updated `pept` DataFrame with adjusted abundance values for treatment samples and additional columns for protein abundance scalars. :rtype: pd.DataFrame .. py:function:: prot_abund_correction_matched(pept: pandas.DataFrame, prot: pandas.DataFrame, columns_to_correct: collections.abc.Iterable[str], uniprot_col: str, non_tt_cols: collections.abc.Iterable[str] | None = None) -> pandas.DataFrame Correct the peptide abundance data using the protein abundance values. This function takes the peptide data and corrects the intensity values for each peptide using the protein abundance values from the protein data. The correction is only applied to the treatment samples. :param pept: A DataFrame containing peptide-level data. :type pept: pd.DataFrame :param prot: A DataFrame containing protein-level data. :type prot: pd.DataFrame :param columns_to_correct: Columns to correct for protein abundance changes. Must be shared by `pept` and `prot`. :type columns_to_correct: Iterable[str] :param uniprot_col: Column name for the Uniprot ID in both `pept` and `prot`. :type uniprot_col: str :param non_tt_cols: Columns that should not be included in the abundance correction. Must be shared by `pept` and `prot`. :type non_tt_cols: Iterable[str] | None, optional :returns: Updated `pept` DataFrame with adjusted abundance values for treatment samples and additional columns for protein abundance scalars. :rtype: pd.DataFrame .. py:function:: global_prot_normalization_and_stats(global_prot: pandas.DataFrame, int_cols: list[str], anova_cols: list[str], pairwise_ttest_groups: collections.abc.Iterable[proteometer.stats.TTestGroup], metadata: pandas.DataFrame, par: proteometer.params.Params) -> pandas.DataFrame Perform global protein normalization and statistical analysis. This function applies normalization and statistical tests to global proteomics data. It handles both median normalization and batch correction, depending on the parameters provided in the `par` object. It also performs ANOVA and pairwise t-tests. :param global_prot: DataFrame containing global protein-level data. :type global_prot: pd.DataFrame :param int_cols: List of column names representing intensity data to normalize. :type int_cols: list[str] :param anova_cols: List of column names for ANOVA analysis. :type anova_cols: list[str] :param pairwise_ttest_groups: Iterable of TTestGroup objects for performing pairwise t-tests (each defines a control-treatment pair). :type pairwise_ttest_groups: Iterable[stats.TTestGroup] :param metadata: DataFrame containing metadata for batch correction and ANOVA analysis. :type metadata: pd.DataFrame :param par: Parameter object containing configuration for normalization and statistical analysis. :type par: Params :returns: The normalized and statistically analyzed global protein data. :rtype: pd.DataFrame