Statistical Utilities
The statistical utilities module provides functions for performing various statistical analyses and computations that are commonly used in the context of spectral normative modeling. Notably, this includes functions complementary to the metrics utilities, such as functions for computing log-likelihoods and centile scores, which are essential for evaluating the fit of normative models and for interpreting the results of empirical data analyses.
spectranorm.utils.stats
utils/stats.py
Statistical utility functions for the Spectranorm package.
compute_censored_log_likelihood(observations: npt.NDArray[np.floating[Any]], predicted_mus: npt.NDArray[np.floating[Any]], predicted_sigmas: npt.NDArray[np.floating[Any]], censored_quantile: float = 0.01) -> npt.NDArray[np.floating[Any]]
Compute censored log likelihood, replacing extreme low likelihoods with a censoring threshold.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
observations
|
NDArray[floating[Any]]
|
np.ndarray Observed data points (N,). |
required |
predicted_mus
|
NDArray[floating[Any]]
|
np.ndarray Predicted means for each observation (N,). |
required |
predicted_sigmas
|
NDArray[floating[Any]]
|
np.ndarray Predicted standard deviations for each observation (N,). |
required |
censored_quantile
|
float
|
float (default=0.01) Quantile below which log-likelihoods are censored. |
0.01
|
Returns:
| Type | Description |
|---|---|
NDArray[floating[Any]]
|
np.ndarray: The censored log likelihood of all observations. |
Source code in src/spectranorm/utils/stats.py
compute_centiles_from_z_scores(z_scores: npt.NDArray[np.floating[Any]]) -> npt.NDArray[np.floating[Any]]
Convert z-scores to percentiles.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
z_scores
|
NDArray[floating[Any]]
|
np.ndarray Array of z-scores. |
required |
Returns:
| Type | Description |
|---|---|
NDArray[floating[Any]]
|
np.ndarray Array of percentiles corresponding to the z-scores. |
Source code in src/spectranorm/utils/stats.py
compute_correlation_significance(correlation_matrix: npt.NDArray[np.floating[Any]], n_samples: int, correlation_threshold: float = 0.0, correction_method: str = 'fdr_bh') -> npt.NDArray[np.floating[Any]]
Compute the significance of correlations between variables in the data matrix, thresholded by a specified correlation value, using Fisher's z-transformation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
correlation_matrix
|
NDArray[floating[Any]]
|
np.ndarray A 2D array of pairwise correlation values. |
required |
n_samples
|
int
|
int The number of samples used to compute correlation. |
required |
correlation_threshold
|
float
|
float (default=0.0) The correlation threshold above which correlations are to be considered significant. |
0.0
|
correction_method
|
str
|
str (default='fdr_bh') Method for multiple testing correction. Options include 'bonferroni', 'holm', 'fdr_bh', etc. See statsmodels.stats.multitest.multipletests for more details. |
'fdr_bh'
|
Returns:
| Type | Description |
|---|---|
NDArray[floating[Any]]
|
np.ndarray A matrix of p-values indicating the significance of each correlation. (Testing if the correlation is significantly greater than the threshold.) |
Source code in src/spectranorm/utils/stats.py
compute_log_likelihood(observations: npt.NDArray[np.floating[Any]], predicted_mus: npt.NDArray[np.floating[Any]], predicted_sigmas: npt.NDArray[np.floating[Any]]) -> npt.NDArray[np.floating[Any]]
Compute the log likelihood of observations given predicted means and standard deviations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
observations
|
NDArray[floating[Any]]
|
np.ndarray Observed data points. |
required |
predicted_mus
|
NDArray[floating[Any]]
|
np.ndarray Predicted means for the observations. |
required |
predicted_sigmas
|
NDArray[floating[Any]]
|
np.ndarray Predicted standard deviations for the observations. |
required |
Returns:
| Type | Description |
|---|---|
NDArray[floating[Any]]
|
np.ndarray Log likelihood of each observation. |