Correlation
Numra’s numra-stats crate provides two correlation coefficients — Pearson
and Spearman — along with the covariance matrix for multivariate data. Both
correlation functions return the coefficient and its associated p-value for
testing statistical significance.
Pearson Correlation Coefficient
Section titled “Pearson Correlation Coefficient”The Pearson correlation coefficient measures the strength and direction of the linear relationship between two variables:
The value lies in :
| Interpretation | |
|---|---|
| Perfect positive linear relationship | |
| No linear relationship | |
| Perfect negative linear relationship |
Basic Usage
Section titled “Basic Usage”The pearson_r function returns a tuple (r, p_value):
use numra::stats::pearson_r;
let x = vec![1.0_f64, 2.0, 3.0, 4.0, 5.0];let y = vec![2.0, 4.0, 6.0, 8.0, 10.0]; // y = 2x (perfect linear)
let (r, p) = pearson_r(&x, &y).unwrap();assert!((r - 1.0).abs() < 1e-12); // perfect positive correlationassert!(p < 0.01); // highly significantPerfect Negative Correlation
Section titled “Perfect Negative Correlation”use numra::stats::pearson_r;
let x = vec![1.0_f64, 2.0, 3.0, 4.0, 5.0];let y = vec![10.0, 8.0, 6.0, 4.0, 2.0]; // inversely related
let (r, p) = pearson_r(&x, &y).unwrap();assert!((r + 1.0).abs() < 1e-12); // perfect negative correlationassert!(p < 0.01);p-Value Interpretation
Section titled “p-Value Interpretation”The p-value tests (no linear correlation) against using the t-statistic:
with degrees of freedom.
use numra::stats::pearson_r;
let x = vec![1.0_f64, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0];let y = vec![2.3, 4.1, 5.8, 8.2, 9.7, 12.1, 13.8, 16.0, 18.2, 19.9];
let (r, p) = pearson_r(&x, &y).unwrap();println!("Pearson r = {:.6}", r);println!("p-value = {:.2e}", p);println!("Significant at alpha=0.05? {}", if p < 0.05 { "Yes" } else { "No" });Strength Guidelines
Section titled “Strength Guidelines”| Strength | |
|---|---|
| Very weak | |
| Weak | |
| Moderate | |
| Strong | |
| Very strong |
Spearman Rank Correlation
Section titled “Spearman Rank Correlation”The Spearman correlation coefficient measures the strength of the monotonic relationship between two variables. It is computed as the Pearson correlation of the rank-transformed data:
Spearman correlation is more robust than Pearson because:
- It does not assume linearity — any monotonic relationship gives
- It is less sensitive to outliers
- It works with ordinal data
Monotonic But Nonlinear
Section titled “Monotonic But Nonlinear”use numra::stats::spearman_r;
let x = vec![1.0_f64, 2.0, 3.0, 4.0, 5.0];let y = vec![1.0, 4.0, 9.0, 16.0, 25.0]; // y = x^2 (monotone, not linear)
let (rho, p) = spearman_r(&x, &y).unwrap();assert!((rho - 1.0).abs() < 1e-12); // perfect monotonic relationshipprintln!("Spearman rho = {:.6}, p = {:.2e}", rho, p);Note that is perfectly monotonic for positive , so Spearman gives , while Pearson would give because the relationship is nonlinear.
Handling Ties
Section titled “Handling Ties”When data contains tied values, Spearman uses average ranks:
use numra::stats::spearman_r;
let x = vec![1.0_f64, 2.0, 2.0, 3.0, 4.0];let y = vec![1.0, 2.0, 2.0, 3.0, 4.0];
let (rho, _) = spearman_r(&x, &y).unwrap();assert!((rho - 1.0).abs() < 1e-12); // still perfect agreementThe ranking algorithm assigns average ranks to tied values. For example, if values at positions 2 and 3 are equal, both receive rank 2.5.
Spearman vs Pearson
Section titled “Spearman vs Pearson”use numra::stats::{pearson_r, spearman_r};
// Exponential relationship: monotonic but not linearlet x: Vec<f64> = (1..=20).map(|i| i as f64).collect();let y: Vec<f64> = x.iter().map(|&xi| (0.1 * xi).exp()).collect();
let (r, _) = pearson_r(&x, &y).unwrap();let (rho, _) = spearman_r(&x, &y).unwrap();
println!("Pearson r = {:.4}", r);println!("Spearman rho = {:.4}", rho);// rho will be 1.0 (perfect monotone), r < 1.0 (not linear)Pearson vs Spearman Comparison
Section titled “Pearson vs Spearman Comparison”| Property | Pearson | Spearman |
|---|---|---|
| Measures | Linear association | Monotonic association |
| Data type | Continuous | Continuous or ordinal |
| Sensitivity to outliers | High | Low |
| Distribution assumption | Approximately normal | None |
| Perfect score when | , monotonic | |
| Function | pearson_r | spearman_r |
Covariance Matrix
Section titled “Covariance Matrix”For multivariate analysis, the covariance_matrix function computes the
sample covariance matrix for variables:
The diagonal elements are the variances, and the matrix is symmetric.
use numra::stats::{covariance_matrix, variance};
let x = vec![1.0_f64, 2.0, 3.0, 4.0, 5.0];let y = vec![5.0, 4.0, 3.0, 2.0, 1.0]; // inversely correlated
let cov = covariance_matrix(&[x.clone(), y.clone()]).unwrap();assert_eq!(cov.len(), 4); // 2x2 matrix in row-major order
// Diagonal = variancesassert!((cov[0] - variance(&x).unwrap()).abs() < 1e-12); // Var(x)assert!((cov[3] - variance(&y).unwrap()).abs() < 1e-12); // Var(y)
// Off-diagonal: negative covariance (inverse relationship)assert!(cov[1] < 0.0);
// Symmetric: Cov(x,y) = Cov(y,x)assert!((cov[1] - cov[2]).abs() < 1e-12);Three Variables
Section titled “Three Variables”use numra::stats::covariance_matrix;
let a = vec![1.0_f64, 2.0, 3.0, 4.0, 5.0];let b = vec![2.0, 4.0, 6.0, 8.0, 10.0]; // b = 2a (perfectly correlated)let c = vec![5.0, 3.0, 4.0, 2.0, 1.0]; // roughly inversely related to a
let cov = covariance_matrix(&[a, b, c]).unwrap();assert_eq!(cov.len(), 9); // 3x3 matrix, row-major
// Print the covariance matrixprintln!("Covariance matrix:");for i in 0..3 { println!(" [{:8.4} {:8.4} {:8.4}]", cov[i * 3], cov[i * 3 + 1], cov[i * 3 + 2]);}From Covariance to Correlation
Section titled “From Covariance to Correlation”The correlation matrix can be derived from the covariance matrix by normalizing each entry by the product of the corresponding standard deviations:
use numra::stats::covariance_matrix;
let x = vec![1.0_f64, 2.0, 3.0, 4.0, 5.0];let y = vec![2.0, 4.0, 6.0, 8.0, 10.0];
let cov = covariance_matrix(&[x, y]).unwrap();
// Compute correlation from covariancelet r = cov[1] / (cov[0].sqrt() * cov[3].sqrt());assert!((r - 1.0).abs() < 1e-12); // perfect correlationprintln!("Correlation: {:.6}", r);When to Use Which
Section titled “When to Use Which”| Situation | Recommended Method |
|---|---|
| Two continuous variables, linear relationship | pearson_r |
| Nonlinear but monotonic relationship | spearman_r |
| Ordinal data (rankings, Likert scales) | spearman_r |
| Data contains outliers | spearman_r |
| Multivariate covariance structure | covariance_matrix |
| Testing if correlation is significant | Check p-value from pearson_r or spearman_r |
Error Handling
Section titled “Error Handling”Both correlation functions require at least 3 data points and equal-length inputs:
use numra::stats::{pearson_r, spearman_r, covariance_matrix};
// Need at least 3 data pointsassert!(pearson_r::<f64>(&[1.0, 2.0], &[1.0, 2.0]).is_err());assert!(spearman_r::<f64>(&[1.0, 2.0], &[1.0, 2.0]).is_err());
// Inputs must have equal lengthassert!(pearson_r(&[1.0, 2.0, 3.0], &[1.0, 2.0]).is_err());
// Covariance matrix needs at least 2 data points per variableassert!(covariance_matrix::<f64>(&[vec![1.0]]).is_err());Complete Example: Analyzing Variable Relationships
Section titled “Complete Example: Analyzing Variable Relationships”use numra::stats::{pearson_r, spearman_r, covariance_matrix, mean, std_dev};
fn main() { // Dataset: temperature, ice cream sales, sunscreen sales let temp = vec![60.0_f64, 65.0, 70.0, 75.0, 80.0, 85.0, 90.0, 95.0]; let icecream = vec![100.0, 125.0, 160.0, 185.0, 220.0, 260.0, 300.0, 340.0]; let sunscreen= vec![50.0, 60.0, 80.0, 90.0, 110.0, 130.0, 155.0, 180.0];
// Pairwise Pearson correlations println!("=== Pearson Correlations ==="); let (r1, p1) = pearson_r(&temp, &icecream).unwrap(); let (r2, p2) = pearson_r(&temp, &sunscreen).unwrap(); let (r3, p3) = pearson_r(&icecream, &sunscreen).unwrap();
println!(" Temp vs Ice Cream: r = {:.4}, p = {:.2e}", r1, p1); println!(" Temp vs Sunscreen: r = {:.4}, p = {:.2e}", r2, p2); println!(" Ice Cream vs Sunscreen: r = {:.4}, p = {:.2e}", r3, p3);
// Spearman correlations println!("\n=== Spearman Correlations ==="); let (rho1, _) = spearman_r(&temp, &icecream).unwrap(); let (rho2, _) = spearman_r(&temp, &sunscreen).unwrap(); println!(" Temp vs Ice Cream: rho = {:.4}", rho1); println!(" Temp vs Sunscreen: rho = {:.4}", rho2);
// Full covariance matrix println!("\n=== Covariance Matrix (3x3) ==="); let cov = covariance_matrix(&[ temp.clone(), icecream.clone(), sunscreen.clone() ]).unwrap();
let names = ["Temp", "IceCream", "Sunscreen"]; print!("{:>12}", ""); for name in &names { print!("{:>12}", name); } println!(); for i in 0..3 { print!("{:>12}", names[i]); for j in 0..3 { print!("{:>12.2}", cov[i * 3 + j]); } println!(); }
// Derive correlation matrix from covariance matrix println!("\n=== Correlation Matrix ==="); print!("{:>12}", ""); for name in &names { print!("{:>12}", name); } println!(); for i in 0..3 { print!("{:>12}", names[i]); for j in 0..3 { let r = cov[i * 3 + j] / (cov[i * 3 + i].sqrt() * cov[j * 3 + j].sqrt()); print!("{:>12.4}", r); } println!(); }}Function Reference
Section titled “Function Reference”| Function | Signature | Description |
|---|---|---|
pearson_r | fn pearson_r<S>(x, y) -> Result<(S, S)> | Pearson correlation and p-value |
spearman_r | fn spearman_r<S>(x, y) -> Result<(S, S)> | Spearman rank correlation and p-value |
covariance | fn covariance<S>(x, y) -> Result<S> | Sample covariance between two variables |
covariance_matrix | fn covariance_matrix<S>(vars) -> Result<Vec<S>> | covariance matrix (row-major) |