Saral Shiksha Yojna
Courses/Behavioral Research: Statistical Methods

Behavioral Research: Statistical Methods

CG3.402
Vinoo AlluriMonsoon 2025-264 credits

Formulas & Diagrams

High-ROI section — formulas improve marks, diagrams improve recall.

Formulas

z-score
Standardise a value: how many SDs above/below the population mean.
One-sample t
Test sample mean against a hypothesised value when σ is unknown.
Standard error of the mean
How precisely the sample mean estimates μ. Shrinks as √n.
Cohen's d
Standardised mean difference. 0.2 / 0.5 / 0.8 = small / medium / large.
Pearson χ²
Goodness-of-fit / independence on categorical data. df=(r−1)(c−1) for independence.
F (ANOVA)
Ratio of between-group to within-group variance. F < 1 → no effect; F ≫ 1 → effect.
ANOVA SS partition
Total variability splits into group differences + within-group residual.
Eta-squared (η²)
Effect size for ANOVA. 0.01 / 0.06 / 0.14 = small / medium / large.
Proportion of variance in Y explained by the model. Always ↑ as predictors added — use adjusted R² for honest comparison.
Pearson r
Strength of linear association in [−1, 1]. Sensitive to outliers; assumes linearity.
Variance Inflation Factor
Severity of multicollinearity for predictor j. VIF > 5–10 is problematic.
Bayes' rule
Updates prior P(H) to posterior P(H|D) using likelihood and evidence.
Bayes Factor
Continuous evidence ratio. 3–10 moderate · 10–30 strong · >30 very strong evidence for H₁.
PSNR
(Not BRSM-core but shared with image quality contexts.)
Binomial PMF
Probability of k successes in n independent Bernoulli(p) trials.
Logistic regression (logit)
Link function: linear in log-odds, bounded p ∈ [0,1].
Odds ratio
Multiplicative change in odds per unit increase in predictor. Exam essential.
Bonferroni per-test α
Strict FWER control. With m=20, α=.05 → per-test α = 0.0025.
Benjamini-Hochberg threshold
Rank p-values; significant if below the BH line. Q is the target FDR (e.g., 0.05).
95% CI for mean
Sample mean ± t-critical times SEM. Frequentist: procedure has 95% long-run coverage.
χ² independence df
Contingency table degrees of freedom. 2×3 table → df = 2.
χ² goodness-of-fit df
Goodness-of-fit on k categories.
Phi coefficient (2×2)
Effect size for 2×2 χ². Larger tables use Cramér's V.
Sphericity violation correction
Greenhouse-Geisser (ε estimated) or Huynh-Feldt adjust df when sphericity is violated in RM-ANOVA.
Wilcoxon signed-rank W
Nonparametric paired test. Ranks the absolute differences then sums ranks of positive diffs.

Diagrams

Which test do I use?
Decision flow on IV scale × DV scale × #groups × independent/paired. Categorical DV → χ²; 2 groups continuous → t; 3+ groups continuous → ANOVA; continuous-continuous → r/regression.
[ diagram placeholder ]
NOIR scales of measurement
Four-row table: Nominal / Ordinal / Interval / Ratio with order, equal intervals, true zero, examples, allowable statistics.
[ diagram placeholder ]
Type I / Type II error 2×2
Rows: reject vs fail-to-reject. Cols: H₀ true vs false. Cells: α / power / correct / β.
[ diagram placeholder ]
FWER vs FDR
Side-by-side: FWER controls P(any false positive) — conservative — Bonferroni / Holm. FDR controls expected proportion of FPs among rejections — Benjamini-Hochberg.
[ diagram placeholder ]
ANOVA SS partition
Total variability split into between-group + within-group. F = MS_between / MS_within.
[ diagram placeholder ]
95% CI long-run coverage
Many simulated samples, each producing a CI. Roughly 95% of intervals contain μ. Coverage is a property of the procedure.
[ diagram placeholder ]
Prior → Posterior update
Likelihood × Prior / Evidence → Posterior. Sequential updating across studies.
[ diagram placeholder ]
Regression diagnostic plots
Residuals vs fitted (linearity, heteroscedasticity), Q-Q (normality), scale-location, residuals vs leverage.
[ diagram placeholder ]
Scree plot + parallel analysis
Eigenvalues vs factor #. Retain factors before the elbow; parallel analysis adds a random-data baseline.
[ diagram placeholder ]
Two-way ANOVA interaction plot
Cell means with one IV on x-axis, other as colored lines. Non-parallel lines = interaction.
[ diagram placeholder ]
Anscombe's quartet
Four datasets sharing mean / SD / r / regression line — wildly different scatter plots. Lesson: always plot your data.
[ diagram placeholder ]
Boxplot + 1.5×IQR outlier rule
Box from Q1 to Q3, median line, whiskers to 1.5×IQR; points beyond flagged as outliers.
[ diagram placeholder ]
Logistic S-curve
p = 1/(1 + e^(−η)) where η = β₀ + β·x. Saturates to 0/1 at extremes.
[ diagram placeholder ]
Sampling distribution of the mean
Repeated samples from population → distribution of sample means. CLT → Normal with mean μ, SD σ/√n.
[ diagram placeholder ]