Behavioral Research: Statistical Methods
CG3.402Vinoo Alluri•Monsoon 2025-26•4 credits
Cheatsheet
Ultra-condensed. Revise a chapter in minutes.
Unit 1 — Why Do Statistics? (Biases & Base Rates)
The Case for Statistics — Biases, Base Rates, Bayes
One-liners
- Statistics is the corrective for human probabilistic illusion.
- Posterior = Likelihood × Prior / Evidence.
- Mammogram + sensitivity 90% + prevalence 1% → P(cancer | +) ≈ 9%.
- Simpson: subgroup trends can reverse aggregate trends.
- Statistically significant ≠ practically meaningful.
Formulas
Definitions
- Belief bias = judging by conclusion plausibility.
- Confirmation bias = ignoring falsifiers.
- Simpson's paradox = trend reversal on aggregation.
Algorithms
- Bayes update: identify prior, likelihood, compute evidence by total probability, divide.
Comparisons
- Sensitivity vs PPV: Sensitivity = P(+ | disease); PPV = P(disease | +). Bayes-related but not equal — PPV depends on base rate.
- p-value vs P(H₀ true): p = P(data | H₀); NOT P(H₀ | data). Confusing them is the prosecutor's fallacy.
Keywords
belief biasconfirmation biasSimpson's paradoxbase rateBayesPPVWason
Unit 2 — Research Design & Measurement
Scales, Reliability, Validity
One-liners
- NOIR scales: Nominal → Ordinal → Interval → Ratio.
- Reliability = consistency; Validity = accuracy.
- Cannot be valid without being reliable; can be reliable without being valid.
- Internal validity: causal inference. External validity: generalisability.
Formulas
Definitions
- Operational definition = measurable spec of abstract construct.
- Confound = third variable threatening internal validity.
Algorithms
- Reliability check pipeline: test-retest → inter-rater → parallel forms → internal consistency (Cronbach's α).
Comparisons
- Interval vs Ratio: Interval has no true zero (°C, calendar year). Ratio has true zero (RT, weight). Ratios only meaningful on ratio scale.
- Reliability vs Validity: Reliability = repeatability. Validity = on-target. Stopped clock: reliable, invalid.
Keywords
NOIRLikertCronbachCohen κconvergentdiscriminantecologicalconfounddouble-blind
Unit 3 — Probability & Distributions
Probability, Distributions, and the CLT
One-liners
- PDF for continuous: P(X = exact) = 0; only intervals have probability.
- Binomial: mean np, variance np(1−p).
- t > Normal tails; t → Normal as df → ∞.
- CLT: x̄ → N(μ, σ²/n) regardless of population shape.
- R prefixes d / p / q / r.
Formulas
Definitions
- CLT = sampling distribution → Normal.
- LLN = x̄ → μ as n→∞.
Algorithms
- pbinom(k,n,p) for P(X ≤ k); qbinom for inverse.
Comparisons
- Normal vs t: t has heavier tails; depends on df; → Normal as df→∞. Use t when σ unknown.
- LLN vs CLT: LLN: x̄ → μ. CLT: shape of variability around μ becomes Normal.
Keywords
BernoulliBinomialNormaltχ²FCLTLLNPDFCDFSEM
Unit 4 — Data Visualization
Plots, Matching, and Common Pitfalls
One-liners
- Always plot the raw data first.
- Match the plot to the scale.
- Tukey outlier rule: ±1.5×IQR.
- Avoid 3D pies, dual-y, rainbow, red+green.
Formulas
Definitions
- Anscombe's quartet = same stats, different plots.
- Skew: positive = right tail.
Algorithms
- Boxplot construction: Q1, median, Q3, IQR; whiskers to ±1.5×IQR; flag points beyond.
Comparisons
- Bar chart vs Pie chart: Bars use position (perceptually accurate); pies use angle (poor). Avoid pies for >3 slices.
- Histogram vs Boxplot: Histogram shows full distribution shape; boxplot summarises with outliers compactly.
Keywords
AnscombeboxplotviolinmosaicheatmapskewIQRTukey
Unit 5 — Descriptive Statistics
Centre, Spread, Standardisation
One-liners
- Mean (sensitive) vs Median (robust) vs Mode (nominal).
- Skew: mean > median > mode = right tail.
- Bessel's correction: /(n−1).
- z = (x − μ)/σ; |z| ≤ 1.96 ≈ 95% under Normal.
- Robust spread: IQR, MAD.
Formulas
Definitions
- Bessel: /(n−1) for unbiased s².
- MAD = median |x − median|.
Algorithms
- z-score: subtract mean, divide by SD; lookup Normal table.
Comparisons
- Mean vs Median: Mean uses all data, sensitive to outliers. Median is the 50th percentile, robust.
- SD vs MAD: SD = √mean squared deviation (Bessel). MAD = median absolute deviation; robust.
Keywords
meanmedianmodeIQRMADBesselz-scoreskew
Unit 6 — Correlation & Reliability Quantified
Pearson, Spearman, Partial, Reliability Metrics
One-liners
- Pearson r: linear; Spearman ρ: monotone (ranks); Kendall τ: pair concordance.
- r² = shared variance.
- r = 0 ≠ independence (no LINEAR association).
- Partial: strip Z from both. Semi-partial: from one side.
- Correlation ≠ causation.
Formulas
Definitions
- Partial corr = controls for confounder.
- Cronbach's α = internal consistency.
Algorithms
- Compute Pearson r in R: cor(x, y).
- Spearman: cor(x, y, method='spearman').
Comparisons
- Pearson vs Spearman: Pearson: linear, sensitive to outliers. Spearman: ranks, captures monotone, robust.
- Partial vs Semi-partial: Partial residualises both X and Y; semi-partial only one side. Different denominators.
Keywords
PearsonSpearmanKendallpartialsemi-partialCohen κCronbach αspurious
Unit 7 — Hypothesis Testing & NHST
p-values, Errors, Power, t-tests
One-liners
- p = P(data | H₀), not P(H₀ | data).
- Power = 1 − β; need ≥ 0.80.
- Cohen's d: 0.2 small / 0.5 medium / 0.8 large.
- Welch's t = independent t without equal-variance assumption.
- Never 'accept' H₀.
Formulas
Definitions
- Type I = false positive (α); Type II = false negative (β).
- Power = 1 − β.
Algorithms
- NHST: H → α → test → assumptions → statistic → p → decide → effect size + CI.
Comparisons
- Independent t vs Welch t: Welch drops equal-variance assumption; adjusts df. Default when in doubt.
- Independent t vs Paired t: Paired removes between-subject variability → much more power on same n.
Keywords
NHSTpαβpowerCohen's dt-testWelch
Unit 8 — Multiple Comparisons (FWER, FDR)
FWER vs FDR; Bonferroni, Holm, BH
One-liners
- Multiple tests inflate Type I: P(≥1 FP) = 1 − (1−α)^m.
- Bonferroni: per-test α = α/m. Conservative.
- Holm: stepwise FWER; uniformly more powerful than Bonferroni.
- BH: stepwise FDR; less conservative; for exploratory analyses.
- Pre-registration is the main antidote to forking paths.
Formulas
Definitions
- FWER = P(any FP). FDR = E[FP / rejections].
Algorithms
- BH: sort p's; find largest i with p_(i) ≤ (i/m)·Q; reject that and all smaller.
Comparisons
- FWER vs FDR: FWER conservative — controls P(any FP). FDR less conservative — controls E[FP/rejections].
- Bonferroni vs Holm: Bonferroni: divide α by m uniformly. Holm: stepwise; more powerful at the same FWER level.
Keywords
FWERFDRBonferroniHolmBHpermutationp-hacking
Unit 9 — Non-parametric & Categorical Tests
Categorical & Rank-Based Tests
One-liners
- χ² = Σ(O−E)²/E. df = (r−1)(c−1) indep; k−1 GoF.
- Mann-Whitney ↔ independent t; Wilcoxon signed-rank ↔ paired t.
- Kruskal-Wallis ↔ one-way ANOVA; Friedman ↔ RM-ANOVA.
- McNemar = paired χ² for binary outcome.
- Effect size: φ (2×2), Cramér's V (larger).
Formulas
Definitions
- Nonparametric = no Normality assumption.
- Use when ordinal, heavily skewed, small n, or unfixable outliers.
Algorithms
- χ² independence: compute E_ij = r_i·c_j/n; sum (O−E)²/E; compare to χ²((r−1)(c−1)).
Comparisons
- Independent t vs Mann-Whitney: Mann-Whitney drops Normality; rank-based; tests stochastic dominance.
- Paired t vs Wilcoxon signed-rank: Wilcoxon drops Normality on differences; signed-rank-based.
- χ² test vs Fisher's exact: Fisher used at small expected counts (E < 5); exact rather than approximate.
Keywords
χ²Mann-WhitneyWilcoxonKruskal-WallisFriedmanSpearmanMcNemarFisher's exact
Unit 10 — Multicollinearity, PCA & Factor Analysis
VIF, PCA, EFA/CFA, Scree Plot
One-liners
- VIF > 5–10 → severe multicollinearity.
- PCA = variance maximisation; FA = latent-variable model.
- EFA discovers; CFA tests.
- Choose # factors: parallel analysis > Kaiser > scree.
- Rotation: varimax (orthogonal) vs oblimin (oblique).
Formulas
Definitions
- Multicollinearity = correlated predictors → unstable β.
- Scree plot = eigenvalues; retain before the elbow.
Algorithms
- FA pipeline: KMO + Bartlett → choose # factors (parallel) → extract → rotate → interpret loadings → CFA on held-out sample.
Comparisons
- PCA vs FA: PCA: all variance. FA: shared variance only with latent model and error.
- EFA vs CFA: EFA: data-driven, no prior structure. CFA: hypothesis-driven, tests pre-specified structure.
Keywords
VIFmulticollinearityPCAFAEFACFAscreeparallel analysisvarimax
Unit 11 — ANOVA (one-way, RM, two-way)
Partition, F-test, Sphericity, Post-hoc
One-liners
- SS_total = SS_between + SS_within.
- F = MS_between / MS_within.
- df_between = k−1; df_within = N−k.
- Tukey HSD = standard post-hoc after significant F.
- Sphericity violated → Greenhouse-Geisser.
- Significant interaction qualifies main effects.
Formulas
Definitions
- Sphericity = equality of differences' variances.
- Mauchly = sphericity test.
Algorithms
- ANOVA: state H, check assumptions, partition SS, compute F, p, η², post-hoc Tukey if significant.
Comparisons
- One-way ANOVA vs RM-ANOVA: Between-subjects vs same-subjects in all conditions. RM has more power; needs sphericity assumption.
- Tukey HSD vs Bonferroni: Tukey optimised for all pairwise comparisons with FWER control. Bonferroni more general but conservative.
Keywords
ANOVAFSSη²TukeysphericityMauchlyGreenhouse-Geisserinteraction
Unit 12 — Regression (Linear, Multiple)
OLS, Diagnostics, Multiple Regression
One-liners
- OLS minimises Σ(residual)².
- R² ALWAYS ↑ when adding predictors; use adjusted R².
- LINeM = Linearity, Independence, Normality, Equal variance, no Multicollinearity.
- For simple regression: R² = r².
- Categorical predictor with k levels → k−1 dummies.
- Cook's d > 4/n flags influential outliers.
Formulas
Definitions
- OLS = minimise squared residuals.
- Adjusted R² penalises for # predictors.
Algorithms
- Regression workflow: fit → R², adj R², F → coefficients with SEs and p → residual diagnostics → effect size + CIs.
Comparisons
- R² vs Adjusted R²: R² always ↑ with predictors; Adj R² penalises by k — honest for model comparison.
- OLS vs Ridge: OLS unbiased but high variance under multicollinearity; ridge biased but stable; trades bias for variance.
Keywords
OLSR²adjusted R²LINeMresidual diagnosticsheteroscedasticitydummyCook
Unit 13 — Bayesian Statistics
Priors, Posteriors, Bayes Factors
One-liners
- Posterior ∝ Prior × Likelihood.
- BF₁₀ = P(D|H₁)/P(D|H₀); continuous evidence.
- BF₀₁ > 10 = strong evidence for the null (p-values cannot do this).
- Posterior odds = prior odds × BF.
- Bayesian robust to optional stopping.
Formulas
Definitions
- Prior + Likelihood → Posterior.
- BF = continuous evidence ratio.
Algorithms
- Bayes update: identify prior, likelihood; compute evidence; divide.
Comparisons
- Frequentist vs Bayesian: Frequentist: long-run frequency, no priors. Bayesian: degree of belief, explicit priors, can evidence the null.
- p-value vs Bayes Factor: p = P(data | H₀); BF = ratio of likelihoods; BF can favour H₀, p cannot.
Keywords
BayespriorposteriorlikelihoodBFcredible intervalconjugate
Unit 14 — GLMs & Logistic Regression
Logistic Regression and the GLM Framework
One-liners
- OLS fails on 0/1 outcomes (escapes [0,1], non-Normal residuals, heteroscedastic).
- Logistic regression: log(p/(1−p)) = linear predictor.
- OR = exp(β); β > 0 → OR > 1.
- MLE, not OLS, for fitting.
- GLM = random + systematic + link.
- No normality / homoscedasticity needed.
Formulas
Definitions
- GLM = random + systematic + link.
- Logit = log(p/(1−p)).
Algorithms
- Fit by MLE → coefficients with z-tests → odds ratios → confusion matrix on held-out → AUC.
Comparisons
- OLS vs Logistic: OLS: continuous Y, normal residuals, OLS estimation. Logistic: binary Y, logit link, MLE.
- Linear vs GLM: Linear is a special case of GLM with identity link + Gaussian. GLMs generalise to binomial/Poisson/etc.
Keywords
logitORGLMlink functionMLEAUCdeviance
Unit 15 — Rapid Revision & Exam Strategy
Decision Tree, Confusions, Report Checklist
One-liners
- Decision: DV scale × IV scale × #groups × indep/paired → test.
- Always report effect size + CI.
- Stat significance ≠ practical significance.
- Frequentist non-significance ≠ no effect.
- Pre-register direction; correct multiple comparisons.
Formulas
Definitions
- Power = 1 − β.
- BF interpretation: 3–10 moderate, 10–30 strong, > 30 very strong.
Algorithms
- 5-step checklist: question → test → assumptions → effect size + CI → interpret.
Comparisons
- PCA vs FA vs Reliability vs Validity: PCA: variance reduction; FA: latent variables. Reliability: consistency; Validity: accuracy.
Keywords
decision treechecklisteffect sizeCIreportexam strategy