Saral Shiksha Yojna
Courses/Behavioral Research: Statistical Methods

Behavioral Research: Statistical Methods

CG3.402
Vinoo AlluriMonsoon 2025-264 credits

Cheatsheet

Ultra-condensed. Revise a chapter in minutes.

Unit 1 — Why Do Statistics? (Biases & Base Rates)

The Case for Statistics — Biases, Base Rates, Bayes
One-liners
  • Statistics is the corrective for human probabilistic illusion.
  • Posterior = Likelihood × Prior / Evidence.
  • Mammogram + sensitivity 90% + prevalence 1% → P(cancer | +) ≈ 9%.
  • Simpson: subgroup trends can reverse aggregate trends.
  • Statistically significant ≠ practically meaningful.
Formulas
Definitions
  • Belief bias = judging by conclusion plausibility.
  • Confirmation bias = ignoring falsifiers.
  • Simpson's paradox = trend reversal on aggregation.
Algorithms
  • Bayes update: identify prior, likelihood, compute evidence by total probability, divide.
Comparisons
  • Sensitivity vs PPV: Sensitivity = P(+ | disease); PPV = P(disease | +). Bayes-related but not equal — PPV depends on base rate.
  • p-value vs P(H₀ true): p = P(data | H₀); NOT P(H₀ | data). Confusing them is the prosecutor's fallacy.
Keywords
belief biasconfirmation biasSimpson's paradoxbase rateBayesPPVWason

Unit 2 — Research Design & Measurement

Scales, Reliability, Validity
One-liners
  • NOIR scales: Nominal → Ordinal → Interval → Ratio.
  • Reliability = consistency; Validity = accuracy.
  • Cannot be valid without being reliable; can be reliable without being valid.
  • Internal validity: causal inference. External validity: generalisability.
Formulas
Definitions
  • Operational definition = measurable spec of abstract construct.
  • Confound = third variable threatening internal validity.
Algorithms
  • Reliability check pipeline: test-retest → inter-rater → parallel forms → internal consistency (Cronbach's α).
Comparisons
  • Interval vs Ratio: Interval has no true zero (°C, calendar year). Ratio has true zero (RT, weight). Ratios only meaningful on ratio scale.
  • Reliability vs Validity: Reliability = repeatability. Validity = on-target. Stopped clock: reliable, invalid.
Keywords
NOIRLikertCronbachCohen κconvergentdiscriminantecologicalconfounddouble-blind

Unit 3 — Probability & Distributions

Probability, Distributions, and the CLT
One-liners
  • PDF for continuous: P(X = exact) = 0; only intervals have probability.
  • Binomial: mean np, variance np(1−p).
  • t > Normal tails; t → Normal as df → ∞.
  • CLT: x̄ → N(μ, σ²/n) regardless of population shape.
  • R prefixes d / p / q / r.
Formulas
Definitions
  • CLT = sampling distribution → Normal.
  • LLN = x̄ → μ as n→∞.
Algorithms
  • pbinom(k,n,p) for P(X ≤ k); qbinom for inverse.
Comparisons
  • Normal vs t: t has heavier tails; depends on df; → Normal as df→∞. Use t when σ unknown.
  • LLN vs CLT: LLN: x̄ → μ. CLT: shape of variability around μ becomes Normal.
Keywords
BernoulliBinomialNormaltχ²FCLTLLNPDFCDFSEM

Unit 4 — Data Visualization

Plots, Matching, and Common Pitfalls
One-liners
  • Always plot the raw data first.
  • Match the plot to the scale.
  • Tukey outlier rule: ±1.5×IQR.
  • Avoid 3D pies, dual-y, rainbow, red+green.
Formulas
Definitions
  • Anscombe's quartet = same stats, different plots.
  • Skew: positive = right tail.
Algorithms
  • Boxplot construction: Q1, median, Q3, IQR; whiskers to ±1.5×IQR; flag points beyond.
Comparisons
  • Bar chart vs Pie chart: Bars use position (perceptually accurate); pies use angle (poor). Avoid pies for >3 slices.
  • Histogram vs Boxplot: Histogram shows full distribution shape; boxplot summarises with outliers compactly.
Keywords
AnscombeboxplotviolinmosaicheatmapskewIQRTukey

Unit 5 — Descriptive Statistics

Centre, Spread, Standardisation
One-liners
  • Mean (sensitive) vs Median (robust) vs Mode (nominal).
  • Skew: mean > median > mode = right tail.
  • Bessel's correction: /(n−1).
  • z = (x − μ)/σ; |z| ≤ 1.96 ≈ 95% under Normal.
  • Robust spread: IQR, MAD.
Formulas
Definitions
  • Bessel: /(n−1) for unbiased s².
  • MAD = median |x − median|.
Algorithms
  • z-score: subtract mean, divide by SD; lookup Normal table.
Comparisons
  • Mean vs Median: Mean uses all data, sensitive to outliers. Median is the 50th percentile, robust.
  • SD vs MAD: SD = √mean squared deviation (Bessel). MAD = median absolute deviation; robust.
Keywords
meanmedianmodeIQRMADBesselz-scoreskew

Unit 6 — Correlation & Reliability Quantified

Pearson, Spearman, Partial, Reliability Metrics
One-liners
  • Pearson r: linear; Spearman ρ: monotone (ranks); Kendall τ: pair concordance.
  • r² = shared variance.
  • r = 0 ≠ independence (no LINEAR association).
  • Partial: strip Z from both. Semi-partial: from one side.
  • Correlation ≠ causation.
Formulas
Definitions
  • Partial corr = controls for confounder.
  • Cronbach's α = internal consistency.
Algorithms
  • Compute Pearson r in R: cor(x, y).
  • Spearman: cor(x, y, method='spearman').
Comparisons
  • Pearson vs Spearman: Pearson: linear, sensitive to outliers. Spearman: ranks, captures monotone, robust.
  • Partial vs Semi-partial: Partial residualises both X and Y; semi-partial only one side. Different denominators.
Keywords
PearsonSpearmanKendallpartialsemi-partialCohen κCronbach αspurious

Unit 7 — Hypothesis Testing & NHST

p-values, Errors, Power, t-tests
One-liners
  • p = P(data | H₀), not P(H₀ | data).
  • Power = 1 − β; need ≥ 0.80.
  • Cohen's d: 0.2 small / 0.5 medium / 0.8 large.
  • Welch's t = independent t without equal-variance assumption.
  • Never 'accept' H₀.
Formulas
Definitions
  • Type I = false positive (α); Type II = false negative (β).
  • Power = 1 − β.
Algorithms
  • NHST: H → α → test → assumptions → statistic → p → decide → effect size + CI.
Comparisons
  • Independent t vs Welch t: Welch drops equal-variance assumption; adjusts df. Default when in doubt.
  • Independent t vs Paired t: Paired removes between-subject variability → much more power on same n.
Keywords
NHSTpαβpowerCohen's dt-testWelch

Unit 8 — Multiple Comparisons (FWER, FDR)

FWER vs FDR; Bonferroni, Holm, BH
One-liners
  • Multiple tests inflate Type I: P(≥1 FP) = 1 − (1−α)^m.
  • Bonferroni: per-test α = α/m. Conservative.
  • Holm: stepwise FWER; uniformly more powerful than Bonferroni.
  • BH: stepwise FDR; less conservative; for exploratory analyses.
  • Pre-registration is the main antidote to forking paths.
Formulas
Definitions
  • FWER = P(any FP). FDR = E[FP / rejections].
Algorithms
  • BH: sort p's; find largest i with p_(i) ≤ (i/m)·Q; reject that and all smaller.
Comparisons
  • FWER vs FDR: FWER conservative — controls P(any FP). FDR less conservative — controls E[FP/rejections].
  • Bonferroni vs Holm: Bonferroni: divide α by m uniformly. Holm: stepwise; more powerful at the same FWER level.
Keywords
FWERFDRBonferroniHolmBHpermutationp-hacking

Unit 9 — Non-parametric & Categorical Tests

Categorical & Rank-Based Tests
One-liners
  • χ² = Σ(O−E)²/E. df = (r−1)(c−1) indep; k−1 GoF.
  • Mann-Whitney ↔ independent t; Wilcoxon signed-rank ↔ paired t.
  • Kruskal-Wallis ↔ one-way ANOVA; Friedman ↔ RM-ANOVA.
  • McNemar = paired χ² for binary outcome.
  • Effect size: φ (2×2), Cramér's V (larger).
Formulas
Definitions
  • Nonparametric = no Normality assumption.
  • Use when ordinal, heavily skewed, small n, or unfixable outliers.
Algorithms
  • χ² independence: compute E_ij = r_i·c_j/n; sum (O−E)²/E; compare to χ²((r−1)(c−1)).
Comparisons
  • Independent t vs Mann-Whitney: Mann-Whitney drops Normality; rank-based; tests stochastic dominance.
  • Paired t vs Wilcoxon signed-rank: Wilcoxon drops Normality on differences; signed-rank-based.
  • χ² test vs Fisher's exact: Fisher used at small expected counts (E < 5); exact rather than approximate.
Keywords
χ²Mann-WhitneyWilcoxonKruskal-WallisFriedmanSpearmanMcNemarFisher's exact

Unit 10 — Multicollinearity, PCA & Factor Analysis

VIF, PCA, EFA/CFA, Scree Plot
One-liners
  • VIF > 5–10 → severe multicollinearity.
  • PCA = variance maximisation; FA = latent-variable model.
  • EFA discovers; CFA tests.
  • Choose # factors: parallel analysis > Kaiser > scree.
  • Rotation: varimax (orthogonal) vs oblimin (oblique).
Formulas
Definitions
  • Multicollinearity = correlated predictors → unstable β.
  • Scree plot = eigenvalues; retain before the elbow.
Algorithms
  • FA pipeline: KMO + Bartlett → choose # factors (parallel) → extract → rotate → interpret loadings → CFA on held-out sample.
Comparisons
  • PCA vs FA: PCA: all variance. FA: shared variance only with latent model and error.
  • EFA vs CFA: EFA: data-driven, no prior structure. CFA: hypothesis-driven, tests pre-specified structure.
Keywords
VIFmulticollinearityPCAFAEFACFAscreeparallel analysisvarimax

Unit 11 — ANOVA (one-way, RM, two-way)

Partition, F-test, Sphericity, Post-hoc
One-liners
  • SS_total = SS_between + SS_within.
  • F = MS_between / MS_within.
  • df_between = k−1; df_within = N−k.
  • Tukey HSD = standard post-hoc after significant F.
  • Sphericity violated → Greenhouse-Geisser.
  • Significant interaction qualifies main effects.
Formulas
Definitions
  • Sphericity = equality of differences' variances.
  • Mauchly = sphericity test.
Algorithms
  • ANOVA: state H, check assumptions, partition SS, compute F, p, η², post-hoc Tukey if significant.
Comparisons
  • One-way ANOVA vs RM-ANOVA: Between-subjects vs same-subjects in all conditions. RM has more power; needs sphericity assumption.
  • Tukey HSD vs Bonferroni: Tukey optimised for all pairwise comparisons with FWER control. Bonferroni more general but conservative.
Keywords
ANOVAFSSη²TukeysphericityMauchlyGreenhouse-Geisserinteraction

Unit 12 — Regression (Linear, Multiple)

OLS, Diagnostics, Multiple Regression
One-liners
  • OLS minimises Σ(residual)².
  • R² ALWAYS ↑ when adding predictors; use adjusted R².
  • LINeM = Linearity, Independence, Normality, Equal variance, no Multicollinearity.
  • For simple regression: R² = r².
  • Categorical predictor with k levels → k−1 dummies.
  • Cook's d > 4/n flags influential outliers.
Formulas
Definitions
  • OLS = minimise squared residuals.
  • Adjusted R² penalises for # predictors.
Algorithms
  • Regression workflow: fit → R², adj R², F → coefficients with SEs and p → residual diagnostics → effect size + CIs.
Comparisons
  • vs Adjusted R²: R² always ↑ with predictors; Adj R² penalises by k — honest for model comparison.
  • OLS vs Ridge: OLS unbiased but high variance under multicollinearity; ridge biased but stable; trades bias for variance.
Keywords
OLSadjusted R²LINeMresidual diagnosticsheteroscedasticitydummyCook

Unit 13 — Bayesian Statistics

Priors, Posteriors, Bayes Factors
One-liners
  • Posterior ∝ Prior × Likelihood.
  • BF₁₀ = P(D|H₁)/P(D|H₀); continuous evidence.
  • BF₀₁ > 10 = strong evidence for the null (p-values cannot do this).
  • Posterior odds = prior odds × BF.
  • Bayesian robust to optional stopping.
Formulas
Definitions
  • Prior + Likelihood → Posterior.
  • BF = continuous evidence ratio.
Algorithms
  • Bayes update: identify prior, likelihood; compute evidence; divide.
Comparisons
  • Frequentist vs Bayesian: Frequentist: long-run frequency, no priors. Bayesian: degree of belief, explicit priors, can evidence the null.
  • p-value vs Bayes Factor: p = P(data | H₀); BF = ratio of likelihoods; BF can favour H₀, p cannot.
Keywords
BayespriorposteriorlikelihoodBFcredible intervalconjugate

Unit 14 — GLMs & Logistic Regression

Logistic Regression and the GLM Framework
One-liners
  • OLS fails on 0/1 outcomes (escapes [0,1], non-Normal residuals, heteroscedastic).
  • Logistic regression: log(p/(1−p)) = linear predictor.
  • OR = exp(β); β > 0 → OR > 1.
  • MLE, not OLS, for fitting.
  • GLM = random + systematic + link.
  • No normality / homoscedasticity needed.
Formulas
Definitions
  • GLM = random + systematic + link.
  • Logit = log(p/(1−p)).
Algorithms
  • Fit by MLE → coefficients with z-tests → odds ratios → confusion matrix on held-out → AUC.
Comparisons
  • OLS vs Logistic: OLS: continuous Y, normal residuals, OLS estimation. Logistic: binary Y, logit link, MLE.
  • Linear vs GLM: Linear is a special case of GLM with identity link + Gaussian. GLMs generalise to binomial/Poisson/etc.
Keywords
logitORGLMlink functionMLEAUCdeviance

Unit 15 — Rapid Revision & Exam Strategy

Decision Tree, Confusions, Report Checklist
One-liners
  • Decision: DV scale × IV scale × #groups × indep/paired → test.
  • Always report effect size + CI.
  • Stat significance ≠ practical significance.
  • Frequentist non-significance ≠ no effect.
  • Pre-register direction; correct multiple comparisons.
Formulas
Definitions
  • Power = 1 − β.
  • BF interpretation: 3–10 moderate, 10–30 strong, > 30 very strong.
Algorithms
  • 5-step checklist: question → test → assumptions → effect size + CI → interpret.
Comparisons
  • PCA vs FA vs Reliability vs Validity: PCA: variance reduction; FA: latent variables. Reliability: consistency; Validity: accuracy.
Keywords
decision treechecklisteffect sizeCIreportexam strategy