Courses/Behavioral Research: Statistical Methods

Behavioral Research: Statistical Methods

CG3.402

Vinoo Alluri•Monsoon 2025-26•4 credits

Cheatsheet

Ultra-condensed. Revise a chapter in minutes.

Unit 1 — Why Do Statistics? (Biases & Base Rates)

The Case for Statistics — Biases, Base Rates, Bayes

One-liners

Statistics is the corrective for human probabilistic illusion.
Posterior = Likelihood × Prior / Evidence.
Mammogram + sensitivity 90% + prevalence 1% → P(cancer | +) ≈ 9%.
Simpson: subgroup trends can reverse aggregate trends.
Statistically significant ≠ practically meaningful.

Formulas

$P (H ∣ D) = P (D ∣ H) \cdot P (H) / P (D)$
$P (D) = P (D ∣ H) \cdot P (H) + P (D ∣\neg H) \cdot P (\neg H)$

Definitions

Belief bias = judging by conclusion plausibility.
Confirmation bias = ignoring falsifiers.
Simpson's paradox = trend reversal on aggregation.

Algorithms

Bayes update: identify prior, likelihood, compute evidence by total probability, divide.

Comparisons

Sensitivity vs PPV: Sensitivity = P(+ | disease); PPV = P(disease | +). Bayes-related but not equal — PPV depends on base rate.
p-value vs P(H₀ true): p = P(data | H₀); NOT P(H₀ | data). Confusing them is the prosecutor's fallacy.

Keywords

belief biasconfirmation biasSimpson's paradoxbase rateBayesPPVWason

Unit 2 — Research Design & Measurement

Scales, Reliability, Validity

One-liners

NOIR scales: Nominal → Ordinal → Interval → Ratio.
Reliability = consistency; Validity = accuracy.
Cannot be valid without being reliable; can be reliable without being valid.
Internal validity: causal inference. External validity: generalisability.

Formulas

$C r o nba c h^{'} s α = k / (k - 1) \cdot (1 - Σ σ_{i}^{2} / σ_{t}^{2} o t a l)$
$C o h e n^{'} s κ = (p_{o} - p_{e}) / (1 - p_{e})$

Definitions

Operational definition = measurable spec of abstract construct.
Confound = third variable threatening internal validity.

Algorithms

Reliability check pipeline: test-retest → inter-rater → parallel forms → internal consistency (Cronbach's α).

Comparisons

Interval vs Ratio: Interval has no true zero (°C, calendar year). Ratio has true zero (RT, weight). Ratios only meaningful on ratio scale.
Reliability vs Validity: Reliability = repeatability. Validity = on-target. Stopped clock: reliable, invalid.

Keywords

NOIRLikertCronbachCohen κconvergentdiscriminantecologicalconfounddouble-blind

Unit 3 — Probability & Distributions

Probability, Distributions, and the CLT

One-liners

PDF for continuous: P(X = exact) = 0; only intervals have probability.
Binomial: mean np, variance np(1−p).
t > Normal tails; t → Normal as df → ∞.
CLT: x̄ → N(μ, σ²/n) regardless of population shape.
R prefixes d / p / q / r.

Formulas

$P (X = k) = C (n, k) \cdot p^{k} \cdot (1 - p)^{(} n - k)$
$S E M = σ /\sqrt n$

Definitions

CLT = sampling distribution → Normal.
LLN = x̄ → μ as n→∞.

Algorithms

pbinom(k,n,p) for P(X ≤ k); qbinom for inverse.

Comparisons

Normal vs t: t has heavier tails; depends on df; → Normal as df→∞. Use t when σ unknown.
LLN vs CLT: LLN: x̄ → μ. CLT: shape of variability around μ becomes Normal.

Keywords

BernoulliBinomialNormaltχ²FCLTLLNPDFCDFSEM

Unit 4 — Data Visualization

Plots, Matching, and Common Pitfalls

One-liners

Always plot the raw data first.
Match the plot to the scale.
Tukey outlier rule: ±1.5×IQR.
Avoid 3D pies, dual-y, rainbow, red+green.

Formulas

$I QR = Q 3 - Q 1$
$O u tl i er :> Q 3 + 1.5 \cdot I QR or < Q 1 - 1.5 \cdot I QR$

Definitions

Anscombe's quartet = same stats, different plots.
Skew: positive = right tail.

Algorithms

Boxplot construction: Q1, median, Q3, IQR; whiskers to ±1.5×IQR; flag points beyond.

Comparisons

Bar chart vs Pie chart: Bars use position (perceptually accurate); pies use angle (poor). Avoid pies for >3 slices.
Histogram vs Boxplot: Histogram shows full distribution shape; boxplot summarises with outliers compactly.

Keywords

AnscombeboxplotviolinmosaicheatmapskewIQRTukey

Unit 5 — Descriptive Statistics

Centre, Spread, Standardisation

One-liners

Mean (sensitive) vs Median (robust) vs Mode (nominal).
Skew: mean > median > mode = right tail.
Bessel's correction: /(n−1).
z = (x − μ)/σ; |z| ≤ 1.96 ≈ 95% under Normal.
Robust spread: IQR, MAD.

Formulas

$\overset{x}{ˉ} = Σ x / n$
$s^{2} = Σ (x - \overset{x}{ˉ})^{2} / (n - 1)$
$z = (x - μ) / σ$

Definitions

Bessel: /(n−1) for unbiased s².
MAD = median |x − median|.

Algorithms

z-score: subtract mean, divide by SD; lookup Normal table.

Comparisons

Mean vs Median: Mean uses all data, sensitive to outliers. Median is the 50th percentile, robust.
SD vs MAD: SD = √mean squared deviation (Bessel). MAD = median absolute deviation; robust.

Keywords

meanmedianmodeIQRMADBesselz-scoreskew

Unit 6 — Correlation & Reliability Quantified

Pearson, Spearman, Partial, Reliability Metrics

One-liners

Pearson r: linear; Spearman ρ: monotone (ranks); Kendall τ: pair concordance.
r² = shared variance.
r = 0 ≠ independence (no LINEAR association).
Partial: strip Z from both. Semi-partial: from one side.
Correlation ≠ causation.

Formulas

$r = co v (X, Y) / (s_{X} \cdot s_{Y})$
$κ = (p_{o} - p_{e}) / (1 - p_{e})$

Definitions

Partial corr = controls for confounder.
Cronbach's α = internal consistency.

Algorithms

Compute Pearson r in R: cor(x, y).
Spearman: cor(x, y, method='spearman').

Comparisons

Pearson vs Spearman: Pearson: linear, sensitive to outliers. Spearman: ranks, captures monotone, robust.
Partial vs Semi-partial: Partial residualises both X and Y; semi-partial only one side. Different denominators.

Keywords

PearsonSpearmanKendallpartialsemi-partialCohen κCronbach αspurious

Unit 7 — Hypothesis Testing & NHST

p-values, Errors, Power, t-tests

One-liners

p = P(data | H₀), not P(H₀ | data).
Power = 1 − β; need ≥ 0.80.
Cohen's d: 0.2 small / 0.5 medium / 0.8 large.
Welch's t = independent t without equal-variance assumption.
Never 'accept' H₀.

Formulas

$t = (\overset{x}{ˉ} - μ_{0}) / (s /\sqrt n)$
$C o h e n^{'} s d = (M_{1} - M_{2}) / S D_{p} oo l e d$

Definitions

Type I = false positive (α); Type II = false negative (β).
Power = 1 − β.

Algorithms

NHST: H → α → test → assumptions → statistic → p → decide → effect size + CI.

Comparisons

Independent t vs Welch t: Welch drops equal-variance assumption; adjusts df. Default when in doubt.
Independent t vs Paired t: Paired removes between-subject variability → much more power on same n.

Keywords

NHSTpαβpowerCohen's dt-testWelch

Unit 8 — Multiple Comparisons (FWER, FDR)

FWER vs FDR; Bonferroni, Holm, BH

One-liners

Multiple tests inflate Type I: P(≥1 FP) = 1 − (1−α)^m.
Bonferroni: per-test α = α/m. Conservative.
Holm: stepwise FWER; uniformly more powerful than Bonferroni.
BH: stepwise FDR; less conservative; for exploratory analyses.
Pre-registration is the main antidote to forking paths.

Formulas

$P (\geq 1 F P) = 1 - (1 - α)^{m}$
$α_{B} o n f = α / m$
$B H : p_{(} i) \leq (i / m) \cdot Q$

Definitions

FWER = P(any FP). FDR = E[FP / rejections].

Algorithms

BH: sort p's; find largest i with p_(i) ≤ (i/m)·Q; reject that and all smaller.

Comparisons

FWER vs FDR: FWER conservative — controls P(any FP). FDR less conservative — controls E[FP/rejections].
Bonferroni vs Holm: Bonferroni: divide α by m uniformly. Holm: stepwise; more powerful at the same FWER level.

Keywords

FWERFDRBonferroniHolmBHpermutationp-hacking

Unit 9 — Non-parametric & Categorical Tests

Categorical & Rank-Based Tests

One-liners

χ² = Σ(O−E)²/E. df = (r−1)(c−1) indep; k−1 GoF.
Mann-Whitney ↔ independent t; Wilcoxon signed-rank ↔ paired t.
Kruskal-Wallis ↔ one-way ANOVA; Friedman ↔ RM-ANOVA.
McNemar = paired χ² for binary outcome.
Effect size: φ (2×2), Cramér's V (larger).

Formulas

$χ^{2} = Σ (O - E)^{2} / E$
$φ = \sqrt (χ^{2} / n)$
$df = (r - 1) (c - 1)$

Definitions

Nonparametric = no Normality assumption.
Use when ordinal, heavily skewed, small n, or unfixable outliers.

Algorithms

χ² independence: compute E_ij = r_i·c_j/n; sum (O−E)²/E; compare to χ²((r−1)(c−1)).

Comparisons

Independent t vs Mann-Whitney: Mann-Whitney drops Normality; rank-based; tests stochastic dominance.
Paired t vs Wilcoxon signed-rank: Wilcoxon drops Normality on differences; signed-rank-based.
χ² test vs Fisher's exact: Fisher used at small expected counts (E < 5); exact rather than approximate.

Keywords

χ²Mann-WhitneyWilcoxonKruskal-WallisFriedmanSpearmanMcNemarFisher's exact

Unit 10 — Multicollinearity, PCA & Factor Analysis

VIF, PCA, EFA/CFA, Scree Plot

One-liners

VIF > 5–10 → severe multicollinearity.
PCA = variance maximisation; FA = latent-variable model.
EFA discovers; CFA tests.
Choose # factors: parallel analysis > Kaiser > scree.
Rotation: varimax (orthogonal) vs oblimin (oblique).

Formulas

$V I F_{j} = 1/ (1 - R_{j}^{2})$
$X = F \cdot Λ + E (F A)$

Definitions

Multicollinearity = correlated predictors → unstable β.
Scree plot = eigenvalues; retain before the elbow.

Algorithms

FA pipeline: KMO + Bartlett → choose # factors (parallel) → extract → rotate → interpret loadings → CFA on held-out sample.

Comparisons

PCA vs FA: PCA: all variance. FA: shared variance only with latent model and error.
EFA vs CFA: EFA: data-driven, no prior structure. CFA: hypothesis-driven, tests pre-specified structure.

Keywords

VIFmulticollinearityPCAFAEFACFAscreeparallel analysisvarimax

Unit 11 — ANOVA (one-way, RM, two-way)

Partition, F-test, Sphericity, Post-hoc

One-liners

SS_total = SS_between + SS_within.
F = MS_between / MS_within.
df_between = k−1; df_within = N−k.
Tukey HSD = standard post-hoc after significant F.
Sphericity violated → Greenhouse-Geisser.
Significant interaction qualifies main effects.

Formulas

$F = M S_{b} e tw ee n / M S_{w} i t hin$
$η^{2} = S S_{b} e tw ee n / S S_{t} o t a l$

Definitions

Sphericity = equality of differences' variances.
Mauchly = sphericity test.

Algorithms

ANOVA: state H, check assumptions, partition SS, compute F, p, η², post-hoc Tukey if significant.

Comparisons

One-way ANOVA vs RM-ANOVA: Between-subjects vs same-subjects in all conditions. RM has more power; needs sphericity assumption.
Tukey HSD vs Bonferroni: Tukey optimised for all pairwise comparisons with FWER control. Bonferroni more general but conservative.

Keywords

ANOVAFSSη²TukeysphericityMauchlyGreenhouse-Geisserinteraction

Unit 12 — Regression (Linear, Multiple)

OLS, Diagnostics, Multiple Regression

One-liners

OLS minimises Σ(residual)².
R² ALWAYS ↑ when adding predictors; use adjusted R².
LINeM = Linearity, Independence, Normality, Equal variance, no Multicollinearity.
For simple regression: R² = r².
Categorical predictor with k levels → k−1 dummies.
Cook's d > 4/n flags influential outliers.

Formulas

$β_{1} = co v (X, Y) / V a r (X)$
$R^{2} = 1 - S S_{r} es / S S_{t} o t$
$a d j R^{2} = 1 - (1 - R^{2}) (n - 1) / (n - k - 1)$

Definitions

OLS = minimise squared residuals.
Adjusted R² penalises for # predictors.

Algorithms

Regression workflow: fit → R², adj R², F → coefficients with SEs and p → residual diagnostics → effect size + CIs.

Comparisons

R² vs Adjusted R²: R² always ↑ with predictors; Adj R² penalises by k — honest for model comparison.
OLS vs Ridge: OLS unbiased but high variance under multicollinearity; ridge biased but stable; trades bias for variance.

Keywords

OLSR²adjusted R²LINeMresidual diagnosticsheteroscedasticitydummyCook

Unit 13 — Bayesian Statistics

Priors, Posteriors, Bayes Factors

One-liners

Posterior ∝ Prior × Likelihood.
BF₁₀ = P(D|H₁)/P(D|H₀); continuous evidence.
BF₀₁ > 10 = strong evidence for the null (p-values cannot do this).
Posterior odds = prior odds × BF.
Bayesian robust to optional stopping.

Formulas

$P (H ∣ D) = P (D ∣ H) \cdot P (H) / P (D)$
$B F_{10} = P (D ∣ H_{1}) / P (D ∣ H_{0})$

Definitions

Prior + Likelihood → Posterior.
BF = continuous evidence ratio.

Algorithms

Bayes update: identify prior, likelihood; compute evidence; divide.

Comparisons

Frequentist vs Bayesian: Frequentist: long-run frequency, no priors. Bayesian: degree of belief, explicit priors, can evidence the null.
p-value vs Bayes Factor: p = P(data | H₀); BF = ratio of likelihoods; BF can favour H₀, p cannot.

Keywords

BayespriorposteriorlikelihoodBFcredible intervalconjugate

Unit 14 — GLMs & Logistic Regression

Logistic Regression and the GLM Framework

One-liners

OLS fails on 0/1 outcomes (escapes [0,1], non-Normal residuals, heteroscedastic).
Logistic regression: log(p/(1−p)) = linear predictor.
OR = exp(β); β > 0 → OR > 1.
MLE, not OLS, for fitting.
GLM = random + systematic + link.
No normality / homoscedasticity needed.

Formulas

$l o g (p / (1 - p)) = β_{0} + β \cdot X$
$p = 1/ (1 + e^{(} - η))$
$O R = e^{β}$

Definitions

GLM = random + systematic + link.
Logit = log(p/(1−p)).

Algorithms

Fit by MLE → coefficients with z-tests → odds ratios → confusion matrix on held-out → AUC.

Comparisons

OLS vs Logistic: OLS: continuous Y, normal residuals, OLS estimation. Logistic: binary Y, logit link, MLE.
Linear vs GLM: Linear is a special case of GLM with identity link + Gaussian. GLMs generalise to binomial/Poisson/etc.

Keywords

logitORGLMlink functionMLEAUCdeviance

Unit 15 — Rapid Revision & Exam Strategy

Decision Tree, Confusions, Report Checklist

One-liners

Decision: DV scale × IV scale × #groups × indep/paired → test.
Always report effect size + CI.
Stat significance ≠ practical significance.
Frequentist non-significance ≠ no effect.
Pre-register direction; correct multiple comparisons.

Formulas

$C I = \overset{x}{ˉ} \pm t \cdot s /\sqrt n$
$C o h e n^{'} s d = (M_{1} - M_{2}) / S D_{p} oo l e d$
$χ^{2} df = (r - 1) (c - 1)$

Definitions

Power = 1 − β.
BF interpretation: 3–10 moderate, 10–30 strong, > 30 very strong.

Algorithms

5-step checklist: question → test → assumptions → effect size + CI → interpret.

Comparisons

PCA vs FA vs Reliability vs Validity: PCA: variance reduction; FA: latent variables. Reliability: consistency; Validity: accuracy.

Keywords

decision treechecklisteffect sizeCIreportexam strategy