Behavioral Research: Statistical Methods
CG3.402Vinoo Alluri•Monsoon 2025-26•4 credits
200-mark mock paper (Set 3) · Paper THREE
Duration: 180 min • Max marks: 200
Section A — 0.5 mark MCQs (20 × 0.5 = 10 marks)
10 marks- 1.A z-score of +2.5 indicates the value is: (a) 2.5 units above the mean (b) 2.5 SDs above the mean (c) 2.5% above the mean (d) The 25th percentile0.5 m
- 2.A survey reaches 1,000 people; 200 respond. The response rate is: (a) 100% (b) 80% (c) 20% (d) Cannot determine0.5 m
- 3.In R, `qt(0.975, df=20)` returns approximately: (a) 1.96 (b) 2.086 (c) 1.645 (d) 0.0250.5 m
- 4.Kruskal-Wallis is the nonparametric counterpart of: (a) Independent t-test (b) Paired t-test (c) One-way ANOVA (d) RM ANOVA0.5 m
- 5.The **t-distribution** approaches the **standard normal** as: (a) Sample size shrinks (b) Degrees of freedom increase (c) Effect size grows (d) α decreases0.5 m
- 6.Reporting Cohen's d **alongside its 95% CI** matters because: (a) p-values are sufficient (b) The CI shows precision of the effect estimate (c) Effect size is irrelevant (d) Required by law0.5 m
- 7.**k-fold cross-validation** is used to: (a) Compute population SD (b) Estimate out-of-sample predictive performance (c) Test homogeneity of variance (d) Bonferroni-correct p-values0.5 m
- 8.A boxplot's **whisker** in Tukey's convention extends to: (a) Minimum/maximum values (b) ±1 SD from mean (c) ±1.5 IQR from quartiles, or the most extreme non-outlier data point (d) 5th/95th percentiles0.5 m
- 9.IQR for data {3, 7, 9, 12, 15, 18, 22, 28, 35} is approximately: (a) 32 (b) 18 (c) 15 (d) 90.5 m
- 10.A **standardized regression coefficient (β)** is interpreted as: (a) Change in DV per unit IV (b) Change in DV in SDs per SD of IV (c) p-value (d) R²0.5 m
- 11.Daksh has two groups, unequal variances, normal data, n₁ = 25, n₂ = 30. Best test: (a) Mann-Whitney U (b) Student's t-test (c) Welch's t-test (d) Paired t-test0.5 m
- 12.Levene's test with p > .05 means: (a) Homogeneity of variances assumption appears to be met (b) Variances are heterogeneous (c) Normality is met (d) Reject H₀0.5 m
- 13.A 95% **Bayesian credible interval** of [3.2, 5.8] for a parameter means: (a) The procedure has 95% coverage in repeated samples (b) Given the data and prior, there is 95% posterior probability that the parameter lies in [3.2, 5.8] (c) The point estimate is 4.5 (d) Equivalent to a frequentist CI0.5 m
- 14.A **directed acyclic graph (DAG)** in causal inference is used to: (a) Visualize a clustering algorithm (b) Encode causal assumptions and identify confounders/colliders (c) Run regressions (d) Estimate Cohen's d0.5 m
- 15.In a 2 × 3 mixed ANOVA, a significant **interaction** but non-significant main effects implies: (a) The IVs are uncorrelated (b) The effect of one IV depends on the level of the other; main effects can still be hidden by the interaction (c) Data are non-normal (d) Sample size is too small0.5 m
- 16.For a one-tailed test at α = .05 (upper tail), the critical z is: (a) 1.96 (b) 1.645 (c) 2.33 (d) 1.280.5 m
- 17.Trisha standardizes her variables before regression. The interpretation of coefficients: (a) Doesn't change (b) Now expressed in SD units (c) Tests for normality (d) Eliminates multicollinearity0.5 m
- 18.The **inter-rater reliability** statistic Cohen's kappa κ ≈ 0.85 indicates: (a) Excellent agreement beyond chance (b) Slight agreement (c) Random agreement (d) Disagreement0.5 m
- 19.Among Bonferroni, Holm, and Benjamini-Hochberg, the **most conservative** (controls FWER strictest) is: (a) Bonferroni (b) Holm (c) Benjamini-Hochberg (d) All equivalent0.5 m
- 20.**Likelihood × Prior = ___ × Posterior** (in Bayes' rule normalization): (a) Evidence (b) Likelihood (c) Posterior odds (d) Marginal0.5 m
Section B — 1 mark MCQs (20 × 1 = 20 marks)
20 marks- 1.Karan compares **3 different teaching modalities** (Live, Recorded, Hybrid) on a continuous engagement score. **Different students** in each modality, normal data, equal variances. Best test: (a) Paired t-test (b) One-way ANOVA (c) Mixed ANOVA (d) Chi-square1 m
- 2.Anaira fits a **Cox proportional hazards model** for customer churn. The hazard ratio for `monthly_charge` is 1.03. Interpretation: (a) Each ₹1 increase in monthly charge multiplies the instantaneous churn risk by 1.03 (b) Customers churn 3% more often (c) p-value is 0.03 (d) Model fit is 1.031 m
- 3.ANCOVA combines: (a) Two ANOVAs (b) ANOVA with a continuous covariate (c) Two regressions (d) ANOVA with a categorical covariate1 m
- 4.Indrani fits a **hierarchical linear model** with students nested in schools. School-level random intercepts capture: (a) Different school baselines (averages) (b) Different student-level slopes (c) Outliers (d) Multicollinearity1 m
- 5.A 3 × 3 contingency table has expected counts of 4, 4, 6, 8, 10, 12, 5, 7, 9. Chi-square's assumption is: (a) Met — all counts > 0 (b) Violated — multiple cells have expected < 5 (c) Met — most cells > 5 (d) Need a different test entirely1 m
- 6.Yadu uses **bootstrapping** (1000 resamples) to estimate the SE of the median. He's: (a) Computing population variance (b) Estimating sampling distribution by resampling with replacement (c) Testing normality (d) Adjusting for multiple comparisons1 m
- 7.A propensity score matches treated and untreated participants on: (a) The outcome (b) Predicted probability of treatment based on observed covariates (c) Age only (d) Random number1 m
- 8.Daksh uses an **instrumental variable** to estimate causal effects. The IV should: (a) Correlate with the outcome directly (b) Affect the outcome ONLY through the treatment (exclusion restriction) and correlate with the treatment (c) Be uncorrelated with the treatment (d) Be categorical only1 m
- 9.**Regression discontinuity** exploits: (a) Random assignment (b) A sharp cutoff in treatment assignment based on a continuous score (c) Time-series autocorrelation (d) Categorical predictors1 m
- 10.**Difference-in-differences** estimates causal effects by comparing: (a) Pre/post change in treatment group only (b) Treatment vs control at one time only (c) Pre/post change in treatment group MINUS pre/post change in control group (d) Random samples1 m
- 11.Nirav fits a moderated mediation model: X → M → Y, with the X → M path moderated by W. To test, he uses: (a) Standard mediation analysis (b) Hayes' PROCESS macro (or equivalent), Model 7 or similar (c) Chi-square (d) ANOVA1 m
- 12.A **cluster-randomized trial** randomizes: (a) Individuals (b) Intact groups (schools, villages, clinics) (c) Outcomes (d) Time points1 m
- 13.A **Sobel test** in mediation: (a) Tests the indirect effect assuming normal sampling distribution (b) Computes Cohen's d (c) Tests homogeneity (d) Adjusts p-values1 m
- 14.A BF₁₀ = 100 represents: (a) Anecdotal evidence (b) Strong evidence for H₁ (c) Very strong / decisive evidence for H₁ (d) Inconclusive1 m
- 15.For an independent t-test with d = 0.3, n per group required for 90% power at α = .05: (a) ~30 (b) ~100 (c) ~234 (d) ~5001 m
- 16.Rashi extracts factors using **Maximum Likelihood**. The advantage over PAF is: (a) Doesn't require normality (b) Provides model-fit statistics (χ², CFI, RMSEA) (c) Faster computation (d) Doesn't need rotation1 m
- 17.**Item discrimination** in a scale refers to: (a) Whether items distinguish high vs low scorers on the latent trait (b) Discrimination based on demographics (c) Item difficulty (d) Time to complete1 m
- 18.**Test-retest** reliability r = 0.45 for a "depression" measure suggests: (a) Excellent reliability (b) Poor reliability — scores fluctuate too much for a stable trait (c) Construct validity (d) High sensitivity1 m
- 19.**Composite reliability** (CR) is preferred over Cronbach's α when: (a) Items have equal loadings (b) Items load unequally on the factor; CR weights items by their loadings (c) The scale is unidimensional (d) Sample size is large1 m
- 20.**Multiple imputation** for missing data is preferred over single imputation because: (a) It's faster (b) It accounts for uncertainty in the imputed values, producing valid SEs (c) Always more accurate (d) Required by law1 m
Section C — 2 mark short answers (15 × 2 = 30 marks)
30 marks- 1.State three **threats to internal validity** that are NOT controlled by random assignment.2 m
- 2.Define **ecological validity** and contrast with internal validity.2 m
- 3.When would you use **cluster sampling** rather than stratified random sampling?2 m
- 4.Differentiate **fixed effects** and **random effects** in multilevel modeling.2 m
- 5.Define **construct underrepresentation** with one example.2 m
- 6.Karan studies whether **type of programming environment** (Cloud-based vs Local IDE) affects **bug count per 1,000 lines** in code submitted by IIIT-H students. Identify IV, DV, scales, design, and analysis.2 m
- 7.Define the **base-rate fallacy** with an example.2 m
- 8.State two key features of a **registered report**.2 m
- 9.State the **Box-Cox transformation**'s purpose.2 m
- 10.Define **omitted variable bias** with an example.2 m
- 11.Why might **R² be a misleading measure** for time-series regressions?2 m
- 12.Define **effect modification** (synonym: interaction in epidemiology).2 m
- 13.State the **two-sample z-test for proportions** assumption checks.2 m
- 14.State the difference between **estimation** and **prediction** in regression.2 m
- 15.Briefly explain why a **Bonferroni-corrected non-significant result** doesn't mean "no effect."2 m
Section D — 5 mark questions (12 × 5 = 60 marks)
60 marksSection E — 10 mark long descriptive (8 × 10 = 80 marks)
80 marksTrack your attempt locally — score and time are recorded in your browser. (Coming soon: timed-attempt mode.)