Computer Vision
CSE471Prof. Makarand Tapaswi + Prof. Charu Sharma•Spring 2025-26•4 credits
Mock Paper 8 — Derivations, Proofs, Mathematical Reasoning
Duration: 180 min • Max marks: 100
Section A — Quick Derivations (2 marks each, 20 marks)
20 marks- 1.Derive the variance of q·k where each component q_i, k_i ~ N(0,1) iid and vectors have dimension d.2 m
- 2.Show 1×1 conv ≡ FC layer applied independently per spatial position.2 m
- 3.Receptive field of three stacked 3×3 convs, stride 1.2 m
- 4.Derive ∂y/∂x for y = ReLU(x).2 m
- 5.Show softmax(x + c) = softmax(x).2 m
- 6.Derive Bayes' theorem and relate to image classification.2 m
- 7.Show L2 regularisation ≡ Gaussian prior on weights.2 m
- 8.Derive backprop for y = Wx + b given upstream gradient ∂L/∂y.2 m
- 9.5-layer network with weight magnitudes w = 0.9. What happens to gradient at layer 1 from layer 5's loss?2 m
- 10.Show centering in DINO converges to a useful equilibrium.2 m
Section B — Conceptual Derivations (4-6 marks each, 40 marks)
40 marks- 1.Derive the bias-variance decomposition of squared error: E[(y − f̂(x))²] = Bias² + Variance + Irreducible noise.6 m
- 2.Derive that PCA's principal components are eigenvectors of the covariance matrix.5 m
- 3.Derive the EM algorithm for Gaussian Mixture Models.5 m
- 4.Derive the closed-form gradient of SimCLR's contrastive loss w.r.t. embeddings (assume L2-normalised).5 m
- 5.Derive why (Conv + ReLU + max pool) is translation equivariant but a fully connected layer is not.6 m
- 6.Derive the Inception Score (IS) and show what it measures.4 m
- 7.Derive why a single-layer Transformer with 2 heads can implement induction heads (recognise 'A B … A → predict B').5 m
- 8.Derive the Information Bottleneck view of supervised learning.4 m
Section C — Long Derivations (10 marks each, 40 marks)
40 marks- 1.Derive full backprop for a 3-layer MLP: x → h_1 = ReLU(W_1·x + b_1) → h_2 = ReLU(W_2·h_1 + b_2) → y = softmax(W_3·h_2 + b_3); L = −log(y_t). Derive ∂L/∂W_l, ∂L/∂b_l for l = 1, 2, 3.10 m
- 2.Derive the closed-form optimal homography given 4+ correspondences via the Direct Linear Transform (DLT).10 m
- 3.Derive the Adam optimiser update from first principles. Show why it combines momentum and adaptive learning rates effectively.10 m
- 4.Derive in full how Gaussian Splatting renders an image differentiably. Start from 3D Gaussians, end with gradients flowing back to update Gaussian parameters.10 m
Track your attempt locally — score and time are recorded in your browser. (Coming soon: timed-attempt mode.)