Courses/Computer Vision

Computer Vision

CSE471

Prof. Makarand Tapaswi + Prof. Charu Sharma•Spring 2025-26•4 credits

Sample Papers/Mock Paper 9 — Debugging, Failure Analysis, Critical Thinking

Mock Paper 9 — Debugging, Failure Analysis, Critical Thinking

Duration: 180 min • Max marks: 100

20 marks

1.ImageNet CNN: 95% train acc, 30% val acc. Diagnose and prescribe.2 m
2.ViT from scratch on small dataset: train and val loss both stuck at 2.3 after 100 epochs. Diagnose.2 m
3.Sobel on noisy image produces many false edges. Diagnose and prescribe.2 m
4.Mask R-CNN: accurate boxes but masks have jagged edges. Diagnose.2 m
5.YOLO 95% mAP on standard objects but fails on small (< 32 px) objects. Diagnose.2 m
6.DINO teacher has higher accuracy than student throughout training — normal?2 m
7.SimCLR collapses: all images map to same representation. Diagnose.2 m
8.U-Net for medical segmentation: loss decreases but predicted masks are entirely black.2 m
9.NMS at threshold 0.5 suppresses many overlapping true positives. Diagnose.2 m
10.GAN: D's loss → 0 while G's loss explodes. Diagnose.2 m

40 marks

1.Object-classification CNN: 100% train accuracy after 1 epoch, 10% val (chance). What are the causes and how to distinguish them?5 m
2.Mask R-CNN: 90% mAP boxes, masks are PERFECT RECTANGLES filling each box. What's happening?5 m
3.CLIP: perfect retrieval on standard pairs but fails with synonyms ('car' works, 'automobile' fails). Why?5 m
4.CycleGAN image-to-image translation: correct globally but artifacts along region boundaries. Why?4 m
5.Face recognition: 99% on benchmark, 70% in deployment. Why and how to debug?5 m
6.Video action recognition: 90% on UCF-101, 30% on similar internal dataset. Investigate.4 m
7.Model perfect at train & val. After 6 months in deployment, accuracy drops 95 → 60%. Diagnose.4 m
8.Self-driving pedestrian detector misses dark-skinned pedestrians at night more than light-skinned ones. Investigate fairness failure and propose mitigation.5 m

40 marks

1.ViT-Base achieves 76% on ImageNet (paper 81%). Investigate every possible cause and write a complete debugging plan.10 m
2.Mask R-CNN deployed for defect detection drops 95 → 70% accuracy after 6 months. Structured debugging.10 m
3.Research team trains DINOv2 from scratch, performs significantly worse than the official release with same architecture and HPs. Investigate.10 m
4.ClipCap captioning system gets progressively worse over time — more factual errors, hallucinated objects, repetitive phrasing. Diagnose.10 m

Track your attempt locally — score and time are recorded in your browser. (Coming soon: timed-attempt mode.)