Computer Vision
CSE471Fill in the Blanks
Formulas, keywords, theorem statements.
GIoU adds a penalty term proportional to the area of ____ that lies outside (A ∪ B).
Focal loss multiplies cross-entropy by (1 − p_t)^γ with γ ≈ ____.
Dice can be rewritten in terms of IoU as Dice = ____.
PCKh@0.5 normalises the distance threshold by ____.
PointNet's aggregation function is ____, chosen because it is symmetric (permutation invariant).
In 3DGS, the covariance is parameterised as Σ = ____.
Scaled dot-product attention divides QKᵀ by ____ to keep softmax in its non-saturated regime.
ViT-B/16 has approximately ____ million parameters.
BYOL prevents collapse via ____ on the target network plus a predictor head on the online network.
MAE masks ____% of patches, far more than BERT's 15%, because images are spatially redundant.
RoPE rotates the (q_{2i}, q_{2i+1}) pair by an angle proportional to ____.
In PaliGemma's Prefix-LM masking, image and prompt tokens use ____ attention while the answer suffix uses causal attention.
I3D inflates a 2D K×K filter into a 3D K×K×K filter by replicating along time and dividing by ____.