Recognizing masked perpetrators in real-world surveillance scenarios poses significant challenges due to facial occlusion and degraded image quality. This study investigated the effects of contextual congruency on matching surveillance videos to suspects’ photos. Participants (N = 229) completed a face-matching task involving four masked or unmasked video targets paired with either full face or masked photos. Matching accuracy was significantly higher for unmasked faces compared to masked faces, with contextual congruency offering modest benefits (OR = 1.40). Participants were more confident in accurate than inaccurate matching decisions, and the confidence-accuracy relationship was stronger for unmasked than masked conditions. Contextual congruency did not affect confidence and undermined the confidence-accuracy relationship only when the videos displayed a masked target. These findings emphasize the limitations of human performance in identifying masked individuals under degraded conditions and the constraints of potential strategies for improving face recognition in forensic and surveillance contexts.