This work proposes and evaluates strategies based on Stacked Supervised Auto-Encoders (SSAE) for face representation in video surveillance applications. The study focuses on the identification task with a single sample per person (SSPP) in the gallery. Variations in terms of pose, facial expression, illumination and occlusion are approached in two ways. First, the SSAE extracts features from face images, which are robust to such variations. Second, we propose methods to exploit the multiple samples per persons probes (MSPPP) that can be extracted from video sequences. Three variants of the proposed method are compared upon HONDA/UCSD and VIDTIMIT video datasets. The experimental results demonstrate that strategies combining SSAE and MSPPP are able to outperform other SSPP methods, such a local binary patterns, in face recognition from video.