Face recognition from video is an important theme in biometric research today. We also survey the current approaches used for recognition in video including published techniques for addressing problems such as pose and illumination variations. It has been attacked using PCA-based approaches, exploitation of the temporality of the frames, construction of 3-D face models to handle pose changes. These approaches have drawbacks such as preprocessing and requiring prior knowledge about the datasets. Some of these approaches require the face to be manually cropped out of the background, or the eyes to be marked. In this work, we develop strategies to select a set of frames to represent a subject in our gallery and probe set, to improve recognition performance over that of a single frame, when the data is acquired from video. We exploit the fact that there are many such frames available in a few seconds of video data. We incorporate quality and diversity in making a decision about which images will be used to represent the subjects in our dataset. We demonstrate our approaches on the Notre Dame dataset. It uses three different sensors and is the largest known research video dataset. We also compare our best approach to using a single high-quality image in our gallery set and compare the performance of FaceIt and Viisage, using this set. The Honda/UCSD dataset is used to show that our approach performs comparably to an existing approach.