Human-machine teaming is the idea that humans and machines can provide complementary information in solving a given task such that their combination results in better performance than either individually. We implement human-machine teaming by introducing the concept of Human-AI supervision. This refers to a hypothesis that humans can provide input to train better models by guiding them towards salient information and in turn these human-aided models can help future human examiners to solve the task more effectively. This dissertation aims to address the question of how to accomplish this in the context of human visual perception and deep convolutional neural networks.This document will present a series of works that outline how to effectively guide deep learning models towards human-defined regions of saliency and thus help the models to learn more generalized features (part 1 of human-AI supervision). Finally, I will show that these human-guided models can aid future humans solving the task by supplying useful information about the sample (part 2 of human-AI supervision). By guiding our models to human-defined saliency, it avoids learning spurious features i.e. incidental features in the training data that do not generalize to the entire domain. Results show that one sample with human saliency can be equivalent to training on multiple samples without. While this idea could theoretically be applied to any domain in which humans can provide meaningful input, for this talk it will be focused on computer vision applications, specifically post-mortem iris recognition, fake iris detection, synthetic face detection and physiological abnormality detection based on chest X-ray scans