Out of Mind, Out of Sight: Unexpected Scene Elements Frequently Go Unnoticed Until Primed


Out of Mind, Out of Sight: Unexpected Scene Elements
Frequently Go Unnoticed Until Primed

George M. Slavich & Philip G. Zimbardo

Published online: 18 September 2013
# Springer Science+Business Media New York 2013

Abstract The human visual system employs a sophisticated set of strategies for
scanning the environment and directing attention to stimuli that can be expected
given the context and a person’s past experience. Although these strategies enable us
to navigate a very complex physical and social environment, they can also cause
highly salient, but unexpected stimuli to go completely unnoticed. To examine the
generality of this phenomenon, we conducted eight studies that included 15 different
experimental conditions and 1,577 participants in all. These studies revealed that a
large majority of participants do not report having seen a woman in the center of an
urban scene who was photographed in midair as she was committing suicide. Despite
seeing the scene repeatedly, 46 % of all participants failed to report seeing a central
figure and only 4.8 % reported seeing a falling person. Frequency of noticing the
suicidal woman was highest for participants who read a narrative priming story that
increased the extent to which she was schematically congruent with the scene. In
contrast to this robust effect of inattentional blindness, a majority of participants
reported seeing other peripheral objects in the visual scene that were equally difficult
to detect, yet more consistent with the scene. Follow-up qualitative analyses revealed
that participants reported seeing many elements that were not actually present, but
which could have been expected given the overall context of the scene. Together,
these findings demonstrate the robustness of inattentional blindness and highlight the
specificity with which different visual primes may increase noticing behavior.

Keywords Visual attention . Selective attention . Perception . Inattentional blindness .

Figure-ground perception . Aschematic blindness . Priming . Expectations .

Transformational teaching

Curr Psychol (2013) 32:301–317
DOI 10.1007/s12144-013-9184-3

G. M. Slavich (*)
Cousins Center for Psychoneuroimmunology and Department of Psychiatry and Biobehavioral
Sciences, University of California, Los Angeles, UCLA Medical Plaza 300, Room 3156, Los Angeles,
CA 90095-7076, USA
e-mail: gslavich@mednet.ucla.edu

P. G. Zimbardo
Department of Psychology, Stanford University, Stanford, CA 94305-2130, USA


Philosophers and psychologists have long questioned whether humans have access to an
objective reality. One possibility is that we are capable of seeing the physical and social
world as it actually exists. This perspective has been termed naïve realism, or direct
realism (Ross and Ward 1996), and it posits that our eyes function like cameras or
sensors that collect information and then send it back to our brains, where it is
represented accurately and objectively. Underlying this approach to seeing the empirical
world “as it is,” in terms of its database of information, is bottom-up processing. In this
mode of processing, visual attention is deployed based largely on the inherent saliency
of different objects in a visual scene (Itti 2005; Itti et al. 1998). Objects that are highly
salient are readily noticed and perceived by observers, while less salient objects are not.

Another possibility is that our experience of the physical and social world is
inevitably influenced by subjective mental processes. This perspective was described
in Bruner’s (1957) classic treatise on perceptual readiness. In that landmark paper,
Bruner argued that some people are readier to expect, and therefore quicker to
perceive, the least desirable event among an array of expected events; for others, in
contrast, it is the most desirable event that stands out. From this perspective, for some
people at least some of the time, we see the world “as we think it is,” in terms of our
expectations, values, goals, and personal history. Underlying the process of seeing the
world in this way, as influenced by cognition, is top-down processing. In this mode of
processing, detection and perception of visual information is inherently shaped by
observers’ differing histories, motivations, and tendencies, leading people to perceive
different aspects of the visual scenes they encounter.

A large body of research has recently attempted to evaluate the respective contri-
butions of bottom-up and top-down processing to visual search, and this work has
renewed the debate about how real-world scenes are perceived and remembered (e.g.,
Chun 2003; Henderson and Hollingworth 1999). Arising from this research is
evidence that bottom-up processes affect attention at a localized level, especially in
the absence of top-down or contextual cues (Chen and Zelinsky 2006). When
contextual information is present, however, it appears to heavily guide visual search
behavior (Hsieh et al. 2011). This means that under most natural viewing conditions,
top-down processes likely play a prominent role in guiding visual search, insofar as
these processes provide information about the likely presence and expected location
of various elements within a given scene (Torralba et al. 2006; Wolfe et al. 2003).

Central to top-down processing are conceptual frameworks, or schemas, that encode
complex generalizations about one’s experience of the structure of the surrounding
environment. We have schemas for virtually all aspects of our experience, including
objects, people, and situations. Ultimately, it is through these schemas that we perceive,
attend to, evaluate, understand, and remember experiences of the physical and social
world (Cantor and Michel 1979; Levy et al. 1999). Schemas simplify the extremely
complex task of identifying and classifying the innumerable things in our visual world
by grouping instances into more readily-accessible wholes, or gestalts. By doing so,
schemas greatly enhance the speed and accuracy of processing the enormous amount of
information to which we are exposed everyday (Stryker and Burke 2000).

The obvious utility of schematic perception is based on the swift processing of the
familiar. However, what about the novel event, or unfamiliar or unexpected stimulus
(Beiderman et al. 1982)? Do perceptual objects of this sort require greater time or mental
energy to process? Or, are such objects simply not perceived if they are sufficiently

302 Curr Psychol (2013) 32:301–317


incongruent, or aschematic, with our personal expectations, or with their general context
or background (Wolfe et al. 2005)?

Early answers to these questions were offered by social philosopher Henry David
Thoreau (1860/1906), who wrote:

A man receives only what he is ready to receive, whether physically or intellectually
or morally, as animals conceive at certain seasons their own kind only. We hear and
apprehend only what we already half know… The phenomenon or fact that cannot
in any wise be linked with the rest which he has observed, he does not observe
(pp. 77–78).

Thoreau argued that perception is guided in part by previously-established schemas
that provide categorical assignments for new stimuli based on their inherent meaning.
This process, in turn, influences our expectations for the probable appearance of different
stimuli within different contexts. According to this view, stimuli or changes that violate
our expectations, or that rarely occur, should be less likely to be seen than those that are
schematic or expected given their context or background. A large body of research has
documented effects that are consistent with this view, and this phenomenon has generally
been referred to as inattentional blindness (Becklen and Cervone 1983; Mack 2003;
Mack and Rock 1998; Most et al. 2001; Neisser and Becklen 1975; Simons 2000).

Updating Thoreau takes us to a social-constructionist view of perception. This
perspective posits that we see only what we have learned to see and that we are
psychologically prepared to see. This perceptual preparation comes as a consequence
of a complex personal history of perceiving events, actions, and people in certain
ways, within certain contexts that are socially and culturally appropriate. Each
individual’s personal history creates a set of powerful perceptual expectations about
the way the world is and ought to be. As a result of these expectations, we most
readily see what we expect to see and interpret incoming visual information in ways
that are highly congruent with preexisting schemas. This notion that perceptions of
the external world are shaped in this way have long been recognized by cognitive and
gestalt theorists (e.g., Beck 1967; Perls 1969), and have been elaborated on in many
ways over the years (e.g., Attneave 1954; Axelrod 1973; Becker 1973; Greenberg
et al. 1986; Henderson and Hollingworth 1999).

The present study was designed to examine this general perceptual phenomenon
by testing how frequently individuals report seeing contextually expected and unex-
pected objects in a visual scene. The study idea originated from an experience of one
the authors (PGZ). Specifically, while looking through photos in the book, LIFE: The
First 50 Years, 1936–1986, PGZ expressed surprise and distress at a photo that
depicted a woman in midair, who had been photographed seconds after having leapt
from a hotel window (See Fig. 1). A companion who was also looking at the photo
over his shoulder noticed nothing unusual. Indeed, his recognition of the reality of the
shocking scene came only after seeing the photo caption, which read: “The camera
caught the 1942 plunge toward the Buffalo pavement of despondent divorcee in the
last dreadful split second before death” (Kunhardt 1986, p. 64).

To examine whether this failure to perceive the suicidal woman was a unique
experience or an instance of a more common phenomenon, we conducted a series of
eight studies in which more than 1,550 students viewed the photograph in a controlled
classroom setting under fifteen different experimental conditions. Each study was

Curr Psychol (2013) 32:301–317 303


conducted under the auspices of an experiential activity on “visual perception,” which is
consistent with a transformational teaching approach to classroom instruction (Slavich
2005, 2006, 2009; Slavich and Zimbardo 2012; see also Slavich and Toussaint 2013).
First, we examined the generality of the phenomenon of failing to see an object that is
unexpected, but which should otherwise be readily apparent given its dynamic centrality
in a static stimulus array. Based on previous research demonstrating the failure of
individuals to detect rare objects (Wolfe et al. 2005) and unlikely or unexpected events
(Simons and Chabris 1999), we hypothesized that most viewers would not report having
seen the suicidal woman. Next, to extend the existing literature on this topic, we examined
the extent to which a series of very specific priming and structural manipulations of
the photograph enhanced participants’ detection of the woman and other peripheral
scene elements. Across these experimental manipulations, we predicted that reporting of
particular scene elements (including the woman) would increase greatest for participants

Fig. 1 Original suicidal woman stimulus photograph, published in LIFE: The First 50 Years, 1936–1986

304 Curr Psychol (2013) 32:301–317


who received primes aimed at making these elements more schematic with the scene,
and thus more expected given the visual context.

Method

Overview

The suicidal woman photograph previously described was shown to students in
introductory psychology courses over eight years. The photograph was presented
on a projection screen for multiple exposures (2, 3, or 4 exposures) of increasing
duration (2–10 seconds per exposure). Before each viewing, participants were
instructed to extract as much information as possible from the photograph. After
each viewing, participants were asked to report on a standardized response form
every scene element that they could remember from the photograph.

As described in Table 1, eight replications utilizing this basic study design were
conducted in all. Each replication occurred in a classroom format with 106–305
students per class (M=197.1, SD=73.6). The default study instructions prompted
participants to “notice all that you can in the photograph” or to “extract as many
details as possible from the scene.” In some replications, participants were randomly
assigned to different groups and then primed with a specific visual search instruction
(e.g., notice as many animate, inanimate, or unusual aspects as possible). In other
replications, the photograph was modified in some way in sequential presentations of
the test scene (e.g., the suicidal woman was deleted but then added to the scene, or
vice versa; see Table 1). Given that the methods, stimuli, and procedures were
essentially the same across the eight different studies (i.e., with the exception of
structural changes and focus primes in some studies), we report the results of the
studies together below.

Participants

A total of 1,577 students at Stanford University, the majority (74.9 %) of who were
freshman, participated in the eight replications that comprised this overall program of
research. Participants were of college age (M=19.2 years old; SD=1.03) and approx-
imately equal in sex representation (56 % were female). Participants received re-
search credit for their time. All participants provided written informed consent prior
to participation, and all study procedures were approved in advance by the Institu-
tional Review Board at Stanford University.

Stimulus Materials

The central element of the test scene is a woman in the act of committing suicide, who
was photographed just seconds after she jumped from a top-story window in the
Genesee Hotel in Buffalo, New York, in 1942 (See Fig. 1). The photograph was
published in the oversized LIFE Magazine book called LIFE: The First 50 Years,
1936–1986 (Kunhardt 1986). The woman is located approximately in the center of
the scene.

Curr Psychol (2013) 32:301–317 305


T
a
b
le

1
O
v
er
v
ie
w

o
f
ch
ar
ac
te
ri
st
ic
s
o
f
ea
ch

st
u
d
y

S
tu
d
y

T
ri
al
s

S
ti
m
u
lu
s
d
u
ra
ti
o
n

IS
Ia

E
x
p
er
im

en
ta
l
co
n
d
it
io
n

n

1
2

6
s/
6
s

2
m
in

N
o
ti
ce

al
l
th
at

y
o
u
ca
n

2
4
3

2
4

2
s/
4
s/
6
s/
8
s

2
m
in

E
x
tr
ac
t
as

m
an
y
d
et
ai
ls
as

p
o
ss
ib
le

1
0
6

3
3

2
s/
4
s/
6
s

2
m
in

D
et
ec
t
n
am

e
o
f
b
u
si
n
es
se
s
an
d
an
y
th
in
g
u
n
u
su
al

3
0
5

4
4

3
s/
4
s/
7
s/
1
0
s

2
m
in

D
et
ec
t
n
am

e
o
f
b
u
si
n
es
se
s
an
d
an
y
th
in
g
u
n
u
su
al

11
7

5
2

2
s/
6
s

2
m
in

N
o
ti
ce

al
l
th
at

y
o
u
ca
n

1
0
9

N
o
ti
ce

as
m
an
y
an
im

at
e
o
b
je
ct
s
as

p
o
ss
ib
le
,
su
ch

as
p
eo
p
le

4
5

N
o
ti
ce

as
m
an
y
in
an
im

at
e
o
b
je
ct
s
as

p
o
ss
ib
le
,
su
ch

as
st
o
re
s
o
r
si
g
n
s

5
0

N
o
ti
ce

th
in
g
s
th
at

ar
e
u
n
u
su
al

4
2

6
2

2
s/
6
s

2
m
in

N
o
ti
ce

as
m
an
y
an
im

at
e
o
b
je
ct
s
as

p
o
ss
ib
le
,
su
ch

as
p
eo
p
le

8
3

N
o
ti
ce

as
m
an
y
in
an
im

at
e
o
b
je
ct
s
as

p
o
ss
ib
le
,
su
ch

as
st
o
re
s
o
r
si
g
n
s

8
4

N
o
ti
ce

as
m
an
y
th
in
g
s
th
at

ar
e
u
n
u
su
al

6
4

7
2

2
s/
6
s

2
m
in

S
u
ic
id
al
w
o
m
an

st
o
ry

p
ri
m
e

1
2
1

8
2

2
s/
6
s

2
m
in

W
o
m
an

p
re
se
n
t→

W
o
m
an

ab
se
n
t

6
7

W
o
m
an

ab
se
n
t→

W
o
m
an

p
re
se
n
t

6
8

B
ar
b
er

p
o
le

p
re
se
n
t→

B
ar
b
er

p
o
le
ab
se
n
t

7
3

N
=
1
,5
7
7

a
T
h
e
IS
I,
o
r
in
te
rs
ti
m
u
lu
s
in
te
rv
al
,
is

th
e
ti
m
e
b
et
w
ee
n
o
ff
se
t
o
f
o
n
e
st
im

u
lu
s
an
d
o
n
se
t
o
f
th
e
n
ex
t.
D
u
ri
n
g
th
is

ti
m
e,

p
ar
ti
ci
p
an
ts

w
er
e
as
k
ed

to
li
st

w
h
at

th
ey

sa
w

in
th
e

p
h
o
to
g
ra
p
h
.

306 Curr Psychol (2013) 32:301–317


Priming Manipulations

The stimulus photograph was presented after a classroom lecture that discussed perceptual
violations of expectations. This overall procedure thus constituted a “meta prime” for
noticing scene elements that do not fit one’s schema for such scenes. Schematic perception
and violations of expectation had also been defined and discussed prior to presenting the
photograph. As described above, we primed participants in a number of different ways
across the different studies. We also altered the structure of the test scene in several ways to
examine the effects that these experimental manipulations had on participants’ reporting
of the different scene elements. The number of trials, stimulus duration, interstimulus
interval, and sample size for each study is summarized in Table 1, and a brief description
of each of the experimental manipulations is provided below.

Focus Primes In two separate replications, participants were given more specific
instructions on what to focus on in the scene (i.e., “focus primes”). These instructions
were printed on the standardized response forms, which were randomly assigned to
students in the classroom. As summarized in Table 1, the focus prime in Study 3 and
Study 4 was to “Detect the name of businesses in the scene and anything else that is
unusual.” In Study 5 and Study 6, participants were asked to “Try to notice…” (a) “as
many Animate Objects as possible, such as people,” or (b) “as many Inanimate
Objects as possible, such as stores or signs,” or (c) “as many things that are Unusual.”

Despondent Woman Story Prime The most direct schema-congruent prime with respect
to the suicidal woman was a brief vignette that participants in this priming condition read
immediately before viewing the test scene. This vignette was presented to students in
Study 7. The vignette, introduced as “background information that might be useful to
you,” read as follows:

Joan was usually a happy person until she suffered a series of setbacks
following graduation. Her mother died in a car crash, and soon after she lost
her job that she really liked because the company was downsizing. She has had
difficulty getting another job. Her boyfriend recently told her he was breaking
off their engagement because he found a woman better suited to his values. Joan
was despondent enough to seek help, but her depression worsened anyway. She
felt her future was bleak when she checked into that downtown hotel…

Structural Manipulations

Finally, in one replication of the study (Study 8), we systematically altered the structure of
the scene to examine how this manipulation influenced participants’ reporting of the
different scene elements. In two experimental conditions in this study, the suicidal woman
was airbrushed out of the scene to compare perceptions of the photograph on sequential
viewings in which the central figure’s presence was manipulated as either present or absent
in counterbalanced presentations (i.e., woman present then absent, or woman absent then
present; See Fig. 2). In a third experimental condition in Study 8, the barber pole was added
to and removed from the scene in counterbalanced presentations to test for differences in
noticing scene elements as a function of altering a peripheral detail of the scene.

Curr Psychol (2013) 32:301–317 307


Procedure

In each replication of the study, the test scene was projected onto a large screen using
a Kodak CAROUSEL slide projector. The room was lit with low, ambient lighting.
Participants were told that they were about to see a scene presented multiple times, for
a few seconds each time. They were instructed to scan the scene as quickly and
carefully as possible, and to then report every scene element that they could recall.
When reporting on subsequent presentations of the scene, they were told to list
elements that they had not noticed previously, as well as any modifications of
previously reported items. Participants were given two minutes to record their
responses after each viewing.

Prior to this testing procedure, students recorded their age, sex, college year,
quality of corrected vision (excellent, good, fair, poor), and viewing/sitting position
in the classroom (center or side, and front, middle, or rear) on their standardized

Fig. 2 Altered suicidal woman stimulus photograph, with woman airbrushed out

308 Curr Psychol (2013) 32:301–317


response form. This enabled a comparison of the reports of participants with the best
vision in ideal viewing positions (front, center) with the reports of those in other
viewing conditions. Participants were also asked to write their names on the response
form to increase their motivation to perform the task well, given that one of the
experimenters was also their course instructor.1

Results

Effects of Visual Acuity and Viewing Position on Reporting of Scene Elements

Analyses were conducted on data from an early replication of the study (Study 1,
N=243) to test whether quality of vision (excellent, good, fair, poor) or viewing
position (center-front, center-rear, sides) were related to the frequency of
reporting elements in the scene. Viewing position was unrelated to reporting
frequency, and quality of vision was only associated with reporting of the small
“Coffee Shop” sign, χ2(6, N=243)=23.61, p=.001. Among those with
excellent/good vision, 85 % of participants saw “Coffee Shop” compared to
52 % of those with fair/poor vision. Because noticing the central figure was
unrelated to quality of vision and viewing position, we collapsed the data across
these categories for all subsequent analyses.

Perceiving the Central Figure

A summary of seeing the central figure and several peripheral scene elements across
all studies is presented in Table 2. Across all priming and structural manipulations,
and averaged across repeated exposures (of 2, 3, or 4 presentations) and total
exposure times (of 8, 12, and 24 seconds), only 2.2 % of participants reported seeing
the target stimulus as a suicidal woman and only 3.9 % of participants reported seeing
a suicidal person in the central position of the scene. Nearly half of all viewers (46 %)
failed to report seeing anything or anyone in or near the position of the woman.
Additional viewing opportunities and increased total exposure time did not significantly
increase reporting of the target stimulus.

Perceiving Peripheral Elements of the Scene

These data may be contrasted with what participants did report. For example, a
majority of participants (65.7 %) reported seeing “Garage,” “Hotel” (83.0 %), and
“Coffee Shop” (94.0 %). Perhaps even more remarkable, 24.2 % and 26.1 % of
participants accurately reported seeing the small signs announcing “Sandwiches 10¢”
and “Milk Shakes,” respectively, in the hotel window.

1 Asking students to identify themselves could have led to less complete reporting of elements about which
students were uncertain. However, given that students were explicitly instructed to list as many elements
from the scene as possible, reporting elements about which students were uncertain was associated with
doing better, not worse. This, we believe, decreased the likelihood of omitting uncertain scene elements.

Curr Psychol (2013) 32:301–317 309


Priming Effects

Focus Primes In Study 5 (N=246) and Study 6 (N=231), participants were primed by
having been asked to notice animate, inanimate, or unusual elements of the scene
immediately prior to viewing the photograph. These manipulations did not significantly
affect the likelihood of seeing the woman as a central figure in the scene. As summarized
in Table 3, however, these primes did influence seeing peripheral elements in the scene in
predictable ways. For example, participants who received the Inanimate focus prime
were more likely than those in the Unusual and Animate conditions to report seeing
“Hotel,” χ2(2, N=231)=7.07, p=.03, and “Coffee Shop,” χ2(2, N=231)=7.07, p=.03.
Participants who received the Unusual focus prime were more likely than those in the

Table 3 Percentage of participants who reported seeing scene elements after two seconds of exposure by
type of prime

Type of prime

Scene element Neutral Animate Inanimate Unusual Story

Central element

Saw suicidal woman 0.0 % 0.0 % 0.0 % 0.0 % 12.4 %

Saw suicidal person 0.5 % 0.0 % 0.0 % 0.0 % 1.7 %

Saw falling person 0.6 % 15.7 % 9.5 % 9.4 % 2.5 %

Saw target stimulus 11.1 % 39.7 % 27.2 % 31.4 % 25.0 %

Peripheral elements

Saw people in doorway 3.4 % 19.3 % 6.0 % 3.1 % 10.1 %

Saw “1.00 Up” 14.0 % 15.7 % 23.8 % 40.6 % 26.7 %

Saw “Garage” 18.2 % 13.3 % 20.2 % 20.3 % 33.1 %

Saw “Hotel” 53.1 % 47.0 % 66.7 % 51.6 % 91.6 %

Saw “Coffee Shop” 77.0 % 45.8 % 70.2 % 53.1 % 83.3 %

Table 2 Percentage of participants
who reported seeing the target
stimulus and comparison peripheral
stimuli across all studies, trials, and
experimental conditions

Scene element N % of participants

Central element

Saw suicidal woman 34/1,577 2.2 %

Saw suicidal person 62/1,577 3.9 %

Saw falling person 76/1,577 4.8 %

Saw target stimulus 852/1,577 54.0 %

Peripheral elements

Saw “Sandwiches 10¢” 382/1,577 24.2 %

Saw “Milk Shakes” 411/1,577 26.1 %

Saw “Garage” 1,036/1,577 65.7 %

Saw “Fountain” 1,145/1,577 72.6 %

Saw “Hotel” 1,309/1,577 83.0 %

Saw “Coffee Shop” 1,482/1,577 94.0 %

310 Curr Psychol (2013) 32:301–317


Inanimate and Animate conditions to report seeing “1.00 Up,” χ2(2, N=231)=12.05,
p=.002. Finally, participants who received the Animate focus prime were more likely
than those in the Inanimate and Unusual conditions to report seeing people in the
doorway of the hotel, χ2(2, N=231)=12.88, p=.002.

Despondent Woman Story Prime We hypothesized that seeing the suicidal woman
would be enhanced for participants who read an account of a despondent woman with
reasons to commit suicide immediately prior to viewing the stimulus photograph. This
prediction was validated in Study 7. Of the 121 participants who received the despondent
woman story prime immediately prior to viewing the scene, 15 individuals (12.4 %)
reported seeing a suicidal woman, compared to 2.2 % of individuals across all samplings
and manipulations in this research program, χ2(1, N=1577)=36.34, p<.001.

Structural Manipulation Effects

Woman Present or Absent In Study 8 (N=135), we altered the structure of the scene
by either adding or removing the woman from the photograph in the second of two
consecutive presentations of the stimulus. Participants in this study were asked to
detect anything that changed in the scene between the first and second viewing.
Participants in the condition where the woman was present and then removed differed
significantly from those in the condition where the woman was absent and then
inserted into the scene with respect to whether they noticed a change on the second
viewing, χ2(1, N=135)=55.66, p<.001. Specifically, when the woman was absent and
then added to the scene, no one noticed that something was added. However, when
she was present and then removed from the scene in the second viewing of the
photograph, 58.2 % of participants reported that something had been deleted.

Barber Pole Present or Absent To test for differences in noticing central scene
elements (i.e., those involving the woman) as a function of altering the scene’s
peripheral details, the barber pole was added to and removed from the scene in
counterbalanced presentations with a sample of 73 participants (Study 8). However,
addition and removal of the barber pole did not affect the frequency of reporting any
central scene elements.

Qualitative Aspects of Scene Perception

A content analysis was subsequently conducted on the written perceptual accounts to
qualitatively examine how participants processed the photograph. When noticed, the
central figure was described in 42 different ways – for example, as a woman, man,
person, object, and figure. The animate individual was seen as flying, hanging,
leaping, diving, and committing suicide. The static object was a statue, mannequin,
sculpture, gargoyle, and fixture. When perceived as a figure, it was an angel, cherub,
fairy, nymph, and mermaid. Remarkably, the hotel was frequently (and accurately)
recalled as the “Genesee Hotel,” but it was also called the “Genesee Motel” and
“Genesee Drugs.” The garage sign was described in 18 ways – for example, as free

Curr Psychol (2013) 32:301–317 311


garage and free parking. The people in the window and the doorway were described
as a bellboy, doorman, crossing guard, customer, waiter, and policeman (the person in
the doorway was actually a policewoman who was responding to the scene).

Participants also frequently reported seeing things in the scene that were not
actually present. For example, some participants reported seeing a marine kissing a
girl, two people dancing, a fire hydrant, car, bench, tree, alley, overcast sky, and foggy
weather. Some of the remembrances involved assimilation to the familiar, extensions
of seen objects to related ones, and the mention of elements that were not present, but
which would be schematic with the scene. Examples of this included seeing a “car
parked out front” immediately after reporting “garage.” Given the setting, fire
hydrants and colored neon signs could be expected and were reported, although none
are actually present in the scene. In three instances a “red and white barber pole” was
reported as being present in this black and white photograph. Finally, several participants
dealt with the uncertain, perhaps distressing sight of the suicidal woman by transforming
her into a static object, such as a “thing on the roof,” even though the roof is not visible in
the photograph.

Discussion

Human visual recognition processes are usually quite robust and effective even when
viewing conditions are degraded (Cox et al. 2004). When information about an
element or figure in the visual field is insufficient or incomplete, though, we rely
heavily on contextual cues to determine how to best understand and interpret the
stimulus (Bar and Ullman 1996; Beiderman et al. 1982; Palmer 1975). In such
instances, the personal expectations that we bring to a given perceptual task – for
example, as a result of our histories, motivations, and disposition – dramatically
influence what we see and remember. This is the essence of top-down processing:
Humans see what they are psychologically prepared to see and are generally blind to
what they are not prepared to see in a particular visual context, given their preexisting
schemas and associated expectations for that context.

Data from the present study provide robust evidence in support of this phenomenon.
Across eight studies, 15 different experimental conditions, and more than 1,550 highly
motivated, educated viewers, a large majority of participants failed to see the most
significant event in a scene: a woman in the act of committing suicide. This occurred
despite giving participants specific instructions to detect as many elements as possible
from the scene, which was presented repeatedly with increasing duration times.
Frequency of detecting the suicidal woman did not improve with additional exposure to
the photograph and, in fact, was increased (from 2.2 % to 12.4 %) only among
participants who read a paragraph-long priming story just prior to viewing the test
scene. The story told of a despondent woman entering a hotel and was intended to
increase the woman’s schematic congruency with the scene. Although some participants
reasonably interpreted the central figure to be something other than a suicidal woman,
nearly half of all participants (46 %) across all study replications and experimental
manipulations failed to see anything or anyone in the general location of the woman.

In contrast, participants exhibited remarkably accurate and detailed recognition of
many other elements in the scene. Across all replications and experimental conditions,

312 Curr Psychol (2013) 32:301–317


for example, 65.7 % of participants saw “Garage,” 83 % of participants saw “Hotel,” and
94 % of participants saw “Coffee Shop.” These scene elements are arguably easier to
perceive and recall than the suicidal woman. Importantly, though, they are also more
schematically congruent with the scene. Coffee shops are frequently seen on street
corners and garage signs are often posted near hotel entrances. More remarkable is that
24.2 % and 26.1 % of participants reported seeing “Sandwiches 10¢” and “Milk
Shakes,” respectively. These two elements are incomplete and remain difficult to
decipher even after prolonged exposure to the photograph. They are similar to the
woman in this way, but differ in that they are schematically congruent with major
elements of the scene, like “Coffee Shop” and “Fountain,” which were reported by more
than 70 % of all participants.

These two sets of results permit one of the more interesting comparisons of the
study: Approximately one-fourth of all viewers made complete sense out of two
highly degraded window signs, presumably by using their expectations to fill in the
visual information they needed but which was not present; in the same amount of
time, only 2.2 % of participants were able to make sense out of the woman given the
visual information provided. Accurate detection of the window signs and not the
suicidal woman is made more remarkable by the fact that the signs are approximately
one-tenth the size of the woman.

These results can be interpreted in several ways. One interpretation is that given
the limited amount of time participants had to view the photograph, they simply
reported the scene elements that had the greatest bottom-up saliency, such as “Hotel,”
“Garage,” and “Coffee Shop.” These elements could arguably require less time to
interpret than the suicidal woman. Although the present data do not speak to this
interpretation, if tenable, we believe more participants would have reported basic
stimulus elements of the central figure as well, such as the woman’s arms or legs.
However, frequency of describing the woman’s basic features was relatively low.
Indeed, half of all participants across all eight studies did not perform a bottom-up
analysis of the suicidal woman.

Taken together, these findings demonstrate that while visual schemas can enhance
the detection of expected stimuli, they can also significantly hamper the perception of
aschematic stimuli. We hypothesized that several different priming and structural
manipulations would improve participants’ detection of the central figure and various
peripheral elements, and this prediction was validated. Compared to a neutral condition,
for example, participants primed to search for something animate were more likely to see
a falling person (15.1 % more) and some central figure (28.6 % more), and they were less
likely to report seeing “Hotel” (6.1 % less) and “Coffee Shop” (31.2 % less). Although
neither the Inanimate nor the Unusual primes impacted reporting of a suicidal person,
both increased the likelihood of seeing some central figure. These findings are novel
insofar as they demonstrate the high degree of specificity with which a variety of priming
manipulations may influence what individuals see in a complex visual scene.

As expected, the largest priming effect was the significant increase in seeing a
suicidal woman among participants who had just read a brief vignette about a
despondent woman entering a hotel. In this condition, 12.4 % of participants reported
having seen the suicidal woman, compared to 2.2 % across all conditions and
replications. Participants who had just read the priming story were also much more
likely to report seeing “Hotel” (91.6 % in the story condition, compared to 83 %

Curr Psychol (2013) 32:301–317 313


across all conditions and replications). We interpret these effects based on the formula-
tion that the despondent woman story made the suicidal woman schematic with the
photograph, at least temporarily. Interestingly, removing the woman from the scene in
the second of two sequential presentations caused 58.2 % of participants to notice a
change in the scene, but it did not increase reports of having seen the suicidal woman.

Without these primes, a large majority of viewers failed to notice the woman in
any form, suicidal or not. At the same time, participants saw much that was not really
there. Indeed, although schematic processing of the scene interfered with participants’
accurate recognition of the aschematic figure, it also contributed to their tendency to
invent a series of very reasonable elements to accompany the composite. For example,
the barber pole becomes colored, neon signs flash, cars are parked on the street, and fire
hydrants, trees, benches, and chairs are all introduced in schematically-congruent
fashion. The perceived overcast sky (which is not even visible in the photograph) is
enlivened by noticing a Marine kissing a girl, a man pulling a woman by her hair, and
two people dancing in the street in front of the hotel. The central figure, when seen, takes
on more than forty different forms across various perceptual accounts that all make sense
given the context of the scene. Such is the stuff of vivid imaginations at odds with a
simpler, more accurate visual account of the scene.

A reasonable question is whether participants’ inability to detect the suicidal
woman is simply a failure of proper perceptual grouping (i.e., seeing an arm or leg
and the hotel sign, but not grouping the elements so as to perceive both the sign and
woman). This would be akin to the classic R. C. James (1973) photograph of a
Dalmatian, where viewers can stare at the photograph for minutes but not see the
Dalmatian… until it is labeled as such, at which point it becomes obvious. The label,
it seems, helps to organize the otherwise random dots into a meaningfully-grouped
representation of a dog. Although somewhat similar, the present findings differ from
this demonstration in at least two ways: first, participants viewing the suicidal woman
photograph were not required to decipher the general context of the scene (in fact,
they were told they were going to see an urban scene); and second, participants
viewing the suicidal woman photograph were not asked to code a seemingly random
visual array of elements (i.e., both the suicidal woman and the setting were “conceptually
whole” to begin with).

Relation to “Inattentional Blindness”

Considered more broadly, these effects are consistent with a growing body of
research on “inattentional blindness” (Becklen and Cervone 1983; Bredemeier and
Simons 2012; Hyman et al. 2010; Mack 2003; Mack and Rock 1998; Most et al.
2001; Neisser and Becklen 1975; Simons 2000; see also Varakin and Levin 2008).
This work has demonstrated, for example, that when an experimenter stops to ask a
pedestrian for directions and is interrupted by workmen walking between them, one
of whom then replaces the experimenter, only half of participants notice the change
(Simons and Levin 1998). And when a person in a gorilla suit or a woman with an
umbrella enters a visual scene, 46 % of participants fail to notice the event, despite its
prominent placement in the scene (Simons and Chabris 1999). Remarkably, these effects
occur even among “expert observers” who look directly at the target (e.g., the gorilla), as
confirmed by eye tracking (Drew et al. 2013). The present findings extend the results of

314 Curr Psychol (2013) 32:301–317


these studies by showing that many objects in complex scenes (e.g., “Fountain,”
“Hotel,” and “Coffee Shop” in the suicidal woman photograph) are actually easy to
see and readily reported by participants, assuming they can be reasonably expected
given the general context of the scene. This suggests that inattentional blindness cannot
simply be the result of poor visibility. In addition, the data from the eight studies reported
here show a far lower incidence of noticing (2.2 % for the suicidal woman) than in earlier
studies, in addition to high rates of misidentification and schema-congruent
replacements.

In addition, the present results are consistent with research on figure-ground
perception, including Davenport and Potter (2004), who found that objects and
their settings are processed interactively, with general knowledge about an object’s
appropriateness in a particular scene influencing the perception of that object (see also
Beiderman et al. 1982; Boyce and Pollatsek 1992; Boyce et al. 1989). Our findings
extend this work by highlighting the remarkable acuity that people exhibit when
deciphering objects that are schematically-congruent with their context or setting.
The present study also highlights the effects of a variety of priming and structural
manipulations on reporting accuracy. In many cases, these effects are highly specific,
improving recognition of prime-congruent elements while decreasing recognition of
prime-incongruent elements.

Limitations and Future Directions

Several limitations of these studies should be noted. First, our main comparisons
relied on frequencies of noticing the unexpected woman versus frequencies of
noticing other more schematic elements of the scene (i.e., a within-scene compari-
son). Future research would benefit from utilizing scenes in which it is possible to
systematically modify the extent to which elements are schematically congruent with
the scene by, for example, moving or altering the target object so as to make it either
congruent or incongruent with the context of the scene while keeping constant the
scene’s other characteristics. In addition, given the design of the present study, it was
not possible to determine why participants failed to report the woman. For example,
the present data do not speak to whether the unexpected visual information was
ignored or distorted at the sensory-perceptual level or at the encoding or retrieval
stages of remembering and reporting. This begs the question, were elements not
reported because they were not seen in the first place (i.e., “inattentional blindness”)
or rather because they were seen but not remembered (i.e., “inattentional amnesia”;
see Wolfe 1999)? Future work could employ a gaze-tracking device to identify
participants who look at the woman but who do not report seeing her. In addition,
researchers could monitor arousal levels in this context (e.g., using measures such as
the EEG P-300 “surprise wave,” heart rate measures, or fMRI indices) to identify
participants who respond emotionally, but who still do not report seeing the woman.

Concluding Remarks

In conclusion, the human visual system is remarkably good at helping us navigate a very
complex physical and social world that, more often than not, is highly predictable. Partly
because of the dynamics that underlie this efficiency, however, we are also surprisingly

Curr Psychol (2013) 32:301–317 315


bad at noticing rare or unexpected events, even when they are central elements of a scene
(Wolfe et al. 2005). When confronted with scene elements or situations for which we
have no established (or active) schema, such information appears to fall into a visual
“black hole” – a conceptual blind spot. This is the take-home message from the present
data. When perceptual expectations are violated, even our most basic visual recognition
processes may fail, as we look but do not see.

Acknowledgments Preparation of this report was supported in part by National Institutes of Health grant
R01 CA140933, by the Cousins Center for Psychoneuroimmunology, and by a Society in Science – Branco
Weiss Fellowship to George Slavich. We thank Lauren Anas, Waki Gamez, and Connie Turcott for assisting
with data management.

Declaration of Conflicting Interests The authors declare that they have no conflicts of interest with
respect to their authorship or the publication of this article.

References

Attneave, F. (1954). Some informational aspects of visual perception. Psychological Review, 61, 183–193.
Axelrod, R. (1973). Schema theory: an information processing model of perception and cognition. The

American Political Science Review, 67, 1248–1266.
Bar, M., & Ullman, S. (1996). Spatial context in recognition. Perception, 25, 343–352.
Beck, A. T. (1967). Depression: Clinical, experimental, and theoretical aspects. New York: Harper and

Row.
Becker, E. (1973). The denial of death. New York: Simon & Schuster.
Becklen, R., & Cervone, D. (1983). Selective looking and the noticing of unexpected events. Memory and

Cognition, 11, 601–608.
Beiderman, I., Mezzanotte, R. J., & Rabinowitz, J. C. (1982). Scene perception: detecting and judging

objects undergoing relational violations. Cognitive Psychology, 14, 143–177.
Boyce, S. J., & Pollatsek, A. (1992). Identification of objects in scenes: the role of scene background in

object naming. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 531–543.
Boyce, S. J., Pollatsek, A., & Rayner, K. (1989). Effect of background information on object identification.

Journal of Experimental Psychology: Human Perception and Performance, 15, 556–566.
Bredemeier, K., & Simons, D. J. (2012). Working memory and inattentional blindness. Psychonomic

Bulletin and Review, 19, 239–244.
Bruner, J. S. (1957). On perceptual readiness. Psychological Review, 64, 123–152.
Cantor, N., & Michel, W. (1979). Traits as prototypes: effects on recognition memory. Journal of

Personality and Social Psychology, 35, 38–48.
Chen, X., & Zelinsky, G. J. (2006). Is visual search a top-down or bottom-up process? Journal of Vision, 6,

447.
Chun, M. M. (2003). Scene perception and memory. In D. Irwin & B. Ross (Eds.), Psychology of learning

and motivation: Advances in research and theory: Cognitive vision (Vol. 42, pp. 79–108). San Diego:
Academic Press.

Cox, D., Meyers, E., & Sinha, P. (2004). Contextually evoked object-specific responses in human visual
cortex. Science, 304, 115–117.

Davenport, J. L., & Potter, M. C. (2004). Scene consistency in object and background perception.
Psychological Science, 15, 559–564.

Drew, T., Võ, M.L., & Wolfe, J.M. (2013). The invisible gorilla strikes again: sustained inattentional
blindness in expert observers. Psychological Science, 24, 1848–1853.

Greenberg, J., Pyszczynski, T., & Solomon, S. (1986). The causes and consequences of the need for self-
esteem: A terror management theory. In R. F. Baumeister (Ed.), Public self and private self (pp. 189–
212). New York: Springer.

Henderson, J. M., & Hollingworth, A. (1999). High-level scene perception. Annual Review of Psychology,
50, 243–271.

Hsieh, P. J., Colas, J. T., & Kanwisher, N. (2011). Pop-out without awareness: unseen feature singletons
capture attention only when top-down attention is available. Psychological Science, 22, 1220–1226.

316 Curr Psychol (2013) 32:301–317


Hyman, I. E., Boss, S. M., Wise, B. M., McKenzie, K. E., & Caggiano, J. M. (2010). Did you see the unicycling
clown? Inattentional blindness while walking and talking on a cell phone. Applied Cognitive Psychology,
24, 597–607.

Itti, L. (2005). Models of bottom-up attention and saliency. In L. Itti, G. Rees, & J. K. Tsotsos (Eds.),
Neurobiology of attention (pp. 576–582). San Diego: Elsevier.

Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 1254–1259.

James, R. C. (1973). Dalmatian photo. In R. L. Gregory (Ed.), The intelligent eye (p. 14). New York:
McGraw-Hill.

Kunhardt, P. B. (1986). LIFE: The first 50 years, 1936–1986. Boston: Little, Brown and Company.
Levy, S. R., Plaks, J. E., & Dweck, C. S. (1999). Modes of social thought: Implicit theories and social

understanding. In S. Chaiken & Y. Trope (Eds.), Dual-process theories in social psychology (pp. 179–
202). New York: Guilford Press.

Mack, A. (2003). Inattentional blindness: looking without seeing. Current Directions in Psychological
Science, 12, 180–184.

Mack, A., & Rock, I. (1998). Inattentional blindness. Cambridge: MIT Press.
Most, S. B., Simons, D. J., Scholl, B. J., Jimenez, R., Clifford, E., & Chablis, C. F. (2001). How not to be

seen: the contributions of similarity and selective ignoring to sustained inattentional blindness.
Psychological Science, 12, 9–17.

Neisser, U., & Becklen, R. (1975). Selective looking: attending to visually significant events. Cognitive
Psychology, 7, 480–494.

Palmer, S. E. (1975). The effects of contextual scenes on the identification of objects. Memory and
Cognition, 3, 519–526.

Perls, F. S. (1969). In J. O. Stevens (Ed.), Gestalt therapy verbatim. Lafayette: Real People Press.
Ross, L., & Ward, A. (1996). Naive realism in everyday life: Implications for social conflict and

misunderstanding. In T. Brown, E. Reed, & E. Turiel (Eds.), Values and knowledge (pp. 103–135).
Hillsdale: Lawrence Erlbaum Associates.

Simons, D. J. (2000). Attentional capture and inattentional blindness. Trends in Cognitive Sciences, 4, 147–155.
Simons, D. J., & Chabris, C. F. (1999). Gorillas in our midst: sustained inattentional blindness for dynamic

events. Perception, 28, 1059–1074.
Simons, D. J., & Levin, D. T. (1998). Failure to detect changes to people during a real-world interaction.

Psychonomic Bulletin and Review, 5, 644–649.
Slavich, G. M. (2005). Transformational teaching. E-xcellence in teaching, volume 5. Retrieved from

http://www.teachpsych.org/ebooks/eit2005/eit05-11.html.
Slavich, G. M. (2006). On becoming a teacher of psychology. In J. G. Irons, B. C. Beins, C. Burke, B.

Buskist, V. Hevern, & J. E. Williams (Eds.), The teaching of psychology in autobiography: Perspectives
from exemplary psychology teachers (pp. 92–99). Washington, DC: American Psychological Association.

Slavich, G. M. (2009). On 50 years of giving psychology away: an interview with Philip Zimbardo.
Teaching of Psychology, 36, 278–284.

Slavich, G.M., & Toussaint, L. (2013). Using the Stress and Adversity Inventory (STRAIN) as a teaching tool leads
to significant learning gains in two courses on stress and health. Stress and Health. doi:10.1002/smi.2523.

Slavich, G. M., & Zimbardo, P. G. (2012). Transformational teaching: theoretical underpinnings, basic
principles, and core methods. Educational Psychology Review, 24, 569–608.

Stryker, S., & Burke, P. J. (2000). The past, present, and future of an identity theory. Social Psychology
Quarterly, 63, 284–297.

Thoreau, H. D. (1906). The journal of Henry David Thoreau. In B. Torrey (Ed.), The writings of Henry
David Thoreau: XIII (pp. 77–78). Boston: Houghton Mifflin.

Torralba, A., Oliva, A., Castelhano, M., & Henderson, J. M. (2006). Contextual guidance of eye movements
and attention in real-world scenes: the role of global features in object search. Psychological Review, 113,
766–786.

Varakin, D. A., & Levin, D. T. (2008). Scene structure enhances change detection. Quarterly Journal of
Experimental Psychology, 61, 543–551.

Wolfe, J. M. (1999). Inattentional amnesia. In V. Coltheart (Ed.), Fleeting memories: Cognition of brief
visual stimuli (pp. 71–94). Cambridge: MIT Press.

Wolfe, J. M., Butcher, S. J., Lee, C., & Hyle, M. (2003). Changing your mind: on the contributions of top-
down and bottom-up guidance in visual search for feature singletons. Journal of Experimental
Psychology: Human Perception and Performance, 29, 483–502.

Wolfe, J. M., Horowitz, T. S., & Kenner, N. (2005). Rare items often missed in visual searches. Nature,
435, 439–440.

Curr Psychol (2013) 32:301–317 317

http://www.teachpsych.org/ebooks/eit2005/eit05-11.html
http://dx.doi.org/10.1002/smi.2523

	Out of Mind, Out of Sight: Unexpected Scene Elements Frequently Go Unnoticed Until Primed
	Abstract
	Method
	Overview
	Participants
	Stimulus Materials
	Priming Manipulations
	Structural Manipulations
	Procedure

	Results
	Effects of Visual Acuity and Viewing Position on Reporting of Scene Elements
	Perceiving the Central Figure
	Perceiving Peripheral Elements of the Scene
	Priming Effects
	Structural Manipulation Effects
	Qualitative Aspects of Scene Perception

	Discussion
	Relation to “Inattentional Blindness”
	Limitations and Future Directions
	Concluding Remarks

	References