pnas201217032 3..3


LETTER

Close but not proximate: The
significance of phonological
segments in speaking depends on
their functional engagement

Converging evidence points to a difference between European
and Chinese languages in the type of the initial units of phono-
logical encoding for speaking. The phonological access points
or “proximate units” (1, 2) are segmental in Indo-European
languages but whole syllables in Chinese. Accordingly, Chinese
speakers, unlike English speakers, do not register the presence
of consistent initial consonants in several word production tasks.
Qu et al.’s (3) intriguing report both supports and challenges this
interpretation. In their experiment, Mandarin speaking partici-
pants produced picture descriptions comprising a color-adjective
and noun that shared or did not share initial segments
(e.g., green guitar vs. blue guitar in English). Consistent with
previous findings, there was no response time benefit of shared
initial phonemes. In seeming contrast, there was an early dif-
ferentiation between shared and different onset conditions in
event-related potentials (ERPs).
In combination, these findings can be interpreted as particu-

larly compelling evidence for the subordinate role of phonemes
in production of Chinese: Even though the electrophysiology
reflected the presence of shared phonemes, there was no be-
havioral effect. However, how exactly are phonemes subordi-
nated? Qu et al. (3) proposed a complex account involving
override of phonological activation by a monitoring process.
However, this does not fully comport with the evidence. Because
object name retrieval is rapid but adjectives are prenominal, it
is plausible that adjectives and nouns are coactivated. What is
not clear is how the ERP signature of the resulting phono-
logical concord relates to production. The ERP patterns arose
in a 200- to 400-ms window, whereas speech was not initiated
until about 900 ms, fully 300 ms later than that typically ob-
served in single word production (4). This suggests that the
ERPs may index phonological connectivity but not necessarily

functional engagement of segments in preparation for pro-
duction. Moreover, if the cancelling process account is cor-
rect, one would expect facilitation, rather than a null effect, in
faster single-word production tasks for which monitoring is
not needed.
We also question Qu et al.’s equation of the proximate unit

account (1) with the view that phonemes are vestigial in pro-
duction of Chinese. We certainly do not endorse the idea
that “phonemes are artifacts resulting solely from experience
with an alphabetically organized orthographic system” (ref. 3,
p. 14266). In the report by O’Seaghdha et al. (figure 1A in
ref. 1), we spelled out a model for Mandarin Chinese in which
syllables are primary but in which phonemic specification oc-
curs for every selected syllable. Our statement that speakers
of Mandarin “intend to produce syllables, perhaps to the ex-
clusion of subsyllabic ingredients” (ref. 1, p. 285) thus refers to
an early intentional phase of production rather than to the
entire process.
These concerns aside, Qu et al.’s findings (3) bode well for

future more complete accounts of word production across
languages. Their article promises that comparison of ERP
patterns for conditions sharing a variety of phonological
units (e.g., syllables and segments among others) in Chinese,
European, and other languages will be very informative.

Padraig G. O’Seaghdhaa,1, Jenn-Yeu Chenb, and Train-Min Chenb
aDepartment of Psychology and Cognitive Science Program, Lehigh
University, Bethlehem, PA 18015; and bDepartment of Chinese as
a Second Language, National Taiwan Normal University, Taipei 106,
Taiwan

1. O’Seaghdha PG, Chen J-Y, Chen T-M (2010) Proximate units in word production:
Phonological encoding begins with syllables in Mandarin Chinese but with segments in
English. Cognition 115(2):282–302.

2. O’Seaghdha PG, Chen J-Y (2009) Toward a language-general account of word production:
The proximate units principle. Proceedings of the 31st Annual Conference of the Cognitive
Science Society, eds Taatgen NA, van Rijn H (Cognitive Science Society, Austin, TX), pp 68–73.

3. Qu Q, Damian MF, Kazanina N (2012) Sound-sized segments are significant for Mandarin
speakers. Proc Natl Acad Sci USA 109(35):14265–14270.

4. Indefrey P, Levelt WJ (2004) The spatial and temporal signatures of word production
components. Cognition 92(1-2):101–144.

Author contributions: P.G.O., J.-Y.C., and T.-M.C. wrote the paper.

The authors declare no conflict of interest.
1To whom correspondence should be addressed. E-mail: pat.oseaghdha@lehigh.edu.

www.pnas.org/cgi/doi/10.1073/pnas.1217032110 PNAS | January 2, 2013 | vol. 110 | no. 1 | E3

D
o
w

n
lo

a
d
e
d
 a

t 
C

a
rn

e
g
ie

 M
e
llo

n
 U

n
iv

e
rs

ity
 o

n
 A

p
ri
l 5

, 
2
0
2
1
 

mailto:pat.oseaghdha@lehigh.edu