Is Spoken Language All-or-Nothing?
Implications for future speech-based
human-machine interaction
Roger K. Moore
Speech and Hearing Research Group,
Dept. Computer Science, University of Sheﬃeld,
Regent Court, 211 Portobello, Sheﬃeld, S1 4DP, UK
r.k.moore@sheffield.ac.uk
http://www.dcs.shef.ac.uk/~roger/
Abstract. Recent years have seen signiﬁcant market penetration for
voice-based personal assistants such as Apple’s Siri. However, despite
this success, user take-up is frustratingly low. This position paper ar-
gues that there is a habitability gap caused by the inevitable mismatch
between the capabilities and expectations of human users and the fea-
tures and beneﬁts provided by contemporary technology. Suggestions are
made as to how such problems might be mitigated, but a more worri-
some question emerges: “is spoken language all-or-nothing”? The answer,
based on contemporary views on the special nature of (spoken) language,
is that there may indeed be a fundamental limit to the interaction that
can take place between mismatched interlocutors (such as humans and
machines). However, it is concluded that interactions between native and
non-native speakers, or between adults and children, or even between hu-
mans and dogs, might provide critical inspiration for the design of future
speech-based human-machine interaction.
Keywords: spoken language; habitability gap; human-machine interac-
tion
1
Introduction
The release in 2011 of Siri, Apple’s voice-based personal assistant for the iPhone,
signalled a step change in the public perception of spoken language technology.
For the ﬁrst time, a signiﬁcant number of everyday users were exposed to the
possibility of using their voice to enter information, navigate applications or
pose questions - all by speaking to their mobile device. Of course, voice dicta-
tion software had been publicly available since the release of Dragon Naturally
Speaking in 1997, but such technology only found success in niche market areas
for document creation (by users who could not or would not type). In contrast,
Siri appeared to oﬀer a more general-purpose interface that thrust the potential
beneﬁts of automated speech-based interaction into the forefront of the public’s
imagination. By combining automatic speech recognition and speech synthesis
arXiv:1607.05174v1  [cs.HC]  18 Jul 2016
2
Is Spoken Language All-or-Nothing?
with natural language processing and dialogue management, Siri promoted the
possibility of a more conversational interaction between users and smart devices.
As a result, competitors such as Google Now and Microsoft’s Cortana soon fol-
lowed1.
Of course, it is well established that, while voice-based personal assistants
such as Siri are now very familiar to the majority of mobile device users, their
practical value is still in doubt. This is evidenced by the preponderance of videos
on YouTubeTM that depict humorous rather than practical uses; it seems that
people give such systems a try, play around with them for a short while and
then go back to their more familiar ways of doing things. Indeed, this has been
conﬁrmed by a recent survey of users from around the world which showed that
only 13% of the respondents used a facility such as Siri every day, whereas 46%
had tried it once and then given up (citing inaccuracy and a lack of privacy as
key reasons for abandoning it) [2].
This lack of serious take-up of voice-based personal assistants could be seen as
the inevitable teething problems of a new(ish) technology, or it could be evidence
of something more deep-seated. This position paper addresses these issues, and
attempts to tease out some of the overlooked features of spoken language that
might have a bearing on the success or failure of voice-based human-machine
interaction. In particular, attention is drawn to the inevitable mismatch between
the capabilities and expectations of human users and the features and beneﬁts
provided by contemporary technical solutions. Some suggestions are made as to
how such problems might be mitigated, but a more worrisome question emerges:
“is spoken language all-or-nothing”?
2
The Nature of the Problem
There are many challenges facing the development of eﬀective voice-based human-
machine interaction [3,4]. As the technology has matured, so the applications
that are able to be supported have grown in depth and complexity (see Fig.1).
From the earliest military Command and Control Systems to contemporary com-
mercial Interactive Voice Response (IVR) Systems and the latest Voice-Enabled
Personal Assistants (such as Siri), the variety of human accents, competing
signals in the acoustic environment and the complexity of the application sce-
nario have always presented signiﬁcant barriers to practical usage. Considerable
progress has been made in all of the core technologies, particularly following the
emergence of the data-driven stochastic modelling paradigm [5] (now supple-
mented by deep learning [6]) as a key driver in pushing regularly benchmarked
performance in a positive direction. Yet, as we have seen, usage remains a serious
issue; not only does a speech interface compete with very eﬀective non-speech
GUIs [7], but people have a natural aversion to talking to machines in public
spaces [2]. As Nass & Brave stated in their seminal book Wired for Speech [8]:
1 See [1] for a comprehensive review of the history of speech technology R&D up to,
and including, the release of Siri.
Is Spoken Language All-or-Nothing?
3
“voice interfaces have become notorious for fostering frustration and failure”
(p.6).
Fig. 1. The evolution of spoken language technology applications from early military
Command and Control Systems to future Autonomous Social Agents (robots).
These problems become magniﬁed as the ﬁeld moves forward to develop-
ing voice-based interaction with Embodied Conversational Agents (ECAs) and
Autonomous Social Agents (robots). In these futuristic scenarios, it is assumed
that spoken language will provide a “natural” conversational interface between
human beings and so-called intelligent systems. However, there many additional
challenges which need to be overcome in order to address such a requirement . . .
“We need to move from developing robots that simply talk and listen
to evolving intelligent communicative machines that are capable of truly
understanding human behaviour, and this means that we need to look
beyond speech, beyond words, beyond meaning, beyond communication,
beyond dialogue and beyond one-oﬀinteractions.” [9] (p.321)
Of these, a perennial problem seems to be how to evolve the complexity
of voice-based interfaces from simple structured dialogues to more ﬂexible con-
versational designs without confusing the user [10,11,12]. Indeed, it has been
4
Is Spoken Language All-or-Nothing?
known for some time that there appears to be a non-linear relationship between
ﬂexibility and usability [13] - see Fig.2. As ﬂexibility increases with advancing
technology, so usability increases until users no longer know what they can and
cannot say, at which point usability tumbles and interaction falls apart.
Fig. 2. Illustration of the consequences of increasing the ﬂexibility of spoken language
dialogue systems; increasing ﬂexibility can lead to a habitability gap where usability
drops catastrophically (reproduced, with permission, from Mike Phillips [13]). This
means that it is surprisingly diﬃcult to deliver a technology corresponding to the
point marked ‘??’. Siri corresponds to the point marked ‘Add NL/Dialog’.
2.1
The “Habitability Gap”
Progress is being made in this area: for example, by providing targeted help to
users [14,15,16] and by replacing the traditional notion of turn-taking with a
more ﬂuid interaction based on incremental processing [17,18]. Likewise, simple
slot-ﬁlling approaches to language understanding and generation are being re-
placed by sophisticated statistical methods for estimating dialogue states and
optimal next moves [19,20]. Nevertheless, it is still the case that there is a hab-
itability gap of the form illustrated in Fig.2.
Is Spoken Language All-or-Nothing?
5
In fact, the shape of the curve illustrated in Fig.2 is virtually identical to
the famous Uncanny Valley eﬀect [21] in which a near human-looking artefact
(such as a humanoid robot) can trigger feelings of eeriness and repulsion in an
observer; as human likeness increases, so aﬃnity increases until a point where
artefacts start to appear creepy and aﬃnity goes negative. A wide variety of
explanations have been suggested for this non-linear relationship but, to date,
there is only one quantitative model [22], and this is founded on the combined
eﬀect of categorical perception and mismatched perceptual cues giving rise to
a form of perceptual tension. The implication of this model is that uncanni-
ness - and hence, habitability - can be avoided if care is taken to align how an
autonomous agent looks, sounds and behaves [23,9]. In other words, if a speech-
enabled agent is to converse successfully with a human being, it should make
clear its interactional aﬀordances [24,25].
This analysis leads to an important implication - since a spoken language
system consists of a number of diﬀerent components, each of which possesses
a certain level of technical capability, then in order to be coherent (and hence
usable), the design of the overall system needs to be aligned to the component
with the lowest level of performance. For example, giving an automated personal
assistant a natural human voice is a recipe for user confusion in the (normal)
situation where the other speech technology components are limited in their
abilities. In other words, in order to maximise the eﬀectiveness of the interaction,
a speech-enabled robot should have a robot voice. As Bruce Balentine
succinctly puts it [26]: “It’s better to be a good machine than a bad person”!
This is an unpopular result2, but there is evidence of its eﬀectiveness [27], and
it clearly has implications for contemporary voice-based personal assistants such
as Siri, Google Now and Cortana which employ very humanlike voices3.
Of course, some might claim that the habitability problem only manifests
itself in applications where task-completion is a critical measure of success. The
suggestion would be that the situation might be diﬀerent for applications in
domains such as social robots, education or games in which the emphasis would
be more on the spoken interaction itself. However, the argument presented in
this paper is not concerned with the nature of the interaction, rather it questions
whether such speech-based interaction can be sustained without access to the
notion of full language.
2.2
Half a Language?
So far, so good - as component technologies improve, so the ﬂexibility of the
overall system would increase, and as long as the capabilities of the individual
2 It is often argued that such an approach is unimportant as users will habituate.
However, habituation only occurs after sustained exposure, and a key issue here is
how to increase the eﬀectiveness of ﬁrst encounters (since that has a direct impact
on the likelihood of further usage).
3 Interestingly, these ideas do appear to be having some impact on the design of
contemporary autonomous social agents such as Jibo (which has a childlike and
mildly robotic voice) [28].
6
Is Spoken Language All-or-Nothing?
components are aligned, it should be possible to avoid falling into the habitability
gap.
However, sending mixed messages about the capabilities of a spoken language
system is only one part of the story; even if a speech-based autonomous social
agent looks, sounds and behaves in a coherent way, will users actually be able
to engage in conversational interaction if the overall capability is less than that
normally enjoyed by a human being? What does it mean for a language-based
system to be compromised in some way? How can users know what they may
and may not say [29,15], or even if this is the right question? Is there such a
thing as half a language and, if so, is it habitable? Indeed, what is language
anyway?
3
What is Language?
Unfortunately there is no space here to review the extensive and, at times,
controversial history of the scientiﬁc study of language, or of the richness and
variety of its spoken (and gestural) forms. Suﬃce to say that human beings have
evolved a proliﬁc system of (primary vocal) interactive behaviours that is vastly
superior to that enjoyed by any other animal [30,31,32,33,34]. As has been said
a number of times . . .
“Spoken language is the most sophisticated behaviour of the most com-
plex organism in the known universe.” [35].
The complexity and sophistication of (spoken) language tends to be masked
by the apparent ease with which we, as human beings, use it. As a consequence,
engineered solutions are often dominated by a somewhat na¨ıve perspective in-
volving the coding and decoding of messages passing from one brain (the sender)
to another brain (the receiver). In reality, languaging is better viewed as an emer-
gent property of the dynamic coupling between cognitive unities that serves to
facilitate distributed sense-making through cooperative behaviours and, thereby,
social structure [36,37,38,39,40]. Furthermore, the contemporary view is that lan-
guage is based on the co-evolution of two key traits - ostensive-inferential com-
munication and recursive mind-reading (including ‘Theory-of-Mind’) [41,42,43] -
and that abstract (mental) meaning is grounded in the concrete (physical) world
through metaphor [44,45].
These modern perspectives on language not only place strong emphasis on
pragmatics [46], but they are also founded on an implicit assumption that in-
terlocutors are conspeciﬁcs4 and hence share signiﬁcant priors. Indeed, evidence
suggests that some animals draw on representations of their own abilities (ex-
pressed as predictive models [47]) in order to interpret the behaviours of others
[48,49]. For human beings, this is thought to be a key enabler for eﬃcient recur-
sive mind-reading and hence for language [50,51].
Several of these advanced concepts may be usefully expressed in pictographic
form [52] - see Fig.3.
4 Members of the same species.
Is Spoken Language All-or-Nothing?
7
Fig. 3. Pictographic representation of language-based coupling (dialogue) between two
human interlocutors [52]. One interlocutor (and its environment) is depicted using solid
lines and the other interlocutor (and its environment) is depicted using broken lines.
As can be seen, communicative interaction is founded on two-way ostensive recursive
mind-reading (including mutual Theory-of-Mind).
8
Is Spoken Language All-or-Nothing?
So now we arrive at an interesting position; if (spoken) language interaction
between human beings is grounded through shared experiences, representations
and priors, to what extent is it possible to construct a technology that is intended
to replace one of the participants? For example, if one of the interlocutors illus-
trated in Fig.3 is replaced by a cognitive robot (as in Fig.4), then there will be an
inevitable mismatch between the capabilities of the two partners, and coupled
ostensive recursive mind-reading (i.e. full language) cannot emerge.
Fig. 4. Pictographic representation of coupling between a human being (on the left)
and a cognitive robot (on the right). The robot lacks the capability of ostensive recursive
mind-reading (it has no Theory-of-Mind), so the interaction is inevitably constrained.
Could it be that there is a fundamental limit to the language-based inter-
action that can take place between unequal partners - between humans and
machines? Indeed, returning to the question posed in Section 2.2 “Is there such
a thing as half a language?”, the answer seems to be “no”; spoken language does
appear to be all-or-nothing . . .
“The assumption of continuity between a fully coded communication
system at one end, and language at the other, is simply not justiﬁed.”
[41] (p.46).
Is Spoken Language All-or-Nothing?
9
4
The Way Forward?
The story thus far provides a compelling explanation of the less-than-satisfactory
experiences enjoyed by existing users of speech-enabled systems and identiﬁes
the source of the habitability gap outlined in Section 2.1. It would appear that,
due to the gross mismatch between their respective priors, it might be impos-
sible to create an automated system that would be capable of a sustained and
productive language-based interaction with a human being (except in narrow
specialised domains involving experienced users). The vision of constructing a
general-purpose voice-enabled autonomous social agent may be fundamentally
ﬂawed - the equivalent of trying to build a vehicle that travels faster than light!
However, before we give up all hope, it is important to note that there are sit-
uations where voice-based interaction between mismatched partners is successful
- but these are very diﬀerent from the scenarios that are usually considered when
designing current speech-based systems. For example, human beings regularly
engage in vocal interaction with members of a diﬀerent cultural and/or linguis-
tic and/or generational background5. In such cases, all participants dynamically
adjust many aspects of their behaviour - the clarity of their pronunciation, their
choice of words and syntax, their style of delivery, etc. - all of which may be con-
trolled by the perceived eﬀectiveness of the interaction (that is, using feedback in
a coupled system). Indeed, a particularly good example of such accommodation
between mismatched interlocutors is the diﬀerent way in which caregivers talk to
young children (termed “parentese”) [53]. Maybe these same principles should be
applied to speech-based human-machine interaction? Indeed, perhaps we should
be explicitly studying the particular adaptations that human beings make when
attempting to converse with autonomous social agents - a new variety of spoken
language that could be appropriately termed “robotese”6.
Of course, these scenarios all involve spoken interaction between one human
being and another, hence in reality there is a huge overlap of priors in terms
of bodily morphology, environmental context and cognitive structure, as well as
learnt social and cultural norms. Arguably the largest mismatch arises between
an adult and a very young child, yet this is still interaction between members of
the same species. A more extreme mismatch exists between non-conspeciﬁcs; for
example, between humans and animals. However, it is interesting to note that our
nearest relatives - the apes - do not have language, and this seems to be because
they do not have the key precursor to language: ostensive communication (apes
do not seem to understand pointing gestures) [41].
Interestingly, one animal - the domestic dog - appears to excel in ostensive
communication and, as a consequence, dogs are able to engage in very pro-
ductive spoken language interaction with human partners (albeit one-sided and
somewhat limited in scope) [55,41]. Spoken human-dog interaction may thus
5 Interestingly, Nass & Brave [8] noted that people speak to poor automatic speech
recognition systems as if they were non-native listeners.
6 Unfortunately, this term has already been coined to refer to a robot’s natural lan-
guage abilities in robot-robot and robot-human communication [54].
10
Is Spoken Language All-or-Nothing?
be a potentially important example of a heavily mismatched yet highly eﬀec-
tive cooperative conﬁguration that might usefully inform spoken human-robot
interaction in hitherto unanticipated ways.
5
Final Remarks
This paper has argued that there is a fundamental habitability problem facing
contemporary spoken language systems, particularly as they penetrate the mass
market and attempt to provide a general-purpose voice-based interface between
human users and (so-called) intelligent systems. It has been suggested that the
source of the diﬃculty in conﬁguring genuinely usable systems is twofold: ﬁrst,
the need to align the visual, vocal and behavioural aﬀordances of the system, and
second, the need to overcome the huge mismatch between the capabilities and
expectations of a human being and the features and beneﬁts oﬀered by even the
most advanced autonomous social agent. This led to the preliminary conclusion
that spoken language may indeed be all-or-nothing.
Finally, and on a positive note, it was observed that there are situations
where successful spoken language interaction can take place between mismatched
interlocutors (such as between native and non-native speakers, or between an
adult and a child, or even between a human being and a dog). It is thus concluded
that these scenarios might provide critical inspiration for the design of future
speech-based human-machine interaction.
Acknowledgement
This work was supported by the European Commission [grant numbers EU-FP6-
507422, EU-FP6-034434, EU-FP7-231868 and EU-FP7-611971], and the UK En-
gineering and Physical Sciences Research Council [grant number EP/I013512/1].
References
1. Pieraccini, R. (2012). The Voice in the Machine. MIT Press, Cambridge, MA.
2. Liao, S.-H. (2015). Awareness and Usage of Speech Technology. Masters thesis,
Dept. Computer Science, University of Sheﬃeld.
3. Deng, L., & Huang, X. (2004). Challenges in adopting speech recognition. Com-
munications of the ACM, 47(1), 69-75.
4. Minker, W., Pittermann, J., Pittermann, A., Strauß, P.-M., & B¨uhler, D. (2007).
Challenges in speech-based human-computer interfaces. International Journal of
Speech Technology, 10(2-3), 109-119.
5. Gales, M., Young, S. J. (2007). The application of hidden Markov models in speech
recognition. Foundations and Trends in Signal Processing, 1(3), 195-304.
6. Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A., Jaitly, N., Senior, A.,
Vanhoucke, V., Nguyen, P., Sainath, T. N., & Kingsbury, B. (2012). Deep neural
networks for acoustic modeling in speech recognition: the shared views of four
research groups. Signal Processing Magazine, IEEE.
Is Spoken Language All-or-Nothing?
11
7. Moore, R. K. (2004). Modelling data entry rates for ASR and alternative input
methods. In INTERSPEECH-ICSLP. Jeju, Korea.
8. Nass, C., & Brave, S. (2005). Wired for Speech: How Voice Activates and Advances
the Human-computer Relationship. Cambridge, MA: MIT Press.
9. Moore, R. K. (2015). From talking and listening robots to intelligent communica-
tive machines. In J. Markowitz (Ed.), Robots That Talk and Listen (pp. 317-335).
Boston, MA: De Gruyter.
10. Bernsen, N. O., Dybkjaer, H., & Dybkjaer, L. (1998). Designing Interactive Speech
Systems: From First Ideas to User Testing. London: Springer-Verlag.
11. McTear, M. F. (2004). Spoken Dialogue Technology: Towards the Conversational
User Interface. London: Springer-Verlag.
12. Lopez Cozar Delgado, R. (2005). Spoken, Multilingual and Multimodal Dialogue
Systems: Development and Assessment. Wiley.
13. Philips, M. (2006). Applications of spoken language technology and systems. In M.
Gilbert & H. Ney (Eds.), IEEE/ACL Workshop on Spoken Language Technology
(SLT).
14. Tomko, S., Harris, T. K., Toth, A., Sanders, J., Rudnicky, A., & Rosenfeld, R.
(2005). Towards eﬃcient human machine speech communication. ACM Transac-
tions on Speech and Language Processing, 2(1), 1-27.
15. Tomko, S. L. (2006). Improving User Interaction with Spoken Dialog Systems via
Shaping. PhD Thesis, Carnegie Mellon University.
16. Komatani, K., Fukubayashi, Y., Ogata, T., & Okuno, H. G. (2007). Introducing
utterance veriﬁcation in spoken dialogue system to improve dynamic Help gener-
ation for novice users. In 8th SIGdial Workshop on Discourse and Dialogue (pp.
202-205).
17. Schlangen, D., & Skantze, G. (2009). A general, abstract model of incremental
dialogue processing. 12th Conference of the European Chapter of the Association
for Computational Linguistics (EACL-09). Athens, Greece.
18. Hastie, H., Lemon, O., & Dethlefs, N. (2012). Incremental spoken dialogue sys-
tems: Tools and data. In Proceedings of NAACL-HLT Workshop on Future Direc-
tions and Needs in the Spoken Dialog Community (pp. 15-16). Montreal, Canada.
19. Williams, J. D., & Young, S. J. (2007). Partially observable Markov decision pro-
cesses for spoken dialog systems. Computer Speech and Language, 21(2), 231-422.
20. Gasic, M., Breslin, C., Henderson, M., Kim, D., Szummer, M., Thomson, B.,
Tsiakoulis, P., & Young, S. J. (2013). POMDP-based dialogue manager adaptation
to extended domains. In SIGDIAL (pp. 214-222). Metz, France.
21. Mori, M. (1970). Bukimi no tani (the uncanny valley). Energy, 7, 33-35.
22. Moore, R. K. (2012). A Bayesian explanation of the “Uncanny Valley” eﬀect and
related psychological phenomena. Nature Scientiﬁc Reports, 2(864).
23. Moore, R. K., & Maier, V. (2012). Visual, vocal and behavioural aﬀordances: some
eﬀects of consistency. 5th International Conference on Cognitive Systems (CogSys
2012). Vienna.
24. Gibson, J. J. (1977). The theory of aﬀordances. In R. Shaw & J. Bransford (Eds.),
Perceiving, Acting, and Knowing: Toward an Ecological Psychology (pp. 67-82).
Hillsdale, NJ: Lawrence Erlbaum.
25. Worgan, S., & Moore, R. K. (2010). Speech as the perception of aﬀordances.
Ecological Psychology, 22(4), 327-343.
26. Balentine, B. (2007). It’s Better to Be a Good Machine Than a Bad Person: Speech
Recognition and Other Exotic User Interfaces at the Twilight of the Jetsonian Age.
Annapolis: ICMI Press.
12
Is Spoken Language All-or-Nothing?
27. Moore, R. K., & Morris, A. (1992). Experiences collecting genuine spoken enquiries
using WOZ techniques. 5th DARPA Workshop on Speech and Natural Language.
New York.
28. Jibo: The World’s First Social Robot for the Home, https://www.jibo.com
29. Jokinen, K., & Hurtig, T. (2006). User expectations and real experience on a
multimodal interactive system. In INTERSPEECH-ICSLP Ninth International
Conference on Spoken Language Processing. Pittsburgh, PA, USA.
30. Gardiner, A. H. (1932). The Theory of Speech and Language. Oxford, England:
Oxford Univ. Press.
31. Bickerton, D. (1995). Language and Human Behavior. Seattle, WA, US: University
of Washington Press.
32. Hauser, M. D. (1997). The Evolution of Communication. The MIT Press.
33. Hauser, M. D., Chomsky, N., & Fitch, W. T. (2002). The faculty of language:
what is it, who has it, and how did it evolve? Science, 298, 1569-1579.
34. Everett, D. (2012). Language: The Cultural Tool. London: Proﬁle Books.
35. Moore, R. K. (2007). Spoken language processing: piecing together the puzzle.
Speech Communication, 49(5), 418-435.
36. Maturana, H. R., & Varela, F. J. (1987). The Tree of Knowledge: The Biological
Roots of Human Understanding. Boston, MA: New Science Library/Shambhala
Publications.
37. Cummins, F. (2014). Voice, (inter-)subjectivity, and real time recurrent interac-
tion. Frontiers in Psychology, 5, 760.
38. Bickhard, M. H. (2007). Language as an interaction system. New Ideas in Psy-
chology, 25(2), 171-187.
39. Cowley, S. J. (Ed.). (2011). Distributed Language. John Benjamins Publishing
Company.
40. Fusaroli, R., Raczaszek-Leonardi, J., & Tyl´en, K. (2014). Dialog as interpersonal
synergy. New Ideas in Psychology, 32, 147-157.
41. Scott-Phillips, T. (2015). Speaking Our Minds: Why human communication is
diﬀerent, and how language evolved to make it special. Palgrave MacMillan.
42. Baron-Cohen, S. (1999). Evolution of a theory of mind? In M. Corballis & S. Lea
(Eds.), The Descent of Mind: Psychological Perspectives on Hominid Evolution.
Oxford University Press.
43. Malle, B. F. (2002). The relation between language and theory of mind in develop-
ment and evolution. In T. Giv´on & B. F. Malle (Eds.), The Evolution of Language
out of Pre-Language (pp. 265-284). Amsterdam: Benjamins.
44. Lakoﬀ, G., & Johnson, M. (1980). Metaphors We Live By. Chicago: University of
Chicago Press.
45. Feldman, J. A. (2008). From Molecules to Metaphor: A Neural Theory of Language.
Bradford Books.
46. Levinson, S. C. (1983). Pragmatics. Cambridge: Cambridge University Press.
47. Friston, K., & Kiebel, S. (2009). Predictive coding under the free-energy principle.
Phil. Trans. R. Soc. B, 364(1521), 1211-1221.
48. Rizzolatti, G., & Craighero, L. (2004). The mirror-neuron system. Annual Review
of Neuroscience, 27, 169-192.
49. Wilson, M., & Knoblich, G. (2005). The case for motor involvement in perceiving
conspeciﬁcs. Psychological Bulletin, 131(3), 460-473.
50. Pickering, M. J., & Garrod, S. (2007). Do people use language production to make
predictions during comprehension? Trends in Cognitive Sciences, 11(3), 105-110.
Is Spoken Language All-or-Nothing?
13
51. Garrod, S., Gambi, C., & Pickering, M. J. (2013). Prediction at all levels: forward
model predictions can enhance comprehension. Language, Cognition and Neuro-
science, 29(1), 46-48.
52. Moore, R. K. (2016). Introducing a pictographic language for envisioning a rich
variety of enactive systems with diﬀerent degrees of complexity. Int. J. Advanced
Robotic Systems, 13(74).
53. Fernald, A. (1985). Four-month-old infants prefer to listen to Motherese. Infant
Behavior and Development, 8, 181-195.
54. Matson, E. T., Taylor, J., Raskin, V., Min, B.-C., & Wilson, E. C. (2011). A
natural language exchange model for enabling human, agent, robot and machine
interaction. In The 5th International Conference on Automation, Robotics and
Applications (pp. 340-345). IEEE.
55. Serpell, J. (1995). The Domestic Dog: Its Evolution, Behaviour and Interactions
with People. Cambridge University Press.