Exploring interactive capabilities for home robots
via medium ﬁdelity prototyping
Martin Cooney1,
Sepideh Pashami1,
Yuantao Fan1,
Anita Sant’Anna1,
Yinrong Ma1,
Tianyi Zhang1,
Yuwei Zhao1,
Wolfgang Hotze1,
Jeremy Heyne1,
Cristofer Englund2,
Achim J. Lilienthal3,
Tom Ziemke4
1 Halmstad University
2 RISE Viktoria
3 ¨Orebro University
4 University of Sk¨ovde
Manuscript drafted: November 15, 2016
Abstract
In order for autonomous robots to be able to support people’s well-
being in homes and everyday environments, new interactive capabilities
will be required, as exempliﬁed by the soft design used for Disney’s recent
robot character Baymax in popular ﬁction. Home robots will be required
to be easy to interact with and intelligent–adaptive, fun, unobtrusive and
involving little eﬀort to power and maintain–and capable of carrying out
useful tasks both on an everyday level and during emergencies. The cur-
rent article adopts an exploratory medium ﬁdelity prototyping approach
for testing some new robotic capabilities in regard to recognizing people’s
activities and intentions and behaving in a way which is transparent to
people. Results are discussed with the aim of informing next designs.
1
Introduction
The current article reports on several new robotic prototypes built to explore
some potentially important challenges for bringing interactive and intelligent
robots into home environments.
Figure 1 describes some general motivation for the current work. With a
growing challenge of insuﬃcient resources to care for increasingly isolated elderly
populations in ﬁrst world countries, and evidence linking loneliness to health
problems, robots could help to support well-being of interacting persons by
interacting appropriately. To achieve this, people must ﬁrst be willing to accept
Contact: martin.daniel.cooney@gmail.com. We received support for the
current work from the Swedish Knowledge Foundation for the SIDUS AIR
project.
arXiv:1710.01541v2  [cs.RO]  9 Oct 2017
robots into their daily lives, which will require robots to be easy to interact with
and useful. Ease of interaction extends to all major components of a robot,
both in reacting to humans and proactively behaving; and, usefulness covers
many tasks, both everyday and during emergencies. Thus a challenge is that
there is a need for ideas which could contribute toward forming a solution for
improving quality of life for the elderly, but the design space is highly expansive
and designing complete robotic solutions can require much cost in time and
resources.
Our approach in the current article involved medium ﬁdelity prototyping to
quickly test new interactive capabilities. Medium ﬁdelity prototyping strikes a
balance between obtaining accurate insight into how a complete system would
perform (including in some cases how a system will be perceived by interacting
persons) and allowing results to be acquired quickly and practically [Engelberg
and Seﬀah 2002]. For simplicity in the current work we use this term in a general
sense to describe an approach between low and high ﬁdelity prototyping; thus,
we do not mean that all features of a prototype must be mid-level in terms of
completeness or lack details like in “horizontal” prototyping [Rudd et al. 1996]
but also include approaches for combining aspects of diﬀerent levels of ﬁdelity
referred to as “vertical” or “mixed-ﬁdelity” prototyping [McCurdy et al. 2006].
Some related concepts include the Wizard of Oz technique in Human-Robot
Interaction (HRI) in which teleoperation is used to perform challenging tasks
which would be challenging for an autonomous robot [Riek 2012], and the slogan,
“Fail often, fail fast, fail cheap” which advocates trying many possibilities early
on in the development process [Lee et al. 2010].
Following such an approach, nine systems were built, as described in Figure 2
and Table 1, and evaluated. The main contribution of the current article is a
description of insights drawn from testing some new capabilities for a home
companion robot, intended to facilitate technological acceptance of robots into
homes.
Note: parts 1, 2, and 6 relating to energy harvesting, breath sensing, and
transparency, are new; the other results have been partially presented at con-
ferences/workshops (3, 4, 7) [Cooney et al. 2012; Cooney and Karlsson 2015;
Lundstrom et al. 2016], submitted to journals (3) [Cooney and Sant’Anna 2016],
or described as part of student projects at our university (5, 8, 9) [Ma 2016;
Hotze 2016; Zhang and Zhao 2016; Heyne 2015]. All parts relate to interactive
capabilities; parts 2, 5, 7, 8, and 9 also relate to intelligent capabilities.
2
Cooney et al. Well-being through Robots 
Every Day Tasks:  
e.g., Vacuuming, 
Patrolling, etc  
(Not Addressed Here) 
Emergencies 
Home Robot 
Human 
Supports Well-being 
Accepts 
Composed of 
Appearance 
Awareness 
Motions 
Sensing 
Energy 
Basic  
Ease of Interaction 
Useful 
Applications 
Input 
Output 
1 Energy harvesting 
2 Private 
5 Self-fixing 
3 Fun 
Requires 
7 Going to Victim 
8 Assessing 
9 Finding Help 
4 Adaptive 
Some Desired  
Capabilities 
6 Unobtrusive 
Figure 1: A basic goal of home robots is to help people to feel well.
To achieve this, people must be willing to accept robots, which will
require robots to be easy to interact with and useful. Ease of interac-
tion extends to all major components of a robot, enabling proactive
and reactive behavior. Usefulness covers not only tasks such as vac-
uuming but also helping people in emergencies.
3
mirror 
 
arm/marker 
target to fix 
 
Kinect 
(1) 
ears 
eyes 
mouth 
energy: 
optic (top) 
kinetic (back) 
thermal (side) 
(2) 
eyes 
breath 
sensor 
objects 
(3) 
reaching 
arms 
(4) 
(5) 
(6) 
transparent 
opaque 
(7) 
(8) 
(9) 
close 
person 
far 
person 
Who to ask 
for help? 
drone 
robot 
sensors 
pose? 
vital 
signs? 
mannequin 
Figure 2: Prototypes built: (1) Energy: Energy harvesting, (2) Sens-
ing: Private breath sensing, (3) Motions: Fun reaching, (4) Motions:
Adaptive size-changing, (5) Appearance: Self-ﬁxing, (6) Appearance:
Unobtrusive transparency, (7) Emergency:
Going to a victim, (8)
Emergency: Health assessment, (9) Emergency: Finding help. Note:
larger images are provided in each section devoted to a prototype,
and a video is also provided with the article.
2
Prototype 1 Energy: Energy harvesting
Home robots require power. Recharging and replacing batteries requires contin-
ued attention and eﬀort from humans, robots with docking stations can become
stranded, and wireless power transmission is not cost eﬃcient.
It would be
helpful if robots could seek to secure some power themselves.
A wide range of approaches has been described for how a robot could itself
acquire energy, involving wind, water, light, pressure, heat, salinity, and radio
4
Table 1: Prototypes described in relation to some previous work.
HRI Capability
Some past work
Novelty
1 Energy:
En-
ergy harvesting
Robots powered by var-
ious environmental en-
ergy sources [Kelly et al.
2000; Zivanovic et al.
2009]
Capability for a robot to pro-
vide interactive feedback pow-
ered by a person’s heat and ki-
netic energy
2
Sensing:
Private
breath
sensing
Gas source detection by
a mobile robot [Ben-
netts et al. 2014]
Reacting to a closeby person via
breath sensing
3 Motions: Fun
reaching
Some
multiobjective
motion
generation
studies
exploring
the
communication
of
in-
tentions [Dragan et al.
2014;
Holladay et al.
2014]
Design for generating some mo-
tions which incorporate fun and
functional components [Cooney
and Sant’Anna 2016]
4
Motions:
Adaptive
size-
changing
Impressions
of
short
or tall robots [Walters
2009; Rae 2013]; and
a design which can be
large, tall or wide, but
not small [Tachi 2012]
Some typical impressions of size
changes using a design which
can be large, tall, wide, or small
[Cooney and Karlsson 2015]
5
Appearance:
Self-ﬁxing
Self-detection
[Gold
and
Scassellati
2009];
anomaly-detection
[Suzuki et al.
2011];
self-modiﬁcation
[Revzen et al. 2011]
“SAS”:
combining
these
ap-
proaches [Ma 2016]
6
Appearance:
Transparent
Some
methods
for
achieving transparency
such as active camou-
ﬂaging
[Tachi
2003],
and colored liquids/soft
components [Morin et
al. 2014]
Some
typical
impressions
of
transparency
using
a
design
combining smart ﬁlm and con-
ductive plastic
7
Emergency:
Going
to
a
victim
Anomaly detection [No-
vak et al. 2013], robot
navigation [Santos et al.
2013]
Combining
these
approaches
[Lundstrom et al. 2016]
8
Emergency:
Health
Assess-
ment
Tele-operated
health-
care robots [Katz 2015;
Martinic 2014]
Robotic localization of points of
interest for ﬁrst aid on a fallen
person and measuring of some
vital signs [Hotze 2016; Zhang
and Zhao 2016]
9
Emergency:
Finding help
Helpfulness
in
online
reviews
[Ghose
and
Ipeirotis 2011]
Approaching people estimated
to be helpful for ﬁrst aid [Heyne
2015]
5
waves; some robots have even been designed to mimic animals by acquiring
energy from ingesting prey such as slugs and ﬂies, via microbial fuel cells [Kelly
et al. 2000; Zivanovic et al. 2009]. We wondered if a social home robot could
use energy from a human to interact, but to our knowledge such a prototype
had not been designed.
We built a prototype intended to sit on an (elderly) person’s lap which
secures energy from (1) thermal (the person’s body temperature), (2) kinetic
(stroking), and (3) optical (light) sources, to provide simple visible interactive
feedback, as shown in Figure 3. (Light energy, while not directly obtained from
a human, can become available when a human turns on a light or takes a robot
out of storage to interact.)
 
 
 
Output: 
 
(a) ears 
 
(b) eyes 
 
(c) mouth 
Input (energy): 
 
(a) optic (top) 
 
(b) thermal (sides) 
 
(c) kinetic (back) 
Figure 3: Energy harvesting prototype.
2.1
Approach
Design was informed by a scenario-speciﬁc requirement–the capability to lever-
age multiple sources of energy simultaneously from a typical holding grasp–as
well as general requirements which could facilitate close interaction–human-
likeness (to provide a familar interface), light weight and small size. By allowing
the palms of a user’s hands on the prototype’s sides to provide thermal energy,
6
squeezes from the user’s ﬁngertips on the prototype’s back to provide kinetic en-
ergy, and light (e.g., from the sun or a ceiling lamp) on the top of the prototype
to provide optical energy, all of the robot’s feedback can be seen at once. Eas-
ily held (360g; 26.5cm height x 12.3cm width x 8.5cm depth), the prototype’s
face is a visual focal point in social interactions; therefore the harvested energy
is used to power facial parts: Light Emitting Diodes (LEDS) for the eyes and
mouth, and two wagging ears.
2.2
Evaluation
To objectively compare the prototype in varying conditions, we used OpenCV
to measure the magnitude of visual reactions in video recordings of the proto-
type during interactions. For thermal energy, a video was made in which the
prototype was held until the LED eyes lit and released ﬁve times at a typical
illuminance of 500 lux; average time from initiating contact (no red pixels de-
tectable) to full light up (based on counting pixels when fully lit) was calculated
using basic image processing to pick colors and remove noise. For kinetic en-
ergy, a video was made while placing known weights on top of the prototype ten
times; the minimum pressure to light the mouth higher than a threshold was
calculated. For optical energy, distance travelled by a red marker on one ear
was calculated while increasing the amount of light on the energy generator in
100 lux intervals to ﬁnd an approximate minimum amount of light required to
produce a noticeable reaction; distance and frequency at a typical illuminance
were also calculated.
2.3
Results
For thermal energy, the average time for the LED eyes to light was 7.5 seconds.
For kinetic energy, LED mouth lighting was visible ((t(4) = 3.5, p = .02),
for a two-tailed t-test) using a 600g weight with 45 cm2 area (5.9N, 1300Pa).
For optical energy, oscillation was observed with a displacement of 0.33cm and
frequency of 0.4Hz at 300 lux, and 0.78cm and 0.5Hz at 500 lux.
We feel these results indicate the feasibility of integrating energy harvesting
into a small social home robot because: for the thermal reaction, we expect
people to generally hold pets or children longer than a few seconds; the touch
required to light the mouth was similar to a pat on the back or light hand
squeeze [Spears 1985] and much less than the average grip force of a human
(approx. 1/34, at 200N [Edgren et al. 2004]); and human environments are
often brighter than 300 lux, which is appropriate for large visual tasks with
high contrast [IES 1993], and humans are capable of observing smaller radial
motion in such conditions [Lappin et al. 2009]. We noticed that this ability to
operate in typical conditions, along with its simple design, makes the platform
easy to demonstrate; we were able to hand out the platform to a visiting class
of undergraduate engineering students without worrying about where the demo
would take place and without extensive explanation.
7
3
Prototype 2 Sensing: Privacy-preserving breath
sensing
In addition to leveraging nearby energy sources, home robots can recognize and
react contingently to their sensor input: for example, looking toward an inter-
acting person and tailoring their behavior such that people feel that they matter
to the robot. Cameras and microphones can be used for recognition, but such
sensors could also be misused to acquire personal data identifying individuals
and behaviors, especially in a close physical interaction. Alternatives such as
infrared or radar sensors could oﬀer more privacy, but can require placement
facing a person or have problems with occlusions and reﬂections. One simple
solution which could avoid such problems is inspired by the importance of smell
in the animal world. Gas sensors, previously used by a mobile robot to detect
ﬁres and gas leaks [Bennetts et al. 2014, Lilienthal et al. 2006, Lilienthal and
Duckett 2003], present a promising alternative which to our knowledge had not
been explored yet for a social robot. We built a prototype which uses a MQ-135
gas sensor as a breath sensor to recognize an interacting person’s relative loca-
tion in order to react and provide some simple interactive feedback, as shown
in Figure 4.
 
 
 
 
nose  
(breath sensor) 
 
 
 
 
 
 
eyes 
 
 
 
 
 
pan-tilt 
(2 DOFs) 
Figure 4: Breath sensing prototype.
8
3.1
Approach
Our design took into account scenario-speciﬁc requirements–breath sensing ca-
pability and a mechanism for conveying an illusion of attention–as well as general
requirements which might facilitate interaction–human-likeness (as a familar in-
terface), lightness and small size. Breath sensing was aﬀorded by a typical in-
expensive metal oxide sensor used for air quality control which is also suitable
for detecting CO2.
Requirements for the algorithm were obtained from considering some sim-
ple desired interactive scenarios: detecting a person’s presence or absence in
the robot’s vicinity, rough location (e.g., left or right), a change in a person’s
location, and the number of interacting persons (one or two). This suggested
that, functionally, the prototype should keep track of its state of belief if there
is a human in front of it and orientation, and detect changes. Non-functional
requirements were reaction speed and correctness, robustness to noise, and adap-
tiveness (ﬁxed thresholds to ﬁnd anomalies, based on recording sensor values
when a human isn’t nearby, could not be used because sensor values take ex-
tremely long to revert to “normal” (approx. 1 minute)). To balance reaction
speed and correctness, the robot’s algorithm combined a simple but fast ap-
proach to recognize sudden large changes and a richer but longer-term approach
to recognize slight changes over longer times. The simple algorithm uses a state
machine with two states (human close or far) and two adaptive thresholds (up-
per and lower): when a human is close and the sensor value is rising, or the
human is far and the sensor value is falling, the simple algorithm adapts its
thresholds, sandwiching the current sensor value between new upper and lower
thresholds; otherwise when the threshold is crossed the state ﬂips. The long-
term algorithm uses the reweighted norm minimization version of the TREnd
Filtering with EXponentials (rTREFEX) algorithm, which models a window of
gas sensor output as a series of exponentials; change points are found between
the exponentials [Pashami et al. 2014].
Additionally, to make use of the recognition capability in an interaction,
some simple robot behavior was built with the LEDs and servos. Attention
was indicated via two degrees of freedom in a pan-tilt conﬁguration to allow
the prototype to look in various directions, with a range of approx. 180◦side
to side and approx. 150◦up and down. Additionally the prototype featured
a face with LED “eyes” and a nose, and weighed approx. 500g, with a small
size of 0.38m height x 0.27m width x 0.27m depth. The resulting system was
simple but could accomplish tasks diﬃcult for other sensors such as cameras:
e.g., detecting a person despite an occlusion, or from the back, or in visually
complex environments, or reacting to soft speaking in a noisy environment.
3.2
Evaluation
We felt a fundamental question was related to feasibility: could a breath sensor
provide value in theory (by preserving privacy) and in practice (by reacting with
a reasonable time less than a minute to changes in a person’s location)? The
9
answer was unknown because being “smelled” could be considered distasteful
and intrusive, and precise distance estimation using gas sensors is diﬃcult in
general due to air currents and sensor limitations.
To investigate, data were obtained from seven participants at our university
(age: M = 30.1 years, SD = 2.5; 3 female, 4 male). The experimenter asked the
participants to read some simple instructions, then started the robot’s program
on a laptop computer and left the participants alone in a small room. Partici-
pants held the prototype on their laps, and ﬁve times brought their faces close,
spoke to the robot, and then distanced their faces from the robot while pressing
the spacebar of the computer to record times. When close, the robot looked
up and its eyes lit green; else, the robot looked down and its eyes turned dark.
Afterward, participants ﬁlled out a simple ﬁve-point Likert-style questionnaire
with three items asking how comfortable they would feel with the possibility
that other people might have access to diﬀerent kinds of sensors around them
and data obtained by these sensors: “I would feel privacy living in a home with a
robot equipped with a” (camera, microphone, breath sensor). Data acquisition
took around ﬁve minutes. Impressions of each sensor modality were compared,
and average times were calculated.
3.3
Results
Impressions of the breath sensor and other sensors (camera, microphone) dif-
fered in terms of the degree to which privacy would be perceived: χ2(1, N = 21)
= 6.90, p = .009. Participants strongly agreed that they would perceive privacy
in a home with a robot with a breath sensor, somewhat disagreed for a camera,
and were neutral about microphones (with camera: 1.9 ± 1.1, microphone: 3.1
± 1.2, breath: 4.6 ± 0.53). One participant who provided a relatively high score
for cameras explained that he does not feel comfortable in front of cameras but
that they are everywhere nowadays and he is accustomed to living with them.
In the same sense, we think that diﬀerences in perceived privacy might not be
observed if the persons with access to the sensor and data are completely trusted
and no possibility exists for anyone else to gain access. For timing, the robot
reacted to changes in a person’s location on average in 6.5 ± 7.7s (presence:
6.7 ± 8.7s, absence: 6.3 ± 6.7s). The positive impression suggested that breath
sensors could be useful for a home robot, and we feel the time was acceptable
for our context, as we expect people to hold pets for longer than a few seconds.
4
Prototype 3 Robot motions: Fun reaching
In addition to being reactive, home robots will be expected to perform useful
tasks in an enjoyable manner, but playful behavior can result in undesired im-
pressions such as that a robot is obnoxious, untrustworthy, dangerous, moving
in a meaningless fashion, or boring. Previous work described dynamic genera-
tion of reaching motions intended to appear legible or deceptive [Dragan et al.
2014; Holladay et al. 2014]. Our own work suggested how a similar approach
10
could be used to generate playful motions which avoid typical pitfalls, based
on integrating straight useful motions with a curved playful component. We
integrated our model into a motion planning framework to dynamically gener-
ate reaching trajectories, built a humanoid robot prototype to perform planned
motions , as shown in Figure 5, and generated abstracted ”point light display”
videos from the robot’s reaching motions to explore how motions are perceived.
(Point light displays consisting of moving dots allow motions to be observed,
while hiding the exact details of other factors such as the form, color, and size
of a moving artifact [Johansson 1973; Veto et al. 2013]).
Reaching 
arms 
(4 DOFs 
each) 
 
objects 
face 
Kakapo 
Figure 5: Fun motions prototype.
4.1
Approach
We built a prototype based on scenario-speciﬁc requirements–capability to dy-
namically generate motions which could be perceived as playful–as well as gen-
eral requirements which might facilitate interaction–human-likeness (as a fami-
lar interface), and ability to reach objects. We predicted that failures in playful
motions could occur as a result of over-playfulness, hidden goals, wildness, ran-
11
domness and lack of variety, and if so that they could be avoided by planning
motions to be helpful, autotelic (having no goal outside of playfulness), safe,
clear in purpose, and anomalous. To dynamically generate motions we adapted
a gradient descent framework called Covariant Hamiltonian Optimization for
Motion Planning (CHOMP) [Zucker et al. 2013], combining straight reaching
motions with a curved playful component. For human-likeness the prototype
was given a face and ability to reach objects was implemented by giving the
prototype two arms at approximately human arm height and a mobile base.
4.2
Evaluation
To investigate how to design successful playful reaching motions, we compared
the proposed motions with a baseline (some naive playful motions which did not
take into account our guidelines), both for single motions and sequences. Four
videos were generated by having the robot reach for an object on a tabletop.
To obtain more general impressions LEDs were attached to the arm and objects
to create point-light displays. Videos 1 and 3, consisting of single motions and
sequences using the proposed model, were designed to provide support that
the model is perceived as playful. Videos 2 and 4, consisting of single naive
motions and a naive sequence surrounded by proposed sequences, were designed
to investigate eﬀects of failures. The videos were watched by 14 participants at
our university (ﬁve females and nine males; average age = 30.5 years, SD = 10.2
years) over approximately one hour. To obtain a wide range of feedback, both
qualitative and quantitative, we used the think-aloud method, questionnaires,
and continuous evaluations.
4.3
Results
The proposed curved motions were perceived as more playful than straight mo-
tions (3.6 ± 0.43 vs. 2.3 ± 1.1 out of 5: t(13) = 4.5, p = .001), with scores from
both the questionnaires and continuous scoring also above neutral: (5.5 ± 1.2
vs. 4.0 and 3.6 ± 0.43 vs. 3.0): t(13) = 4.8, p < .0005 ( < α = .02), t(13) =
4.9, p < .0005, and participants laughed during 20 playful motion sequences.
The model was also perceived as failing signiﬁcantly less than the baseline
for questionnaire scores: t(69) = -5.8, p < .0005, and think-aloud comments
(χ2(1, N = 35) = 22.8, p < .001). In Video 4 mean goodness scores before and
after the baseline sequence diﬀered signiﬁcantly (with one participant’s data
removed because he did not rate the robot at all before the baseline motions):
3.2 ± .87 before vs. 2.3 ± 1.2 after, t(12) = 3.4, p = .006, and playfulness
was not regarded a failure in the system, with mean perceived goodness before
the baseline motions (3.0 ± 1.0) higher than a score of 2.0 expressing slight
disagreement that the robot performs well, t(13) = 3.7, p = .003.
Thus, our results suggested that, in the simpliﬁed scenario investigated, the
prototype could generate some playful motions which are also perceived as good.
12
5
Prototype 4 Robot motions: Adaptive size-
changing
In addition to moving in an entertaining way, a robot could also seek to support
positive interactions by moving in such a way as to adapt to people’s preferences
and routines [Dautenhahn et al. 2004]. One way in which a robot could adapt
itself is by changing its size: for example, a robot could become large to attract
the attention of someone who is deep in thought, or small to be carried by
a person who changes location frequently.
Such size changes should not be
threatening or bothersome. How people perceive diﬀerent heights in a robot
has been previously investigated [Walters et al. 2009; Rae et al. 2013], but
not diﬀerent widths or size changes. As well, some mechanisms for size changes
have been proposed, including a folding structure which can become shorter or
thinner but not at the same time [Tachi and Miura 2012]. We built a prototype
which can become taller or wider, or both at once, as shown in Figure 6., and
used it to gather typical impressions of how size changes are perceived.
hand 
 
eye 
 
 
 
handle 
 
mouth 
Suica 
Figure 6: Size adapting prototype.
13
5.1
Approach
Our design was based on a scenario-speciﬁc requirement–capability to perform
size changes in height and width as one cohesive whole–as well as general require-
ments which might facilitate interaction–human-likeness (as a familar interface),
light weight, and small size. Actuation in two dimensions was realized by form-
ing a frame from parallel groups of linear actuators; the challenge was to create
a structure inside the frame which could change size independently along two
diﬀerent dimensions while appearing to be a complete artifact, and to do so in a
safe and power-eﬃcient manner (which would be diﬃcult with a simple solution
involving stretchable rubber). We created such a structure by designing a fold-
ing pattern composed of a grid of square components linked by chevron-shaped
connectors. Additionally we attached a face and moving hands, a camera, and
Bluetooth. The implemented prototype was light (approx. 700 grams), and
small but capable of expanding up to eight times in area, and approximately
three times along a principal axis (from 27.5cm to 77.5cm).
5.2
Evaluation
We did not know how size changes would be perceived. Expansions could appear
intimidating, contractions could show tension or cuteness, and repeated changes
could seem playful, but the literature suggested many other possibilities.
To identify typical impressions we asked eight participants (age: M = 33.5
years, SD = 9.6; 2 female, 6 male) to speak their thoughts aloud while watching
the prototype change size. Alone in a room with the experimenter participants
read a short handout, watched seven size changes in random order (tall, short,
wide, thin, large, small, repeated), and provided feedback in short interviews,
over approximately thirty minutes.
Typical impressions expressed by more than one participant were extracted
from coded transcripts and analyzed with regard to valence (how positive im-
pressions appeared to be), consistency across similar conditions for expansions
and contractions (e.g., comparing impressions for tall and wide), as well as
agreement with our expectations.
5.3
Results
Impressions spanned a spectrum from positive to negative; signiﬁcant diﬀerences
were not observed in numbers of positive, neutral, and negative impressions:
χ2(2, N = 54) = 0.8, p = .7.
Impressions also exhibited some consistency between related categories: ex-
pansions were perceived by some as intimidating/angry (tall and large), or in-
credulous (tall and wide) due to the expanding body and eyes. Contractions
seemed fearful and attentive (all three), and attractive (thin and small). Two
unshared impressions for expansions were that widening seemed like the pro-
totype was happy and smiling due to the expansion of the mouth, and that
expansions in both width and height seemed unnatural due to dispersion of the
14
face.
Unshared impressions for contractions were that the robot was sad or
angry (for short) which like the fearful impression might have been perceived as
a reason for shrinking away from interaction.
Thus, some impressions were similar to what we had expected but expressed
diﬀerently (e.g., the intimidating robot was angry, the cute robot was attractive
or nice, tension was described as fearfulness or attentiveness, and playfulness
was described as happy and excited). Unexpected impressions of incredulousness
and happiness also resulted from expansions of the facial components (eyes and
mouth). We feel this supported our approach of checking the overall kinds of
impressions which would result as our ﬁrst step, rather than starting with a
more speciﬁc kind of study (such as a forced-word test).
6
Prototype 5 Robot appearance: Self-ﬁxing
Alongside motions, a robot’s appearance is also important for safety, trust,
liking, and aesthetics (e.g., avoiding uncanny impressions) and should be main-
tained. For a robot, maintaining appearance is a basic problem because the
appearance can be easily accessed without requiring internals to be exposed;
furthermore, maintenance is an important problem for home robots because
physically embodied systems experience wear and tear and laypersons typically
lack the parts and knowledge to conduct repairs. It would be beneﬁcial if robots
could help to ﬁx themselves.
Passive ﬁxing with self-healing materials will be useful in the future, al-
though challenges exist in dealing with large, repeated damage [Blaiszik et al.
2010] or cases when material should be removed or aligned (e.g., hair or fur).
Active ﬁxing, in which a robot moves to ﬁx itself, can address such cases but
requires capability for self-detection [Gold and Scassellati 2009], anomaly dis-
covery [Suzuki et al. 2011], as well as self-modiﬁcation [Brodbeck et al. 2012;
Revzen et al. 2011], which to our knowledge have not been combined previously.
We built an active ﬁxing prototype comprising a mobile base, a target for
self-ﬁxing (a small sign with the word, ”Robot”), and an arm with a marker, as
shown in Figure 7.; the prototype (assuming no task has been given by a human)
wanders through a home-like environment in search of a reﬂective surface such
as a mirror, checks the appearance of its sign for an anomaly (a missing letter),
and writes in the letter using the marker held in its arm.
15
mirror 
 
 
 
arm/marker 
 
target to fix: robot 
 
 
Kinect 
Figure 7: Self-ﬁxing prototype.
6.1
Approach
Inspired by how humans check their appearances in a mirror, our prototype was
designed to perform self-ﬁxing in three steps: self-detection, anomaly detection,
and self-modiﬁcation. For self-detection the robot navigated to ﬁnd a mirror
and detected itself visually. We used Robot Operating System (ROS) to lo-
calize the robot in a known map and calculate paths, and OpenCV’s template
matching functions along with known images of the robot and some empirically
determined thresholds for visual detection.
To recognize anomalies, a one-class classiﬁer was trained on features from
various regions on an extracted image of the robot’s sign. First the image was
divided into ﬁve equal regions based on our prior knowledge that there would
be ﬁve letters, then the number of SIFT features in each grid cell was passed to
a one-class Support Vector Machine (SVM) classiﬁer with an RBF kernel and
parameters nu and gamma set to 0.5 and 3.1e05, using LIBSVM.
For self-modiﬁcation we conceived of a simpliﬁed model for depicting al-
phanumerics with six lines or less, inspired by the seven segment approach
(https://www.google.com/patents/US1126641), and recorded joint values; by
16
recording two points for each grid cell such as the center and upper right hand
corner any alphanumeric could be drawn.
6.2
Evaluation
Self-detection was assessed by commanding the prototype 20 times to look for
itself from an arbitrary starting point while wandering randomly in a 3m x
3m home-like environment, with a mirror placed in four diﬀerent areas: the
bathroom, bedroom, kitchen, and entrance; the robot halted if it detected a
positive (an image it thought represented itself) or visited each room once. The
accuracy of anomaly detection was measured using roughly equal numbers of
normal and anomalous images (70 and 61) obtained by placing the robot in
diﬀerent poses relative to the mirror (distance and angle), in diﬀerent light,
and with diﬀerent degrees of anomaly occurring to the last letter (the ”t” of
”Robot”). The robot’s sign was checked before and after ﬁxing by counting the
number of SIFT points detected in the grid cell containing an anomaly.
6.3
Results
The prototype detected itself correctly in 76.9% of the cases in which it detected
a positive (10/13), and did not ﬁnd a mirror seven times; for anomaly detection,
accuracy was 71%; and the number of SIFT features in the anomalous grid cell of
the sign increased from almost none to a reasonable amount (3 to 37). Successful
self-detection took on average approx. 150 seconds, whereas anomaly detection
and self-modiﬁcation were quick, requiring only several seconds.
We feel the results are reasonable due to the complexity of the challenge: the
self-detection module did not handle diﬀerences in scale or rotation (the robot’s
distance or angle relative to the mirror), the robot could miss itself when it is
turning (in between frames), and the complex environment had many objects
colored similar to the robot; for anomaly detection there was high variation in
pose and illumination; and for self-modiﬁcation the basic shape of the letter
could be seen and the number of SIFT points was similar to results for a normal
image, suggesting that ﬁxing resulted in some improvement. Moreover, if we
assume the mock-up home was ten times smaller than an average home (in
Sweden this is 89m2), the average time taken would be approx. 25 minutes,
which would not be prohibitive if the robot occasionally has some down-time
(e.g., when humans are sleeping or not at home), locations of previous detections
can be remembered, and other tasks can be conducted simultaneously (e.g.,
looking for potential danger).
7
Prototype 6 Robot appearance: Unobtrusive
transparency
In addition to reducing maintenance times, robots in human environments can
also seek to recognize people’s activities and intentions and reduce the degree to
17
which they obstruct or get in the way of humans. One desirable characteristic
could be to avoid blocking a human’s view, which a robot could realize by turn-
ing transparent. But, how transparency in a social robot would be perceived
was unclear. Various mechanisms are being explored for transparency, including
active camouﬂage which requires projectors or many LEDs [Tachi 2003]. Some
techniques have also become feasible at the nano-scale but not yet at the scale
of robots which could interact with humans [Valentine et al. 2009]. Transparent
organic LEDs (OLEDs) could be used, but white backgrounds can be problem-
atic [planar 2016]. Furthermore, liquids of varying opacity can be moved within
see-through materials, using various components such as pumps and reservoirs
[Morin et al. 2014]. We report on a simple, light design using smart ﬁlm and
conductive plastics, used to explore typical impressions of a prototype becoming
transparent (referred to here as “transpariﬁcation”), as shown in Figure 8.
transparent 
opaque 
smart film 
 
 
eyes 
 
conductive plastic 
 
mouth 
 
 
 
 
university logo 
Figure 8: Transparency-capable prototype.
7.1
Approach
We built a prototype based on scenario-speciﬁc requirements–capability to turn
transparent or opaque–as well as general requirements which might facilitate
interaction–human-likeness, lightness and small size. Transpariﬁcation was re-
alized by using two electrochromic ﬁlms containing polymer dispersed liquid
18
crystals (PDLCs) which align to let light pass when powered behind the main
parts of the prototype, clear light emitting diodes (LEDs), and polyethylene
terephthalate (PET) plastic coated on one side with indium tin oxide (ITO) to
conduct electricity to the LEDs and act as touch sensors. The prototype was
also given humanoid characteristics (a head and actuated hand, attached above
an opaque base holding electronics), weighed approx. 700g and measured 0.35m
width x 0.175m height x 0.105m depth.
7.2
Evaluation
We did not know what kind of social impressions could result from transpariﬁ-
cation, because humans cannot turn transparent and many possibilities existed
(e.g., clarity and understanding, fear, shadiness, or embarrassment could be
attributed). We predicted that proactive transpariﬁcation would be perceived
positively, as an attempt to allow a person to see better, but that as a reaction
to a human behavior it would indicate a negative feeling toward interacting.
Furthermore repeated transpariﬁcation would indicate playfulness or desire to
be noticed in the proactive case, and acknowledgement of human behavior in
the reactive case.
To check, the prototype was shown to eight participants (age: M = 33.4
years, SD = 11.0; 2 female, 6 male), who were asked to describe the robot’s
behavior aloud (e.g., what is the robot doing, and why do you think the robot
did that?). Two factors were controlled: the robot behavior (transpariﬁcation,
opaciﬁcation, or changes repeated three times each at 1Hz) and timing (proac-
tive and reactive). In the reactive case participants were asked to wave, say
hello to the robot, and touch its head before the robot’s behavior was trig-
gered (imagining that they were interacting with the robot). Transcripts were
coded and typical subjective comments common to two or more participants
were extracted.
7.3
Results
As shown in Table 2, transpariﬁcation was perceived by half of the participants
as indicating a change in arousal, with the robot turning oﬀ; a sleeping metaphor
was common, as well as references to attentiveness, and other emotional impres-
sions related to valence or dominance were not perceived. Repeated changes
were described as ”blinking”, and sometimes as the robot calling for attention
or seeking interaction. Proactive behavior was sometimes unclear, whereas the
reactive robot sometimes appeared to malfunction.
Reasons were derived from asking participants. Transpariﬁcation was per-
ceived as turning oﬀ, like a screen turning dark; we thought this could be due
to a reduction of information transmitted from the robot. Anthropomorphic
impressions of sleeping or blinking were catalyzed by the prototype’s human-
like qualities. Repetition indicated desire; we thought this could be because
adaptors (repeated motions in humans) like ﬁnger- or foot-tapping can express
19
Table 2: Typical impressions of transparent behavior.
Transparent
Opaque
Repeated
Proactive fell
asleep
(5),
turned
oﬀ
(4),
inactive (2)
turned on (4), woke
up (2), waiting (2),
attentive
(2),
un-
clear (2)
blinking (5), wait-
ing
(3),
attentive
(3),
calling atten-
tion (2), unclear (2)
Reactive
turned oﬀ(4), bro-
ken (3), responded
(2), inactive (2), fell
asleep (2)
responded (7), woke
up (3)
blinking
(4),
re-
sponding
(4),
waiting
(2),
bro-
ken
(2),
seeking
interaction (2)
concern. Reactive behavior seemed clearer due to causality inferred from tem-
poral correlation of behaviors. Some impressions of malfunctioning were due to
perceived incongruity in the robot turning oﬀafter being greeted by the person.
Thus, the results suggest that transpariﬁcation can be incorporated into a
home robot’s capabilities, although care should be given in a reactive context
to avoid an impression of malfunctioning; and that one possible use could be to
indicate a robot is dormant, e.g., when a person is busy and not interacting.
8
Prototype 7 Useful application (fall emergency):
Smart home integration
In addition to interacting in a nice way, home robots should be capable of
performing useful tasks, one of which will be helping people in emergencies.
One problem is that people might not want to be normally observed by cameras
and microphones on a robot (e.g., when in the bathroom or bedroom), and
also a robot might not become aware of an emergency happening in a diﬀerent
part of a home.
Here we propose that a robot can be combined with some
environmental sensors, which can be simple to preserve privacy, and placed
throughout a home to detect emergencies when a robot is not nearby.
The
interactive feature proposed is that, when trouble is suspected, the robot can
go to where a person is and ask if they are okay.
Verbal communication is
a common and expressive interactive modality which is also useful because a
robot’s positioning does not have to be perfect (conversation can take place
over distances without requiring line of sight). Some home robots have been
designed to go to speak with a person when a condition is met; e.g., one robot
urges elderly persons to drink if they have not had a drink for a while [Dragone
et al. 2015]. As well, some smart home systems have been designed to detect
anomalous behavior patterns [Novak et al. 2013; Kim et al. 2010]. Here we
combine these two approaches, building a prototype which, based on anomalies
detected using some simple environmental sensors, can move close to a victim
and ask if they are okay, as shown in Figure 9.
20
 
 
bedroom 
 
 
 
 
 
sensors 
 
 
 
robot 
 
 
 
hallway 
Mock-up home 
Figure 9: Smart home integrated prototype.
8.1
Approach
We built a prototype based on scenario-speciﬁc requirements–capability to pro-
cess data from simple environmental sensors in a central database–as well as gen-
eral requirements which might facilitate interaction–human-likeness and mobility–
consisting of simple sensors, a central database, and a mobile robot. Eleven
sensors (four pressure, three contact and four passive infrared) were placed in
four locations (a bathroom, bedroom, hallway, and kitchen) in a 3m x 3m small
single-ﬂoor apartment-like space. These sensors could be useful for example for
a person with dementia: pressure sensors can detect if a person falls and stays
in one spot for a long time; contact sensors can detect if a person is opening
drawers to cook even though they have already eaten; and infrared sensors can
detect someone leaving the house in the middle of the night. Data from the
sensors was gathered in a central database. A random forest classiﬁer [Ho 1995]
was trained to detect anomalous patterns, which triggered the robot to move
to the location of the anomaly to investigate.
The prototype was given the
capability to communicate in a human-like fashion via speech; it was designed
to verbally ask if it should call emergency medical services (EMS); a negative
response caused the robot to return to its initial position, and a positive answer
21
or timeout caused the robot to state that a call for help had been made. For the
robot’s hardware, a small diﬀerential drive mobile base with a Microsoft Kinect
sensor were used (0.35 x 0.35 x 0.42m, 6.3 kg). For software, Robot Operating
System (ROS) was used for navigation and visualization, and Festival and CMU
Pocket Sphinx for speech-based interactive capabilities.
8.2
Evaluation
To be feasible, a robot would have to be able to navigate to the scene of an
anomaly quickly and correctly determine if an emergency had occurred through
verbal interaction. Evaluation was conducted by having the experimenter trigger
the environmental sensors 20 times in an anomalous manner, sending the robot
to investigate anomalies ﬁve times each in four locations (a bedroom, hallway,
kitchen, and bathroom); the robot asked if it should call for help, to which the
experimenter answered aﬃrmatively half of the time and otherwise negatively.
The average time required for the robot to arrive at a location and average
accuracy in recognizing a human’s response correctly were computed.
8.3
Results
The prototype required an average of 13.8s (SD: 7.9) to go to the bedroom,
hallway, and kitchen. (The data for the bathroom were not used because the
robot’s initial position was near the bathroom and it only had to turn.) Verbal
responses from the experimenter at the anomaly location were correctly recog-
nized 76.9% of the time, with problems arising due to the timing of when the
robot should recognize.
Thus, results were mostly successful but indicated room for improvement.
Open areas in a single ﬂoor apartment could be reached in under a minute (al-
though real-world problems of stairs and blocked passages were not addressed)
and a robot could acquire other information in addition to a verbal response to
determine if a person is in danger (for example, visual detection of lip movement
could conﬁrm when a robot should seek to recognize, and pose detection could
indicate if a fall had occurred).
9
Prototype 8 Useful application (fall emergency):
Health assessment
Asking if a human is okay when simple sensors detect an anomaly might not
always work, e.g., if sensor coverage is incomplete (e.g., in the presence of oc-
clusions) or a person is unresponsive. We propose that it would be helpful if
a home robot would also be able to autonomously estimate the health states
of detected people, possibly while patrolling. Some teleoperated robots have
been designed to facilitate ﬁrst aid–e.g., to bring a deﬁbrillator to a speciﬁed
location [Katz 2015], or to remotely observe and conduct surgery on a soldier
on a stretcher [Martinic 2014]–but medical staﬀmight be far away or busy and
22
teleoperation can require eﬀort and skill. We built a prototype capable of au-
tonomously assessing health of fallen persons based on some ﬁrst aid guidelines,
as shown in Figure 10, by adding some simple sensors and software to an avail-
able platform, and focusing on three steps: detecting emergencies, localizing
body parts of interest for ﬁrst aid, and assessing some vital signs.
chin position 
(for airway/breathing) 
 
body pose/wounds 
(for bleeding) 
 
 
 
cyanosis 
(for circulation) 
 
Mannequin 
Figure 10: Health assessing prototype.
9.1
Approach
We wanted our design to leverage typical robotic qualities (using the robot’s
ability to sense with various sensors and move throughout a home) and produce
some human-understandable output. To infer if an emergency occurred, falls
and fallen persons were detected: falls were detected by comparing shoulder
height displacement of a frontally located person with a threshold, while also
noting fall direction, which we expected could be valuable for estimating injury
locations.
Fallen persons were detected as human-sized, human-temperature
anomalies, by comparing the size and temperatures of clustered laser scans
within a known map with thresholds during patrolling. The prototype estimated
the location of body parts of interest for ﬁrst aid (chest, hands, chin, mouth,
and nose) based on face and skin detection and a simpliﬁed prior model. To
23
detect faces, which might not be initially visible, the prototype scanned over the
anomaly with a camera attached to its arm, and navigated to the other side of
the anomaly, while rotating image data, as face angles cannot be known a priori.
The estimated pose was visualized over a map of the environment (additionally
a visual servoing algorithm was built for the robot to indicate points with a
laser pointer). Furthermore, the prototype checked for relative blueness in the
distal portion of the hands to assess circulatory state (peripheral cyanosis), chin
pose for airway, speed and normalcy of sound for breathing, and location and
rate of expansion of red color for bleeding.
9.2
Evaluation
To assess health, all three steps must be achieved: vital sign estimation relies
on recognizing where to measure, which in turn relies on recognizing if there is
something to investigate. These steps were evaluated individually as the number
of test cases would be prohibitively large for a holistic evaluation (recognition for
ﬁrst aid is a highly complex task requiring numerous capabilities, each of which
must deal with various typical cases). For emergency detection, the prototype’s
ability to detect fall directions and fallen persons was assessed in 40 trials each
(20 each for detecting human-sized anomalies and human-temperatures); for
falls, a mannequin was pushed forward, backward, or to the sides. For fallen
persons cases intended to be confusing for the classiﬁer were used with various
poses, sizes, and temperatures. For body part localization, face detection success
rate and error for chest, hands, chin, mouth, and nose were assessed via 20 and 5
trials. For vital signs, cyanosis, chin pose, breathing, and bleeding were assessed:
with 40 samples of six images each with bluing in six regions, and four images
with no bluing; 40 samples with chin up or down and face oriented front, side,
or downward; 40 samples with ten for regular breathing and 30 for abnormal
breathing which was fast, slow, or agonal (resembling gasping sounds emitted
near death); 36 samples for location, with six each for head, body, and each
limb, and 18 samples for speed, for massive, slight, or no bleeding.
9.3
Results
Average accuracy over all parts was 78%. Accuracy was 85% for emergency
detection: 80% for detecting fall direction, and 90% for detecting fallen persons
(85% for anomalies, and 95% for detecting human temperature). Faces could
be detected in 70% of test cases, for which average error was 0.015m. Average
accuracy for vital signs was 79%: 65% for cyanosis, 75% for chin pose, 85%
for breathing, and 91% for bleeding (97% location/85% speed).
We believe
that the results are promising for a ﬁrst prototype. Accuracies were imperfect
due to various challenging factors: erroneous estimation of the 3D position
of a person’s shoulders at close distance in narrow spaces for fall detection,
uncertainty of wall positions due to sensor noise for anomaly detection (objects
were near walls and anomaly size was small when the robot could only see the
width and not the length of the human body), a cooling hot object for detecting
24
human temperature, extremely angled faces for face and chin pose detection,
the simpliﬁed model for estimating chest and hand location, low resolution on
hand data for cyanosis, noise and high variance for breathing sounds, closeness
of joints for bleeding location detection, and ﬂow rates close to the threshold
for bleeding rate detection. If such problems can be mitigated, we expect that
robots in homes will make a useful contribution not to replacing human experts,
but rather to helping to detect problems and causes quickly.
10
Prototype 9 Useful application (fall emer-
gency): Finding help
In a health emergency in a home with more than one person, care facility or
public place, aside from calling medical services and checking if a person is
okay, a robot can also actively move to try to ﬁnd a nearby human capable of
helping (conducting ﬁrst aid or driving the victim to a hospital). The challenge
is that every second can count: the possibility of a victim being helped should
be maximized.
We formulated the problem as follows: nearby humans can
be regarded as “nodes” on a graph which the robot can visit; each node has a
travel cost and reward (expected helpfulness); a time limit is likely (e.g., battery
duration, or approx. 5 minutes for cardiac arrest); the robot cannot visit all
nodes (all humans); nodes move and can appear or disappear from view; and the
robot may pass through a node more than once and might not need to return
to its starting point. Thus, at the high-level, the problem can be structured
as a variant of the Traveling Salesman Problem (TSP), e.g. with proﬁts and
partially observed [Kataoka and Morito2013]; at the low level, a path planning
algorithm such as an A* variant can be used for the robot to move to a node
[Hart 1968]. The unique problem was estimating helpfulness, which has been
conducted for online reviews [Ghose and Ipeirotis 2011], but not for people in
an emergency context. We programmed a ﬂying robot prototype to approach
potentially helpful people in its vicinity, as shown in Figure 11.
25
Who to ask for help? 
person, far 
 
 
person, close 
 
 
drone 
Figure 11: Help ﬁnding prototype.
10.1
Approach
The design required capability for estimating helpfulness and simpliﬁed navi-
gation. Helpfulness could be estimated based on various factors, including age,
distance, profession, motion, and possession of useful resources. For example,
adults, especially medical workers and life guards, who are close by or approach-
ing and possess a cellular phone could be helpful. Conversely young children
and elderly might not be physically capable of providing ﬁrst aid, and valuable
time could be lost approaching people far away or moving quickly away from
the robot. Thus we designed our prototype to estimate helpfulness based on two
rules of thumb: estimated distance based on the size of detected faces, and the
height of detected faces (adults are expected to be more helpful and taller than
young children). To detect people the prototype used the Viola-Jones approach
[Viola and Jones 2001] to detect faces, and heads and shoulders.
Simpliﬁed navigation was conducted via visual servoing; the face of a target
helpful person was used as a landmark, with the prototype approaching while
turning so as to keep the person’s face at the center of its view, until estimated
closeness exceeded a threshold. A ﬂying robot was chosen based on the idea that
the robot could ﬂy over obstacles, and that a small drone could be carried by
26
a person or installed in environments with elderly. For a ﬁrst investigation we
also did not consider other problems such as how the robot should communicate,
occlusions, diﬃcult poses, or wind.
10.2
Evaluation
To evaluate if the prototype could approach persons estimated to be helpful,
we placed it in front of two approximately life-sized upper body photos in a
room with the door closed and sent a command to ﬁnd a person to go to for
help. Identical photos of one person were used for convenience, and to avoid any
confusion which might result due to personal diﬀerences in face width, as face
size was used to estimate distance. Two conditions were investigated, distance
and height, with 20 trials conducted in total.
In the distance condition (10
trials), the robot was placed 2m and 1.5m away from one photo (ﬁve trials
each), with the other photo placed farther at 3m for all cases; height for both
photos was equal (1.2m). For height, the robot was placed 2m away and the
photos were placed with a vertical distance of approx. 0.3m or approx. 0.15m
around 1.2m (ﬁve trials each). Positions were chosen such that both photos
were in camera view and could be positively detected and a diﬀerence could be
perceived in distance or height. During data acquisition successful trials were
counted. To be successful, the robot had to approach the closer and higher
photos.
10.3
Results
The robot approached the correct target 90% of the time (18/20 trials) but
crashed into the wall and a target one time each. The success rate was rea-
sonable, due to the simpliﬁed context.
The two failures resulted from some
drifting for which the drone’s in-built stabilization algorithm could not com-
pensate; this sometimes interfered with detection and we believe was aﬀected
by various factors including erratic air currents from the drone’s own propellers,
low ground contrast, and minor imbalances in propeller alignment or the pro-
tective outer hull. We think these problems will be less important in the future
as technologies improve.
11
Discussion
Nine medium ﬁdelity prototypes were built to explore novel capabilities related
to interactiveness and intelligence which might contribute to acceptance of home
robots. Some things we learned, some unexpected observations, and some next
steps are as follows:
*1 Energy harvesting
We learned that it is possible to leverage thermal,
kinetic, and optic energy harvesting for a small held social robot to provide
visible reactions in typical conditions (approx. 20◦C, >1300Pa, >300 lux). A
27
nice but unexpected corollary to using freely balanced ears was that the pro-
totype reacted to being picked up or shaken (proprioceptive behavior). Future
work will involve combining low energy parts (microcontroller, sensors, and
wireless) to transfer important human vital signals such as body temperature,
pulse, respiratory rate, and oximetry to a computer for processing (for exam-
ple temperature could be measured at low current with a SMT172 sensor and
transmitted wirelessly with Silicon Labs EZR32).
*2 Private breath sensing
We learned that for a close interaction with a
small held social robot a breath sensor can be used to estimate human position
in an unobtrusive way with an average reaction time of six seconds. A nice but
unexpected insight was that the prototype would react not only when facing
away or behind an occlusion (which we had expected), but also to speech. Future
work will involve improving performance by fusing data from multiple sensors.
*3 Fun reaching motions
We learned that a good and playful impression
could be elicited over several minutes by blending a straight functional motion
with curved playful components and following some simple heuristics.
One
unexpected observation was that a reversed u-shaped relation was observed
between perceived playfulness and motion length, suggesting that such motions
should be neither too short nor too long. Future work will investigate playfulness
in other kinds of motion such as locomotion or with objects and how other
desirable characteristics can be evoked, such as cuteness or coolness.
*4 Adaptive size-changing motions
We learned that our participants typ-
ically perceived expansions as expressing threat or incredulousness, and contrac-
tions as expressing fearfulness, attentiveness, and attractiveness. Unexpected
was that expansions in both height and width seemed unnatural due to disper-
sion of the face, and widening seemed happy due to expansion of the mouth.
Future work will generate more stable structures which can expand in various
ways (due to drooping our prototype was placed on the ﬂoor in front of interact-
ing people), and test local expansions (e.g., skew motions, or only the robot’s
body and not the eyes or mouth).
*5 Self-ﬁxing appearance
We learned that a robot could seek to ﬁx a simple
ﬂaw in its own appearance by navigating to detect its own reﬂection with a
success rate of 76.9% in approx. 150 seconds, discovering anomalies 71%, and
modifying itself. Unexpected for us were some of the false positives detected
by our prototype, which looked little like the robot to us (we believe this was
a result of our prototype not being able to deal well with varying illumination
and variations in distance and orientation to the mirror, and indicates that
improvement will be possible). Future work will include dealing with various
kinds of anomalies (e.g., additive, aligning) and accessing a larger area of the
robot for ﬁxing.
28
*6 Transparent appearance
We learned that our participants typically per-
ceived transpariﬁcation as the robot becoming dormant, and that care should
be given when used as a reaction to avoid an impression of malfunctioning. This
was unexpected to us, as we had predicted it would be perceived as a change in
valence, not arousal; furthermore, playfulness was not perceived (possibly also
because positive valence was not communicated). Future work will explore local
changes, and partial changes, and other mechanisms for transparency, also for
other components.
*7 Going to victim
We learned that a robot, based on processing data from
some simple environmental sensors, can reach various locations fairly quickly
(13s in a small space, which we estimate to be several minutes for a one-ﬂoor
Swedish home of average size), and interact via speech with a success rate of
76.9% to estimate if a human wanted the robot to call for help. Unexpected
was that problems occurred from the robot hearing itself speak or not hearing
a fast response from a human. Future work will involve using visual cues to
better tell when a human is speaking, and detecting confusion (e.g., via Glasgow
Coma Scale, when a human’s assessment of whether help is required may not
be correct, as in the case of some stroke victims).
*8 Health assessment
We learned that a prototype could be constructed to
assess a person’s health state in emergencies in a simpliﬁed context, based on
some ﬁrst aid guidelines and some simple sensors, with an accuracy of 78%. Un-
expected was that the remote temperature sensor was useful even when clothes
were worn and at distances of several meters.
Future work will involve im-
proving performance (e.g., by learning models for a speciﬁc person), and using
localized body parts and vital sign assessment to try to perform some simple
ﬁrst aid actions on a mannequin (ﬁrst aid on a human can result in injuries such
as broken ribs; therefore development should ﬁrst aim to demonstrate reliable,
safe performance on a mock-up).
*9 Finding help
We learned that a prototype could be constructed to make
a decision about people’s helpfulness and approach a nearby target with a suc-
cess rate of 90% in a simpliﬁed context. Unexpected was that even in a closed
room, drift was a large problem. Future work will involve trials in more complex
conditions (e.g., outdoors in the presence of wind), also by using TSP/A* ap-
proaches; better estimation of helpfulness by detecting other cues; and ﬁnding
a good strategy for communicating information about an emergency.
In general, we believe that there is some overlap in the knowledge obtained
from creating prototypes. For example, the reaching prototype was used to in-
vestigate how to generate fun behavior, but some insight can also be drawn from
other prototypes: the freely swinging ears on the energy harvesting prototype
were fun because they provided quick ﬁne feedback; breathing on a prototype
seemed enjoyable possibly because it was anomalous (we usually don’t breath on
29
people or pets); and impressions of playfulness in the size-changing prototype,
but not in the transparent prototype, could suggest the importance of dynamic
motions for generating fun visual behavior.
It should be noted that the current results are limited by the highly ex-
ploratory nature of the work. Further studies with high-ﬁdelity prototypes and
more participants will be conducted. Also, in cases where our focus was primar-
ily on the robot functionality, such as energy harvesting, we will also evaluate
interactions. Furthermore, we should investigate if improved services will result
from combining capabilities into a single prototype or allowing prototypes to
communicate with each other (for example we have built a transparent struc-
ture which can become larger or smaller).
Despite the limitations, we think the current results could provide some in-
sights for other applications: as one example, for autonomous vehicles, which
we believe are closely related to robots and might also aﬀect people in their
everyday lives.
Like robots, vehicles can also make use of thermal, kinetic,
and solar energy; e.g., vibration and moving parts such as wheels, as well as
passengers’ body heat and motions on the wheel and gas pedal could be used
to power a phone which could be used in emergencies even if the car battery
has a problem. Breath sensors can also be used to detect alcohol levels or car-
bon monoxide. Enjoyable motions of actuators (such as the windshield wipers)
could entertain children. Adaptive size-changing motions can allow vehicles to
expand for stability or contract for compactness when parking or on narrow
roads or passing under overpasses. Self-ﬁxing will also be a probable next step
as autonomous self-diagnosis becomes more advanced.
Transparency has al-
ready been incorporated in a simpliﬁed form into a concept car via LEDs, and
“heat transparency” has been incorporated into some tanks; possible beneﬁts
will be privacy, especially in unsafe areas, as well as improved situation aware-
ness (blind spots can be seen). In emergencies, speech can be used to wake a
driver who has fallen asleep or to keep alert a sleepy driver, or a robot could
be sent from a station to ﬁnd a person and check that they are alright. Speech
could also help with drivers who have partial blindness, by describing driving
situations when care is required. Health assessment could be vital to determine
if a driver is unconscious, in danger and may not know it, or otherwise unable
to drive (e.g., in the case of stroke or epilepsy); in such cases the vehicle can
safely stop or drive to a hospital. If hospitals are far away, ﬁnding help could
also be useful, although constraints will be diﬀerent than a ﬂying robot (an
algorithm for a car should drive on roads). Thus, in general, new interactive
capabilities could provide customer value in various ways, including enhancing
mobility and equality in the transportation system (people who are capable of
driving should not be prohibited from driving only because they have a medical
condition which might require occasional assistance).
In conclusion, to investigate some new interactive capabilities for home
robots which could facilitate acceptance, we used a mid-ﬁdelity prototyping
approach which allowed us to acquire some basic insights while avoiding large
costs in time and eﬀort. Throughout the current work, we observed high com-
plexity both in the problem area as a whole, where many possible goals presented
30
themselves, and in each sub-area, where for each diﬀerent prototype various as-
sumptions had to made. Using a mid-ﬁdelity prototyping approach allowed us
to quickly redraw some of our expectations and assumptions, focus on speciﬁc
questions we had, and acquire a general lay-of-the-land from various perspec-
tives. In this sense we feel that our results support a recent prescription for user
experience design in (social) HRI, which advocated the importance of iterative
design, properly deﬁned goals (due to the non-triviality of designing positive
experiences), and usage of a variety of evaluation methods to avoid bias in the
resulting knowledge [Alenljung et al., to appear]. We also feel that this type
of prototyping is important for addressing user experience issues early on and
in an integrated manner - rather than later on, when in many cases it might
be too late, or in an isolated fashion, which might be less informative. Thus,
we believe that the results of the current study, in addition to providing some
information for designers of home robots, also generally suggest the usefulness
of a mid-ﬁdelity prototyping approach for social HRI.
12
Acknowledgments
We thank the volunteers who participated in the experiment, and everyone else
who helped. This work used the Halmstad Intelligent Home.
References
[1] Beatrice Alenljung, Jessica Lindblom, Rebecca Andreasson, and Tom
Ziemke. User experience in social human-robot interaction. (To appear in:
International Journal of Ambient Computing and Intelligence (IGI).)
[2] Victor Hernandez Bennetts, Erik Schaﬀernicht, Victor Pomareda, Achim
J. Lilienthal, Santiago Marco, and Marco Trincavelli. 2014. Combining Non
Selective Gas Sensors on a Mobile Robot for Identiﬁcation and Mapping of
Multiple Chemical Compounds. Sensors, 14, 9, 17331 - 17352.
[3] Benjamin J. Blaiszik, Sharlotte L. B. Kramer, Solar C. Olugebefola, Jef-
frey S. Moore, Nancy R. Sottos, and Scott R. White. 2010. Self-Healing
Polymers and Composites. Annual Review of Materials Research, 40, 1,
179-211. DOI: 10.1146/annurev-matsci-070909-104532.
[4] Luzius Brodbeck and Fumiya Iida. 2012. Enhanced Robotic Body
Extension
with
Modular
Units.
In
Proc.
IROS,
1428-1433.
DOI:
10.1109/IROS.2012.6385516
[5] Martin Cooney, Francesco Zanlungo, Shuichi Nishio, and Hiroshi Ishiguro.
2012. Designing a Flying Humanoid Robot (FHR): Eﬀects of Flight on
Interactive Communication. In: Proceedings of the 21st IEEE International
Symposium on Robot and Human Interactive Communication (RO-MAN),
364-371. DOI:10.1109/ROMAN.2012.6343780
31
[6] Martin Cooney and Stefan M. Karlsson. 2015. Impressions of Size-Changing
in a Companion Robot. In: Proceedings of the 2nd International Conference
on Physiological Computing Systems (PhyCS).
[7] Martin Cooney and Anita Sant’Anna. Avoiding Playfulness Gone Wrong:
Multi-objective Planar Reaching Motion Generation for a Social Robot
Arm in a Home. (Submitted: IJSR)
[8] Kerstin Dautenhahn. 2004. Robots We Like to Live With? A Developmen-
tal Perspective on a Personalized, Life-Long Robot Companion. In Proc.
RO-MAN, 17-22. DOI: 10.1109/ROMAN.2004.1374720
[9] Anca D. Dragan, Rachel Holladay, and Siddhartha S. Srinivasa. 2014. An
analysis of deceptive robot motion. In: Proceedings of Robotics: Science
and Systems (RSS).
[10] Mauro Dragone, Joe Saunders, and Kerstin Dautenhahn. 2015. On the
Integration of Adaptive and Interactive Robotic Smart Spaces. J. Behav.
Robot. 6, 165179. DOI 10.1515/pjbr-2015-0009
[11] Cally S. Edgren, Robert G. Radwin, and Curtis B. Irwin. 2004. Grip Force
Vectors for Varying Handle Diameters and Hand Sizes. Human Factors, 46,
2, 244-251.
[12] Daniel Engelberg and Ahmed Seﬀah. 2002. A Framework for Rapid Mid-
Fidelity Prototyping of Web Sites. In: J. Hammond, T. Gross, J. Wes-
son (Eds), Usability: Gaining a Competitive Edge. IFlP World Computer
Congress.
[13] Anindya Ghose and Panagiotis G. Ipeirotis. 2011. Estimating the Helpful-
ness and Economic Impact of Product Reviews: Mining Text and Reviewer
Characteristics. IEEE Transactions on Knowledge and Data Engineering -
TKDE , 23, 10, 1498-1512.
[14] Kevin Gold and Brian Scassellati. 2009. Using probabilistic reasoning over
time to self-recognize, Robotics and Autonomous Systems, 57, 4, 384392.
DOI: 10.1016/j.robot.2008.07.006
[15] Peter E. Hart, Nils J. Nilsson, and Bertram Raphael. (1968). A For-
mal Basis for the Heuristic Determination of Minimum Cost Paths. IEEE
Transactions on Systems Science and Cybernetics SSC4 4 (2): 100107.
DOI:10.1109/TSSC.1968.300136.
[16] Tin Kam Ho. 1995. Random Decision Forests. Proceedings of the 3rd In-
ternational Conference on Document Analysis and Recognition, 278282.
[17] Rachel M. Holladay, Anca D. Dragan, and Siddhartha S. Srinivasa.
2014. Legible Robot Pointing. In: Proceedings of International Sympo-
sium on Human and Robot Communication (Ro-Man). DOI: 10.1109/RO-
MAN.2014.6926256
32
[18] Wolfgang Hotze. 2016. Robotic First Aid: Using a mobile robot to localise
and visualise points of interest for ﬁrst aid. Master’s Thesis, Halmstad
University.
[19] Jrmy Heyne. 2015. Assistance-seeking strategy for a ﬂying robot during a
healthcare emergency response, Internship Report, Halmstad University.
[20] Gunnar Johansson. 1973. Visual perception of biological motion and
a model for its analysis. Perception & Psychophysics 14, 2, 201-211.
doi:10.3758/BF03212378.
[21] Seiji Kataoka and Susumu Morito. 1988. An algorithm for single constraint
maximum collection problem. Journal of the Operations Research Society
of Japan, 31, 515-530.
[22] Alissa
Katz.
2015.
News:
Drones:
The
(Possible)
Future
of
Medicine.
Emergency
Medicine
News
37,
9,
1,3030.
DOI:
10.1097/01.EEM.0000471518.70764.7d
[23] Ian Kelly, Owen Holland, and Chris Melhuish. 2000. SlugBot: a robotic
predator in the natural world. In: The 5th International Symposium on
Artiﬁcial Life and Robotics (AROB 5th ’00) for Human Welfare and Arti-
ﬁcial Liferobotics, 47075.
[24] Eunju Kim, Sumi Helal, and Diane Cook. 2010. Human Activity Recog-
nition and Pattern Discovery. IEEE Pervasive Computing 9, 1, 48. DOI:
10.1109/MPRV.2010.7
[25] Joseph S. Lappin, Duje Tadin, Jeﬀrey B. Nyquist, and Anne L. Corn. 2009.
Spatial and temporal limits of motion perception across variations in speed,
eccentricity, and low vision. Journal of Vision, 9, 30. DOI: 10.1167/9.1.30
[26] Seung-Hyun Lee, Yasuhiro Yamakawa, Mike W. Peng, and Jay B. Barney.
2010. How do bankruptcy laws aﬀect entrepreneurship development around
the world? Journal of Business Venturing.
[27] Achim J. Lilienthal and Tom Duckett. 2003. Experimental Analysis of
Smelling Braitenberg Vehicles Proceedings of the IEEE International Con-
ference on Advanced Robotics (ICAR), 375-380.
[28] Achim J. Lilienthal, Amy Loutﬁ, and Tom Duckett. 2006. Airborne Chem-
ical Sensing with Mobile Robots. Sensors, 6, 1616-1678.
[29] Jens Lundstrom, Wagner Ourique De Morais, and Martin Cooney. 2015. A
Holistic Smart Home Demonstrator for Anomaly Detection and Response.
In: 2015 IEEE International Conference on Pervasive Computing and Com-
munication Workshops (PerCom Workshops), 330 - 335.
[30] Yinrong Ma. 2016. TROLL: a regenerating robot. Master’s Thesis, Halm-
stad University.
33
[31] Gary Martinic. 2014. Glimpses of future battleﬁeld medicine-the prolifera-
tion of robotic surgeons and unmanned vehicles and technologies. Journal
of Military and Veterans’ Health, 22, 3.
[32] Michael McCurdy, Christopher Connors, Guy Pyrzak, Bob Kanefsky, and
Alonso Vera. Breaking the Fidelity Barrier: An Examination of our Cur-
rent Characterization of Prototypes and an Example of a Mixed-Fidelity
Success. In Proceedings CHI 2006, 1233-1242.
[33] Stephen A. Morin, Yanina Shevchenko, Joshua Lessing, Sen Wai Kwok,
Robert F. Shepherd, Adam A. Stokes, and George M. Whitesides. 2014.
Using Click-e-Bricks to Make 3D Elastomeric Structures. Advanced Mate-
rials 26, 34. DOI: 10.1002/adma.201401642 59915999.
[34] Marek Novk, Frantiek Jakab, and Luis Lain. 2013. Anomaly Detection in
User Daily Patterns in Smart-Home Environment. Cyber Journals: Multi-
disciplinary Journals in Science and Technology, Journal of Selected Areas
in Health Informatics (JSHI), June Edition, 3, 6.
[35] Sepideh Pashami, Achim J. Lilienthal, Erik Schaﬀernicht, and Marco
Trincavelli. 2014. rTREFEX: Reweighting Norms for Detecting Changes
in the Response of Metal Oxide Gas. Sensors, Sensor Letters. DOI:
10.1166/sl.2014.3170
[36] Irene Rae,
Leila Takayama,
and Bilge Mutlu. 2013. The inﬂuence
of height in robot-mediated communication. In Proc. HRI, 1-8. DOI:
10.1109/HRI.2013.6483495
[37] Shai Revzen, Mohit Bhoite, Antonio Macasieb, and Mark Yim. 2011. Struc-
ture synthesis on-the-ﬂy in a modular robot. In Proc. IROS, 4797-4802.
DOI: 10.1109/IROS.2011.6094575
[38] Laurel D. Riek. 2012. Wizard of Oz Studies in HRI: A Systematic Review
and New Reporting Guidelines. Journal of Human-Robot Interaction.
[39] Jim Rudd, Ken Stern, and Scott Isensee. 1996. Low vs. high-ﬁdelity proto-
typing debate. Interactions 3.1, 76-85.
[40] Joao Machado Santos, David Portugal, and Rui P. Rocha. 2013. An Evalu-
ation of 2D SLAM Techniques Available in Robot Operating System. 11th
IEEE Int. Symp. on Safety, Security, and Rescue Robotics (SSRR 2013).
DOI: 10.1109/SSRR.2013.6719348
[41] Jacqueline D. Spears and Dean Zollman. 1985. Fascination of Physics II.
Ch. 6 Interaction and Force. Benjamin/Cummings Publishing Co.
[42] Takahiro Suzuki, Fumihiro Bessho, Tatsuya Harada, and Yasuo Ku-
niyoshi. 2011. Visual Anomaly Detection under Temporal and Spatial
Nonuniformity for News Finding Robot. In Proc. IROS, 4797-4802. DOI:
10.1109/IROS.2011.6094575
34
[43] Susumu Tachi. 2003. Telexistence and Retro-reﬂective Projection Technol-
ogy (RPT). Proceedings of the 5th Virtual Reality International Conference
(VRIC2003) 69, 1-9.
[44] Tomohiro Tachi and Koryo Miura. 2012. Rigid-Foldable Cylinders and
Cells. Journal of the International Association for Shell and Spatial Struc-
tures (IASS), 53, 4, 217-226.
[45] Jason Valentine, Jensen Li, Thomas Zentgraf, Guy Bartal, and Xiang
Zhang. 2009. An optical cloak made of dielectrics. Nature Materials 8, 568
- 571. DOI:10.1038/nmat2461
[46] Peter Veto, Serge Thill, and Paul Hemeren. 2013. Incidental and Non-
Incidental Processing of Biological Motion: Orientation, Attention and Life
Detection. In: Cooperative Minds: Social Interaction and Group Dynamics:
Proceedings of the 35th Annual Meeting of the Cognitive Science Society,
1528-1533.
[47] Paul Viola and Michael Jones. 2001. Rapid Object Detection using a
Boosted Cascade of Simple Features, in IEEE Computer Society Confer-
ence on Computer Vision and Pattern Recognition, CVPR.
[48] Michael L. Walters, Kheng Lee Koay, Dag Sverre Syrdal, Kerstin Dauten-
hahn, and Ren te Boekhorst. 2009. Preferences and Perceptions of Robot
Appearance and Embodiment in Human-Robot Interaction Trials. In Proc.
New Frontiers in Human-Robot Interaction: Symposium at AISB09, 136-
143.
[49] Tianyi Zhang and Yuwei Zhao. 2016. Recognition for Robot First Aid:
Recognizing a Person’s Health State after a Fall in a Smart Environment
with a Robot. Master’s Thesis, Halmstad University.
[50] Aleksandar Zivanovic, James Auger, and Jimmy Loizeau. 2009. Carnivo-
rous domestic entertainment robots. In: Proceedings of the 3rd Interna-
tional Conference on Tangible and Embedded Interaction (TEI), 127-130.
ISBN 9781605584935 DOI: 1010.1145/1517664.1517696
[51] Matt Zucker, Nathan Ratliﬀ, Anca D. Dragan, Mihail Pivtoraiko, Matthew
Klingensmith,
Christopher M. Dellin,
J. Andrew Bagnell,
and Sid-
dhartha S. Srinivasa. 2013. CHOMP: Covariant Hamiltonian Optimiza-
tion for Motion Planning. Int J Rob Res 32, 9-10, 1164-1193. DOI:
10.1177/0278364913488805
[52] IES Lighting Handbook, Application Volume, Illuminating Engineering So-
ciety, New York, 1993.
[53] Accessed 2016/6/7 at http://www.planar.com/innovations/transparent-
oled/
35