The cracks are more visible online. why virtual interaction is more complex than it looks
‘The cracks are more visible online’: why virtual
interaction is more complex than it looks
Stephen Mugford and Pamela Kinnear
Since the start of the COVID-19 pandemic with its lockdowns and work-from-home
adaptations, pressure has risen to interact online. Transitioning traditional face-to-
face (FtF) methods to online conferencing and consultation was already underway, of
course, and COVID-19 has accelerated that trend.
Online interaction (OL) that worked as effectively as FtF could offer many advantages
to geographically dispersed organisations. Travelling to FtF events consumes time
and money. It can be hard to coordinate busy diaries and, once gathered, pressure
exists to ‘do everything now, while we’re all here’.
In contrast, OL is cheap, easy to organise and simpler to coordinate allowing shorter,
more frequent and more focused exchanges. Yet OL meetings bring some downsides,
often seeming clunky, artificial and dissatisfying. For example, committee meetings,
which can be somewhat boring FtF may become dreadful if transplanted to an OL
setting without suitable adaptation1
Not all effects of OL are negative, however. We use an FtF process where people tell a story to 2 others, then
turn their backs to listen in silencewhilethe others discuss whatthey heard, before changingplaces and
repeating. This has considerable impacton the storyteller and the ‘analysts’.Itseems likely that usingbreak
out rooms of three and replacingback-turningwith muting the camera and microphone for the teller affords
greater listeningand learningthan FtF.
Previously, we at Kinnford prided ourselves on employing successful methods in FtF
contexts to promote energy and engagement. Facing the disruption of the
pandemic, we’ve searched for OL alternatives that might be as effective.
Combining our experiences, a burgeoning commentary by others and our social
science training, we’ve identified three crucial, inter-connected factors that underpin
interaction and which (a) explain why OL can become clunky and (b) may guide
create improvements to it.
First, ‘being is embodied’. Andy Clark cites a lovely passage from a sci-fi story where
aliens discover humans and cannot fathom how meat, lacking any central processor
unit, can function.
We are, as Clark says, ‘thinking meat, feeling meat’.
The implications are profound. Cognitive neuroscience shows cognition and emotion
are not separate, they are singular (call it C/E). C/E is a cascading process of
prediction, perception and interpretation of events. Imagine a kayaker shooting a
rapid: reading the water, balancing herself and the kayak and adjusting her course
moment to moment. That is how C/E unfolds.
Moreover C/E is not abstract, ethereal consciousness. Rather, it is embodied,
embedded in contexts and extended: we think ‘beyond the brain’ with other body
parts and things (imagine the potter thinking with hands, clay and wheel) and in
interaction with others (a lively party is co-created with intermingled C/E.) Contrast
the C/E of a hungry man, wearing suit and tie, walking briskly towards a café, hot and
uncomfortable in the noon sun with that same man in a comfortable chair and casual
clothes, caressed by a gentle breeze, brandy in hand, watching the sun sink into the
ocean after a fine dinner.
Finally, C/E is helped or hindered by features that afford or discourage responses.
‘Affordances’ (‘triggers’ is a rough synonym) are numerous and typically we set them
up routinely, knowing, without thinking deeply, what C/E we want to afford (or
trigger). A table setting, for example, affords many, encoded possibilities (but largely
excludes others.) This table is not set to play Scrabble, nor for a family dinner with
toddlers. You won’t be served beans on toast.
In the OL environment, many important perceptual data are limited: we see faces,
perhaps, but not postures, hands or crossed legs. Side conversations and meaningful
glances are absent. In contrast, ‘chat’ is slow, verbally more formal and ‘flat’. (Think
how ‘nice’ means myriad things dependent on tone and nuance.) OL we do not share
the same ambience of temperature and smell. We cannot touch. (While touching is
rightly constrained in a world concerned about paedophiles and harassers, it remains
true that touch is essential to human communication and wellbeing.) We are less
clear about what affords good interaction OL and what does not.
Lack of embodiment, then is a key issue, to which we will return.
Second, interaction is a ‘team sport’ which we learn to play from birth. In the 1960s
and 70s, sociologist Harold Garfinkel summarised some of the key features:
interaction, he said, is ‘more or less artfully accomplished’.
Garfinkel showed—and a raft of social science elaborates this argument—that the
boundaries around interaction, the sense-making within it and the perception of a
comfortable flow result from everyone playing along. Like a volleyball team,
everyone shares the effort of keeping the ball in the air.
The team effort is ‘artfully accomplished’ when the ball stays aloft and no one needs
think about who should bunt it, how high, etc. Garfinkel saw this as an interactive
achievement, not just people following clear cut rules like automata.
This game varies between cultures (and a foreign game can be confusing) but people
can play along pretty well, most of the time.
How do we interact in OL Land? There’s no established culture to draw upon, no
Lonely Planet Guide available to navigate the unfamiliar. Relying on FtF assumptions
may leave us ‘lost in translation’.
Third, we operate with two brain systems. System 1 consists of automatic rules of
thumb or ‘heuristics’ (which may include stereotypes) while System 2 is conscious
thought and problem solving. Social interaction is managed mostly by System 1
(biologically low cost, rapid and reliable) freeing up resources for the occasional
System 2 action (costly and slow), a weighting that conserves our energy and
This links with the previous point: ordinarily, we deploy many heuristics spiced with a
little System 2 thinking to get along. But our familiar FtF heuristics often don’t work
OL. Instead, we have to use System 2 which is slow, it is hard work and may be
What conclusions might follow? We don’t have a detailed answer but we have a few
Starting with C/E as embodied, embedded etc.: in OL events ‘we are all in the same
meeting but not the same room’. Jane, perhaps, is in her office, using the desktop
with a good camera and microphone, door closed, lighting adjusted. She feels and
looks professional and engaged. Meanwhile, John is at home in the spare bedroom,
balancing his iPad on his knees. The blinds are up and he’s harshly backlit. He’s
keeping one ear on what the kids are doing and the builder’s noise from next door is
loud. He does not look or feel professional, while distraction affects his engagement.
Furthermore, depending on familiarity and expertise with the technology and
software, the ‘extension’ of C/E may differ for Jane and John.
Resources may differ: John doesn’t have an office or desktop at home, perhaps he’s
not had the same training opportunities. Also, most people know how to ‘turn up’ to
a FtF meeting, what the dress codes might be, how to present oneself, etc. The same
isn’t true OL and the varied log-in locations may have major effects on participation
and group dynamics.
Absent a normative code, it’s not always clear what ‘place settings’ to prepare and
not everyone ‘turns up’ the same way. This is a cultural work-in-progress which may
eventually settle into shared knowledge that organises embodiment, embedding and
extension of C/E OL.
Turning second to interaction as a team sport akin to volleyball, the same limitation
seems relevant: lacking the 200,000 years of human social evolution that grounds Ftf
we’ve not yet worked out what OL does and does not afford and have no reliable
rules-of-thumb for cooperating and collaborating. While in FtF we signal, often
without realising it, with gestures, gaze and so on. We do not have an OL code to
replace these. A useful strategy is to keep communicating and reflecting on this:
every time we re-examine FtF and ‘make the implicit, explicit’ we increase the
chance that we can find a different way to achieve outcomes we mutually seek.
Finally, the heuristics we use FtF may or may not translate to the OL context. How we
interpret silences when FtF, for example, may not be good guides for silence OL, let
alone for the simulations of silence—the muted camera and microphone. It will take
time and effort to constitute a reliable set of OL heuristics to—meanwhile, we
consume a lot of System 2 resources, which may explain why people find attention
span OL is shorter and the overall impact more exhausting.
OL is not going to go away—we need to keep working at understanding and
performing it better, combining trial-and-error experience on the one hand with the
most useful concepts on the other. This short piece has hopefully been a small step
in that direction.