Why aren’t remote interviews as useful as face-to-face interactions?

Skype and Zoom aren’t quite as good as meeting in the physical world. But why? Pioneer and Emergent Ventures are looking to fund research on exactly how and why video conferencing interactions are different. Apply at and mention this tweet…Given the rise of remote work, the economic impact of this research could be Nobel-worthy.

That is from Daniel Gross, and here you can apply for Emergent Ventures.


--but . . . but--

We were TOLD, we were PROMISED, we were ASSURED by our Cognitive Elites, our Applied Scientists, and our Applied Technologists all that our internet technologies could and would supply a virtual reality in no way inferior to real reality: paperless offices, teleconferences galore . . .

Why were we LIED to so strenuously and for so long?

Why trust our Lying Elites to tell the truth about ANYTHING?

I don’t recall our elites pushing this. It was the rank and file aspie engineers and their fellow travelers.

I don't recall our elites strictly disabusing anyone of such views. Our benefactors and philanthropists seem to've been pleased to coast for just as long as possible on such generous misapprehensions.

I can't tell if you are a millennial who isn't old enough to know how offices worked pre-internet, or a curmudgeon who doesn't have any experience in a modern office.

Regardless of his age, he needs to take his meds.

"I don't recall our elites strictly disabusing anyone of such views."

I thought that we didn't want elites telling everyone what to think...

Our elites remain at least as conspicuous for what they never say as for what they always say. (Astrologers, we're told, can't see the stars, but astronomers, we've learned, can't see the earth.)

You are pretty xoung it seems, since this would have been correct 50 years ago - 'that our video technologies could and would supply a virtual reality in no way inferior to real reality.'


Of course, back then, it was our telecommunication monopoly elites making that promise, an age that Prof. Cowen is apparently longing to return to.

Well, young.

And this series of combined ads should be amusing, as it predates what most people would call the Internet - https://www.youtube.com/watch?v=TZb0avfQme8 Admittedly, no one is likely to ever send a fax from the beach these days.

Terry Gross is the best interviewer in the business. And she conducts most of her interviews remotely, by choice. Why? She says that she gets better interviews that way: it's "a setup that provides a kind of faceless intimacy not unlike that of confession or psychoanalysis, where the patient and practitioner face away from each other, under the theory that the obscurity will allow thoughts and fantasies to flow more freely".

Cowen is one of the best in the business as interviewer or interviewee, and I prefer to "listen" to his interviews by reading the transcript. Why? Cowen is so smart and knows so much about so many subjects that he doesn't pause, the quick questions and especially the quick responses affecting the listeners reaction to him (as opposed to what he or the other person says). Compare Cowen to another celebrity economist, whose pauses will put one to sleep. If you don't know who I am referring to, you don't listen to interviews with economists. [To be clear, I am not criticizing Cowen or suggesting he adopt the pause, just making an observation.]

I think he edits out the pauses.

Tyler is very good at preparing for interviews and teeing up good, unusual questions, but he is so hell-bent on getting through his question list in a rat-a-tat-tat fashion that he doesn't listen to answers and pursue follow-ups/ tangents that are interesting. Sometimes, the interviewee is clearly surprised by the abrupt shifts. He should dial down the structure a bit and be prepared for conversations to follow a more haphazard route.

Terry Gross is the best interviewer in the business

No, she's an insipid creature who appeals to a certain demographic. The people who run NPR don't care much about any other demographic.

It is a great question. But the Cisco telepresence system pretty much duplicates for me the experience of meeting in person. The screens are high resolution, people on them are life sized and organised round a table in the same way as a regular meeting. Problem is that only a few companies were willing to pay the installation cost, so using them externally was impossible.

In the future, high-resolution screens will be cheap enough and bandwidth will be cheap enough that these will come standard in homes.

There will be one room where an entire wall is a screen, and you will interact with people across the world as if they were sitting in an adjacent room.

But, those wall size ads will give you a headache (and the unsubscribe button will still be 1mm x 1mm).

The wall-screen will handle the visuals, but you will wear nth-generation Airpods in your ears to hear the audio. These will have pinpoint location awareness, keeping track of where each of your ears are at any given moment.

Each room will also have a network of tiny microphones on the walls and ceiling to perfectly capture the acoustics.

The Airpods will then do sound processing on the fly. As a result, when you and your interlocutor move around in each of your respective rooms, no matter where you are at any given moment, the subtle echos and acoustics will always make it sound exactly like the two rooms are in fact part of the same unified space. That will be a key part of the adjacent-rooms illusion.

The screen will have tiny nanolenses mixed in among the pixels, that work like an insect's segmented eyes. So no matter how close or how far you stand from the wall-screen, the image will always be true, without any off-kilter camera-angle issues.

Ray Bradbury anticipated this 70 years ago in Fahreneheit 451. That wasn't a piece of utopian fiction.

The Jetsons also had videophones.

Same with E.M. Forster's astoundingly prescient short story "The Machine Stops". Wall-size screens, and a bunch of people wondering why would they ever want to leave their room.

One thing to look for is interpersonal synchronization, both in overt behavior and in neural activity. There's a growing literature on this topic that goes back at least to William Condon in the 1970s, who took high speed film of people interacting. I've got a bunch of relevant posts on my blog: https://new-savanna.blogspot.com/search/label/synchrony

For software developer interviews, it's important to be able to view both the job candidate's screen and the face of the person. In some scenarios, someone does an online search to find the answer to a technical question (e.g. delays for tens of seconds then gives a perfect textbook response).

Why are you asking questions that are easily googleable? Are they not going to have access to the internet while on the job?

Looking up something on Google isn't the same as knowing what you are talking about. Short-answer responses provide an easy way of determining the candidates experience and knowledge in an area. It also sets a standardized baseline for comparing interview candidates against each other.

Even a non-programmer can get the answer to 'what does an HTTP 200 response code mean?' but if the person has to use Google to get the answer then that person probably isn't a REST web service developer.

Likewise, someone can be in the remote room helping the person answer the questions so in that case too it's important to study the person's face.

Most useful information can be found on Google. Want to know the degredation products of PCE? Type it into Google! Want to know the fossils found in, say, the Silica Shale? Fire up Google.

The hard part is applying that information. There's a scene in the first Harry Potter book (changed in the movies) where Hermione knows 1) a certain plant can't stand light, and 2) how to make magical fire/light. She is incapable of putting these two facts together on her own, however. You see that in young professionals as well: they know the facts, they know the theories, but being able to apply them to messy real-world situations is simply beyond them.

You also sometimes need that information more quickly than Google can provide it. The classic example is bleeding out--you don't have time to Google it. A more common example, however, is a rapid change in context. In my line of work it's routine to be working on one type of task (say, overseeing the installation of grout in an abandoned well), then get a phone call about something completely different (say, river elevation and its effects on potential sample points). Simple things, really, but if you rely on Google for all your answers you can't efficiently switch from one to the other.

There really needs to be more words for "know" in English. You can know a fact without understanding it or being able to apply it, and sometimes you can apply it without knowing it.

Phone interviews work fine if you want to focus on ideas, facts, useful knowledge etc. I;d rather not even see a snapshot of someone before I do the interview, just have a general idea of how old they are.

In-person might be better if the person is uncommunicative or a complete bubblehead and you need to pad out the article with colorful but irrelevant lifestyle details -- e.g. some personal stuff in her office or what he ordered for lunch. Burger, turkey wrap or kale smoothie? Also possibly getting to read (upside down) interesting stuff the interview subject has unwittingly left sitting out on the desk.

My old job required that interact with attorneys. I had fun scribbling gibberish notes and watching them try to read the notes upside down.

In-person meetings also raise the issue of location -- who gets to be the (dominant) home team and who is the (submissive) visitor? Or do you negate this somewhat by meeting at a neutral site like a Starbucks.

Some media handlers advise clients to never let a writer into your home or office, always insist on a neutral site. You're sacrificing some dominance but preventing some inane little detail (The Art of the Deal on your bookshelf) from being taken out of context.

But surely you live in a McMansion where 90% of the rooms are unused.

Just have your media handlers pick one empty room to be your "office" and stage the setting with suitable props, books that some library threw out, and discovery-ready inane little out-of-context details.

Nice. Ever considered going into PR?

I think think this is for the same reason Silicon Valley continues to thrive in spite of rising costs and why Amazon is locating an HQ2 in pretty high cost NoVa. Face to face for lots of important interactions of much more business and economic importance than interviews remain very important. Maybe some people prefer remote access, but most do not. They like to be able to see and react to those signals one sees only face to face. And for really big deals, well, having the serious discussion over a meal really beats some remote access over a screen for a variety of reasons, all the way down to how one cannot shake hands over remote screens, much less drink a toast.

"Amazon is locating an HQ2 in pretty high cost NoVa. " is for regulatory capture. Amazon learned Microsoft's lesson in the 90s with the antitrust threats - that all went away when they set up on K street and started contributing.

Not that the rest of your comment is wrong - people like personal interaction.

Because you can't shake hands.

I have worked in virtual groups, distributed around the world, and in local groups, with everyone in the same building.

I'd say it is all motivation and conscientiousness.

In the distributed setting, I got some great work done, with passionate players overseas. In the local setting too .. but here's the rub: In the local setting if someone is slow or slacks off, a friendly visit can be a motivation (and if done properly, a positive reinforcement).

In the distributed setting, people just go missing.

So I'm not even sure it is a tools issue. If the tools can somehow motivate or create that must surely only be at the margin.

Either the player is bringing it, or not.

"motivate or create [passion] that must surely only be at the margin."

By the way, DG might have some unfortunate priors:

"Before delving into specific theories, I’d think about how to even quantify the difference. A biomarker for kinship between individuals."

If you're approaching us as a kinship issue, put down the keyboard and back away from the computer.

"Skype and Zoom aren’t quite as good as meeting in the physical world".......but they DO much better than a phone call. Meetings are irreplaceable, but the clunky old phone can be killed for good =)

I do engineering consulting, video calls save lots of money which then becomes profit. The clients don't want to see me, they want to see the work results, so I activate the share screen feature.

You meet the client and subcontractors one or twice to dine or drink....I don't share the fetish of shaking hands. Then, it's all video calls. One of my clients is 5 hours of driving away. When we go there for a 2 hour meeting we invoice the sum of our 2 hours meeting + gas + car depreciation + another whole of work which is lost driving. In total around $2500 for shaking hands. In another case I have a client in Paris and a subcontractor in Gothenburg (Sweden). A meeting in Paris means 2 flights, 2 hotel nights, meals, and about 4 days lost between me and the subcontractor, 5-6K?

But, the greatest advantage is how little disruptive is a video call for your work week: you don't get up at 4:00 AM to get that damned early flight, no worries about the reimbursement of travel expenses, no stress of missing a flight/train. It's great to work the whole day without distractions and all you need to do is get a coffee 10 min before the call.

Once again, compare it to phone calls and the complains are gone =)

I think I use Video less than I did 3 or 4 years ago. As you say, the client doesn't really want to see you. Voice has a place.

My rule is that it's very hard to disagree (sensibly) over email, it's easier on a phone call (with or without video), but it's vastly easier face to face. I think that explains a lot of this topic, noting that interviews might be a special case as you probably don't do so much disagreeing.

Yep. Screen sharing plus voice is great. Video feeds of faces staring at their screens are useless. I have many years of working as a part of distributed teams. I've usually met the other folks in person, but once that's done, travel for face-to-face meetings is a waste of time, energy, and money.

We are social mammals and we use many nonverbal forms of signaling — mostly unconsciously.

Check out the body of research on eye gaze and trust, for example. You can’t really meet someone’s eye over videoconferencing. This has a measurable impact (in games) on the ability to form bonds of trust.

Brandon nails it.

Why the Cisco Systems Telepresence product is so engaging. You can see the skin pores of your fellow participants.

Exactly, and very telling about the readership that it took this many comments to get to the obvious.

I know a thirty-something who was appointed to a Director level job in a large corporation. The final fact-to-face interview was held in London: all the earlier interviews were held remotely, between London and the Ozzie outback. I imagine that the tricky bit was agreeing a good time for those remote interviews. Next time I see him I must ask whether his screen attracted lots of mozzies.

The question is worth asking/underrated. But I think most of the answer that emerges will be that body language and physical presence confer a wealth of tacit information. People vary in their sensitivity to it, but one's gait, microexpressions, and interactions with third parties can influence how trustworthy and observant a person appears.

Maybe all it takes is more time for people to adapt to and accept how crappy video conferencing is. They've managed to do it with the audio on mobile phones.

What evidence do you have that interviews are useful at all for your purposes? We've known for a long time that interviews are a terrible way to do personnel selection:


Google "replicated" this discovery several years ago:


Obviously, these results are for job interviews. But I'd argue that interviewing of the type you're doing is likely to have even more randomness. That's the conclusion we reached for investing in pre seed startups. We only talk to founders to extract information about the features of their product, business, etc. that we think are relevant. We make judgements on the founders based on features extracted from their bios or LinkedIn profiles.

I'd be very interested to know to what extent you've established the external validity of your interviewing practices.

I suspect it depends on the specific technology, but the ones I've used and listened to become intelligible when two people are talking. This could be as simple as asking someone to repeat something. The flow of the conversation falls apart. Phone calls aren't like that, even with multiparty.

I can see some situation where it is better than nothing, and sharing visual information in some cases makes it better than in person. But casual conversation, the back and forth of in person conversation doesn't work very well.

I work for a Fortune50 company in the US located in a city that is in fly over country. We have, on more than one occasion, been scammed by not requiring in face interviews for candidates, often times hired through recruiting agencies.

In general it ends up being the case that the candidate is totally unqualified for the role (most examples are IT/data science), so they're either having someone interview for them (if phone), or somehow being provided good live answers to technical questions if we are using a webcam.

The candidates are inevitably fired (no idea how in on it they are), but this still takes time and the candidate/contractor has collected at least 2-3 months of salary before we can fire them.

Wait, that's absurd. Is there actually a world full of George Costanzas, who will brazenly hold a desk and see how long it takes to get fired?

We expect people to look at least as good on screen as, say, the local newscasters.

But I was talking to a local TV news weatherman at a school event years ago and it struck me he was superbly groomed compared to the other dads (who tended to be affluent lawyers and doctors). Plus, when he is on TV, he is made-up and lit by professionals.

Naked from the waist down suits me fine.

It's the mullet of the video conference age - business on top; party on the bottom. Flash to the hacker scene from the movie Swordfish

They ARE just as good as live. It's just we haven't learned to be comfortable with them yet. I run an entire successful business using WEBEX. I can tell everything I need to know about someone after an hour on Webex live teleconferencing. The world has experienced an insanely cool revolution but humans absorb the lessons slowly. Our 'hipster' urban centers are peaking now -- or have already peaked. The long predicted revolution of moving out into the countryside will begin shortly. Buy up that cheap rural land soon. I'm already out here -- and I work in one of the 'hippest, most fashion conscious boutique industries' in the world. It's super quiet out here. You can't hear you neighbor. You're surrounded by nature. It's absolutely incredible. And almost nobody is here yet. Buy as much land as you can. It's cheap!

Well, maybe this trend will cease when the new equilibrium is reached, but in my experience is so far a lot of remote workers and remote businesses who exploit the tech to live in the boonies end up having to either relocate to a city or travel a lot.

The workers change jobs or or get promoted or the company changers owners and they are suddenly required to be at an office more.

And firms start to grow and find out the local labor pool is too small or they can't attract employees who like hog farms for neighbors or whatever.


May work for you, but not a shred of evidence the "hipster urban centers are peaking now," not yet anyway. And maybe it is fine for hiring and maybe you do not make deals with handshakes or drinks, but a lot of people do and will continue to do.

In a world of autistic or introverted misfits like the denizens of this blog, remote interviews would work but in a world of neurotypicals (aka real people), we need to be able to look someone in the eye, give a toothy grin, a firm shake, and a backslapping laugh to build human comradery. There are so few commenters here that know how to do that. Trump does. That's why you nerds all hate Trump.

Although the record shows that anybody signing a deal with Trump after he slaps their back has been a fool, with him stiffing over 10,000 subcontractors. Those folks were about as stupid as his backers like you, "Trump 2020." When are you going to figure it out that you have been taken for a long ride by a lying fraud?

A lot of people refer to a lack of non-verbal clues in video-conferencing as a detriment. Frankly, you can infer a lot more about person's intelligence, competence, situational awareness, empathy (or lack thereof) by how well they operate videoconferencing software, how polished their audio is, how well crafted their presentation is, how aware of latency and possibility of background sounds they are, etc.
Personally I can extract way more clues from these aspects than from body language.

+1. As someone who has been on the remote interviewee multiple times, I can say that's it often awkward and the flow of the conversation is choppy. I've never left one feeling as good as I do after in person interviews.

In other words, they are a technical test for giving presentations in sub-optimal situations. Which is fine if you are interviewing for jobs that require those skills.

I use Skype to talk to my wife and kids while traveling for business. For someone who uses this service regularly, at least one answer is immediately obvious: Lag. Universal high-speed, uninterrupted internet access is still very much a dream waiting to be realized, and even short lags can destroy the flow of a conversation. The audio can freeze, the video can freeze, the whole thing can shut down, any of which make the entire experience extremely frustrating for everyone involved, and--more importantly--create a disconnect between the parties in the conversation. And no, these aren't limited to out-of-the-way backwaters, or to cheap set-ups. I work for one of the largest firms in my industry, one that routinely handles $100 million contracts, and I've seen these issues happen even in hub offices in major cities.

Simply put, you don't have these problems with in-person interviews.

>could be Nobel-worthy

Well, if you mean that it would state the blindingly obvious with far too many words, I'd say you are probably right.

You see, it turns out that tools can be good for one thing, and less good for other things. Your problem is that you want to label "all human interaction" as "one thing," so you are confused.

Skype is for sharing information, not for trying to replace the experience of meeting someone in person for the first time.

Send me the Nobel any time, my attic is huge.

A major problem has nothing to do with video, it's the audio. It's often very hard to hear and when there are multiple speakers feeding into one conference room phone mic it's hopeless. Really, everyone on a call should have a personal lavalier mic

A related issue: Some of the non-verbal cues used in communication are things like "When is it my turn to speak?" I've been in countless remote meetings where folks either wait for far too long to speak, or tumble over one another, because the medium doesn't provide the cues with which to judge who's going to speak when.

Judging by all of the competing ideas and theories from the smarter than average posters on this site, it looks like this is a great idea for further research.

Comments for this post are closed