Live Coding Interviews

I’m not going to come out and say that live coding interviews are objectively bad in every way, but they have insurmountable shortcomings that create gaps in their ability to accurately assess potential team members.

At best, they feel like maybe they serve as a tolerable de-risking filter for a company that needs an assembly-line hiring process. Their weaknesses, however, will directly and indirectly filter out plenty of people who could potentially be great team members.

They likely do a great job filtering out people who are incapable of programming, but a genuine and deeply technical conversation with someone about their past experiences could provide that same level of insight. In my experience with live coding interviews, the companies and teams seem oblivious to the gaps created by using live coding interviews as one of the key signals.

Before getting into the details, let’s get personal because I’m certainly carrying some level of bias, and that’s worth being up front about. Ultimately, though, this isn’t specifically about me. It’s about shedding some light on the problematic aspects of live coding interviews, and anywhere I share personal anecdotes, it’s meant to humanize the impacts of these processes rather than serve as a pseudo-scientific data point.

After being laid off recently, I’ve gone through a handful of live coding interviews, and I’ve had a handful in the not-too-distant past as well. Despite a computer science degree and two decades of software developer experience, I’ve never even come close to a reasonably passable outcome in a live coding interview.

On the other hand, with take-home coding exercises, I’ve successfully made it into the top ten applicants for roles with approximately 1,000 applicants multiple times. While I didn’t eventually get those jobs, hopefully it provides some evidence that my personal failure in live coding interviews doesn’t stem purely from incompetence.

With the initial live coding interview failures, I brushed it off, but as they began to accumulate, the imposter syndrome set in. It set in hard, and it still lingers. But I also started to see patterns and talk to other developers about their experiences.

At this point, I’ve stopped actively looking for a full-time role simply because I believe it’s a waste of both a company’s time and my time if they require a live-coding interview—and they seemingly all require them. I’d still love to find a great role where I’d be a great fit, but I decided not to hold my breath since independent consulting is going well.

Based on my experiences, though, it feels like it’s worth discussing the gaps exposed by being over-reliant on assessing skills through live coding interviews. I still maintain some hope that a little more awareness or perspective may help some teams improve how they approach hiring—with benefits for both the companies and applicants.

Enough context, though. Let’s discuss the problems.

The underlying problem with live coding interviews stems from trying to create an objective evaluation of skills that aren’t easily quantifiable—and they attempt to do this in a wholly unnatural context with a set of loaded assumptions.

Many companies have tweaked their live coding interviews. They claim not to penalize applicants for looking things up. They encourage using whichever language you’re most comfortable with. They say that it’s not a binary pass/fail on whether you complete the exercise. They admit that they understand it’s an unnatural context.

Even with all of these adjustments or acknowledgements, the idea is built on such a shaky foundation, that they don’t count for much.

If you were assembling a track team, you wouldn’t judge marathoners by their 100-meter dash.

Thinking about it a bit, it’s arguably worse. Saying “we recognize that this is a contrived process and less-than-ideal way to evaluate people, but we’re still going to use it” isn’t a great look. It’s pretending to be self-aware without actually being self-aware.

So when companies say that, I only hear “yeah this sucks, but we don’t care that much.”

Let’s start by examining the foundation because no matter how many concessions or accommodations a company makes in order to signal that they understand it’s an imperfect system, it doesn’t change the fact that they’re still using it.

Built on Contrived and Unnatural Context

The dynamic of being watched by a person with whom no level of trust has been established can not provide a reliable proxy for how someone works.

If you were assembling a track team, you wouldn’t judge marathoners by their 100-meter dash. And if you did, it wouldn’t help to tell the marathoner that you know it’s not a perfect system but you don’t have time to wait for them to run a full marathon. More likely, you’d look at their history of marathon times.

In over two decades of professional software development, I’ve never live-coded while a stranger (let alone multiple strangers) quietly observes and judges outside of the context of an interview. Pair programming happens with a teammate where trust has been earned and reciprocated over time. Live coding in front of an evaluator is something else entirely.

We’ll dive into each of these points later. For now, all that matters is the framing and context that these shortcomings impose on the process.

The dynamic of being watched by a person with whom no level of trust has been established can not provide a reliable proxy for how someone works. Add in the lop-sided power dynamic of interviewer and interviewee, and it’s even worse.

It’s as if a scientist knows that something happens in nature and then over-confidently believes any experiments in a controlled lab scenario will replicate the scenario. It might work on occasion by sheer luck, but it won’t be reliable.

Then there’s the time factor. I’ve done plenty of work with a site offline or broken where money is on the line for every second its down. I’m never more focused and deliberate than I am in those circumstances where every second counts.

So in an interview, it isn’t purely about time-induced pressure. The artificial and vague “get as far as you can in this arbitrary period of time” immediately frames a problem in awkward context that can’t be ignored. Many of the downstream factors stem from this problem.

The last fundamental flaw stems from the fact that most of the coding exercises are inherently divorced from reality. While challenges like Fizz Buzz have become less common, live coding interviews still tend to use made-up contexts that have nothing to do with the domain of the underlying business. In that context, focusing on such a narrowly-defined challenge ignores any potential for larger insights from experienced developers.

With the entire interview framed in this unnatural context, everything changes. Any belief that a live coding interview is a consistently reliable way to make an objective assessment represents willful ignorance at best. Almost all of the other flaws stem directly from being built on this shaky foundation.

Ignores the Observer Effect

The only way I can put the experience into words is that the moment screen share starts, it feels as if half of my brain shuts off.

If you don’t personally experience anxiety in these contexts, it may be difficult to understand or appreciate the impact they have on cognitive function. It can be awkward and uncomfortable no matter what steps the interviewer takes. Being completely unaware of this likely means that the observer will incorrectly assess any unusual behavior as some level of incompetence or inability by the applicant.

The only way I can put the experience into words is that the moment screen share starts, it feels as if half of my brain shuts off. Instead of just doing its job, it decides that half of it needs to be dedicated to navigating the contrived nature of the situation.

In the context of scientific experiments, this would be called “the observer effect” which can be described as “the disturbance of an observed system by the act of observation.” While not precisely the same as a physics experiment, it has the same implications. The very nature of the observer creates a context that changes the experiment.

In my experience, that’s precisely how it feels. I find myself uncontrollably second-guessing every click or keystroke, looking up APIs I know by heart under normal circumstances, and generally just tripping all over myself.

Or, I try to think out loud, but then I go into deep thought and forget to try and explain myself. So I stumble backwards to explain what I did and then lose my train of thought. So I would fully expect an outside observer to think I’m a hot mess.

Needless to say, my observed behavior in live coding interviews is not an accurate representation of my work or processes.

Of course, that’s just my experience and reaction. It likely affects countless other people in an endless variety of ways that can easily be misinterpreted. Talking to friends in the industry, this experience it’s universal, but it’s also not uncommon.

One could argue that if a scenario like this creates anxiety, that’s a good signal that the individual can’t work efficiently in a high-pressure environment, but that would be a gross misunderstanding of the causes of the anxiety in this specific context.

For me, it’s not simply a factor of being a higher-pressure situation as I generally find pressure helps me. When a site is offline, or a feature is broken in production, laser-focus comes very easy to me—likely because that’s just part of the job.

The discomfort and distraction stem entirely from having a stranger judgmentally looking over my shoulder, and I’m not using “judgmentally” in a derogatory sense here. In the context of a live coding interview, it’s literally their job to judge. I can pair program with colleagues all day long, and I deeply enjoy it, but there’s trust involved in that scenario. It’s truly collaborative rather than partially adversarial.

You might believe your team strives to ensure that the process feels collaborative, but it’s simply not possible to remove the fact that one person is evaluating another. The power dynamic of observer and participant can never be equalized. Without earned trust or shared goals between two people, true collaboration can’t happen.

Even when an interviewer is helpful and provides hints, cues, or nudges, that moves the progress along, but it also immediately feels like the type of mistake they’re going to penalize you for. And when it’s something you’d know in normal circumstances, it’s easy to beat yourself up. And that doesn’t make the process any easier.

For companies that do believe it serves as a good signal to filter out low performers or people who might be overly sensitive to high-pressure situations. I would argue that in many cases, any outwardly-visible anxiety represents a false negative and filters out people who don’t experience anxiety in their actual job responsibilities.

Any anxiety created by a situational context that doesn’t exist in the day-to-day responsibilities of a job won’t be representative of what to expect of their performance in the actual job, and any assessment of that anxiety will be based on a deeply-flawed assumption.

We’re all wired a little differently, and someone who might under perform in the contrived context of a live coding interview could be under-performing in that context precisely because of their strengths in other areas. Humans tend to gravitate towards others with shared behaviors and similarities. So without an awareness of these problems, teams end up hiring more people who think and behave precisely the same ways.

The end result is homogeneity. And if a company desires innovation and fresh perspectives, unknowingly selecting for and steering towards more of the same won’t get there.

So now that we’ve explored how the framing and foundation factor in, we can explore some of the simpler problems.

Prioritizes a Single Workflow

Anyone who might have a less rigid or structured approach ends up being judged by their ability to conform to a pre-determined approach.

Another unavoidable facet of live coding interviews stems from effectively requiring someone to just start typing. For plenty of people, that’s not a problem, but for people who tend to be more tactile, visual, or kinesthetic, just banging on a keyboard isn’t a natural way to begin problem solving.

In those contexts, it can be more natural and efficient to start by sketching or writing on paper away from a screen, but instead they’re herded into an approach that ties one hand behind their back and further complicates an already unnatural scenario right out of the gate. So if the goal of live coding is to evaluate someone’s abilities, all it does is ensure that you won’t see those abilities for many applicants.

Sometimes, when starting from scratch on a new problem, standing up and pacing in circles around my office while thinking out loud is the best way to absorb information and organize options before choosing one and moving forward. Can you imagine how that would go over in a live coding interview?

Technically, someone could begin with a non-typing-centric approach, but the nature of screen-sharing interviews discourages this kind of approach because drawing and sketching silently for 10 minutes and then maybe holding up a bunch of messy sketches for someone to see isn’t really practical.

So for those that are more visual or tactile, this creates a really rough starting point because you have to map the challenge in your head without your usual tools. Add that to any existing anxiety, and it becomes incredibly difficult to think through a problem.

This represents one of those flawed assumptions that everyone approaches problem solving with the same tactics. Anyone who might have a less rigid or structured approach ends up being judged by their ability to conform to a pre-determined approach. And it’s unlikely the observer recognizes that. It’s more likely that the observer only sees someone who is nervous and awkward.

Obfuscates with Irrelevant Domain Knowledge

Most companies seemed to have moved on from FizzBuzz-like challenges, but they haven’t gone far because they still choose arbitrary and unrelated topics like creating a script to handle scores for bowling or something equally irrelevant.

Some of these challenges may be at least somewhat realistic (as opposed to completely abstract), but in most cases, they’re the exact type of problem that would likely be best handled by ChatGPT. So at that point, do they want to see you muddle through it, or would they rather see that you know to have ChatGPT run through the initial pass and then refactor?

The question is rhetorical because while one company might appreciate such a pragmatic approach, the next would penalize you and question your ability to actually understand and translate the domain model. Even if you’re capable of both approaches, you essentially have to flip a coin to determine your approach and hope that the interviewer appreciates your methodology.

If a company is evaluating engineers with questions that can be easily answered by AI in seconds, what are they really evaluating for? Perhaps they’d be better off hiring a chat bot.

Doesn’t Allow for Adaptability

When you have an hour and an arbitrary problem, you can only guess which side of you they’d rather see.

Most companies would say that adaptability is a powerful trait to look for during an interview to know people can thrive under constantly-changing circumstances. Every team is different, and people are inherently adaptable. A team member who might work one way on one team could be very capable of working a different way for another team. In a live-coding interview without more context, there’s no opportunity to recognize that kind of adaptability.

One team may be looking for an employee to optimize for speed and short-term results, another might prefer someone who takes the time to create a good long-term and healthy solution. When you have an hour and an arbitrary problem, you can only guess which side of you they’d rather see.

And at any job, being able to know which of those approaches makes the most sense in a given context is an important skill in and of itself.

Of course, that’s assuming that seeing someone write code to an irrelevant and contrived domain in an unnatural context for an hour is a reliable proxy for their ability to contribute to a team to begin with.

In the day-to-day duties of a developer, there’s always a balance between up-front speed and long-term speed. Time-boxed coding challenges, however, don’t really allow for any kind of meaningful consideration of that balance—let alone the observation of that consideration.

While I can’t speak for all developers, there are times where I can and will create a script in a short time with no regard for long-term usability, but if something will obviously be around for a week or a month or longer, I’ll easily spend an hour up front ensuring that it won’t take me an extra hour each and every time I have to interact with it in the future.

Whether that’s researching available options, writing some documentation, or sketching out ideas, I’ve accumulated enough experience to anticipate both the times it will be worth it to spend some extra time up front and the times when it wouldn’t be worth it.

Prioritizes Speed Over All Else

For someone who might slow down to think and plan, there’s not really time for that in an hour.

Even if you tell someone that completing an exercise isn’t a pre-requisite, presenting a more-or-less finite problem to be worked through in a set amount of time strongly favors a preference towards speed rather than stability.

I’ll readily admit that I’ll never be the fastest developer out of the gate. In twenty-plus years, I’ve been bitten by quick-and-dirty more frequently than I’ve been helped by moving quickly. I can, and I will in some contexts, but I’m much more interested in building systems that my team and I won’t be cursing for the next five years.

In day-to-day work, we frequently face scenarios where a throwaway script is the right solution. Quickly creating something that gets the job done since it won’t be reused any time soon can be a useful skill. In my experience, however, it’s much more likely that code will linger and be reused for years to come.

Even one-off scripts accidentally evolve into zombie manual processes that stick around far beyond the initial plan. This is the kind of work that speed-oriented, time-boxed, live coding interviews try to evaluate. And while plenty of people can hack together scripts like this, the downstream costs of maintaining or working with the resulting code far outweigh any savings in initial speed.

What someone can accomplish in an hour vs. how quickly their code can be understood or improved a year or three from now are different things. I’ve had companies tell me about how they wish a large refactoring was the one project that haunted them the most, and then in the rejection email, they said they would have liked to see a more “pragmatic” approach.

Of course, without some level of pragmatism, plenty of companies don’t even survive to the point they can even begin thinking about refactoring the code, but pragmatism is a spectrum. So let’s just agree that either extreme of pragmatism is the wrong amount, and the more important skill revolves around knowing which way to lean from the middle of the pragmatism spectrum.

Speed and efficiency are undoubtedly valuable, but without any larger context, an interview process heavily weighted towards evaluating speed will naturally lead to developers who may be fast up front but rack up technical debt equally as quickly.

For someone who might slow down to think and plan, there’s not really time for that in an hour. With projects of any significant consequence or timeline, someone’s ability to dig into existing technical debt, and create a plan that accounts for that baggage and still provide a clear path forward is just one example of a skill that these kinds of interviews can’t possible evaluate.

Myopic Focus Overlooks Big Picture

By implicitly focusing on low-level coding speed, these interviews fail to give serious consideration to the other skills required by a developer.

By implicitly focusing on low-level coding speed, these interviews fail to give serious consideration to the broader skill sets that a developer needs. As someone interviewing for senior and lead roles, I have yet to have anyone ask me about mentoring, see how I provide feedback in code reviews, or find out how I research and understand if a given problem even needs a code-based solution.

We’ve all heard the cliche phrase, “if all you have is a hammer, everything looks like a nail.” The same is true of developers—especially when it comes to web applications. Sure it’s possible that someone who is incredible skilled with the one language they use in a live coding exercise, but how are they with the other elements of the web stack? Or, to put it another way, what if someone is slightly less good at that live coding language precisely because they’ve spent much of their career also being good at front-end development?

If a company has been tuning their hiring process for deeply specialized developers, the chances are equally high that fewer developers on the team have a holistic view of the full web stack. For example, whenever I’ve questioned these companies about their commitments to accessibility, their answers are predictably weak, and I have yet to hear anyone say much more than “we do what we need to do to avoid law suits.”

Whether someone can create a script in their preferred language won’t tell you whether that individual can recognize that twenty lines of back-end code could actually be handled with an approach that only requires twenty characters of CSS. Or, for that matter, whether an entire module of code could be removed if the interface used a better set of labels around an interaction.

It may be that companies need some kind of low-level filter so that they can feel good about the consistency of the process, but in my experience, such a narrow focus in a coding challenge focuses more on trying to steer the hiring process towards something quantifiable rather than trying to evaluate someone’s holistic understanding of how everything fits together.

More often than not, though, it’s felt like there’s a strong correlation between companies with narrowly-focused live coding interviews and companies with super-siloed and poorly integrated systems and experiences. There’s a place for deep specialization, and it is highly valuable, but interviews for senior or lead engineers can’t yield ideal results by focusing on a single skill.

Tends Towards Confirmation Bias

That loosening of filters can initially make teams feel that they’ve lowered their standards rather than broadened their recognition capabilities.

Finally, an inherent problem of any interview process is that the people performing the interview have previously gone through that same process and thus believe that since the process selected them, it must be inherently accurate.

It’s human nature. We all fall into this trap of believing that the way we’ve traditionally done things is the way to do things until it leads to observable bad results that we can clearly connect back to the faulty assumption.

“I participated in this process. I made it through this process. I believe that I’m good at what I do. Therefore this process is good.”

Similarly, while filtering applicants out is the purpose, the more people that are filtered out the more the process appears to be working since the whole point of a filter is to filter. If a filter doesn’t remove anything, then it’s likely not a good filter.

Given the nature of filters is to remove undesirables, it becomes more likely that it will trend towards exclusivity rather than inclusivity.

That provides very little motivation for a team to question its hiring processes because any change that reduces the number of applicants deemed unworthy can be perceived as being too loose with its criteria. Moreover, there’s really no way to know if any of the people filtered out were done so incorrectly because they’re not around to evaluate.

Tuning a filter towards inclusivity requires a significant leap of faith up front in order to adjust the filters in a way that enables different types of applicants to go further in the process. That loosening of filters can initially make teams feel that they’ve lowered their standards rather than broadened their recognition capabilities.

That inherently requires a highly self-aware team to recognize and admit flaws in their approach, and take a leap of faith. And, quite frankly, that’s not something one generally sees of teams at larger scales where reducing risk is generally given priority over anything else. Instead of introspection or improvement, they focus on making the hiring process as binary, simple, and assembly-line-driven as they possibly can. And that de-risking nature of interview processes strongly encourages a bias for exclusion.

That also makes it much more natural to lament a pipeline problem or lack of available talented engineers rather than a failure to recognize abilities that don’t precisely match their own pre-conceived notions of what a software engineer should look like or how they should perform.

Unfortunately, the skill-sets that could help them see this more frequently belong to the people that the existing process “successfully” filters out and ensures they’re never in a position to provide the type of feedback a team needs in order to improve its hiring processes.

What are the alternatives?

I may personally be un-employable for other reasons, or I may not be as knowledgeable as I’d like to think I am. It’s impossible to know because most companies are more focused on avoiding litigation than providing useful feedback.

Setting those or any other possibilities aside, I feel like these weaknesses in live-coding interviews don’t leave much room for debate. At best, they obfuscate some applicants’ abilities. At worse, they completely sabotage the interview for a large subset of applicants.

One could argue that live coding interviews are the worst types of interviews—except for all the other types. But “it’s not the absolute worst approach we could take” is a pretty low bar.

Companies may deem these acceptable concessions relative to having a consistent and un-biased process. Or, the speed and efficiency may mean they feel any extra false negatives represent an acceptable cost in order to fully avoid letting any unqualified people potentially slip through occasionally.

I just can’t believe it’s an either/or proposition with live coding compared to any of the other desirable characteristics or outcomes of a company’s interview process. And to some degree, I can’t help but wonder if the refrain that companies can’t hire enough good people stems more from being bad at interviewing than from a lack of available talent. (More likely it’s a little of both.)

One could argue that live coding interviews are the worst types of interviews—except for all the other types. But “it’s not the absolute worst approach we could take” is a pretty low bar.

There isn’t a simple answer, and I’m not naïve enough to believe any answer I could come up with is the singlar correct answer for every company, team, or role. The approach for entry-level roles, senior-level roles, management-oriented, or independent contributor roles will all need slightly different approaches, and complex questions rarely have simple solutions. Plus some companies truly should have different priorities. Nonetheless, the tech industry at large can do better, and the companies that truly understand these faults in live coding interviews can find ways to address them.

If that means switching from live-coding to take-home, then so be it. Or maybe a little of both. Have someone do a take home, and then have a live call to discuss their solution. Or better yet, let applicants do a take-home, and pay them for their time.

Just in case you’re bristling at the idea of paying them at this point, don’t forget that even if you extend an offer, there’s still no guarantee they’ll accept. So think of it as another chance to be selling them on joining your team over some other team.

Or maybe use multi-pronged technical interviews that also look at both giving and receiving feedback on pull requests.

Or present them with a complex technical problem that requires architecture-level decisions across various parts of a technology stack, and have them talk through how they would approach it.

Or maybe just give people the benefit of the doubt on low-level coding exercises. If they can hold an intelligent, informed, and reasoned conversation while discussing their opinions and ideas in a respectful manner, then it’s a pretty safe bet they have the ability to write the code.

I expect there’s plenty of companies where all of this represents a feature rather than a bug, but I also believe there’s plenty of other companies that would genuinely like to do better but are simply too close to the problem to see it clearly and objectively. So maybe—just maybe—this kind of external perspective is what they need to recognize the flaws so they can make things better for other folks.