How to Debug Python Errors

Share this article

How to Debug Python Errors

This article was originally published on Sentry. Thank you for supporting the partners who make SitePoint possible.

One of Sentry’s more powerful features comes with languages like Python. In Python, PHP, and the JVM, we’re able to introspect more deeply into the runtime and give you additional data about each frame in your call stack. At a high level, this lets you know things like calling arguments to your functions, allowing you to more easily reproduce and understand an error. Let’s dive into what this looks like, and how it works under the hood.

We’ll start with what a Python error might typically look like in your terminal, or in a standard logging system:

TypeError: expected string or buffer
  File "sentry/stacktraces.py", line 309, in process_single_stacktrace
    processable_frame, processing_task)
  File "sentry/lang/native/plugin.py", line 196, in process_frame
    in_app = (in_app and not self.sym.is_internal_function(raw_frame.get('function')))
  File "sentry/lang/native/symbolizer.py", line 278, in is_internal_function
    return _internal_function_re.search(function) is not None

While this gives us an idea of the type and location of the error, it unfortunately doesn’t help us understand what is truly causing it. It’s possible that it’s passing an integer or a NoneType, but, realistically, it could be any number of things. Guessing at it will only get us so far, and we really need to know what function actually is.

An easy and often very accessible option to do this would be to add some logging. There’s a few different entry points where we could put logging, which makes it easier to implement. It also lets us ensure we specifically get the answer we want. For example, we want to figure out what the type of function is:

import logging
# ...
logging.debug("function is of type %s", type(function))

Another benefit of logging like this is that it could carry over to production. The consequence is generally you’re not recording DEBUG level log statements in production as the volume can be significant and not very useful. It also often requires you to plan ahead to ensure you’re logging the various failure cases that could happen and the appropriate context with each.

For the sake of this tutorial, let’s assume we can’t do this in production, didn’t plan ahead for it, and, instead, are trying to debug and reproduce this in development.

The Python Debugger

The Python Debugger (PDB) is a tool that allows you to step through your callstack using breakpoints. The tool itself is inspired by GNU’s Debugger (GDB) and, while powerful, often can be overwhelming if you’re not familiar with it. It’s definitely an experience that will get easier with repetition, and we’re only going to go into a few high level concepts with it for our example.

So the first thing we’d want to do is instrument our code to add a breakpoint. In our example above, we could actually instrument symbolizer.py ourselves. That’s not always the case, as sometimes the exception happens in third party code. No matter where you instrument it, you’ll still be able to jump up and down the stack. Let’s start by changing that code:

def is_internal_function(self, function):
    # add a breakpoint for PDB
    try:
        return _internal_function_re.search(function) is not None
    except Exception:
        import pdb; pdb.set_trace()
        raise

We’re limiting this to the exception, as it’s common you’ll have code that runs successfully most of the time, sometimes in a loop, and you wouldn’t want to pause execution on every single iteration.

Once we’ve hit this breakpoint (which is what set_trace() is registering), we’ll be dropped into a special shell-like environment:

# ...
(Pdb)

This is the PDB console, and it works similar to the Python shell. In addition to being able to run most Python code, we’re also executing under a specific context within our call stack. That location is the entry point. Rather, it’s where you called set_trace(). In the above example, we’re right where we need to be, so we can easily grab the type of function:

(Pdb) type(function)
<type 'NoneType'>

Of course, we could also simply grab all locally scoped variables using one of Python’s builtins:

(Pdb) locals()
{..., 'function': None, ...}

In some cases we may have to navigate up and down the stack to get to the frame where the function is executing. For example, if our set_trace() instrumentation had dropped us higher up in the stack, potentially at the top frame, we would use down to jump into the inner frames until we hit a location that had the information needed:

(Pdb) down
-> in_app = (in_app and not self.sym.is_internal_function(raw_frame.get('function')))
(Pdb) down
-> return _internal_function_re.search(function) is not None
(Pdb) type(function)
<type 'NoneType'>

So we’ve identified the problem: function is a NoneType. While that doesn’t really tell us why it’s that way, it at least gives us valuable information to speed up our test cases.

Debugging in Production

So PDB works great in development, but what about production? Let’s look a bit more deeply at what Python gives us to answer that question.

The great thing about the CPython runtime — that’s the standard runtime most people use — is it allows easy access to the current call stack. While some other runtimes (like PyPy) will provide similar information, it’s not guaranteed. When you hit an exception the stack is exposed via sys.exc_info(). Let’s look at what that gives us for a typical exception:

>>> try:
...   1 / 0
... except:
...   import sys; sys.exc_info()
...
(<type 'exceptions.ZeroDivisionError'>,
    ZeroDivisionError('integer division or modulo by zero',),
    <traceback object at 0x105da1a28>)

We’re going to avoid going too deep into this, but we’ve got a tuple of three pieces of information: the class of the exception, the actual instance of the exception, and a traceback object. The bit we care about here is the traceback object. Of note, you can also grab this information outside of exceptions using the traceback module. There’s some documentation on using these structures, but let’s just dive in ourselves to try and understand them. Within the traceback object we’ve got a bunch of information available to us, though it’s going to take a bit of work — and magic — to access:

>>> exc_type, exc_value, tb = exc_info
>>> tb.tb_frame
<frame object at 0x105dc0e10>

Once we’ve got a frame, CPython exposes ways to get stack locals — that is all scoped variables to that executing frame. For example, look at the following code:

def foo(bar=None):
    foo = "bar"
    1 / 0

Let’s generate an exception with that code:

try:
    foo()
except:
    exc_type, exc_value, tb = sys.exc_info()

And finally, let’s access the locals via f_locals on the <frame> object:

>>> from pprint import pprint
>>> pprint(tb.tb_frame.f_locals)
{'__builtins__': <module '__builtin__' (built-in)>,
    '__doc__': None,
    '__name__': '__main__',
    '__package__': None,
    'exc_info': (<type 'exceptions.ZeroDivisionError'>,
                ZeroDivisionError('integer division or modulo by zero',),
                <traceback object at 0x105cd4fc8>),
    'foo': <function foo at 0x105cf50c8>,
    'history': '/Users/dcramer/.pythonhist',
    'os': <module 'os' from 'lib/python2.7/os.py'>,
    'pprint': <function pprint at 0x105cf52a8>,
    'print_function': _Feature((2, 6, 0, 'alpha', 2), (3, 0, 0, 'alpha', 0), 65536),
    'readline': <module 'readline' from 'lib/python2.7/lib-dynload/readline.so'>,
    'stack': [],
    'sys': <module 'sys' (built-in)>,
    'tb': <traceback object at 0x105da1a28>,
    'tb_next': None,
    'write_history': <function write_history at 0x105cf2c80>}

What we see above isn’t actually that useful. The reason being is that we’re one level up in our scope, so we’re seeing foo defined, but nothing actually relevant to the function call itself. This won’t always be true, but in our trivial example it is. To find the information we’re looking for, we’ll need to go one level deeper:

>>> inner_frame = tb.tb_next.tb_frame
>>> pprint(inner_frame.f_locals)
{'bar': None, 'foo': 'bar'}

You can quickly understand how this might be useful when we go back to our original TypeError. In that case, with the above introspection, we find out that function, which is expected to be a string, is actually set to a NoneType. We know that because Sentry has captured this error for us, and automatically extracts stack locals for every frame:

Exception

This was one of the first features we built in Sentry and, to this day, it remains as one of the most valuable debugging components we can provide. While we’re not able to always give you the details needed to reproduce an exception, in our experience it’s very rare that we actually need to manually instrument something to understand the context and, ultimately, resolve the issue.

If you’re curious about the full implementation, which also utilizes other various components form Python’s traceback structure, you can always take a look at our Python SDK’s source code on GitHub. In PHP and the JVM, the approach is slightly different due to the runtimes and, if you’re interested, you’ll also find those repositories on Sentry’s GitHub. If you’re using Sentry, we’ll generally automatically instrument things for you, though the JVM requires a little bit of configuration (docs coming soon).

Frequently Asked Questions (FAQs) about Debugging Python Errors

What are the common Python errors that I might encounter while coding?

Python errors can be broadly categorized into two types: syntax errors and exceptions. Syntax errors occur when the parser detects an incorrect statement. For instance, missing parentheses or incorrect indentation. Exceptions, on the other hand, occur even if the statement or expression is syntactically correct, but it causes an error when an attempt is made to execute it. Examples of exceptions include ZeroDivisionError, NameError, TypeError, and so on. Understanding these errors is the first step towards effective debugging.

How can I use the Python Debugger (pdb) for debugging?

The Python Debugger (pdb) is a built-in module that you can use for debugging Python programs. It provides functionalities such as setting breakpoints, stepping through the code, inspecting stack frames, and so on. To use pdb, you need to import the pdb module and then call pdb.set_trace() at the location where you want the debugger to start. Once the debugger starts, you can use commands like ‘n’ (next), ‘s’ (step), ‘c’ (continue), etc., to navigate through the code.

Can I use Visual Studio Code for Python debugging?

Yes, Visual Studio Code (VS Code) is a powerful editor that supports Python debugging. You can set breakpoints, step through your code, inspect variables, and more. To start debugging, you need to open the folder containing your Python script in VS Code, set breakpoints by clicking on the left side of the source code line, and then start the debugger.

What are some best practices for debugging Python code?

Debugging is an essential skill for any programmer. Some best practices for debugging Python code include understanding the error messages, using print statements or logging to display intermediate results, isolating the problem by commenting out code, and using a debugger tool like pdb or the debugger in an Integrated Development Environment (IDE) like PyCharm or VS Code.

How can I handle exceptions in Python?

In Python, you can handle exceptions using try/except blocks. The code that can potentially cause an exception is put in the try block, and the code that handles the exception is put in the except block. If an exception occurs in the try block, the execution immediately passes to the except block. You can also use the finally block to specify code that will be executed regardless of whether an exception occurs.

How can I debug a Python script that runs on a server?

Debugging a Python script that runs on a server can be more challenging than debugging a local script. However, you can use remote debugging tools like remote-pdb, which allows you to set a breakpoint in your code and then connect to the debugger over a network socket. Another option is to use logging to record what your script is doing and then analyze the log file to find the problem.

How can I debug a multi-threaded Python application?

Debugging a multi-threaded application can be tricky because multiple threads can run simultaneously. However, Python’s pdb debugger supports multi-threaded debugging. You can switch between threads using the ‘thread’ command followed by the thread id. You can also use the ‘break’ command to set breakpoints that are specific to a particular thread.

How can I debug a Python script that uses a third-party library?

If a Python script that uses a third-party library is causing problems, you can debug it in the same way as any other script. However, if the problem is in the third-party library itself, you might need to download the source code of the library and debug it. You can also use the ‘step into’ feature of the debugger to step into the functions of the third-party library.

How can I debug a Python script that takes a long time to run?

If a Python script takes a long time to run, it can be difficult to debug because you have to wait a long time to reproduce the problem. In such cases, you can use profiling tools like cProfile to find out which parts of the script are taking the most time. You can then focus your debugging efforts on those parts.

How can I improve my debugging skills in Python?

Improving your debugging skills in Python involves a combination of understanding Python’s error messages, getting familiar with debugging tools like pdb or the debugger in an IDE, and practicing debugging on real code. You can also read books or online resources on debugging, and learn from the experience of other programmers by reading their code and understanding how they handle errors and exceptions.

David CramerDavid Cramer
View Author

David Cramer is the CEO and founder of Sentry.

Share this article
Read Next
Get the freshest news and resources for developers, designers and digital creators in your inbox each week