Working with JSON in Python

November 02, 2018
Written by
Sam Agnew
Twilion

json

Often developers need to deal with data in various different formats and JSON, short for JavaScript Object Notation, is one of the most popular formats used in web development. This is the syntax that the JavaScript language uses to denote objects.

As a Python developer, you may notice that this looks eerily similar to a Python dictionary. There are several different solutions to working with JSON in Python, and more often than not this data is loaded into a dictionary.

For this post, we are going to use the following modified JSON data from NASA's Astronomy Picture of the Day API. Navigate to where you want to run the example code, create a file called apod.json and add the following to it:

{
    "copyright": "Yin Hao",
    "date": "2018-10-30",
    "explanation": "Meteors have been shooting out from the constellation of Orion. This was expected, as October is the time of year for the Orionids Meteor Shower. Pictured here, over two dozen meteors were caught in successively added exposures last October over Wulan Hada volcano in Inner Mongolia, China. The featured image shows multiple meteor streaks that can all be connected to a single small region on the sky called the radiant, here visible just above and to the left of the belt of Orion, The Orionids meteors started as sand sized bits expelled from Comet Halley during one of its trips to the inner Solar System. Comet Halley is actually responsible for two known meteor showers, the other known as the Eta Aquarids and visible every May. An Orionids image featured on APOD one year ago today from the same location shows the same car. Next month, the Leonids Meteor Shower from Comet Tempel-Tuttle should also result in some bright meteor streaks. Follow APOD on: Facebook, Instagram, Reddit, or Twitter",
    "hdurl": "https://apod.nasa.gov/apod/image/1810/Orionids_Hao_2324.jpg",
    "media_type": "image",
    "service_version": "v1",
    "title": "Orionids Meteors over Inner Mongolia",
    "url": "https://apod.nasa.gov/apod/image/1810/Orionids_Hao_960.jpg"
}

Using this example, let's examine how you would decode and encode this data with different Python libraries.

The Standard Library

Let's start with the obvious choice, the native JSON module in the Python standard library. This library gets the task of encoding and decoding JSON done in a fairly easy to use way. A lot of the other JSON libraries base their API off of this one and behave similarly.

Create a file called test.py and paste the following code into it to decode the JSON in our apod.json text file, store it in a Python dictionary, and then decode it back into a string:

import json

with open('apod.json', 'r') as f:
    json_text = f.read()

# Decode the JSON string into a Python dictionary.
apod_dict = json.loads(json_text)
print(apod_dict['explanation'])

# Encode the Python dictionary into a JSON string.
new_json_string = json.dumps(apod_dict, indent=4)
print(new_json_string)

Run your code with the following command:

python test.py

One of the upsides about using the built in JSON module is that you don't have to install any third party libraries, allowing you to have minimal dependencies.

simplejson

simplejson is a simple and fast JSON library that functions similarly to the built in module. A cool thing about simplejson is that it is externally maintained and regularly updated.

You will have to install this module with pip. So in your terminal, run the following command (preferably in a virtual environment):

pip install simplejson==3.16.0

This library is designed to be very similar to the built in module, so you don't even have to change your code to get the same functionality! Just import the simplejson module, give it the name json, and the rest of the code from the previous example should just work.

Replace your previous code with the following if you want to use simplejson to encode and decode:

import simplejson as json

with open('apod.json', 'r') as f:
    json_text = f.read()

# Decode the JSON string into a Python dictionary.
apod_dict = json.loads(json_text)
print(apod_dict['explanation'])

# Encode the Python dictionary into a JSON string.
new_json_string = json.dumps(apod_dict, indent=4)
print(new_json_string)

Again, run this with the following command:

python test.py

Many Python developers would suggest using simplejson in place of the stock json library for most cases because it is well maintained.

UltraJSON

Like simplejson, ujson is another community-maintained JSON library. This one, however, is written in C and designed to be really fast. It lacks some of the more advanced features that the built in JSON library has, but really delivers on its promise, as it seems to be unmatched in terms of speed.

Install ujson with the following command:

pip install ujson==1.35

As with simplejson, you don't have to change any of your code for it to work. In most cases, it works in the same way from the developer's point of view as the built in module. Replace your previous code with the following:

import ujson as json

with open('apod.json', 'r') as f:
    json_text = f.read()

# Decode the JSON string into a Python dictionary.
apod_dict = json.loads(json_text)
print(apod_dict['explanation'])

# Encode the Python dictionary into a JSON string.
new_json_string = json.dumps(apod_dict, indent=4)
print(new_json_string)

Run this with the following command:

python test.py

If you're dealing with really large datasets and JSON serialization is becoming an expensive task, then ujson is a great library to use.

The Requests library

These JSON serialization libraries are great, but often in the real world there is more context around why you have to deal with JSON data. One of the most common scenarios that requires decoding JSON would be when making HTTP requests to third party REST APIs.

The requests library is the most popular Python tool for making HTTP requests, and it has a pretty awesome built in json() method on the response object that is returned when your HTTP request is finished. It's great to have a built in solution so you don't have to import more libraries for a simple task.

Install requests with the following shell command:

pip install requests==2.20.0

In this example, we are actually going to make an HTTP request to the Astronomy Picture of the Day API rather than using the local hard coded .json file from the other examples.

Open a new file called apod.py and add the following code to it:

import requests

apod_url = 'https://api.nasa.gov/planetary/apod?api_key=DEMO_KEY'
apod_dict = requests.get(apod_url).json()

print(apod_dict['explanation'])

This code makes an HTTP GET request to NASA's API, parses the JSON data that it returns using this built in method, and prints out the explanation of the current Astronomy Picture of the Day.

Run your code with the following command:

python apod.py

Responding to an HTTP request with JSON in Flask

Another common scenario is that you are building a route on a web application and want to respond to requests with JSON data. Flask, a popular lightweight web framework for Python, has a built in jsonify function to handle serializing your data for you.

Install Flask with pip:

pip install flask==1.0.2

And now create a new file called app.py, where the code for our example web app will live:

from flask import Flask, jsonify


app = Flask(__name__)


@app.route('/apod', methods=['GET'])
def apod():
    url = 'https://apod.nasa.gov/apod/image/1810/Orionids_Hao_960.jpg'
    title = 'Orionids Meteors over Inner Mongolia'

    return jsonify(url=url, title=title)


if __name__ == '__main__':
    app.run()

In this code, we have a route named /apod, and anytime a GET request is sent to that route, the apod() function is called. In this function, we are pretending to respond with the Astronomy Picture of the Day. In this example the data we're returning is just hard coded, but you can replace this with data from any other source.

Run the file with python app.py, and then visit http://localhost:5000/apod in your browser to see the JSON data.

Per the Flask docs, the jsonify function takes data in the form of:

  1. Single argument: Passed straight through to dumps().
  2. Multiple arguments: Converted to an array before being passed to dumps().
  3. Multiple keyword arguments: Converted to a dict before being passed to dumps().
  4. Both args and kwargs: Behavior undefined and will throw an exception.

This function wraps dumps() to add a few enhancements that make life easier. It turns the JSON output into a Response object with the application/json mimetype.

Conclusion

There are many different solutions to working with JSON in Python, and I've shown you just a few examples in this post. You can use whichever library suits your personal needs, or in the case of requests and Flask, might not even have to import a specific JSON library.

Feel free to reach out for any questions or to show off what you build: