How To Create An API From A Dataset Using Python and Flask

Posted in

Datasets are the building blocks of an API, the genome from which your development projects will grow and evolve. Datasets are the heart of some of the most advanced digital disciplines, from AI to computer vision to machine learning. Considering the power and potential of machine learning for everything from small business to science the need for quality, reliable datasets is only going to grow.

Being able to convert a dataset into an API also makes it possible to create your own custom APIs, whether that be for in-house use or to share with end-users. It lets you interact with your raw data in a more hands-on manner.

We’re going to show you how to build a basic web API using PythonSQLite, and Flask, a popular web framework. We’re going to build a basic book catalog as an example of how a REST API built with Python can yield granular results from an API and then serve them up to different API endpoints. In this instance, we’ll be serving the title, date of publication, and the first sentence of a book via our API.

We’re going to start by building a basic Flask server and hosting a homepage. Then we’ll show you how to route your data to different API endpoints.

Install Flask

Flask is a web framework, meaning it’s specifically designed to make building web applications with Python simple and intuitive. We’re going to begin by loading Flask and importing some basic functions into our Python program.

Before you begin, you should create a new folder for your project in your programming directory. We’ve called our’s “apitest” but you can name your folder whatever you like.

Once you’ve got your project folder created, navigate to that directory using Terminal or whatever IDE you prefer. Once you’re there, type:

Pip install flask

Once that’s loaded, create an empty file using your text editor of choice. We’ll be using Notepad++ as it allows you to save files using whatever file extension you like. Title your empty file app.py and save it.

Now we’re going to create a barebones Flask server. In app.py, input the following:

import flask

app = flask.Flask(__name__)
app.config["DEBUG"] = True

@app.route('/', methods=['GET'])
def home():
    return "Distant Reading Archive: This site is a prototype API for distant reading of science fiction novels."

app.run()

This simply creates an instance of Flask and routes the results to a homepage. Save the file and then type the following into the command line:

Python app.py

This returns the “Distant Reading Archive: This site is a prototype API for distant reading of science fiction novels.” result to a homepage at https://127.0.0.1:5000/.

That’s the basics of how Flask works. It should give you an indication of how simple and intuitive it is to create a web service using Flask.

What Flask Is Doing

Flask translates web functions into a format Python can understand. In this instance, Flask defines a function called Home and routes it to the / API endpoint. This is what’s known as app routing.

The app routing part of our code is:

@app.route('/', methods=['GET'])

The / is where the results are routed to and the methods variable is what types of commands are able to be used at that endpoint. We’ll just be using GET commands for the sake of this tutorial, but you could use POST methods as well if you wanted to be able to add data to that endpoint.

As far as our preliminary code, the import Flask commands inputs Flask into our program. The app = flask.Flask(__name__) creates a Flask application, which we’ll run at the end to deliver our results. The app.config["DEBUG"] = True command runs your program in debug mode so you won’t have to reset the Flask server every time you change a line of code. Finally, the app.run() function simply runs the application.

Creating The API

Now that our Flask server is up and running, it’s time to build our API. We’ll be configuring our data as a series of Python dictionaries, which pairs lists of keys and values. Python dictionaries are formatted as follows:

{
	'Key' : 'value' ,
	'Key': 'value'
}

Now let’s add a bit of test data to our application for an example of how this works. Delete the previous code in app.py and replace it with the following code:

import flask
from flask import request, jsonify

app = flask.Flask(__name__)
app.config["DEBUG"] = True

And this will create some test data for our catalog in the form of a list of dictionaries:

books = [
    {'id': 0,
     'title': 'A Fire Upon the Deep',
     'author': 'Vernor Vinge',
     'first_sentence': 'The coldsleep itself was dreamless.',
     'year_published': '1992'},
    {'id': 1,
     'title': 'The Ones Who Walk Away From Omelas',
     'author': 'Ursula K. Le Guin',
     'first_sentence': 'With a clamor of bells that set the swallows soaring, the Festival of Summer came to the city Omelas, bright-towered by the sea.',
     'published': '1973'},
    {'id': 2,
     'title': 'Dhalgren',
     'author': 'Samuel R. Delany',
     'first_sentence': 'to wound the autumnal city.',
     'published': '1975'}
]


@app.route('/', methods=['GET'])
def home():
    return ''Distant Reading Archive: A prototype API for distant reading of science fiction novels.''

A route to return all of the available entries in our catalog:

@app.route('/api/v1/resources/books/all', methods=['GET'])
def api_all():
    return jsonify(books)

app.run()

Save it and run the program. Navigate to https://127.0.0.1:5000/api/v1/resources/books/all to see the test data returned to the api/v1/resources/books/all endpoint.

You should see the three books from our test data returned with JSON formatting.

Filtering The Results

Now you should have an idea of app routing and working with dictionaries and JSON data in Python. So far, though, we’ve only made it so that the entire list is delivered to an endpoint. There are many circumstances where you’d want to segment the results to different API endpoints. Let’s learn how to do that.

import flask
from flask import request, jsonify

app = flask.Flask(__name__)
app.config["DEBUG"] = True

This will create some test data for our catalog in the form of a list of dictionaries:

books = [
    {'id': 0,
     'title': 'A Fire Upon the Deep',
     'author': 'Vernor Vinge',
     'first_sentence': 'The coldsleep itself was dreamless.',
     'year_published': '1992'},
    {'id': 1,
     'title': 'The Ones Who Walk Away From Omelas',
     'author': 'Ursula K. Le Guin',
     'first_sentence': 'With a clamor of bells that set the swallows soaring, the Festival of Summer came to the city Omelas, bright-towered by the sea.',
     'published': '1973'},
    {'id': 2,
     'title': 'Dhalgren',
     'author': 'Samuel R. Delany',
     'first_sentence': 'to wound the autumnal city.',
     'published': '1975'}
]

@app.route('/', methods=['GET'])
def home():
    return ''Distant Reading Archive: A prototype API for distant reading of science fiction novels.''

@app.route('/api/v1/resources/books/all', methods=['GET'])
def api_all():
    return jsonify(books)

@app.route('/api/v1/resources/books', methods=['GET'])
def api_id():
    # Check if an ID was provided as part of the URL.
    # If ID is provided, assign it to a variable.
    # If no ID is provided, display an error in the browser.
    if 'id' in request.args:
        id = int(request.args['id'])
    else:
        return "Error: No id field provided. Please specify an id."

    # Create an empty list for our results
    results = []

    # Loop through the data and match results that fit the requested ID.
    # IDs are unique, but other fields might return many results
    for book in books:
        if book['id'] == id:
            results.append(book)

    # Use the jsonify function from Flask to convert our list of
    # Python dictionaries to the JSON format.
    return jsonify(results)

app.run()

Save the file and run the program again. Once your server is running, investigate the following links:

In this new code, we’ve created a new function, called api_id and mapped it to the API endpoint api/v1/resources/books. Two things are happening inside of this function. The first adds a query parameter to the data, which is input after the URL with a ? qualifier. This looks for a particular ID in our data and returns it to a particular endpoint.

The second part adds some code to make our API iterate through the data and adds them to the list that’s returned to the user. The results are then returned in JSON format.

Adding A Database To An API

You should now have a functional but limited API. An API isn’t going to be that useful if you have to input Python dictionaries by hand, however. One of the points of creating an API is to make data easier to parse and work with. Importing a database file into your program makes that easy to implement.

We’ll be using SQLite to connect to a database. SQLite works with .db files. We’ll be using this books database for this example. Download the file and move it into the root directory of your project.

Input the following code into your text editor:

import flask
from flask import request, jsonify
import sqlite3

app = flask.Flask(__name__)
app.config["DEBUG"] = True

def dict_factory(cursor, row):
    d = {}
    for idx, col in enumerate(cursor.description):
        d[col[0]] = row[idx]
    return d

@app.route('/', methods=['GET'])
def home():
    return '''Distant Reading Archive: A prototype API for distant reading of science fiction novels.'''

@app.route('/api/v1/resources/books/all', methods=['GET'])
def api_all():
    conn = sqlite3.connect('books.db')
    conn.row_factory = dict_factory
    cur = conn.cursor()
    all_books = cur.execute('SELECT * FROM books;').fetchall()

    return jsonify(all_books)

@app.errorhandler(404)
def page_not_found(e):
    return "404. The resource could not be found.", 404

@app.route('/api/v1/resources/books', methods=['GET'])
def api_filter():
    query_parameters = request.args

    id = query_parameters.get('id')
    published = query_parameters.get('published')
    author = query_parameters.get('author')

    query = "SELECT * FROM books WHERE"
    to_filter = []

    if id:
        query += ' id=? AND'
        to_filter.append(id)
    if published:
        query += ' published=? AND'
        to_filter.append(published)
    if author:
        query += ' author=? AND'
        to_filter.append(author)
    if not (id or published or author):
        return page_not_found(404)

    query = query[:-4] + ';'

    conn = sqlite3.connect('books.db')
    conn.row_factory = dict_factory
    cur = conn.cursor()

    results = cur.execute(query, to_filter).fetchall()

    return jsonify(results)

app.run()

Save the result as api_final.py. Navigate to the following endpoints to see your new database and filtering function in action:

Let’s take a deeper look at what’s happening with both our database and the API we’ve created from it.

Understanding Databases and APIs

For this exercise, we’re using SQLite, although there are plenty of database formats you could choose from. In SQLite, data is stored in tables, which store data in columns and rows.

The database we’re using is a collection of all of the winners of the Hugo Awards. It has five columns: id, published, author, title, and first sentence. The first code that allows our program to interact with the database is:

def api_all():
    conn = sqlite3.connect('books.db')
    conn.row_factory = dict_factory
    cur = conn.cursor()
    all_books = cur.execute('SELECT * FROM books;').fetchall()

    return jsonify(all_books)

This creates a function called api_all():. First, the function connects to the database using the sqlite3.connect command. The variable in the () loads the .db file and the resulting data is connected to the ‘conn’ variable.

The conn.row_factory command tells the connection function to use the dict_factory variable we defined, which converts the data retrieved from the database as dictionaries rather than lists. The cur object is an object that moves through the database and collects all the data. Finally, the cur.execute method retrieves all pertinent data, *, from the books table in the database.

The last bits of code are for when errors occur, returning 404 errors when a query isn’t included in the database. The last part is a refined version of the api_id function we created earlier. It lets the end-user filter by id, published, and author.

The function begins by defining the query parameters:

query_parameters = request.args

It then chains the supported queries to the appropriate variable.

id = query_parameters.get('id')
published = query_parameters.get('published')
author = query_parameters.get('author'

Next, the function translates Python code into a format SQL can understand.

query = "SELECT * FROM books WHERE"
to_filter = []

Finally, if id, published, or author was used as a query, the results are appended to the list.

if id:
        query += ' id=? AND'
        to_filter.append(id)
    if published:
        query += ' published=? AND'
        to_filter.append(published)
    if author:
        query += ' author=? AND'
        to_filter.append(author)

If the user hasn’t used any of these queries, they’ll be redirected to a 404 page.

Then the results are paired with appropriate variables, same as before.

conn = sqlite3.connect('books.db')
conn.row_factory = dict_factory
cur = conn.cursor()

results = cur.execute(query, to_filter).fetchall()

Finally, those results are returned in JSON format thanks to the following code:

return jsonify(results)

Creating An API With Flask: Final Thoughts

We’ve shown you how to create a basic web-based API using Python, Flask, and SQLite. These concepts and the code we’ve shared are universal and flexible, so you can apply these principles toward making your own APIs. You can modify the style sheets to your heart’s content, making your web app read however you want. You can also customize the display, meaning you can make your datasets as appealing as you want for the intended end user.

There’s no end to the possibilities once you can actually design and build your own datasets and APIs. It could even be a source of revenue for your business or brand, once you’ve got the right data.