Datasette is a way to look through a sqlite3 database, and with sqlite-utils we can easily put a CSV into a database. This makes it easier, or at least easier, to see what there. Additionally, we can deploy the dataset on the web and make it accessable via JSON so we can quickly prototype or build something.

I watched the tutorial video on datasette homepage and got super excited.

Lets go through a real world dataset to see how it works.

Get some data

To go afdc.energy.gov and download the csv of alt fuel stations. I'm saving it as alt_fuel_stations.csv. We are going to explore what sort of public changers are out there for electric vehicles.

Install datasette and sqlite-utils

The tools are both in homebrew, and man does it take a long long while to install!

1
  brew install datasette sqlite-utils

Once that's up we can test it with

1
2
  datasette --version
  sqlite-utils --version
datasette, version 0.64.6
sqlite-utils, version 3.36

Lets install the datasette-cluster-map plug in so we can see where these chargers actually are.

1
2
3
  datasette install datasette-cluster-map \
            --app stations \
            --install datasette-cluster-map

Import the csv file

With the file that we downloaded above, lets smash it into a sqlite database.

1
  sqlite-utils insert stations.db stations alt_fuel_stations.csv --csv

And check to see if it's been created.

1
  ls -l stations.db *csv
-rw-r--r--@ 1 wschenk  staff  27355616 Mar  5 11:34 alt_fuel_stations.csv
-rw-r--r--  1 wschenk  staff  29429760 Mar  5 16:20 stations.db

Start datasette

1
  datasette serve stations.db

Click through to stations and start looking around. If we look at EV Connector Types as a facet, we can see what sort of weird values are in there!

Looks like the connector types are in some weird format. Lets put it in to the CLI.

1
2
3
  sqlite-utils query stations.db \
               "select distinct [EV Connector Types] from stations" \
               --csv
EV Connector Types
""
J1772
J1772 NEMA520
J1772 NEMA515
J1772 TESLA
CHADEMO J1772 J1772COMBO
CHADEMO J1772
CHADEMO J1772 NEMA515
J1772 J1772COMBO
CHADEMO
J1772 NEMA1450
CHADEMO J1772COMBO
NEMA520 TESLA
TESLA
J1772COMBO
CHADEMO J1772 J1772COMBO TESLA
J1772 NEMA1450 TESLA
CHADEMO J1772 NEMA520
J1772COMBO TESLA
CHADEMO J1772COMBO TESLA
CHADEMO J1772 J1772COMBO NEMA515

Lets figure out which columns we need to add. This is SQL plus bash to get out the individual fields. In the database itself

1
2
3
4
5
  sqlite-utils query stations.db \
               "select distinct [EV Connector Types] from stations" \
               --csv --no-headers | \
      sed 's/ /\n/g' | sed 's/[^0-9A-Za-z]//g' | \
      sort | uniq | tee charger_columns

CHADEMO
J1772
J1772COMBO
NEMA1450
NEMA515
NEMA520
TESLA

Lets extract those into their own columns

1
2
3
4
5
  for i in $(cat charger_columns);
  do
      echo Adding $i
      sqlite-utils add-column stations.db stations  $i int
  done
Adding CHADEMO
Adding J1772
Adding J1772COMBO
Adding NEMA1450
Adding NEMA515
Adding NEMA520
Adding TESLA

Now we can populate it

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
  require 'bundler/inline'

  gemfile do
    source 'https://rubygems.org'
    gem 'sqlite3'
  end

  db = SQLite3::Database.open 'stations.db'

  i = 0
  results = db.query( "select distinct [EV Connector Types]
          from stations where [EV Connector Types] != ''" )

  results.each do |r|
    fields = r[0].split(' ' ).collect { |x| "#{x} = 1 "}.join(" and ")
    cmd =  "update stations set #{fields} where [EV Connector Types] = '#{r[0]}'"
    puts cmd
    db.execute cmd
    
    i = i + 1
    if i % 1000 == 0
      puts "#{i} rows processed"
    end
  end

Speeding things up

We can pull out tables and add foriegn keys if you want to look at those directly

1
2
  sqlite-utils extract stations.db stations "Fuel Type Code" --table fuel_type_code
  sqlite-utils extract stations.db stations "EV Network" --table ev_network

It's also possible to quickly add indexes

1
2
  sqlite-utils create-index stations.db stations State
  sqlite-utils create-index stations.db stations  'Facility Type'

Publish to fly.io

This is fun to have locally, but it's interesting to share this with the team. We can use the datasette-publish-fly plugin to make this easier.

1
  pip install datasette-publish-fly

I've already have flyctl on my computer and am authenticated, so this is the one liner to publish it:

1
2
3
  datasette publish fly stations.db \
            --app="stations" \
            --install=datasette-cluster-map

It will create the app, build it, embed the sqlite file in the Docker container, and put it on the internet!

Previously

POSSE rss to mastodon keep it local and then share

2024-03-04

Next

Ruby crashes on fly.io more memory

2024-03-05

howto

Previously

POSSE rss to mastodon keep it local and then share

2024-03-04

Next

Making a web component by scratch progressive enhancement

2024-03-13