June 2, 2019

Manage your book inventory in Airtable by scanning barcodes and scraping Goodreads

If you use Airtable to keep an inventory of items, maybe your personal media collection, you can save time by just scanning barcodes and scraping product details from a website like Goodreads.

Airtable’s iPhone app can already scan barcodes through your phone’s camera, so this post shows how to use Zapier and Apify in the background to automatically scrape product details from Goodreads:

We use Apify to crawl Goodreads for book details, and we use Zapier to kick everything off when new barcodes are scanned. Here is an overview of the process:

  1. Prepare your Airtable base
  2. Create an Apify crawler for Goodreads
  3. Get Zapier to start a crawl when a new barcode is scanned in Airtable
  4. Get Zapier to update Airtable with the crawler’s results

Step 1: Prepare your Airtable base

The fields in your Airtable base will obviously depend on your use case, but at a minimum you’ll need a field with the Barcode type, and any other metadata fields for the products you’re tracking.

Here’s what I used for a simple Books table:

The field types of a basic Books table in AirtableThe field types of a basic Books table in Airtable

If you open this table on Airtable’s mobile app, editing the barcode field will pull up your camera for easy scanning!

Step 2: Create an Apify crawler for Goodreads

I’ve covered Apify in more detail in other posts, but in short it makes it very easy to scrape websites and make the results accessible via an API. You can also automate it with Zapier, which is what we’ll do later in the post.

Here’s how to create a crawler to extract book details from a Goodreads page like this:

  1. Create a new crawler in your Apify account.
  2. Make the Clickable elements field blank. (We don’t need the crawler to follow any links.)
  3. Set the Page function to the JavaScript code below:
function pageFunction(context) {
    var $ = context.jQuery;
    var result = {
        title: $("h1#bookTitle").text().strip(),
        author: $("#bookAuthors span[itemprop='name']").text(),
        coverImageUrl: $("img#coverImage").attr("src"),
        rating: parseFloat($("span[itemprop='ratingValue']").text().strip()),
        originalBarcodeNumber: context.request.label
    };
    return result;
}

That’s it! Apify will run this function on Goodreads pages we tell it to scrape, and the JSON object we return will be made available to Zapier.

I explain this step in more detail in a previous post, but all I did here was use Chrome DevTools to understand how Goodreads structures its HTML with CSS classes and IDs, and then use jQuery selectors to extract the relevant details.

It’s important we return context.request.label in the result object here. This variable holds the original barcode number for this crawl, and when we process the results later we’ll use this information to find the right Airtable record to update.

Step 3: Get Zapier to start a crawl when a new barcode is scanned in Airtable

After scanning the barcode we’re left with a book record with no details filled inAfter scanning the barcode we’re left with a book record with no details filled in

When we scan a barcode in the Airtable app, we’re left with a record with a barcode and no other details. We want to start our Apify crawler each time one of these records is added. We don’t want to start our crawler if the new row already has details filled in, which might be because we manually entered the details or imported them from somewhere else.

We’ll achieve this in two steps: first we’ll create an Airtable View that filters to books that need details from Goodreads, and then we’ll use a New Record in View trigger in Zapier to only trigger on these rows.

Create a new view in Airtable that filters to records where the title is empty but the barcode is not:

A view in Airtable that filters to books needing Goodreads detailsA view in Airtable that filters to books needing Goodreads details

With that view set up, create a new Zapier Zap with this trigger and action:

Zapier Zap for triggering an Apify crawler when new records are added to an Airtable viewZapier Zap for triggering an Apify crawler when new records are added to an Airtable view

New Record in View (Trigger)

Make sure you have at least one sample row in your Books to be scraped” view in Airtable, then choose the view when configuring this step in Zapier.

Run Crawler (Action)

After you’ve selected your Apify crawler, the magic happens when we specify the Start URLs” in the Crawler Properties:

Setting the Start URLs of our Apify crawler in ZapierSetting the Start URLs of our Apify crawler in Zapier

When Zapier tells Apify to start our crawler, it will pass this JSON object with special settings. Here we specify two key details:

  1. We specify which Goodreads URL we want to crawl. We use the format https://www.goodreads.com/search?q=<BARCODE_NUMBER>, and we use Zapier’s templating functionality to populate the barcode number from the Airtable record that triggered the Zap. (Use the +” button in the top right to access the Barcode Text” variable.)
  2. We set the label of our Start URL to the barcode number by setting the key property. When we wrote our Apify crawer’s JavaScript function we included context.request.label in our result object, and now you can see how that variable will hold the barcode number.

Save your Zap and turn it on!

Step 4: Get Zapier to update Airtable with the crawler’s results

Now we’ve got Apify automatically crawling Goodreads when new barcodes are added to Airtable, but we’re not actually doing anything with the results. To complete the cycle, we need a new Zapier Zap to run when the crawler is finished:

Zapier Zap for updating Airtable after our crawler finishesZapier Zap for updating Airtable after our crawler finishes

Crawler Run Finished (Trigger)

Select your Apify Crawler at this step. Make sure you’ve run your crawler manually at least once by this stage — Zapier will pull in the sample results and it will make setting up the remaining steps much easier.

Choose your Base and Table, set Search by Field to your Barcode” field, and set Search Values to the original barcode variable from the trigger step. It will be called Results Original Barcode Number” in the variable list.

Setting the search criteria so Zapier can find the right Airtable record to updateSetting the search criteria so Zapier can find the right Airtable record to update

Update Record (Action)

For the Record to update, choose Use a Custom Value (advanced)” to update the record we searched for in the previous step.

All of the crawler results from Goodreads will be available as Zapier variables by this point, so now we just need to map the values to the right Airtable fields.

Mapping our Goodreads crawling results to Airtable fieldsMapping our Goodreads crawling results to Airtable fields

Save your Zap, and you’re done!

Give it a whirl

At this point the whole cycle should be complete! When you scan a new barcode in the Airtable app, all of your automation steps should kick off magically in the cloud, and you should be able to watch the values appear in your Airtable base!


Airtable Zapier Apify


Previous post
How to forward emails via Zapier to create new contacts in Airtable
Next post
How to create a slide-over card using SwiftUI (like in Maps or Stocks)