😮 You looked at the source!

Dimitris Zorbas

Code spells, smells and sourcery

Organising Book Highlights and Notes

I’ve used a variety of tools to organise my reading and notes. Given I spend a significant amount of my time studying, depending on 3rd parties gives me anxiety. Any of the tools I use, even the open-source offline-first ones, can become unmaintained, ridden with security vulnerabilities, slow or they may change in way which makes me reluctant to use them.

To some extent, this post is a sequel to “knowledge mapping”.

Why care about my notes and Highlights?

A friend asked this question. She said, are you going to use this mid-conversation to correct someone? Do you want to be the “Well, Ackchyually..” guy?

actually

Most definitely not. I’ve noticed the rate of information I consume keeps increasing disproportionately to the rate I digest it into knowledge. I repeat it’s about building knowledge not memorising.

This project in particular, is about using tech for “good”, doing your thing, or escaping from servitude to the capitalistic delusion of perpetual exponential growth (too far eh? 😅).

Initially I became reasonably frustrated of not being able to revisit my Kindle highlights. Half of the books I read, I do so on my Kindle device, I highlight sentences, then said highlights appear on goodreads.com and read.amazon.co.uk.

I felt locked-in using Goodreads (owned by Amazon) and read.amazon. What if my highlights become unavailable due to some licensing issue?

Having all my highlights in one place, open up some interesting possibilities. Similar to how I fancy creating playlists, I might for example, some day curate my favourite quotes by an author. Some day I may build a gadget for my desk to display a daily quote (raspberrypi + e-ink).

So.. I built a thing

I started off with a simple script which downloads all my Kindle highlights. It formats and builds a readable, pretty-printed JSON file which I sync in Git.

Then I enhanced it so that I can write my own notes and highlights from non-Kindle physical books.

Then I made it create and incrementally import highlights into a Notion database. Why? I was too lazy to write a frontend and I’m starting to like Notion. What’s great about this database is that it can be embedded as a view in any page.

notion database
Quick search from any page

The Notion database can then be shared. He’s mine.

Demo Time

The script can be found here and accepts the following commands:

sync_local

./notes sync_local

It downloads Kindle highlights and imports local notes into a books.json file.

The location of the flat-file “database” can be configured through a DB_FILEPATH env var, which can also be set in a .env file.

The “database” file is intended to be version-controlled with Git.

Local Notes

Highlights and notes can also be imported into the “database” books.json file. Such notes are written as YAML, yes YAML. I contemplated creating my own tiny markup language for this (ideally using nimble_parsec), this non-feature ended up in the “backlog”.

The location of the notes file can be configure through the NOTES_FILEPATH env var, which can also be set in a .env file.

-
  # The asin key may also hold an ISBN
  asin: 0-679-76288-4
  title: High Output Management
  author: Andy Grove
  highlights:
    -
      location: 17
      text: >
        A genuinely effective indicator will cover the output of the work unit and not simply
        the activity involved.        
./notes search <keyword>

Prints any highlights which match the given keyword.

Example:

./notes search work

Found 179 results for "work"

╔════════════════════════════════════════════════════════════════════╗
║  Book:   High Output Management                                    ║
║  Author: Andy Grove                                                ║
╚════════════════════════════════════════════════════════════════════╝

 A genuinely effective indicator will cover the output of the work unit
and not simply the activity involved.

random

./notes random

Returns a random highlight.

Example:

╔════════════════════════════════════════════════════════════════════╗
║  Book:   The Genealogy of Morals                                   ║
║  Author: Friedrich Nietzsche                                       ║
╚════════════════════════════════════════════════════════════════════╝

 All sick and diseased people strive instinctively after a herd-organisation,
out of a desire to shake off their sense of oppressive discomfort and weakness;
the ascetic priest divines this instinct and promotes it;

update_notion

./notes update_notion

Syncs the local “database” file into a Notion database.

It supports the --since <date> flag to only sync the database entries which have been updated since the given date (ISO-8601 formatted). This option is particularly useful since the Notion API is rate-limited and for more than 1000 highlights syncing can take significant time (more than 10 minutes).

Recap

Features

  • Offline-first - The “database” is a single JSON file
  • Version control - The JSON database file is prettified making it easy to review and commit changes to git
  • Both notes and the JSON database are readable and searchable using a text editor or tools like jq
  • Fast search - Both the CLI search and Notion’s search are fast

Non-Features

  • Removing highlights from the local database - I never delete highlights
  • Full-text search - Simple regex case-insensitive does the trick for now

Contributing

Feel free to fork either the bookworm script or my notes. As stated at the top of this post, this is a fun project for me, so expect no support. However, I’d be glad to discuss ideas on this domain.

Further Reading

Nah, stop reading. Start organising!


Cover image credits: @giamboscaro