This post is about a new command-line tool I've recently built in Go - gemini-cli, and how to use it for LLM-based data analysis with Google's Gemini models.

Background: I've been reading Simon Willison's posts about LLMs with interest, especially his work on tools that leverage LLMs and SQLite to create fun little analysis pipelines for local documents. Since I've recently done some Go work on Google's Gemini SDKs (also in langchaingo) and wrote a couple of blog posts about it, I was interested in creating a similar pipeline for myself using Go and Gemini models. This is how the idea for gemini-cli was born.

The tool

Like any Go command-line tool, gemini-cli is very easy to install:

$ go install github.com/eliben/gemini-cli@latest

And you're good to go! It will want a Gemini API key set in the GEMINI_API_KEY env var or passed with the --key flag. If you don't have an API key yet, you can get one quickly and for free from https://ai.google.dev/

The motivating task

For a while I've been interested in adding a "related posts" feature to my blog. It was clear that I'll want to use embeddings to convert my posts to vector space and then use vector similarity to find related posts. Check out my earlier post on RAG for additional information on these techniques.

Before starting to write the code, however, I wanted to experiment with a command-line tool so I could rapidly prototype. Think of it as crafting some text processing pipeline from classical Unix command-line tools before trying to implement it in a programming language. gemini-cli excels for precisely such prototyping.

What's next

For my task, I now have the basic information available to implement it, and all the infrastructure for running experiments; with gemini-cli in hand, this took less than 5 minutes. All I needed to do is write the tool :-)

I really enjoyed building gemini-cli; it's true to the spirit of simple, textual Unix CLIs that can be easily combined together through pipes. Using SQLite as the storage and retrieval format is also quite pleasant, and provides interoperability for free.

For you - if you're a Go developer interested in building stuff with LLMs and getting started for free - I hope you find gemini-cli useful. I've only shown its embed * subcommands, but the CLI also lets you chat with an LLM through the terminal, query the API for various model details, and everything is configurable with extra flags.

It's open-source, of course; the README file rendered on GitHub has extensive documentation, and more is available by running gemini-cli help. Try it, ask questions, open issues!


[1]I like using pss, but feel free to use your favorite tools - git grep, ag or just a concoction of find and grep.
[2]

A word of caution: LLMs have limited context window sizes; for embeddings, if the input is larger than the model's context window it may get truncated - so it's the user's responsibility to ensure that input documents are properly sized.

gemini-cli will report the maximal number of input tokens for supported models when you invoke the gemini-cli models command.

[3]We have to be careful with too much parallelism, because at the free tier the Gemini SDK may be rate-limited.