Fine tuning language models with Axon

Published November 02, 2023 by Toran Billups

In 2023 I spent a lot of time at the intersection of software engineering and machine learning hoping to uncover the next great opportunity for automation. My talk at ElixirConf US documents my nearly year-long effort to deconstruct deep learning for those who feel overwhelmed by the technical aspects.

I started with FizzBuzz because it was a problem most software engineers had some familiarity with which allowed me to subtly bridge the gap to more complex concepts. This widely known interview question was the perfect primer on the subject because underneath the diverse solutions was a well-understood classification problem that provided the optimal foundation for machine learning.

After the more introductory concepts I discussed fine-tuning pre-trained models for text classification. I went into detail about creating a labeled dataset, tokenization, transforming text into embeddings and finally, fine tuning the model with Axon. I emphasized the critical role of data quality, the challenges that come with preparing or cleaning data, and the nuances of encoding text for machine learning models.

Toward the end of the talk I spoke about the trend towards off-the-shelf, open source models, and the opportunities for developers to upskill. For those eager to delve deeper, I recommend Grokking Deep Learning as an invaluable resource. This book greatly simplified the topic and inspired me to learn about neural networks and share with others.

My goal was to make deep learning and model fine-tuning more approachable, especially for those in the Elixir community. While this talk was presented at ElixirConf I actually think it's applicable to all who consider themselves unfamiliar with the basics of machine learning.

Looking back, this journey has been about more than just acquiring knowledge; it's about sharing it, making the path less intimidating, and most importantly, showing how software engineering and machine learning collide to make way for innovation.

The source code is up on Github for anyone who wanted to see a few of those examples in more detail. The full transcript for the talk can be downloaded here.

I did share a complete FizzBuzz implementation I put together with Nx for those who want to see the most basic numerical computing solution in Elixir. The Axon example shows a more declarative approach that ultimately solves the same problem.


Buy Me a Coffee

Twitter / Github / Email