PyDev of the Week: Ines Montani

This week we welcome Ines Montani (@_inesmontani) as our PyDev of the Week! Ines is the Founder of Explosion AI and a core developer of the spaCy package, which is a Python package for Natural Language Processing. If you would like to know more about Ines, you can check out her website or her Github profile. Let’s take a few moments to get to know her better!

Can you tell us a little about yourself (hobbies, education, etc):

Hi, I’m Ines! I pretty much grew up on the internet and started making websites when I was 11. I remember sitting in school and counting the hours until I could go back home and keep working on my websites. I still get that feeling sometimes when I’m working on something particularly exciting.

I wasn’t quite sure what to do with my life, so I ended up doing a combined degree of media science and linguistics and went on to work in the media industry for a few years, leading marketing and sales. But I always kept programming and building things on the side.

In 2016, I started Explosion, together with my co-founder Matt. We specialise in developer tools for Machine Learning, specifically Natural Language Processing – so basically, working with and extracting information from large volumes of text. Our open-source library spaCy is a popular package for building industrial-strength, production-ready NLP pipelines. We also develop Prodigy, an annotation tool for creating training data for machine learning models.

I’m based in Berlin, Germany, and if I’m not programming, I enjoy bouldering 🧗‍♀️, eating good food 🥘 and spending time with my pet rats 🐀.

Why did you start using Python?

It really just kinda… happened. I never sat down and said, hey, I want to learn Python. I’m actually pretty bad at just sitting down and learning things. I always need a project or a higher-level goal. When I started getting into Natural Language Processing, many of the tools I wanted to use and work on were written in Python. So I ended up learning Python along the way. It also appealed to me as a language, because it’s just very accessible and straightforward, and I like the syntax.

What other programming languages do you know and which is your favorite?

These days, I mostly work in Python and Cython. I’m also fluent in JavaScript, have recently started working more with TypeScript, and did a bit of PHP and Perl back in the day.

I don’t want to get hung up on the definition of a “programming language”, but in terms of *writing code*, I also really love building things for the web. CSS is quite elegant once you get to know it, and it’s actually one of my favourite things to write.

What projects are you working on now?

I recently finished a bunch of stuff that has been in the works for a long time! The other month, we finally released v2.1 of our open-source library spaCy. I also published a free interactive online course on Advanced NLP with spaCy, together with an open-source framework for building interactive online courses.

At the moment, we’re working on Prodigy Scale, which will be an extension product for our annotation tool Prodigy, specifically for larger teams who want to scale up their annotation projects. I’m especially excited about the data privacy-focused architecture via a self-hosted cluster, and the new features for reviewing data, measuring annotator agreement and building annotation and model training flows interactively.

Finally, I’m also working on organizing our first real-life (!) event here in Berlin for folks working with spaCy and NLP more generally. We have a really cool mix of speakers lined up so it’s going to be a lot of fun. It’s called “spaCy IRL” and takes place on July 6 – you can find more details about it here.

Which Python libraries are your favorite (core or 3rd party)?

  • IPython shell: I have to admit, I use plain Python interpreters *a lot* – mostly to try things out, quickly test and run some code, parse some text with spaCy, and so on. It was kind of a “mind blown!” moment when I discovered that IPython included an enhanced interactive shell with syntax highlighting, autocomplete and various other cool features.
  • black: We’ve been slowly adding the Black auto-formatter to our Python code bases and it’s been incredibly satisfying. Combined with flake8 and auto-formatting and linting in Visual Studio Code, it’s really changed the way I write Python code.
  • plac: We write a lot of command-line interfaces for our Python libraries — it’s an especially prominent feature in Prodigy. Plac provides a very small and concise way to define command-line interfaces with decorators. It works for quick scripts, but is also a good solution for long-lasting library code, as it helps you make sure your CLIs behave consistently and have decent documentation.
  • FastAPI: I’ve never liked the server-side templating style of web development you get with frameworks like Django or Rails, as I’ve always wanted to write more interactive single-page applications. FastAPI is a new step in REST-only frameworks, that really takes advantage of new Python features like type hints and async/await. We’ve been switching all our services over to it, and the experience so far has been great.

Even though it’s not strictly a Python library, I’d also like to give a shoutout to Binder (and the related Jupyter ecosystem). Their components and building blocks have completely changed the way I think about executing Python code on the web, and have enabled me to build so many cool things, including spaCy’s interactive docs and my online course framework.

Can you tell us how your business, Explosion AI, came about?

My co-founder Matt left academia in 2014 to write spaCy. As the technologies he was working on became more and more viable, companies became increasingly interested in using his research code. What was really missing at the time was a library that took what works in research and made it available as a production-ready implementation. We met while Matt was in Berlin writing the first version of spaCy. We started working together soon after. The first thing I built was an interactive visualizer for the syntax of a text predicted by a statistical model.

In 2016, we founded Explosion, to focus on building developer tools for NLP and Machine Learning. For the first few months, we bootstrapped our company with consulting, before focusing full-time on our products. We’ve been pleased to avoid a lot of the distractions that come up for new software companies today, driven by the venture capital ecosystem. I actually gave a keynote about our take on running an open-source software company at EuroPython 2018, which still sums up my feelings very well.

Explosion is now very stable, based entirely on revenue from our first product, Prodigy. We’re also close to launching our second product, Prodigy Scale, and have been working with some great developers, including some new hires who’ll be joining our team soon. All this means spaCy will be well-funded going forward, with lots of cool new features to look forward to.

Thanks for doing the interview, Ines!