Reading and Writing CSV Files in Kotlin with Apache Commons

Introduction

In this article we'll be taking a look at how to read and write CSV files in Kotlin, specifically, using Apache Commons.

Apache Commons Dependency

Since we're working with an external library, let's go ahead and import it into our Kotlin project. If you're using Maven, simply include the commons-csv dependency:

<dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-csv</artifactId>
    <version>1.5</version>
</dependency>

Or, if you're using Gradle:

implementation 'org.apache.commons:commons-csv:1.5'

Finally, with the library added to our project, let's define the CSV file we're going to read - students.csv:

101,John,Smith,90
203,Mary,Jane,88
309,John,Wayne,96

It'll be located under /resources/students.csv.

Also, since we'll be reading these records into custom objects, let's make a data class:

data class Student (
    val studentId: Int,
    val firstName: String,
    val lastName: String,
    val score: Int
)

Reading a CSV File in Kotlin

Let's first read this file using a BufferedReader, which accepts a Path to the resource we'd like to read:

val bufferedReader = new BufferedReader(Paths.get("/resources/students.csv"));

Then, once we've read the file into the buffer, we can use the buffer itself to initialize a CSVParser instance:

val csvParser = CSVParser(bufferedReader, CSVFormat.DEFAULT);

Given how volatile the CSV format can be - to remove the guesswork, you'll have to specify the CSVFormat when initializing the parser. This parser, initialized this way, can only then be used for this CSV format.

Since we're following the textbook example of the CSV format, and we're using the default separator, a comma (,) - we'll pass in CSVFormat.DEFAULT as the second argument.

Now, the CSVParser is an Iterable, that contains CSVRecord instances. Each line is a CSV record. Naturally, we can then iterate over the csvParser instance and extract records from it:

for (csvRecord in csvParser) {
    val studentId = csvRecord.get(0);
    val studentName = csvRecord.get(1);
    val studentLastName = csvRecord.get(2);
    var studentScore = csvRecord.get(3);
    println(Student(studentId, studentName, studentLastName, studentScore));
}

For each CSVRecord, you can get its respective cells using the get() method, and passing in the index of the cell, starting at 0. Then, we can simply use these in the constructor of our Student data class.

This code results in:

Student(studentId=101, firstName=John, lastName=Smith, score=90)
Student(studentId=203, firstName=Mary, lastName=Jane, score=88)
Student(studentId=309, firstName=John, lastName=Wayne, score=96)

Though, this approach isn't great. We need to know the order of the columns, as well as how many columns there are to use the get() method, and changing anything in the CSV file's structure totally breaks our code.

Reading a CSV File with Headers in Kotlin

It's reasonable to know what columns exist, but a little less so in which order they're in.

Usually, CSV files have a header line that specifies the names of the columns, such as StudentID, FirstName, etc. When constructing the CSVParser instance, following the Builder Design Pattern, we can specify whether the file we're reading has a header row or not, in the CSVFormat.

By default, the CSVFormat assumes that the file doesn't have a header. Let's first add a header row to our CSV file:

StudentID,FirstName,LastName,Score
101,John,Smith,90
203,Mary,Jane,88
309,John,Wayne,96
Free eBook: Git Essentials

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

Now, let's initialize the CSVParser instance, and set a couple of optional options in the CSVFormat along the way:

val bufferedReader = new BufferedReader(Paths.get("/resources/students.csv"));

val csvParser = CSVParser(bufferedReader, CSVFormat.DEFAULT
        .withFirstRecordAsHeader()
        .withIgnoreHeaderCase()
        .withTrim());

This way, the first record (row) in the file will be treated as the header row, and the values in that row will be used as the column names.

We've also specified that the header case doesn't mean much to us, turning the format into a case-insensitive one.

Finally, we've also told the parser to trim the records, which removes redundant whitespaces from the starts and ends of values if there are any. Some of the other options that you can fiddle around with are options such as:

CSVFormat.DEFAULT
    .withDelimiter(',')
    .withQuote('"')
    .withRecordSeparator("\r\n")

These are used if you'd like to change the default behavior, such as set a new delimiter, specify how to treat quotes since they can oftentimes break the parsing logic and specify the record separator, present at the end of each record.

Finally, once we've loaded the file in and parsed it with these settings, you can retrieve CSVRecords as previously seen:

for (csvRecord in csvParser) {
    val studentId = csvRecord.get("StudentId");
    val studentName = csvRecord.get("FirstName);
    val studentLastName = csvRecord.get("LastName);
    var studentScore = csvRecord.get("Score);
    println(Student(studentId, studentName, studentLastName, studentScore));
}

This is a much more forgiving approach, since we don't need to know the order of the columns themselves. Even if they get changed at any given time, the CSVParser's got us covered.

Running this code also results in:

Student(studentId=101, firstName=John, lastName=Smith, score=90)
Student(studentId=203, firstName=Mary, lastName=Jane, score=88)
Student(studentId=309, firstName=John, lastName=Wayne, score=96)

Writing a CSV File in Kotlin

Similar to reading files, we can also write CSV files using Apache Commons. This time around, we'll be using the CSVPrinter.

Just how the CSVReader accepts a BufferedReader, the CSVPrinter accepts a BufferedWriter, and the CSVFormat we'd like it to use while writing the file.

Let's create a BufferedWriter, and instantiate a CSVPrinter instance:

val writer = new BufferedWriter(Paths.get("/resources/students.csv"));

val csvPrinter = CSVPrinter(writer, CSVFormat.DEFAULT
                     .withHeader("StudentID", "FirstName", "LastName", "Score"));

The printRecord() method of the CSVPrinter instance is used to write out records. It accepts all the values for that record and prints it out in a new line. Calling the method over and over allows us to write many records. You can either specify each value in a list, or simply pass in a list of data.

There's no need to use the printRecord() method for the header row itself, since we've already specified it with the withHeader() method of the CSVFormat. Without specifying the header there, we would've had to print out the first row manually.

In general, you can use the csvPrinter like this:

csvPrinter.printRecord("123", "Jane Maggie", "100");
csvPrinter.flush();
csvPrinter.close();

Don't forget to flush() and close() the printer after use.

Since we're working with a list of students here, and we can't just print the record like this, we'll loop through the student list, put their info into a new list and print that list of data using the printRecord() method:

val students = listOf(
    Student(101, "John", "Smith", 90), 
    Student(203, "Mary", "Jane", 88), 
    Student(309, "John", "Wayne", 96)
);

for (student in students) {
    val studentData = Arrays.asList(
            student.studentId,
            student.firstName,
            student.lastName,
            student.score)

    csvPrinter.printRecord(studentData);
}
csvPrinter.flush();
csvPrinter.close();

This results in a CSV file, that contains:

StudentID,FirstName,LastName,Score
101,John,Smith,90
203,Mary,Jane,88
309,John,Wayne,96

Conclusion

In this tutorial, we've gone over how to read and write CSV files in Kotlin, using the Apache Commons library.

Last Updated: February 27th, 2023
Was this article helpful?

Improve your dev skills!

Get tutorials, guides, and dev jobs in your inbox.

No spam ever. Unsubscribe at any time. Read our Privacy Policy.

Make Clarity from Data - Quickly Learn Data Visualization with Python

Learn the landscape of Data Visualization tools in Python - work with Seaborn, Plotly, and Bokeh, and excel in Matplotlib!

From simple plot types to ridge plots, surface plots and spectrograms - understand your data and learn to draw conclusions from it.

© 2013-2024 Stack Abuse. All rights reserved.

AboutDisclosurePrivacyTerms