DEV Community

Cover image for A closer look at Go's append function
Andy Haskell
Andy Haskell

Posted on • Updated on

A closer look at Go's append function

Appending items onto arrays and array-like objects is common in many programming languages, with functions like JavaScript's Array.push and Python's List.append to add items. In Go, the function for adding items to its slice type is called append, and it looks pretty similar to what you'd expect from another language.

var numbers []int
numbers = append(numbers, 250)
numbers = append(numbers, 2071)
numbers = append(numbers, 1123)
// logs [250, 2071, 1123]
fmt.Println(numbers)

In this code snippet, we add the numbers 250, 2071, and 1123 to a slice of int, and then print out its content.

Most of the time, it's easy think of append as just adding items onto a slice. However, knowing what goes on behind the scenes when you append to a slice helps you both write Go code that runs faster, and avoid an easy to miss pitfall of working with slices and the append function. So in this tutorial, we're going to take a closer look at what your computer is doing when you run append.

Watching a slice grow

When you have a slice, there are five things it can tell you:

🐛 Its length; len(slice)
📦 Its capacity; cap(slice)
🎊 What's inside it; fmt.Sprintf("%v", slice)
🗺️ Where, in your computer's memory, it is; fmt.Sprintf("%p", slice)
❓ What type its items are; fmt.Sprintf("%T", slice)

So let's take a look at what happens in the first four of those if we append to a slice five times:

var numbers []int
for i := 0; i < 5; i++  {
    numbers = append(numbers, i)
    fmt.Printf("address: %p, length: %d, capacity: %d, items: %v\n",
        numbers, len(numbers), cap(numbers), numbers)
}

Here's my output when I ran this. Any time you run it, the memory addresses should be different, but everything else should be the same:

address: 0xc00009a000, length: 1, capacity: 1, items: [0]
address: 0xc00009a030, length: 2, capacity: 2, items: [0 1]
address: 0xc0000a6000, length: 3, capacity: 4, items: [0 1 2]
address: 0xc0000a6000, length: 4, capacity: 4, items: [0 1 2 3]
address: 0xc000092080, length: 5, capacity: 8, items: [0 1 2 3 4]

Notice that:

  • Every time we add an item, both len(numbers) and the items in the numbers slice always change, just as we'd expect.
  • Each time len(numbers) and cap(numbers) are the same, the slice is at maximum capacity. If we add another item to a slice at maximum capacity, then the slice's capacity grows.
address: 0xc0000a6000, length: 3, capacity: 4, items: [0 1 2]
address: 0xc0000a6000, length: 4, capacity: 4, items: [0 1 2 3]
address: 0xc000092080, length: 5, capacity: 8, items: [0 1 2 3 4]
  • Not only that, but when we add to a slice that's at maximum capacity, its address also changes!

On that last point, that means that when that fifth number is added to the slice, your computer is allocating a new piece of memory for our slice. The numbers variable now points to whatever section of memory just got allocated for the slice. To understand why we need that allocation, let's take a look at the data structure behind slices: Go arrays!

Arrays in Go

If you're working with a contiguous, ordered collection of items in Go, chances are you're using a slice. However, that doesn't mean there's no such thing as Go arrays. Go uses arrays, but the reason we don't often use them directly is that arrays are mainly used for storing the underlying data for slices.

One big difference between a Go array and a Go slice is that you have to say the size of an array up front when you declare it. For example, you would declare an array of two strings with the type [2]string instead of []string:

// A slice of strings, with two items in it
vehiclesSlice := []string{"car", "Pokémon that knows SURF"}
// An array with two items in it, with its size set to two
vehiclesArray := [2]string{"car", "Pokémon that knows SURF"}

These look similar, and you can access items in an array or a slice the same way. For example, to see all your vehicles, this loop would work for both vehiclesSlice and vehiclesArray:

for i, v := range vehiclesArray {
    if i == 0 {
        fmt.Printf("I get around by riding a %s\n", v)
    } else {
        fmt.Printf("My other ride is a %s\n", v)
    }
}

However, besides syntax, the big difference between a slice and an array, and why you have to declare an array's size up front, is that while a slice can change sizes, arrays are fixed-size. And because a fixed-size array is where the items in a slice live, when you append to a slice with a line of code like vehiclesSlice = append(vehiclesSlice, "Catbus"), what actually happens is:

1) The slice increments its length, so len(vehicles) is now 3
2a) If there is still space left in the underlying array, our slice isn't at maximum capacity, so the item is just added to a new slot in the array:

A Go Gopher holding a pencil, writing the word "Catbus" into the third slot in a Go slice

2b) But if the underlying array was already full, because arrays are fixed-size, that means we need to allocate a new, bigger array in memory and then copy everything over to it.

A Go Gopher pulling a lever to operate a crane, carrying a newly-allocated slice of strings

Either way, from the perspective of a slice's items, the result will be the same; the slice will have an underlying array containing this well-drawn Catbus:

A sloppy drawing of Catbus from My Neighbor Totoro, but it looks more like a turtle with ears

Boston Museum of Fine Arts, here I come!

Slices, arrays, and performance

You won't notice it in examples with just a few items, but the reason these re-allocations from appending are a big deal is that if you have millions of items in your slice, you can end up spending a lot of time on allocating memory and re-copying it. And memory allocations are a slow operation from the computer's perspective.

To see this in action, try running this Go testing benchmark that fills a slice with a billion 8-bit integers (if you have more than four gigabytes of RAM).

package main

import (
    "testing"
)

var aBillion=1000*1000*1000

// Add items to a slice with a billion-item array allocated up-front
func BenchmarkBillionPreallocatedSlice(b *testing.B) {
    b.ReportAllocs()

    for i := 0; i < b.N; i++ {
        items := make([]uint8, 0, aBillion)
        for j := 0; j < aBillion; j++ {
            items = append(items, 255)
        }
    }
}

// Add items to a slice that starts out nil; the underlying array gets
// re-allocated and re-copied several times
func BenchmarkBillionEmptySlice(b *testing.B) {
    b.ReportAllocs()

    for i := 0; i < b.N; i++ {
        var items []uint8
        for j := 0; j < aBillion; j++ {
            items = append(items, 255)
        }
    }
}

On my Mac, adding a billion items to a slice where the memory isn't pre-allocated typically took around .5-.6 seconds, while adding them to a slice that started out empty took around 2.5-2.8 seconds, with six times as many bytes being processed!

goos: darwin
goarch: amd64
pkg: benchmark
BenchmarkBillionPreallocatedSlice-8            2     583063935 ns/op    1000005728 B/op        2 allocs/op
BenchmarkBillionEmptySlice-8                   1    2799137688 ns/op    6168961152 B/op       73 allocs/op

This shows how much the extra allocations and repeated data copying can slow down your code. Because of this, if you are looking to improve the performance of code that works heavily with a slice, you should look into preventing unnecessary re-allocations.

For example, here is a code snippet where we fill a slice up and then process its data, but we use the slicing operator to allow us to reuse its underlying array when we fill it up again.

s := make([]int, 0, 1000)

// Fill the slice up and then send the items in the slice on their merry way
// down our data pipeline.
for i := 0; i < 1000; i++ {
    s = append(s, i)
}
fmt.Printf("address: %p, length: %d, capacity: %d\n", s, len(s), cap(s))
processSliceItems(s)

// Set the slice's length to 0 so we can start re-filling it. We're still using
// the same underlying array, so we can append to it and fill it up as many
// times as we want without any new allocations slowing down our code's
// performance.
s = s[:0]
fmt.Printf("address: %p, length: %d, capacity: %d\n", s, len(s), cap(s))

for i := 0; i < 1000; i++ {
    s = append(s, i)
}

When we say s = s[:0], that means that we are assigning s a slice, with the same underlying array, but with the slice's length reset. Because of this, no new memory allocations have to happen, so we'll see the same address both times we print it.

Slice before and after emptying it with the slicing operator. In the slice before emptying, length is 1000 and capacity is 0. In the slice after emptying, length is 0, capacity is 1000, and its underlying arrays still have the same items as before, with the caption "These numbers now are not in the slice, but are still in the underlying array"

Be safe with appending to slices

In addition to performance, there is one other implication a slice's underlying array has on how we use append. It's that to use slice append safely, we need to be sure we know where in the array we are appending to. If we take this sloth family's grocery list:

groceryList := []string{
    "Cecropia leaves",     // Leaves sloths eat a lot of
    "Hibiscus flowers",    // 🌺 They're like sloth chocolate!
    "Green Sheen shampoo", // How sloths groom the algae ecosystem in their fur
}

foodstuffs := groceryList[:2]
personalCare := groceryList[2:]

fmt.Println("Food grocery items:", foodstuffs)
fmt.Println("Personal care items:", personalCare)

This looks correct so far; we print out the foodstuffs slice and get our first two items, and we print out the personalCare slice and get our third item.

At the memory level we have three slices in-play, and all of them are working with the same underlying array:

An array of the strings "Cecropia leaves", "Hibiscus flowers", and "Green Sheen shampoo". There is a black bracket labeled "groceryList" below the array, a blue bracket labeled "foodstuffs" above the array's first two items, and a green bracket labeled "personalCare" above the array's third item

However, that can spell trouble if we try to then append to the foodstuffs slice. If we appended "avocados" to foodstuffs like this (ancient giant sloths helped disperse the seeds of what would evolve to be today's avocados):

foodstuffs := groceryList[0:2]
personalCare := groceryList[2:3]

foodstuffs = append(foodstuffs, "Avocados")

fmt.Println("Food grocery items:", foodstuffs)
fmt.Println("Personal care items:", personalCare)

Then even though the statements foodstuffs := groceryList[0:2] and foodstuffs = append(foodstuffs, "Avocados") look like they're making fresh arrays, we're still using the same sections of memory. This means when we append to the foodstuffs slice, what we're actually doing looks like this:

An array of the strings "Cecropia leaves", "Hibiscus flowers", and "Green Sheen shampoo", with "Green Sheen shampoo" crossed out and replaced with "avocados". Now, the green bracket labeled "personalCare" still is only over the third item, but the blue bracket labeled "foodstuffs" is over all three items in the array.

We put "Avocados" into the third slot of the underlying array, and in memory, that is also the location of first slot of the personalCare slice. So now the foodstuffs slice is []string{"Cecropia leaves", "Hibiscus flowers", "Avocados"}, but the personalCare slice is []string{"Avocados"}. We wiped "Green Sheen shampoo" out of the array!

Because of this, to append to slices safely in Go, you need to be sure you are appending to the slice containing every item we have added, in this case groceryList. This means that in general, the foodstuffs and personalCare slices should really only be read from, not appended to.

If you need a copy of the slice to append to, you can use Go's built-in copy function, like this:

// The foodstuffs slice now has a different underlying array from groceryList;
// we can now append to it without overwriting personalCare[0]!
foodstuffs := make([]string, 2)
copy(foodstuffs, groceryList[:2])
foodstuffs = append(foodstuffs, "Avocados")

fmt.Println("Food grocery items:", foodstuffs)
fmt.Println("Personal care items:", personalCare)

Now, with foodstuffs having a different underlying array, avocados can be added to the foodstuffs slice, and that won't overwrite Green Sheen shampoo on the grocery list! The sloths are ready to go to the store, and you're ready to write code with append that's both performant and safe. Until next time,

Two-toed sloth climbing a tree

STAY SLOTHFUL!

Image credit for the sloth picture to Stevenj on Wikipedia, original photographer is Leyo on Wikipedia; picture is licensed CC-BY-2.5.

The Go Gopher on the title banner was originally drawn by Renee French and is licensed CC-BY-3.0.

Also shoutout to Nicole Archambault's MetroWest freeCodeCampers meetup, which is where I drew that Catbus drawing to demonstrate slice re-allocation!

Top comments (4)

Collapse
 
begmaroman profile image
Roman Behma

The capacity doesn't always double. When it's grown to ~1024 (depends on OS) there applies another logic to increase the capacity of a slice.

Collapse
 
init profile image
Alexander Bykov

This behavior changed in go 1.18 to more smooth allocation:
starting cap growth factor
256 2.0
512 1.63
1024 1.44
2048 1.35
4096 1.30

commit:
github.com/golang/go/commit/2dda92...

Collapse
 
andyhaskell profile image
Andy Haskell

Thanks for the feedback! Will update the post with that info

Collapse
 
larrydwp profile image
larrydwp • Edited

Thanks @andyhaskell Nice write up.

One question about printing the address part:

fmt.Sprintf("%p", slice)
Enter fullscreen mode Exit fullscreen mode

does it print out the address of slice header or the underlying array?