Golang: Compact large slices after appends

The way in which arrays and slices are designed in golang it is possible to hold a chunck of unused memory when using append to append a large set of elements to slices. The builtin append function is designed in a way to strike a balance between memory usage and frequent copying required when growing the slice.
A quick primer
Arrays in golang are a fixed alocation of bytes. When you declare an array of 10 uint64 like this:
	var data [10]uint64
Golang will allocate 10 * 8 = 80 bytes of memory. Slices in golang are segments of arrays. A slice can begin from any index in the array and have a lecgth. For example here are few slices using the above declared array of 10 uint64:
	s1 := data[:]  // a slice pointing to the entire array
	s2 := data[5:] // a slice pointing to the 5th index to the last element.
Because of this design, manipulating slices are very fast and easy. The thing to always remember is that any slice in golang always refer to an underlying backing array.
The problem with append
The built in append function must before appending to the slice check that the slice has the required capacity. If the capacity is not enough, it must create a new array to hold the existing data in the slice + the new elements being appended. This process is called growing the slice. The question here is, how much to grow? If new array has only room for the existing data + the new elements, it will have to be copyed again if another apend is performed. To avoid this possibly frequent copying and allocation of new arrays, the append function reserves few extra bytes when it must grow the slice. Currently append doubles the allocation for smaller slices (ess 1024 elements) and roughly grows 25 percent extra after the 1024 elements. This might change in the future go versions.
Compacting
For the large slices (thousands or millions of elements) the problem arises the when all the appending is done. At this stage there usually is some unused space at the end of the backing array. If you are done with appending, you can compact the slice and release the backing memory to the GC. Here is a function for compacting slices after large number of appends.
func AppendCompact(r []*Record) []*Record {
	if cap(r) == len(r) {
		return r // no compacting required.
	}
	n := make([]*Record, len(r))
	copy(n, r)
	return n
}
Here we create a new backing array with just enough space for the slice length and copy the existing slice to the new array and return it. You can use this like this:
	var rec []*Record
	//populate it
	for i := 0; i < 10000; i++ {
		rec = append(rec, NewRecord())
	}

	//print the length and capacity
	fmt.Printf("Lenghth: %d, Cap: %d \n", len(rec), cap(rec))
	//compact
	rec = AppendCompact(rec)
	//print the length and capacity
	fmt.Printf("Lenghth: %d, Cap: %d\n", len(rec), cap(rec))
The downside is that if the slice is large, we will be creating a copy of it. For some time till the GC collects the old allocation, we will be using more memory.
github.com/jmoiron/sqlx and selects
Recently I've been working on a system where large sets of database rows are to be loaded in memory. I used jmoiron/sqlx package's Select method to select and store the rows in a slice. Since this package uses append internally, the slice had some unused capacity. The AppendCompact function is a good choice here to compact the slice. If possible, always allocate the slice with correct count before passing it to the Select function. Example:
	records := make([]*Record, 0, count)
Read more
© 2018 Nirandas Thavorath. All Rights Reserved.
Last updated at Thursday January 11, 2018 04:39 AM