Python zip function tutorial (Simple Examples)

The zip() function is a built-in Python function that takes two or more sequences or collections (like lists or strings), and makes an iterator that aggregates elements from each of the collections in parallel.

This process of combining these values is known as ‘zipping’, which originates from the idea of zipping together two separate collections of items.

 

 

Python zip() syntax and usage

The basic syntax of the zip() function in Python is as follows:

zip(*iterables, strict=False)

Where ‘*iterables’ can be one or more iterable objects. This returns an iterator of tuples, where the first item in each passed iterator is paired together, then the second item in each passed iterator is paired together, and so on.

If the passed iterators have different lengths, the iterator with the least items decides the length of the new iterator.
Here’s an example of its basic usage:

fruits = ["apple", "banana", "cherry"]
numbers = [1, 2, 3]
result = zip(fruits, numbers)

# convert result to list
print(list(result))

Output:

[('apple', 1), ('banana', 2), ('cherry', 3)]

In this example, we have two lists, fruits and numbers. We pass these two lists as parameters to the zip() function, which returns a zip object.

The zip object is an iterator of tuples where the first item in each passed iterator (in this case, the lists) is paired together, and then the second item in each passed iterator are paired together, etc.

Finally, we convert the zip object to a list of tuples.

Using strict keyword

In Python 3.10 and above, the zip() function accepts an additional strict keyword argument.

If strict is True, zip() will check that all input iterables have the same length and will raise a ValueError if the lengths do not match.

Here’s an example of zip() with strict=True:

numbers = [1, 2, 3]
letters = ['a', 'b']
try:
    zipped = zip(numbers, letters, strict=True)    
    zipped_list = list(zipped)
    print(zipped_list)
except ValueError as ve:
    print(f"Caught an exception: {ve}")

Output:

Caught an exception: zip() argument 2 is shorter than argument 1

In this code, the zip() function raises a ValueError because the numbers list has three items, while the letters list has only two, and strict is set to True.

 

zip() with two lists

The zip() function is commonly used with two lists, but can be used with more than two as well. Here’s an example of using zip() with two lists:

fruits = ["Apple", "Banana", "Cherry"]
colors = ["Red", "Yellow", "Red"]
zipped = zip(fruits, colors)
print(list(zipped))

Output:

[('Apple', 'Red'), ('Banana', 'Yellow'), ('Cherry', 'Red')]

In this example, we defined two lists fruits and colors. We then use the zip() function to combine these two lists into pairs.

The zip() function returns a zip object which is an iterator that contains tuples. Each tuple consists of elements from the same index in the input lists.

The first element in each tuple is from the first list, and the second element is from the second list. Finally, we convert the zip object to a list to make it easier to read.

 

zip() with more than two lists

The zip() function can take any number of iterables, and the resulting tuples will be of the same size.

fruits = ["Apple", "Banana", "Cherry"]
colors = ["Red", "Yellow", "Red"]
weights = [120, 150, 50]
result = zip(fruits, colors, weights)
print(list(result))

Output:

[('Apple', 'Red', 120), ('Banana', 'Yellow', 150), ('Cherry', 'Red', 50)]

In this example, we’ve used the zip() function to pair elements from three different lists – fruits, colors, and weights.

The zip() function iterates over these lists in parallel, taking one element from each list and grouping them into a tuple.

This is repeated until the shortest list is exhausted – in this case, all the lists are of equal length, so all their elements are paired up.

 

zip() with tuples

The zip() function can also be used with tuples. The function pairs up the elements of each tuple at the same index, just like it does with lists.

tuple1 = ("Apple", "Banana", "Cherry")
tuple2 = ("Red", "Yellow", "Red")
result = zip(tuple1, tuple2)
print(list(result))

Output:

[('Apple', 'Red'), ('Banana', 'Yellow'), ('Cherry', 'Red')]

In this example, we defined two tuples, tuple1 and tuple2. We then used the zip() function to create a zip object that pairs the corresponding elements from each tuple together, resulting in a list of tuples.

 

zip() with sets

Python’s zip() function can also be used with sets. However, remember that sets are unordered collections of unique elements, so the order in which the elements get paired can vary each time you run the code.

set1 = {"Apple", "Banana", "Cherry"}
set2 = {"Red", "Yellow", "Red"}
result = zip(set1, set2)
print(list(result))

Output (could vary due to the unordered nature of sets):

[('Cherry', 'Red'), ('Banana', 'Yellow'), ('Apple', 'Red')]

In this example, we defined two sets, set1 and set2. We then used the zip() function to create a zip object that pairs the corresponding elements from each set together, resulting in a list of tuples.

 

zip() with dictionaries

When the zip() function is used with dictionaries, it will iterate over the keys by default:

dict1 = {"name": "Alice", "age": 25, "country": "USA"}
dict2 = {"name": "Bob", "age": 30, "country": "Canada"}
result = zip(dict1, dict2)
print(list(result))

Output:

[('name', 'name'), ('age', 'age'), ('country', 'country')]

In this example, we defined two dictionaries, dict1 and dict2. We then used the zip() function to create a zip object that pairs the corresponding keys from each dictionary together, resulting in a list of tuples.

If you want to zip the values of dictionaries instead, you can use the values() function like so:

result_values = zip(dict1.values(), dict2.values())
print(list(result_values))

Output:

[('Alice', 'Bob'), (25, 30), ('USA', 'Canada')]

Here, zip() is used on the values of dict1 and dict2, so the resulting list of tuples contains paired values from both dictionaries.

 

zip() with strings

Each string is an iterable of characters, so zip() will pair the characters at the same indexes in the strings.

str1 = "ABC"
str2 = "123"
result = zip(str1, str2)
print(list(result))

Output:

[('A', '1'), ('B', '2'), ('C', '3')]

In this example, we defined two strings, str1 and str2. We then used the zip() function to create a zip object that pairs the corresponding characters from each string together.

 

Zipping iterables with different lengths

When the zip() function is used with iterables of different lengths, it stops creating tuples when the smallest iterable is exhausted. Let’s see this in action:

# Define three lists of different lengths
list1 = [1, 2, 3, 4, 5]
list2 = ["a", "b", "c"]
list3 = [1.1, 2.2, 3.3, 4.4]

# Use zip function
result = zip(list1, list2, list3)

# Convert result to list
print(list(result))

Output:

[(1, 'a', 1.1), (2, 'b', 2.2), (3, 'c', 3.3)]

When we use the zip() function on these three lists, it stops creating tuples after the third element because list2 (the shortest list) only has three elements. So the resulting list of tuples only contains three tuples.

 

 

zip() alternative (zip_longest())

As we saw above when working with iterables of uneven length, the smallest iterable decides the length of the iterator, but what if you don’t want that?

This is where the zip_longest() function from the itertools module comes in.

It works similarly to zip(), but instead of stopping the iteration when the shortest iterable is exhausted, it fills in None for the remaining values of the longer iterables.

You can also provide a different fill value using the fillvalue parameter.

import itertools
list1 = [1, 2, 3, 4, 5]
list2 = ["a", "b", "c"]
result = itertools.zip_longest(list1, list2)
print(list(result))

Output:

[(1, 'a'), (2, 'b'), (3, 'c'), (4, None), (5, None)]

When we use the itertools.zip_longest() function on these lists, it continues creating tuples after the third element by filling in None for the remaining values of list2.

 

zip() with the * operator (Unzip)

Python’s zip() function can be used in conjunction with the asterisk ‘*’ operator to unpack iterables. This is often used to “unzip” a list of tuples.

pairs = [("a", 1), ("b", 2), ("c", 3)]
letters, numbers = zip(*pairs)
print(letters)
print(numbers)

Output:

('a', 'b', 'c')
(1, 2, 3)

In this example, we use the zip() function with the * operator to unpack the tuples in pairs, resulting in two tuples: letters, which contains all the first elements from the tuples, and numbers, which contains all the second elements from the tuples.

 

Common errors when using zip()

Errors can occur when using the zip() function if you do not provide the correct arguments, or attempt to use the function in ways it is not intended. Here are a few common errors:

TypeError: zip argument must support iteration

This error occurs when you try to use zip() with a non-iterable argument.

number = 5
list(zip(number))

This will raise an error: TypeError: 'int' object is not iterable

Only use zip() with iterable arguments, such as lists, tuples, sets, dictionaries, and strings.

TypeError: ‘zip’ object is not subscriptable

This error occurs when you try to index a zip object.

list1 = [1, 2, 3]
list2 = ["a", "b", "c"]
zipped = zip(list1, list2)
print(zipped[0])

This will raise an error: TypeError: 'zip' object is not subscriptable

Zip objects are iterators, and they do not support indexing. To index the pairs, first convert the zip object into a list or another subscriptable type like this:

list1 = [1, 2, 3]
list2 = ["a", "b", "c"]
zipped = zip(list1, list2)
print(list(zipped)[0])

 

Memory optimization with zip()

One of the advantages of the zip() function is its efficiency in terms of memory usage. This is because zip() returns an iterator, not a list or another sequence.

The iterator produces the pairs one at a time, only as they are requested, instead of storing all the pairs in memory at once.
This can make a significant difference when working with large datasets.
For this, we will use the sys module to check the memory usage of a list vs a zip object.

import sys

# Two large lists of numbers
list1 = list(range(1, 10000001))  # numbers from 1 to 10 million
list2 = list(range(10000001, 20000001))  # numbers from 10 million + 1 to 20 million

list_memory = sys.getsizeof(list1) + sys.getsizeof(list2)
zipped = zip(list1, list2)
zip_memory = sys.getsizeof(zipped)

print(f"Memory used by lists: {list_memory} bytes")
print(f"Memory used by zip object: {zip_memory} bytes")

Output:

Memory used by lists: 160000112 bytes
Memory used by zip object: 64 bytes

The zip object uses significantly less memory compared to the two lists, illustrating the memory efficiency of zip().

 

zip() with lambda functions

Let’s look at an example where we pair elements from two lists, and then use a lambda function to create a new list where each element is the sum of the elements in each pair.

list1 = [1, 2, 3, 4, 5]
list2 = [6, 7, 8, 9, 10]
pairs = zip(list1, list2)

# Use a lambda function to sum each pair
sums = list(map(lambda pair: pair[0] + pair[1], pairs))
print(sums)

Output:

[7, 9, 11, 13, 15]

In this example, we first used zip(list1, list2) to create a zip object that pairs together elements from list1 and list2.

Then, we used the map() function with a lambda function lambda pair: pair[0] + pair[1] to create a new list sums.

This lambda function takes a pair of numbers and returns their sum.
This way, the zip() function allows you to pair elements from multiple iterables, and lambda functions let you process and manipulate these pairs in a compact and efficient manner.

 

Nested zip()

You can nest zip() functions to pair elements of multiple iterables in more complex ways. Let’s consider an example where you have two lists of tuples, and you want to pair together the corresponding tuples from each list.
Here is the code:

# Two lists of tuples
list1 = [(1, 2), (3, 4), (5, 6)]
list2 = [('a', 'b'), ('c', 'd'), ('e', 'f')]

# Use nested zip to pair tuples from the two lists
pairs = list(zip(*[zip(*list1), zip(*list2)]))
print(pairs)

Output:

[((1, 2), ('a', 'b')), ((3, 4), ('c', 'd')), ((5, 6), ('e', 'f'))]

In this example, zip(*list1) and zip(*list2) first unpack the tuples in list1 and list2 into separate tuples of the first elements and the second elements.

Then, zip(*[zip(*list1), zip(*list2)]) pairs together the corresponding tuples from the result of zip(*list1) and zip(*list2), effectively pairing together the original tuples from list1 and list2.
Note that it can become hard to read and understand if overused or used with too many levels of nesting.

 

Real-world examples of using zip()

To better appreciate the power and utility of the zip() function, let’s explore a real-world example where zip() might come in handy.

Pairing lines from two files

Suppose you have two text files, file1.txt and file2.txt, and you want to pair together the corresponding lines from each file.

with open('file1.txt', 'r') as file1, open('file2.txt', 'r') as file2:
    # Use zip to pair lines from the two files
    for line1, line2 in zip(file1, file2):
        print(line1.strip(), line2.strip())

In this example, zip(file1, file2) pairs together the corresponding lines from file1 and file2. The loop then prints each pair of lines.

Feature and Label Separation

In machine learning, you often work with datasets where each instance is represented as a tuple (or list) of features and a label.

For preprocessing, you might need to separate the features and labels. The zip() function is perfect for this.

dataset = [
    ((1.1, 2.2, 3.3), 0),
    ((4.4, 5.5, 6.6), 1),
    ((7.7, 8.8, 9.9), 0),
]

# Use zip to separate the features and labels
features, labels = zip(*dataset)

print("Features:")
for feature in features:
    print(feature)

print("\nLabels:")
for label in labels:
    print(label)

Output:

Features:
(1.1, 2.2, 3.3)
(4.4, 5.5, 6.6)
(7.7, 8.8, 9.9)

Labels:
0
1
0

In this example, zip(*dataset) separates the tuples in dataset into separate tuples of features and labels.

 

Resources

https://docs.python.org/3/library/functions.html#zip

2 thoughts on “Python zip function tutorial (Simple Examples)
Leave a Reply

Your email address will not be published. Required fields are marked *