One of the most common tasks in Python data processing is Flattening, taking a nested 2D matrix (List of Lists) and turning it into a single 1D list.
The standard for loop way is verbose. The list comprehension way is elegant, but the syntax often trips people up.
[!NOTE] The Syntax Cheat Code
- Syntax:
[item for outer in matrix for item in outer]- Mnemonic: "Read it left to right, exactly like nested for loops."
Table of Contents
- The Flattening Syntax
- Visualizing the Loop Order
- Flattening with Filtering
- Flattening 3+ Levels Deep
- Alternatives: sum() vs itertools vs numpy
- Real-world Logic: API Pagination
- FAQ
- Test Your Knowledge (Quiz)
The Flattening Syntax
The Goal:
Convert [[1, 2], [3, 4]] into [1, 2, 3, 4].
The Logic:
- Loop over the main list to get each sublist.
- Loop over each sublist to get each item.
The Comprehension:
[item for sublist in matrix for item in sublist]
Wait, why is it written that way? Let's check the loop comparison.
Visualizing the Loop Order
Confused about the order? Just read it from Left to Right, exactly like nested for statements.
matrix = [ [1,2], [3,4] ]
^
|
for sublist in matrix: (Outer Loop)
|
v
[1, 2]
^
|
for item in sublist: (Inner Loop)
|
v
1, 2 ...
Standard Loops:
flattened = []
for sublist in matrix: # Outer Loop
for item in sublist: # Inner Loop
flattened.append(item)
Comprehension:
# [Expr Outer Loop Inner Loop ]
[ item for sublist in matrix for item in sublist ]
Think of it as simply deleting the indentation and putting it all on one line.
Flattening with Filtering
You can add if conditions at any level of the flattening process.
1. Filter the Sublist
Ignore empty sublists.
matrix = [[1, 2], [], [3, 4]]
# Only loop over sublist if it has items
[x for sublist in matrix if len(sublist) > 0 for x in sublist]
2. Filter the Items
Flatten, but keep only even numbers.
matrix = [[1, 2], [3, 4]]
[x for sublist in matrix for x in sublist if x % 2 == 0]
# Result: [2, 4]
Flattening 3+ Levels Deep
What if you have [[[1], [2]], [[3], [4]]]?
Technically, you just keep adding loops at the end.
# [x for dim1 in data for dim2 in dim1 for x in dim2]
Recommendation: Do not do this. It is impossible to read. If you have 3+ levels of nesting, use a recursive function or a library like deepflatten.
Alternatives: sum() vs itertools vs numpy
List comprehension isn't the only way. Is it the best?
1. The sum() Hack (Avoid This)
flattened = sum(matrix, [])
This looks clever but it is Quadratic O(N^2) complexity. It creates a new list for every sublist addition. It is terribly slow for large lists.
2. itertools.chain (Best for Memory)
If you just need to iterate over the flattened items and don't need a list immediately:
import itertools
flattened_iter = itertools.chain.from_iterable(matrix)
# Use it in a loop... (Zero memory overhead!)
for x in flattened_iter:
print(x)
3. numpy.flatten() (Best for Numbers)
If you are doing data science:
import numpy as np
arr = np.array([[1, 2], [3, 4]])
flat = arr.flatten()
This is invariably the fastest method for numeric data.
Real-world Logic: API Pagination
Imagine fetching user data from an API. You get pages of users.
pages = [
[{'id': 1}, {'id': 2}], # Page 1
[{'id': 3}, {'id': 4}] # Page 2
]
# Flatten into a single list of IDs
all_ids = [u['id'] for page in pages for u in page]
# Result: [1, 2, 3, 4]
This is cleaner than writing a double loop just to extract IDs.
FAQ
Q1: Why does [x for x in sublist for sublist in matrix] fail?
A: Because Python reads left-to-right. You are trying to use sublist in the first loop before it is defined in the second loop. The Outer loop MUST come first.
Q2: Can I flatten mixed data like [1, [2, 3]]?
A: No, a standard comprehension will fail because the integer 1 is not iterable. You would need to a custom function with instanceof checks.
Q3: Is itertools.chain faster than comprehension?
A: For simply iterating, yes. For creating a new list, list(itertools.chain(*matrix)) is roughly the same speed as a comprehension.
Test Your Knowledge
Conclusion
Flattening is a tool you will reach for constantly. The syntax seems backwards until you realize it is just a "one-line nested loop".
Next Steps:
- Learn what not to do: Best Practices
- Go Deeper: Advanced List Comprehensions