Did you know that Python has thousands of external libraries, but some of the most powerful tools are already included in the standard library?

This series, themed around ‘Deep Dive into the Python Standard Library’, will explore standard libraries that are widely used but not often discussed in depth, one by one.

The goal is not just to list simple functions, but to understand concepts through practical examples and enhance your coding skills to elevate your Python usage to the next level.


In-depth Usage of collections: From Basics to Practical Applications

1. Why start with collections?

collections provides efficient, high-level collections that complement Python's built-in data types (list, dict, tuple) in terms of performance and structure. These collections often appear in practical scenarios but are rarely discussed in depth.

In this article, I will focus on the five most practical classes and explain ‘why to use them’, ‘how to use them’, and ‘when they are beneficial’.


2. Counter – The Definitive Tool for Counting Frequencies and Beyond

Count all the things!

Basic Concept

collections.Counter is one of the very useful classes included in the Python standard library collections module. As the name implies, it's a "counter", a special dictionary optimized for counting occurrences (frequencies) of data.

from collections import Counter

c = Counter(['a', 'b', 'c', 'a', 'b', 'a'])
print(c)  # Counter({'a': 3, 'b': 2, 'c': 1})

It’s a data structure that counts how many times each element appears when you input iterable objects such as lists, strings, tuples, or dictionaries.


Main Features and Methods

📌 Various Initialization Methods

Counter can be initialized in various ways, allowing flexible data analysis.

from collections import Counter

print(Counter(['a', 'b', 'a']))
# Counter({'a': 2, 'b': 1}) → List

print(Counter({'a': 2, 'b': 1}))
# Counter({'a': 2, 'b': 1}) → Dictionary

print(Counter(a=2, b=1))
# Counter({'a': 2, 'b': 1}) → Keyword arguments

📌 Element Access

Counter behaves like a dict, but when accessing a non-existent key, it returns 0 instead of raising a KeyError.

c = Counter('hello')
print(c['l'])  # 2 (The character 'l' appears twice)
print(c['x'])  # 0 ('x' does not appear, but returns 0 instead of an error)

📌 Adding/Modifying Elements

You can add to existing elements or modify them directly. Non-existent keys are automatically added.

c = Counter('hello')
c['l'] += 3
print(c)
# Counter({'l': 5, 'o': 1, 'h': 1, 'e': 1})

📌 most_common(n) – Extracting Most Frequent Elements

Returns a list of tuples of the n most common elements in order of frequency.

c = Counter('banana')
print(c.most_common(2))
# [('a', 3), ('n', 2)] → 'a' appears 3 times, 'n' appears 2 times

📌 elements() – An Iterator for Iterating Elements

Provides an iterator that repeats elements based on their counts.

c = Counter('banana')
print(list(c.elements()))
# ['b', 'a', 'a', 'a', 'n', 'n']

However, elements with values less than or equal to 0 are excluded.


📌 Support for Mathematical Operations (Counter operations using +, -, &, |)

One of the powerful points of Counter is that it supports arithmetic and set operations.

c1 = Counter(a=3, b=1)
c2 = Counter(a=1, b=2)

print(c1 + c2)
# Counter({'a': 4, 'b': 3}) → Same keys are combined

print(c1 - c2)
# Counter({'a': 2}) → Negative values are ignored, 'b' is omitted as it would be negative

print(c1 & c2)
# Counter({'a': 1, 'b': 1}) → Intersection, based on minimum value

print(c1 | c2)
# Counter({'a': 3, 'b': 2}) → Union, based on maximum value

Practical Examples

📌 Analyzing Word Frequencies in Strings
text = "the quick brown fox jumps over the lazy dog"
counter = Counter(text.split())
print(counter)
📌 Log Frequency Analysis
logs = ['INFO', 'ERROR', 'INFO', 'DEBUG', 'ERROR', 'ERROR']
print(Counter(logs))  # Counter({'ERROR': 3, 'INFO': 2, 'DEBUG': 1})
📌 Counting Duplicate Elements in a List
nums = [1, 2, 2, 3, 3, 3]
print(Counter(nums))  # Counter({3: 3, 2: 2, 1: 1})

Important Points to Note

  • Counter inherits from dict, but does not guarantee order. If order is important, use most_common().
  • Items are not removed even if their counts drop to 0 or below, so you may need to filter them manually.
c = Counter(a=3)
c.subtract({'a': 5})
print(c)  # Counter({'a': -2})  # Note that items do not disappear even if their values drop to 0 or below

Tip: Accumulating Without Initialization

counter = Counter()
with open("data.txt") as f:
    for line in f:
        counter.update(line.strip().split())

Conclusion

collections.Counter is a powerful tool that is almost indispensable in data analysis, log processing, and text mining. It serves as an easy frequency counting tool for beginners, while also evolving into an advanced processing tool that combines operations and filtering for experts.


Next Episode Preview

  • defaultdict – A world without KeyErrors, more flexible than dict! Stay tuned for the next episode!

By ‘thoroughly understanding and properly using’ the standard library, the quality of your code will definitely improve.