Mastering the Python Standard Library①

Did you know that Python has thousands of external libraries, but some of the most powerful tools are already included in the standard library?

This series, themed around ‘Deep Dive into the Python Standard Library’, will explore standard libraries that are widely used but not often discussed in depth, one by one.

The goal is not just to list simple functions, but to understand concepts through practical examples and enhance your coding skills to elevate your Python usage to the next level.

In-depth Usage of collections: From Basics to Practical Applications

1. Why start with `collections`?

collections provides efficient, high-level collections that complement Python's built-in data types (list, dict, tuple) in terms of performance and structure. These collections often appear in practical scenarios but are rarely discussed in depth.

In this article, I will focus on the five most practical classes and explain ‘why to use them’, ‘how to use them’, and ‘when they are beneficial’.

2. `Counter` – The Definitive Tool for Counting Frequencies and Beyond

Count all the things!

Basic Concept

collections.Counter is one of the very useful classes included in the Python standard library collections module. As the name implies, it's a "counter", a special dictionary optimized for counting occurrences (frequencies) of data.

from collections import Counter

c = Counter(['a', 'b', 'c', 'a', 'b', 'a'])
print(c)  # Counter({'a': 3, 'b': 2, 'c': 1})

It’s a data structure that counts how many times each element appears when you input iterable objects such as lists, strings, tuples, or dictionaries.

Main Features and Methods

📌 Various Initialization Methods

Counter can be initialized in various ways, allowing flexible data analysis.

from collections import Counter

print(Counter(['a', 'b', 'a']))
# Counter({'a': 2, 'b': 1}) → List

print(Counter({'a': 2, 'b': 1}))
# Counter({'a': 2, 'b': 1}) → Dictionary

print(Counter(a=2, b=1))
# Counter({'a': 2, 'b': 1}) → Keyword arguments

📌 Element Access

Counter behaves like a dict, but when accessing a non-existent key, it returns 0 instead of raising a KeyError.

c = Counter('hello')
print(c['l'])  # 2 (The character 'l' appears twice)
print(c['x'])  # 0 ('x' does not appear, but returns 0 instead of an error)

📌 Adding/Modifying Elements

You can add to existing elements or modify them directly. Non-existent keys are automatically added.

c = Counter('hello')
c['l'] += 3
print(c)
# Counter({'l': 5, 'o': 1, 'h': 1, 'e': 1})

📌 `most_common(n)` – Extracting Most Frequent Elements

Returns a list of tuples of the n most common elements in order of frequency.

c = Counter('banana')
print(c.most_common(2))
# [('a', 3), ('n', 2)] → 'a' appears 3 times, 'n' appears 2 times

📌 `elements()` – An Iterator for Iterating Elements

Provides an iterator that repeats elements based on their counts.

c = Counter('banana')
print(list(c.elements()))
# ['b', 'a', 'a', 'a', 'n', 'n']

However, elements with values less than or equal to 0 are excluded.

📌 Support for Mathematical Operations (Counter operations using +, -, &, |)

One of the powerful points of Counter is that it supports arithmetic and set operations.

c1 = Counter(a=3, b=1)
c2 = Counter(a=1, b=2)

print(c1 + c2)
# Counter({'a': 4, 'b': 3}) → Same keys are combined

print(c1 - c2)
# Counter({'a': 2}) → Negative values are ignored, 'b' is omitted as it would be negative

print(c1 & c2)
# Counter({'a': 1, 'b': 1}) → Intersection, based on minimum value

print(c1 | c2)
# Counter({'a': 3, 'b': 2}) → Union, based on maximum value

Practical Examples

📌 Analyzing Word Frequencies in Strings

text = "the quick brown fox jumps over the lazy dog"
counter = Counter(text.split())
print(counter)

📌 Log Frequency Analysis

logs = ['INFO', 'ERROR', 'INFO', 'DEBUG', 'ERROR', 'ERROR']
print(Counter(logs))  # Counter({'ERROR': 3, 'INFO': 2, 'DEBUG': 1})

📌 Counting Duplicate Elements in a List

nums = [1, 2, 2, 3, 3, 3]
print(Counter(nums))  # Counter({3: 3, 2: 2, 1: 1})

Important Points to Note

Counter inherits from dict, but does not guarantee order. If order is important, use most_common().
Items are not removed even if their counts drop to 0 or below, so you may need to filter them manually.

c = Counter(a=3)
c.subtract({'a': 5})
print(c)  # Counter({'a': -2})  # Note that items do not disappear even if their values drop to 0 or below

Tip: Accumulating Without Initialization

counter = Counter()
with open("data.txt") as f:
    for line in f:
        counter.update(line.strip().split())

Conclusion

collections.Counter is a powerful tool that is almost indispensable in data analysis, log processing, and text mining. It serves as an easy frequency counting tool for beginners, while also evolving into an advanced processing tool that combines operations and filtering for experts.

Next Episode Preview

defaultdict – A world without KeyErrors, more flexible than dict! Stay tuned for the next episode!

By ‘thoroughly understanding and properly using’ the standard library, the quality of your code will definitely improve.

Mastering the Python Standard Library① - collections.Counter

In-depth Usage of collections: From Basics to Practical Applications

1. Why start with `collections`?

2. `Counter` – The Definitive Tool for Counting Frequencies and Beyond

Basic Concept

Main Features and Methods

📌 Various Initialization Methods

📌 Element Access

📌 Adding/Modifying Elements

📌 `most_common(n)` – Extracting Most Frequent Elements

📌 `elements()` – An Iterator for Iterating Elements

📌 Support for Mathematical Operations (Counter operations using +, -, &, |)

Practical Examples

📌 Analyzing Word Frequencies in Strings

📌 Log Frequency Analysis

📌 Counting Duplicate Elements in a List

Important Points to Note

Tip: Accumulating Without Initialization

Conclusion

Next Episode Preview

whitedec

Similar Posts

Mastering the Python Standard Library ② - collections.defaultdict

Same Name, Different Results? Using urlencode in Django Development

When Django Admin Search Gets Frustrating: Creating a Mixin to Search Specific Fields Only

[Python Standard Library - 5] Working with Numbers: Using math and statistics

Leave a comment

Add a New Comment

In-depth Usage of collections: From Basics to Practical Applications

1. Why start with collections?

2. Counter – The Definitive Tool for Counting Frequencies and Beyond

Basic Concept

Main Features and Methods

📌 Various Initialization Methods

📌 Element Access

📌 Adding/Modifying Elements

📌 most_common(n) – Extracting Most Frequent Elements

📌 elements() – An Iterator for Iterating Elements

📌 Support for Mathematical Operations (Counter operations using +, -, &, |)

Practical Examples

📌 Analyzing Word Frequencies in Strings

📌 Log Frequency Analysis

📌 Counting Duplicate Elements in a List

Important Points to Note

Tip: Accumulating Without Initialization

Conclusion

Next Episode Preview

whitedec

Similar Posts

Mastering the Python Standard Library ② - collections.defaultdict

Same Name, Different Results? Using urlencode in Django Development

When Django Admin Search Gets Frustrating: Creating a Mixin to Search Specific Fields Only

[Python Standard Library - 5] Working with Numbers: Using math and statistics

Leave a comment

Add a New Comment

1. Why start with `collections`?

2. `Counter` – The Definitive Tool for Counting Frequencies and Beyond

📌 `most_common(n)` – Extracting Most Frequent Elements

📌 `elements()` – An Iterator for Iterating Elements