Revision Ver-4

Handling Numbers with Python’s Standard Library: math and statistics

Series 05 – Calculations are precise, summaries are tidy

Numerical code may look simple, but floating-point errors and choices like population vs. sample can subtly shift results. The math and statistics modules in Python’s standard library give you a solid foundation for these scenarios.

Playground of math and statistics

math: mathematical functions and constants, rounding, combinatorics, floating-point comparison, more accurate summation
statistics: mean, median, variance, and other summary statistics to summarize your data

1. `math`: The Toolbox of Mathematics

1.1 Constants and Basic Functions

import math

print(math.pi)     # π
print(math.e)      # e

print(math.sqrt(2))     # square root
print(math.pow(2, 10))  # exponentiation (2**10 works just as well)

1.2 Rounding: `ceil`, `floor`, `trunc`

The behavior with negative values can be confusing, but a quick table clarifies the differences.

Value (x)	`math.ceil(x)`	`math.floor(x)`	`math.trunc(x)`
3.7	4	3	3
-3.7	-3	-4	-3

import math

for x in [3.7, -3.7]:
    print(x, math.ceil(x), math.floor(x), math.trunc(x))

ceil: move to the next integer up (3.7→4, -3.7→-3)
floor: move to the next integer down (3.7→3, -3.7→-4)
trunc: truncate toward zero (3.7→3, -3.7→-3)

1.3 Comparing Floats: `isclose()` over `==`

import math

a = 0.1 + 0.2
b = 0.3
print(a == b)            # False (0.30000000000000004 != 0.3)
print(math.isclose(a, b))  # True

math.isclose(a, b, rel_tol=..., abs_tol=...) determines closeness based on relative or absolute tolerances.

1.4 More Accurate Summation: `fsum()`

sum() is fast and usually fine, but when many floating‑point values accumulate, error can grow. math.fsum() uses a more precise algorithm.

import math

values = [0.1] * 10_000
print(sum(values))
print(math.fsum(values))

1.5 Combinations and Permutations: `comb`, `perm`

import math

print(math.comb(10, 3))  # choose 3 out of 10
print(math.perm(10, 3))  # order 3 out of 10

2. `statistics`: The Basics of Data Summaries

statistics takes a list (or any iterable) and returns mean, median, variance, and more.

2.1 Mean: `mean` vs `fmean`

import statistics as st

data = [10, 12, 13, 12, 100]
print(st.mean(data))   # arithmetic mean
print(st.fmean(data))  # float‑based mean (always returns float)

2.2 Median and Mode: `median`, `mode`, `multimode`

import statistics as st

data = [10, 12, 13, 12, 100]
print(st.median(data))  # 12
print(st.mode(data))    # 12

When multiple modes exist, multimode() is more appropriate.

import statistics as st

data = [1, 1, 2, 2, 3]
print(st.multimode(data))  # [1, 2]

2.3 Variance and Standard Deviation: Population vs. Sample

import statistics as st

data = [10, 12, 13, 12, 100]

print(st.pvariance(data))  # population variance
print(st.variance(data))   # sample variance (n-1)

print(st.pstdev(data))     # population std dev
print(st.stdev(data))      # sample std dev

2.4 Weighted Mean: `fmean(weights=…)` in Python 3.11+

Older code often calculated weighted means manually, but Python 3.11+ adds a weights argument to statistics.fmean().

import statistics as st

values = [80, 90, 100]
weights = [1, 2, 1]

# Python 3.11+:
weighted_mean = st.fmean(values, weights=weights)
print(weighted_mean)

For versions prior to 3.11, the manual calculation is still handy.

values = [80, 90, 100]
weights = [1, 2, 1]

weighted_mean = sum(v*w for v, w in zip(values, weights)) / sum(weights)
print(weighted_mean)

3. Three Common Pitfalls in Numerical Work

3.1 `float` Cannot Represent All Decimals Exactly

This isn’t a Python quirk; it’s inherent to binary floating-point arithmetic. The decimal module is also recommended for exact decimal arithmetic.

from decimal import Decimal

print(0.1 + 0.2)                   # may look like 0.30000000000000004
print(Decimal("0.1") + Decimal("0.2"))  # Decimal('0.3')

decimal is designed for precise decimal arithmetic.

3.2 Look at Both Mean and Median

Outliers can skew the mean. Pairing mean with median gives a clearer picture.

3.3 `median()` Can Be Expensive on Large Data

Median often requires sorting, which becomes costly for huge datasets. In such cases, consider alternative strategies or external libraries.

4. A Quick Example: Summarizing Sensor Readings

Below is a concise script that summarizes a list of sensor values (floats). It uses statistics for central tendency and math.fsum() for a more accurate sum of differences.

import math
import statistics as st

# Example sensor readings (21.5 stands out as an outlier)
readings = [20.1, 20.0, 20.2, 19.9, 20.1, 21.5]

# 1) Mean: quick sense of overall level
print("fmean:", st.fmean(readings))   # fmean: 20.3

# 2) Median: robust to outliers
print("median:", st.median(readings))   # median: 20.1

# 3) Sample std dev: spread of values
print("stdev(sample):", st.stdev(readings))    # stdev(sample): 0.5966573556070519

# 4) Accurate sum of absolute differences from a reference (e.g., 20.0)
diffs = [abs(x - 20.0) for x in readings]
print("sum abs diff:", math.fsum(diffs))     # sum abs diff: 2.0000000000000036

With just a few lines, you can quickly gauge whether most readings hover around a target and how much variation exists.

Related Posts

[Python Standard Library - 5] Working with Numbers: Using math and statistics [ r4 ]

1. math: The Toolbox of Mathematics

1.1 Constants and Basic Functions

1.2 Rounding: ceil, floor, trunc

1.3 Comparing Floats: isclose() over ==

1.4 More Accurate Summation: fsum()

1.5 Combinations and Permutations: comb, perm

2. statistics: The Basics of Data Summaries

2.1 Mean: mean vs fmean

2.2 Median and Mode: median, mode, multimode