Handling Numbers with Python’s Standard Library: math and statistics

Series 05 – Calculations are precise, summaries are tidy

Numerical code may look simple, but floating-point errors and choices like population vs. sample can subtly shift results. The math and statistics modules in Python’s standard library give you a solid foundation for these scenarios.

Playground of math and statistics

  • math: mathematical functions and constants, rounding, combinatorics, floating-point comparison, more accurate summation
  • statistics: mean, median, variance, and other summary statistics to summarize your data

1. math: The Toolbox of Mathematics

1.1 Constants and Basic Functions

import math

print(math.pi)     # π
print(math.e)      # e

print(math.sqrt(2))     # square root
print(math.pow(2, 10))  # exponentiation (2**10 works just as well)

1.2 Rounding: ceil, floor, trunc

The behavior with negative values can be confusing, but a quick table clarifies the differences.

Value (x) math.ceil(x) math.floor(x) math.trunc(x)
3.7 4 3 3
-3.7 -3 -4 -3
import math

for x in [3.7, -3.7]:
    print(x, math.ceil(x), math.floor(x), math.trunc(x))
  • ceil: move to the next integer up (3.7→4, -3.7→-3)
  • floor: move to the next integer down (3.7→3, -3.7→-4)
  • trunc: truncate toward zero (3.7→3, -3.7→-3)

1.3 Comparing Floats: isclose() over ==

import math

a = 0.1 + 0.2
b = 0.3
print(a == b)            # False (0.30000000000000004 != 0.3)
print(math.isclose(a, b))  # True

math.isclose(a, b, rel_tol=..., abs_tol=...) determines closeness based on relative or absolute tolerances.


1.4 More Accurate Summation: fsum()

sum() is fast and usually fine, but when many floating‑point values accumulate, error can grow. math.fsum() uses a more precise algorithm.

import math

values = [0.1] * 10_000
print(sum(values))
print(math.fsum(values))

1.5 Combinations and Permutations: comb, perm

import math

print(math.comb(10, 3))  # choose 3 out of 10
print(math.perm(10, 3))  # order 3 out of 10

2. statistics: The Basics of Data Summaries

statistics takes a list (or any iterable) and returns mean, median, variance, and more.

2.1 Mean: mean vs fmean

import statistics as st

data = [10, 12, 13, 12, 100]
print(st.mean(data))   # arithmetic mean
print(st.fmean(data))  # float‑based mean (always returns float)

2.2 Median and Mode: median, mode, multimode

import statistics as st

data = [10, 12, 13, 12, 100]
print(st.median(data))  # 12
print(st.mode(data))    # 12

When multiple modes exist, multimode() is more appropriate.

import statistics as st

data = [1, 1, 2, 2, 3]
print(st.multimode(data))  # [1, 2]

2.3 Variance and Standard Deviation: Population vs. Sample

import statistics as st

data = [10, 12, 13, 12, 100]

print(st.pvariance(data))  # population variance
print(st.variance(data))   # sample variance (n-1)

print(st.pstdev(data))     # population std dev
print(st.stdev(data))      # sample std dev

2.4 Weighted Mean: fmean(weights=…) in Python 3.11+

Older code often calculated weighted means manually, but Python 3.11+ adds a weights argument to statistics.fmean().

import statistics as st

values = [80, 90, 100]
weights = [1, 2, 1]

# Python 3.11+:
weighted_mean = st.fmean(values, weights=weights)
print(weighted_mean)

For versions prior to 3.11, the manual calculation is still handy.

values = [80, 90, 100]
weights = [1, 2, 1]

weighted_mean = sum(v*w for v, w in zip(values, weights)) / sum(weights)
print(weighted_mean)

3. Three Common Pitfalls in Numerical Work

3.1 float Cannot Represent All Decimals Exactly

This isn’t a Python quirk; it’s inherent to binary floating-point arithmetic. The decimal module is also recommended for exact decimal arithmetic.

from decimal import Decimal

print(0.1 + 0.2)                   # may look like 0.30000000000000004
print(Decimal("0.1") + Decimal("0.2"))  # Decimal('0.3')

decimal is designed for precise decimal arithmetic.

3.2 Look at Both Mean and Median

Outliers can skew the mean. Pairing mean with median gives a clearer picture.

3.3 median() Can Be Expensive on Large Data

Median often requires sorting, which becomes costly for huge datasets. In such cases, consider alternative strategies or external libraries.


4. A Quick Example: Summarizing Sensor Readings

Below is a concise script that summarizes a list of sensor values (floats). It uses statistics for central tendency and math.fsum() for a more accurate sum of differences.

import math
import statistics as st

# Example sensor readings (21.5 stands out as an outlier)
readings = [20.1, 20.0, 20.2, 19.9, 20.1, 21.5]

# 1) Mean: quick sense of overall level
print("fmean:", st.fmean(readings))   # fmean: 20.3

# 2) Median: robust to outliers
print("median:", st.median(readings))   # median: 20.1

# 3) Sample std dev: spread of values
print("stdev(sample):", st.stdev(readings))    # stdev(sample): 0.5966573556070519

# 4) Accurate sum of absolute differences from a reference (e.g., 20.0)
diffs = [abs(x - 20.0) for x in readings]
print("sum abs diff:", math.fsum(diffs))     # sum abs diff: 2.0000000000000036

With just a few lines, you can quickly gauge whether most readings hover around a target and how much variation exists.


Related Posts