Handling Numbers with Python’s Standard Library: math and statistics

Series 05 – Calculations are precise, summaries are tidy

Number code may look simple, but floating‑point errors and choices like population vs. sample can subtly shift results. The math and statistics modules in Python’s standard library give you a solid foundation for these scenarios.

Playground of math and statistics

  • math: mathematical functions and constants, rounding, combinatorics, error comparison, more accurate summation
  • statistics: mean, median, variance, and other summary statistics to outline your data

1. math: The Toolbox of Mathematics

1.1 Constants and Basic Functions

import math

print(math.pi)     # π
print(math.e)      # e

print(math.sqrt(2))     # square root
print(math.pow(2, 10))  # exponentiation (2**10 works just as well)

1.2 Rounding: ceil, floor, trunc

Negative values can be confusing, but a quick table clarifies the differences.

Value (x) math.ceil(x) math.floor(x) math.trunc(x)
3.7 4 3 3
-3.7 -3 -4 -3
import math

for x in [3.7, -3.7]:
    print(x, math.ceil(x), math.floor(x), math.trunc(x))
  • ceil: move to the next integer up (3.7→4, -3.7→-3)
  • floor: move to the next integer down (3.7→3, -3.7→-4)
  • trunc: truncate toward zero (3.7→3, -3.7→-3)

1.3 Comparing Floats: isclose() over ==

import math

a = 0.1 + 0.2
b = 0.3
print(a == b)            # False (0.30000000000000004 != 0.3)
print(math.isclose(a, b))  # True

math.isclose(a, b, rel_tol=..., abs_tol=...) decides closeness based on relative or absolute tolerances.


1.4 More Accurate Summation: fsum()

sum() is fast and usually fine, but when many floating‑point values accumulate, error can grow. math.fsum() uses a more precise algorithm.

import math

values = [0.1] * 10_000
print(sum(values))
print(math.fsum(values))

1.5 Combinations and Permutations: comb, perm

import math

print(math.comb(10, 3))  # choose 3 out of 10
print(math.perm(10, 3))  # order 3 out of 10

2. statistics: The Basics of Data Summaries

statistics takes a list (or any iterable) and instantly returns mean, median, variance, and more.

2.1 Mean: mean vs fmean

import statistics as st

data = [10, 12, 13, 12, 100]
print(st.mean(data))   # arithmetic mean
print(st.fmean(data))  # float‑based mean (always returns float)

2.2 Median and Mode: median, mode, multimode

import statistics as st

data = [10, 12, 13, 12, 100]
print(st.median(data))  # 12
print(st.mode(data))    # 12

When multiple modes exist, multimode() is safer.

import statistics as st

data = [1, 1, 2, 2, 3]
print(st.multimode(data))  # [1, 2]

2.3 Variance and Standard Deviation: Population vs. Sample

import statistics as st

data = [10, 12, 13, 12, 100]

print(st.pvariance(data))  # population variance
print(st.variance(data))   # sample variance (n-1)

print(st.pstdev(data))     # population std dev
print(st.stdev(data))      # sample std dev

2.4 Weighted Mean: fmean(weights=…) in Python 3.11+

Older code often calculated weighted means manually, but Python 3.11+ adds a weights argument to statistics.fmean().

import statistics as st

values = [80, 90, 100]
weights = [1, 2, 1]

# Python 3.11+:
weighted_mean = st.fmean(values, weights=weights)
print(weighted_mean)

For all versions, the manual calculation is still handy.

values = [80, 90, 100]
weights = [1, 2, 1]

weighted_mean = sum(v*w for v, w in zip(values, weights)) / sum(weights)
print(weighted_mean)

3. Three Common Pitfalls in Numerical Work

3.1 float Cannot Represent All Decimals Exactly

This isn’t a Python quirk; it’s binary floating‑point. The tutorial also recommends decimal for exact decimal arithmetic.

from decimal import Decimal

print(0.1 + 0.2)                   # may look like 0.30000000000000004
print(Decimal("0.1") + Decimal("0.2"))  # Decimal('0.3')

decimal is designed for precise decimal arithmetic.

3.2 Look at Both Mean and Median

Outliers can skew the mean. Pairing mean with median gives a clearer picture.

3.3 median() Can Be Expensive on Large Data

Median often requires sorting, which becomes costly for huge datasets. In such cases, consider alternative strategies or external libraries.


4. A Quick Example: Summarizing Sensor Readings

Below is a concise script that summarizes a list of sensor values (floats). It uses statistics for central tendency and math.fsum() for a more accurate error‑sum.

import math
import statistics as st

# Example sensor readings (21.5 stands out as an outlier)
readings = [20.1, 20.0, 20.2, 19.9, 20.1, 21.5]

# 1) Mean: quick sense of overall level
print("fmean:", st.fmean(readings))   # fmean: 20.3

# 2) Median: robust to outliers
print("median:", st.median(readings))   # median: 20.1

# 3) Sample std dev: spread of values
print("stdev(sample):", st.stdev(readings))    # stdev(sample): 0.5966573556070519

# 4) Accurate sum of absolute differences from a reference (e.g., 20.0)
diffs = [abs(x - 20.0) for x in readings]
print("sum abs diff:", math.fsum(diffs))     # sum abs diff: 2.0000000000000036

With just a few lines, you can quickly gauge whether most readings hover around a target and how much variation exists.


Related Posts