Python runstats Module: Collect running statistics and regression in a single pass of data.

Features and benefits:

• Statistics summary computes arithmetic mean, variance, standard deviation, skewness, and kurtosis.
• Regression summary computes slope, intercept, and correlation.
• Based on the Knuth and Welford method for computing standard deviation in one pass as described in the Art of Computer Programming, Vol 2, p. 232, 3rd edition.
• Numerically stable and accurate.
• Supports combining summary objects with + and += operators.
• Requires only one pass of the discrete data points. Alternatively, can efficiently provide running statistics and regression.
• Install Python runstats from PyPI
• Fork Python runstats on Github
• Read the Docs for Python runstats

Python runstats Examples

Statistics

 import random   from runstats import Statistics   stats = Statistics()   # Add values to statistics summary.   for num in range(1000): stats.push(random.random())   # Copy statistics summary.   more_stats = stats.copy()   for num in range(1000): more_stats.push(random.random())   # Combine summaries.   stats += more_stats   # Display statistics.   print 'Count:', len(stats) print 'Mean:', stats.mean() print 'Variance:', stats.variance() print 'Standard Deviation:', stats.stddev() print 'Skewness:', stats.skewness() print 'Kurtosis:', stats.kurtosis()

Simple Linear Regression

 import random   from runstats import Regression   def linear_noisy_func(x_coord): """Linear function with some noise.""" alpha, beta = 12, 42 noise = (20 * (random.random() - 0.5)) return alpha * x_coord + beta +   regr = Regression()   # Add values to simple linear regression.   for num in range(1000): regr.push(num, linear_noisy_func(num))   # Copy regression summary.   regr_copy = regr.copy()   regr_more = Regression()   for num in range(1000, 2000): regr_more.push(num, linear_noisy_func(num))   # Combine summaries.   regr = regr_copy + regr_more   # Display regression values.   print 'Count:', len(regr) print 'Slope:', regr.slope() print 'Intercept:', regr.intercept() print 'Correlation:', regr.correlation()

Installation

Available for install from PyPI:

 pip install runstats

Credits

Based entirely on the C++ versions by John Cook: