Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
LOWESS : Locally weighted regression
"""
This module implements the Lowess function for nonparametric regression.
Functions:
lowess Fit a smooth nonparametric regression curve to a scatterplot.
For more information, see
William S. Cleveland: "Robust locally weighted regression and smoothing
scatterplots", Journal of the American Statistical Association, December 1979,
volume 74, number 368, pp. 829-836.
William S. Cleveland and Susan J. Devlin: "Locally weighted regression: An
approach to regression analysis by local fitting", Journal of the American
Statistical Association, September 1988, volume 83, number 403, pp. 596-610.
"""
# Authors: Alexandre Gramfort <alexandre.gramfort@telecom-paristech.fr>
#
# License: BSD (3-clause)
from math import ceil
import numpy as np
from scipy import linalg
def lowess(x, y, f=2. / 3., iter=3):
"""lowess(x, y, f=2./3., iter=3) -> yest
Lowess smoother: Robust locally weighted regression.
The lowess function fits a nonparametric regression curve to a scatterplot.
The arrays x and y contain an equal number of elements; each pair
(x[i], y[i]) defines a data point in the scatterplot. The function returns
the estimated (smooth) values of y.
The smoothing span is given by f. A larger value for f will result in a
smoother curve. The number of robustifying iterations is given by iter. The
function will run faster with a smaller number of iterations.
"""
n = len(x)
r = int(ceil(f * n))
h = [np.sort(np.abs(x - x[i]))[r] for i in range(n)]
w = np.clip(np.abs((x[:, None] - x[None, :]) / h), 0.0, 1.0)
w = (1 - w ** 3) ** 3
yest = np.zeros(n)
delta = np.ones(n)
for iteration in range(iter):
for i in range(n):
weights = delta * w[:, i]
b = np.array([np.sum(weights * y), np.sum(weights * y * x)])
A = np.array([[np.sum(weights), np.sum(weights * x)],
[np.sum(weights * x), np.sum(weights * x * x)]])
beta = linalg.solve(A, b)
yest[i] = beta[0] + beta[1] * x[i]
residuals = y - yest
s = np.median(np.abs(residuals))
delta = np.clip(residuals / (6.0 * s), -1, 1)
delta = (1 - delta ** 2) ** 2
return yest
if __name__ == '__main__':
import math
n = 100
x = np.linspace(0, 2 * math.pi, n)
y = np.sin(x) + 0.3 * np.random.randn(n)
f = 0.25
yest = lowess(x, y, f=f, iter=3)
import pylab as pl
pl.clf()
pl.plot(x, y, label='y noisy')
pl.plot(x, yest, label='y pred')
pl.legend()
pl.show()
@nickv2002

This comment has been minimized.

Copy link

commented Nov 6, 2013

Thanks for this... it's very useful and easier than installing the statsmodels packages.

Note that numpy has a value for pi so you can remove import math on line 59 and change math.pi to np.pi. I think you can also use np.ceil instead of importing math.ceil in the same respect.

@Zujio

This comment has been minimized.

Copy link

commented Aug 20, 2014

Wonderful! Thank you for this.

@davidliwei

This comment has been minimized.

Copy link

commented May 15, 2015

Thanks very much! Really useful.

@neale

This comment has been minimized.

Copy link

commented Jul 9, 2015

This is great! Just as effective as anything else I've used and it avoids heavy dependencies.

@mzhao4

This comment has been minimized.

Copy link

commented Sep 8, 2015

Good code. But a minor suggestion that might take into consideration is that if input x is integers, the w matrix might have problem and result in singular matrix in solving linear systems. I would add a conversion sentence such as x = x.astype('float') before implementing w = np.clip(np.abs((x[:,None] - x[None,:]) / h), 0.0, 1.0). Or simply force h to be float.

@NelleV

This comment has been minimized.

Copy link

commented Oct 14, 2015

Hi Alex,
Do you mind adding a licence?
Cheers,
N

@agramfort

This comment has been minimized.

Copy link
Owner Author

commented Oct 21, 2015

done

@Xunius

This comment has been minimized.

Copy link

commented Mar 9, 2017

Thanks for the code.
Is it possible to extend this to 2D? I'm looking for a solution that works on data in the format of maps/images. Is it possible to allow the local subset to be non-square, e.g. 10 grid wide x 6 grid height?

@bbartley

This comment has been minimized.

Copy link

commented May 13, 2017

This is really cool, but it doesn't quite do what I need. I guess what I'm looking for is a gist that estimates beta[1] and the standard error for beta[1]. If I were a little smarter I could probably figure out how to modify this.

@mridolfi

This comment has been minimized.

Copy link

commented Feb 7, 2018

Thanks, this is very nice. However, I have an issue. When I change f I often get this error:
numpy.linalg.linalg.LinAlgError: Singular matrix

What can I do about it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.