Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
LOWESS : Locally weighted regression
"""
This module implements the Lowess function for nonparametric regression.
Functions:
lowess Fit a smooth nonparametric regression curve to a scatterplot.
For more information, see
William S. Cleveland: "Robust locally weighted regression and smoothing
scatterplots", Journal of the American Statistical Association, December 1979,
volume 74, number 368, pp. 829-836.
William S. Cleveland and Susan J. Devlin: "Locally weighted regression: An
approach to regression analysis by local fitting", Journal of the American
Statistical Association, September 1988, volume 83, number 403, pp. 596-610.
"""
# Authors: Alexandre Gramfort <alexandre.gramfort@telecom-paristech.fr>
#
# License: BSD (3-clause)
from math import ceil
import numpy as np
from scipy import linalg
def lowess(x, y, f=2. / 3., iter=3):
"""lowess(x, y, f=2./3., iter=3) -> yest
Lowess smoother: Robust locally weighted regression.
The lowess function fits a nonparametric regression curve to a scatterplot.
The arrays x and y contain an equal number of elements; each pair
(x[i], y[i]) defines a data point in the scatterplot. The function returns
the estimated (smooth) values of y.
The smoothing span is given by f. A larger value for f will result in a
smoother curve. The number of robustifying iterations is given by iter. The
function will run faster with a smaller number of iterations.
"""
n = len(x)
r = int(ceil(f * n))
h = [np.sort(np.abs(x - x[i]))[r] for i in range(n)]
w = np.clip(np.abs((x[:, None] - x[None, :]) / h), 0.0, 1.0)
w = (1 - w ** 3) ** 3
yest = np.zeros(n)
delta = np.ones(n)
for iteration in range(iter):
for i in range(n):
weights = delta * w[:, i]
b = np.array([np.sum(weights * y), np.sum(weights * y * x)])
A = np.array([[np.sum(weights), np.sum(weights * x)],
[np.sum(weights * x), np.sum(weights * x * x)]])
beta = linalg.solve(A, b)
yest[i] = beta[0] + beta[1] * x[i]
residuals = y - yest
s = np.median(np.abs(residuals))
delta = np.clip(residuals / (6.0 * s), -1, 1)
delta = (1 - delta ** 2) ** 2
return yest
if __name__ == '__main__':
import math
n = 100
x = np.linspace(0, 2 * math.pi, n)
y = np.sin(x) + 0.3 * np.random.randn(n)
f = 0.25
yest = lowess(x, y, f=f, iter=3)
import pylab as pl
pl.clf()
pl.plot(x, y, label='y noisy')
pl.plot(x, yest, label='y pred')
pl.legend()
pl.show()
@nickv2002

This comment has been minimized.

Copy link

nickv2002 commented Nov 6, 2013

Thanks for this... it's very useful and easier than installing the statsmodels packages.

Note that numpy has a value for pi so you can remove import math on line 59 and change math.pi to np.pi. I think you can also use np.ceil instead of importing math.ceil in the same respect.

@Zujio

This comment has been minimized.

Copy link

Zujio commented Aug 20, 2014

Wonderful! Thank you for this.

@davidliwei

This comment has been minimized.

Copy link

davidliwei commented May 15, 2015

Thanks very much! Really useful.

@neale

This comment has been minimized.

Copy link

neale commented Jul 9, 2015

This is great! Just as effective as anything else I've used and it avoids heavy dependencies.

@mzhao4

This comment has been minimized.

Copy link

mzhao4 commented Sep 8, 2015

Good code. But a minor suggestion that might take into consideration is that if input x is integers, the w matrix might have problem and result in singular matrix in solving linear systems. I would add a conversion sentence such as x = x.astype('float') before implementing w = np.clip(np.abs((x[:,None] - x[None,:]) / h), 0.0, 1.0). Or simply force h to be float.

@NelleV

This comment has been minimized.

Copy link

NelleV commented Oct 14, 2015

Hi Alex,
Do you mind adding a licence?
Cheers,
N

@agramfort

This comment has been minimized.

Copy link
Owner Author

agramfort commented Oct 21, 2015

done

@Xunius

This comment has been minimized.

Copy link

Xunius commented Mar 9, 2017

Thanks for the code.
Is it possible to extend this to 2D? I'm looking for a solution that works on data in the format of maps/images. Is it possible to allow the local subset to be non-square, e.g. 10 grid wide x 6 grid height?

@bbartley

This comment has been minimized.

Copy link

bbartley commented May 13, 2017

This is really cool, but it doesn't quite do what I need. I guess what I'm looking for is a gist that estimates beta[1] and the standard error for beta[1]. If I were a little smarter I could probably figure out how to modify this.

@mridolfi

This comment has been minimized.

Copy link

mridolfi commented Feb 7, 2018

Thanks, this is very nice. However, I have an issue. When I change f I often get this error:
numpy.linalg.linalg.LinAlgError: Singular matrix

What can I do about it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.