Skip to content

Instantly share code, notes, and snippets.

@evu
Created August 20, 2021 17:33
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save evu/7ff69123824dbddd713a19d722381c63 to your computer and use it in GitHub Desktop.
Save evu/7ff69123824dbddd713a19d722381c63 to your computer and use it in GitHub Desktop.
Scatterplot with regression lines
palette = iter(sns.color_palette("tab10"))
# Plot points
g = sns.scatterplot(x=x, y=y, color=next(palette))
g.grid(False)
# Plot regression line
m, b = np.polyfit(x, y, 1)
linecolor = next(palette)
xr = range(x.min(), x.max(), 1)
line1 = plt.plot(xr, m*xr+b, color=linecolor, linestyle="solid")
outliers = ["sample10", "sample27"]
# Label outliers on plot
for o in outliers:
plt.annotate(
o,
(
x[o] + 2,
y[o],
),
color="gray"
)
# Drop outliers
x = x.drop(index=outliers)
y = y.drop(index=outliers)
# Refit without outliers
m, b = np.polyfit(x, y, 1)
x = range(x.min(), x.max(), 1)
line2 = plt.plot(x, m*x+b, color=linecolor, linestyle="dotted")
# Add legend
g.legend(
[line1[0], line2[0]],
["all data", "outliers removed"],
loc="lower right"
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment