Skip to content

Instantly share code, notes, and snippets.

@mahimairaja
Created February 11, 2023 13:04
Show Gist options
  • Save mahimairaja/bd18f3ac6a6738aa9c866a6133fd7353 to your computer and use it in GitHub Desktop.
Save mahimairaja/bd18f3ac6a6738aa9c866a6133fd7353 to your computer and use it in GitHub Desktop.
To analyse the data points based on normal distribution and standard deviation.
from scipy import stats
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_csv('your_csv_name')
col = df['Grade']
density = stats.gaussian_kde(col)
col.plot.density()
s = col.std()
m = col.mean()
x1 = [m-s, m+s]
y1 = density(x1)
plt.plot(x1, y1, color='magenta')
plt.annotate("1 std (68.26%)",(x1[1],y1[1]))
x2 = [m-(2*s), m+(2*s)]
y2 = density(x2)
plt.plot(x2, y2, color='yellow')
plt.annotate("2 std (98.45%)",(x2[1],y2[1]))
x3 = [m-(3*s), m+(3*s)]
y3 = density(x3)
plt.plot(x3, y3, color='orange')
plt.annotate("3 std (99.73%)",(x3[1],y3[1]))
plt.axvline(col.mean(), color='cyan', linestyle='dashed', linewidth=2)
plt.axis('off')
plt.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment