Skip to content

Instantly share code, notes, and snippets.

@HarryStevens
Last active June 27, 2017 19:51
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save HarryStevens/be559bed98d662f69e68fc8a7e0ad097 to your computer and use it in GitHub Desktop.
Save HarryStevens/be559bed98d662f69e68fc8a7e0ad097 to your computer and use it in GitHub Desktop.
Linear Regression for Scatter Plot
license: gpl-3.0

Want to add a "trendline" to your d3.js scatter plot? You will have to calculate the linear regression of your data. I wrote a javascript function that does just that, based on this excellent tutorial.

name x y
Noah 11.27003571 88.77343766
Liam 19.73385696 79.04811495
Ethan 3.761929642 85.68535388
Lucas 5.421437532 82.80845477
Mason 2.549084585 79.97105809
Oliver 33.36555772 58.44080684
Aiden 1.594728697 92.31637968
Elijah 5.028269931 84.06370038
Benjamin 29.62285349 70.40983908
James 4.300790399 65.71298073
Logan 1.018589568 92.72992499
Jacob 6.189890001 62.0497844
Jackson 2.500819713 91.00477379
Michael 4.354894457 76.83763512
Carter 14.050125 63.6362308
William 1.988261951 84.52044745
Daniel 11.94771355 55.17891294
Alexander 7.257912644 76.05499515
Luke 8.287677281 77.36675344
Owen 1.536290248 88.48360983
Jack 3.226684682 96.3338812
Gabriel 1.709155851 88.068512
Matthew 6.537126369 90.79426085
Henry 1.959780252 92.16606293
Sebastian 4.613524263 84.75412344
Wyatt 2.546143547 84.67401936
Grayson 2.203033882 86.39972227
Isaac 2.802066804 85.69300716
Ryan 8.737254378 84.5829933
Nathan 9.391822654 75.38122126
Jayden 1.629713601 89.32460249
Jaxon 2.067608467 79.76530605
Caleb 2.071654475 90.27542871
David 3.169471649 91.46316395
Levi 14.12227801 66.81414111
Eli 3.151841812 85.71375168
Julian 9.458633815 89.29621373
Andrew 3.727868003 89.33320868
Dylan 1.691066983 88.59192863
Hunter 2.151502825 91.26024528
Emma 8.994238858 83.04809322
Olivia 3.173394263 89.79970872
Ava 9.931748154 82.77564203
Sophia 9.257203392 84.31899348
Isabella 2.125536698 85.79547586
Mia 5.751086254 82.15530514
Charlotte 4.476227873 76.55001202
Harper 2.86511372 95.26945167
Amelia 6.165794958 77.07868992
Abigail 1.527170789 81.8296847
Emily 3.310661013 93.05720137
Madison 19.26649252 93.47722192
Lily 4.397012544 90.71884874
Ella 1.903926077 80.23402878
Avery 24.53763972 63.39872735
Evelyn 2.362863791 91.81263692
Sofia 19.99035985 48.32296873
Aria 28.36302159 57.69543829
Riley 21.56565684 71.3442642
Chloe 4.674148004 85.11159217
Scarlett 4.659473604 91.32189682
Ellie 8.262626513 45.97488076
Elizabeth 19.2708132 56.74192862
Aubrey 40.18594637 66.66316631
Grace 4.605922517 75.82782712
Layla 2.787963348 86.69875359
Addison 3.359345533 80.24495283
Zoey 21.67122862 57.09057474
Hannah 9.039155653 68.79092791
Mila 1.836441804 69.9295158
Victoria 1.784124503 79.29206832
Brooklyn 1.34223936 93.35376759
Zoe 2.534419191 82.7585241
Penelope 7.46372613 87.89232417
Lucy 2.937304061 84.5204076
<html>
<head>
<style>
body {
margin: 0 auto;
display: table;
font-family: "Helvetica Neue", sans-serif;
}
.regression {
stroke-width: 2px;
stroke: steelblue;
stroke-dasharray: 10,5;
}
.equation {
font-size: 12px;
margin-top: 10px;
text-align: center;
}
</style>
</head>
<body>
<div class="chart"></div>
<div class="equation"></div>
<div class="equation"></div>
<script src="https://d3js.org/d3.v4.min.js"></script>
<script>
var margin = {top: 5, right: 5, bottom: 20, left: 20},
width = 450 - margin.left - margin.right,
height = 450 - margin.top - margin.bottom;
var svg = d3.select(".chart").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
var x = d3.scaleLinear()
.range([0,width]);
var y = d3.scaleLinear()
.range([height,0]);
var xAxis = d3.axisBottom()
.scale(x);
var yAxis = d3.axisLeft()
.scale(y);
d3.csv("data.csv", types, function(error, data){
y.domain(d3.extent(data, function(d){ return d.y}));
x.domain(d3.extent(data, function(d){ return d.x}));
// see below for an explanation of the calcLinear function
var lg = calcLinear(data, "x", "y", d3.min(data, function(d){ return d.x}), d3.min(data, function(d){ return d.x}));
svg.append("line")
.attr("class", "regression")
.attr("x1", x(lg.ptA.x))
.attr("y1", y(lg.ptA.y))
.attr("x2", x(lg.ptB.x))
.attr("y2", y(lg.ptB.y));
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
svg.append("g")
.attr("class", "y axis")
.call(yAxis);
svg.selectAll(".point")
.data(data)
.enter().append("circle")
.attr("class", "point")
.attr("r", 3)
.attr("cy", function(d){ return y(d.y); })
.attr("cx", function(d){ return x(d.x); });
});
function types(d){
d.x = +d.x;
d.y = +d.y;
return d;
}
// Calculate a linear regression from the data
// Takes 5 parameters:
// (1) Your data
// (2) The column of data plotted on your x-axis
// (3) The column of data plotted on your y-axis
// (4) The minimum value of your x-axis
// (5) The minimum value of your y-axis
// Returns an object with two points, where each point is an object with an x and y coordinate
function calcLinear(data, x, y, minX, minY){
/////////
//SLOPE//
/////////
// Let n = the number of data points
var n = data.length;
// Get just the points
var pts = [];
data.forEach(function(d,i){
var obj = {};
obj.x = d[x];
obj.y = d[y];
obj.mult = obj.x*obj.y;
pts.push(obj);
});
// Let a equal n times the summation of all x-values multiplied by their corresponding y-values
// Let b equal the sum of all x-values times the sum of all y-values
// Let c equal n times the sum of all squared x-values
// Let d equal the squared sum of all x-values
var sum = 0;
var xSum = 0;
var ySum = 0;
var sumSq = 0;
pts.forEach(function(pt){
sum = sum + pt.mult;
xSum = xSum + pt.x;
ySum = ySum + pt.y;
sumSq = sumSq + (pt.x * pt.x);
});
var a = sum * n;
var b = xSum * ySum;
var c = sumSq * n;
var d = xSum * xSum;
// Plug the values that you calculated for a, b, c, and d into the following equation to calculate the slope
// slope = m = (a - b) / (c - d)
var m = (a - b) / (c - d);
/////////////
//INTERCEPT//
/////////////
// Let e equal the sum of all y-values
var e = ySum;
// Let f equal the slope times the sum of all x-values
var f = m * xSum;
// Plug the values you have calculated for e and f into the following equation for the y-intercept
// y-intercept = b = (e - f) / n
var b = (e - f) / n;
// Print the equation below the chart
document.getElementsByClassName("equation")[0].innerHTML = "y = " + m + "x + " + b;
document.getElementsByClassName("equation")[1].innerHTML = "x = ( y - " + b + " ) / " + m;
// return an object of two points
// each point is an object with an x and y coordinate
return {
ptA : {
x: minX,
y: m * minX + b
},
ptB : {
y: minY,
x: (minY - b) / m
}
}
}
</script>
</body>
</html>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment