Skip to content

Instantly share code, notes, and snippets.

@emagee
Created April 18, 2015 02:54
Show Gist options
  • Save emagee/e972f6375397f5d2bbcf to your computer and use it in GitHub Desktop.
Save emagee/e972f6375397f5d2bbcf to your computer and use it in GitHub Desktop.
Scattered chocolate

###Process notes

  1. I started with original data set, but that seemed too small for a scatterplot, and I wanted to find a dataset for which I had pre-1999 values; specifically, I wanted the data to cover some pre-NAFTA years. I did track down that data. The earliest year available was 1989, which is fine, since NAFTA took effect in 1994. The new data set also included 2014, whereas the original set only went as far as 2013. Whee!

  2. As part of my data cleaning, I took out all countries from which US did not import chocolate in 1989, and set the minimum value of the chocolate import per country at 1 million — this allowed me to take out all countries that had a "0" in their 1989 column. (Not sure at which point values were rounded down to zero; the metadata mentioned a different reason for the zeros anyway.)

  3. Because of the large number of countries in the original list, I opted to go without the "rest of world" entry that was in the original dataset. Should I have soldiered on and found the average "rest of world" value for each year?

  4. Next, I plotted the new data. At first I had 1989 imports on the x axis, 2014 on the y. Soon after I had an inexplicable need to flip this setup. It just seemed more appropriate. I cannot put my finger on why.

  5. I then multiplied all 1989 values by 1.93 — this is the factor of inflation (1989-2014) that I got from a random inflation calculator on the Internet. Should I have done this in the orignal dataset instead, or would the utility of that depend on if and how I wanted to use this scatterplot on this dataset?

  6. It dawned on me that I could have retrieved a data set of chocolate QUANTITY instead of the DOLLAR VALUE for each year, which would have negated the need to adjust for inflation, but I needed to move on. Of course, charting both quantity and value might have made a nice statement about the cost of chocolate, but I digress...

  7. Since I prefer all my graph elements to be inside the graph confines, as opposed to the highest values extending to or being plotted at the edge of either axis, I padded each scale's domain a bit, just five percent. Is this kosher? The code:

xScale.domain([ 0, d3.max(data, function(d) {
					return +d[2014] + (+d[2014])*0.05;
				}) ]); 
  1. A few outliers exist. Canada was expected, but Brazil was a big surprise!

###Lessons learned

  1. Borrowing code from classmates, instructors, and other helpful souls is only effective when you know what you're doing. My head continues to spin; my brain stays tied in knots.

  2. Writing orderly, well-formatted, well-commented code is a noble goal that eludes me time and time again. I close with the link to today's (quite relevant) xkcd comic: [http://xkcd.com/1513/] (http://xkcd.com/1513/). (Be sure to hover over the first frame to reveal a tool tip.)

Cheers!

Source 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014
Canada 100 115 127 151 164 178 222 262 316 348 346 390 456 525 705 721 712 709 726 758 647 872 941 989 1046 1177
Mexico 16 23 12 10 17 15 25 26 27 23 35 34 37 54 73 89 102 131 144 215 355 454 533 515 499 478
Indonesia 4 12 16 17 18 20 47 62 97 62 57 44 47 50 60 62 68 64 83 139 100 122 166 157 160 251
Malaysia 22 46 51 55 30 38 43 42 69 66 57 43 47 38 90 110 112 108 121 256 242 308 278 148 134 185
Cote d'Ivoire 37 35 33 28 23 31 30 26 23 39 30 24 13 30 60 62 78 79 67 91 136 176 129 154 146 179
Germany 18 17 18 22 24 21 25 29 29 29 30 40 54 52 58 53 58 64 82 78 79 150 166 182 174 178
Netherlands 71 80 53 54 66 67 66 67 73 75 71 79 101 124 175 163 134 122 107 132 163 250 297 267 202 169
Belgium-Luxembourg 19 24 22 23 25 23 25 27 35 40 51 48 55 50 61 70 76 89 103 96 80 93 115 120 124 131
France 7 6 7 9 9 11 12 11 15 17 26 21 24 31 42 46 47 55 61 56 52 71 73 63 64 62
Switzerland 19 21 17 19 17 20 22 23 21 26 28 22 22 25 30 37 44 52 64 67 57 51 60 50 57 61
Brazil 115 125 140 127 111 66 40 49 33 39 23 38 41 46 123 120 147 147 111 117 59 70 73 50 24 54
Italy 15 12 10 13 14 15 23 27 28 28 12 11 15 15 17 18 25 30 30 39 26 30 39 38 47 46
Ireland 5 7 7 7 7 7 4 5 7 9 11 10 13 14 12 15 21 29 23 30 24 13 25 30 29 37
Spain 2 3 2 1 2 3 3 3 4 3 4 5 5 10 16 15 13 13 21 18 17 39 55 63 52 36
India 1 2 - - - 1 1 - 1 - - - - - - - 1 1 1 4 4 7 4 3 7 36
Peru 13 11 13 8 6 4 - 1 13 11 11 9 5 4 3 4 4 8 5 14 17 8 15 11 23 29
United Kingdom 21 27 18 23 25 28 33 32 44 50 56 63 59 61 58 61 54 48 47 51 32 30 19 30 31 24
Singapore 18 26 17 24 14 23 23 19 18 34 14 18 11 17 26 13 17 36 59 67 61 57 53 28 22 24
Ecuador 27 42 42 27 21 20 22 30 41 17 18 22 12 5 14 13 10 3 2 22 12 10 12 7 18 21
Colombia 5 9 9 5 5 5 3 6 11 6 9 4 8 6 7 10 10 4 7 13 10 13 10 7 10 21
Sweden 3 2 2 3 2 3 3 3 3 3 3 3 3 4 4 4 5 5 5 5 6 11 22 9 12 17
Austria 1 1 - - - - - 1 1 1 2 - 4 7 5 6 7 6 6 8 8 10 12 12 11 13
Israel 3 3 2 3 4 2 3 5 6 6 5 5 8 4 5 5 6 7 7 8 8 12 12 14 10 12
Philippines 11 17 15 9 5 4 5 6 5 4 3 1 2 2 4 3 2 3 2 5 2 2 2 3 10 12
Cameroon 8 6 2 1 1 1 2 2 1 1 1 4 1 14 18 13 16 15 10 6 11 31 36 34 21 12
China 9 12 13 21 30 24 24 36 49 33 21 17 11 10 13 22 42 48 41 66 12 7 5 7 11 12
Dominican Republic 5 5 4 3 4 4 5 5 7 7 4 4 3 3 5 7 7 6 4 10 7 6 6 4 6 11
Denmark 1 1 1 1 1 1 1 0 1 1 1 3 3 4 5 5 6 5 5 4 3 2 3 3 3 7
Venezuela 2 2 1 2 2 2 2 3 3 2 1 1 1 2 3 4 3 2 2 3 2 2 2 1 5 5
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<title>U.S. imports of chocolate</title>
<style>
body {
background-color: #FFFFFD;
font-family: helvetica, sans-serif;
max-width: 800px;
}
svg {
margin: 12px 0 6px 12px;
}
.graphBlock {
border-radius: 15px;
}
#textBlock {
padding: 10px 10px 0 10px;
margin: 1px 6px 0px 6px;
}
#innerTextBlock{
margin-left: 94px;
margin-top: 24px;
width: 592px;
padding: 0px 15px 9px 15px;
border: 1px dotted darkkhaki;
background-color: oldlace;
}
.hangingindent {
padding-left: 1px ;
text-indent: -8px ;
}
h1 {
margin: 24px 6px 4px 6px;
font-weight: bold;
padding-left: 8px;
padding-bottom: 2px;
font-size: 20px;
border-bottom: 3px solid darkgoldenrod;
color: sienna;
}
h2 {
margin: 24px 0 0 0;
font-weight: bold;
font-size: 20px;
color: sienna;
min-width: 600px;
}
h3 {
color:sienna;
font-size: 14px;
line-height: 125%;
margin: 8px 0 0px 0;
padding-top:3px;
}
.intro {
font-size: 15px;
font-weight: bold;
color: black;
min-width: 600px;
margin-top: -6px;
}
.sidenote {
line-height: 125%;
font-size:12px;
margin: 3px 0 0px 0px;
}
.title {
line-height: 125%;
font-size:14px;
font-weight: bold;
color: sienna;
margin: 34px 0 -10px 98px;
}
.source {
padding: 0 40px 0 0;
font-size: 11px;
color: black;
font-style: italic;
text-align: right;
margin-top: -5px;
}
a {
text-decoration: none;
color: sienna;
}
.axis path,
.axis line {
fill: none;
stroke: darkkhaki;
stroke-dasharray:1, 2;
shape-rendering: crispEdges;
}
.axis text {
font-family: sans-serif;
font-size: 10px;
}
.label {
fill: sienna;
font-size: 13px;
}
.noBold {
font-weight: normal;
}
</style>
<script type="text/javascript" src="http://d3js.org/d3.v3.min.js" charset="utf-8"></script>
</head>
<body>
<header>
<h1>Canada continues to be the major exporter of chocolate to the United States.</h1>
</header>
<div id="textBlock">
<p class="intro">Mexico is a distant second; once-mighty Brazil and other countries are no longer big players.<p>
<div id="innerTextBlock">
<h3>The charted values cover mostly wholesale chocolate for the food-service and food-manufacturing industries.</h3>
<p class="sidenote">Here &ldquo;chocolate&rdquo; comprises chocolate bars and slabs weighing more than two kilograms each, cocoa paste, cocoa butter, and cocoa powder. Popular sweets such as Mars Bars, Kit Kats, and Hershey's Kisses are <em>not</em> included in the data. </p>
<h3>The power of NAFTA</h3>
<p class="sidenote">The US Department of Agriculture says that &ldquo;North American Free Trade Agreement partners Canada and Mexico are not only the main destinations for US exports but also the main suppliers of chocolate candy to the US market.&rdquo; </p>
<p class="sidenote">But does that doesn't necessarily mean that Canada's and Mexico's stronger presence is <em>because</em> of NAFTA&mdash;does it? NAFTA took effect in 1994; perhaps a line chart covering the years 1989 through 2014 would reveal some clues. Stay tuned!</p>
</div> <!-- end innerTextBlock-->
<p class="title">Import values of chocolate entering US ports and their origin of shipment, 1989 and 2014
<br /><span class="noBold">(1989 values were adjusted for inflation.)</span></p>
</div>
<svg class="graphBlock">
<p class="source">Source: USDA, <a href ="http://www.fas.usda.gov/gats"">www.fas.usda.gov/gats</a>
<script>
/*
The addCommas function is from http://www.mredkj.com/javascript/nfbasic.html. It's used to format the dollar amounts in the titles/tooltips.
*/
function addCommas(nStr) {
nStr += '';
x = nStr.split('.');
x1 = x[0];
x2 = x.length > 1 ? '.' + x[1] : '';
var rgx = /(\d+)(\d{3})/;
while (rgx.test(x1)) {
x1 = x1.replace(rgx, '$1' + ',' + '$2');
}
return x1 + x2;
}
var w = 800;
var h = 550;
var padding = [ 10, 60, 60, 100 ]; //Top, right, bottom, left
var xScale = d3.scale.linear()
.range([ 0, w - padding[1] - padding[3] ]);
var yScale = d3.scale.linear()
.range([ padding[0], h - padding[2] ]);
var xAxis = d3.svg.axis()
.scale(xScale)
.tickSize(-(h-padding[0]-padding[2]))
.outerTickSize(0)
.tickPadding(6)
.orient("bottom")
.ticks(15);
var yAxis = d3.svg.axis()
.scale(yScale)
.tickSize(-(w-padding[3]-padding[1]))
.outerTickSize(0)
.tickPadding(6)
.orient("left")
.ticks(12);
var svg = d3.select("svg")
.attr("width", w)
.attr("height", h);
d3.csv("chocolate_imports_1989-2014.csv", function(data) {
/** data.sort(function(a,b) {
// thank you for the hint about forcing the strings!
return d3.ascending(+a["2014"], +b["2014"]);
}); **/
xScale.domain([ 0, d3.max(data, function(d) {
return +d[2014] + (+d[2014])*0.05;
}) ]);
yScale.domain([
d3.max(data, function(d) {
return +d[1989] * 1.9340 + (+d[1989])*0.05;
}), 0
]);
var circles = svg.selectAll("circle")
.data(data)
.enter()
.append("circle");
circles.attr("cx", function(d) {
return xScale(+d[2014]) + padding[3];
})
.attr("cy", function(d) {
return yScale(+d[1989] * 1.93) ;
})
.attr("r", 625)
.style("opacity", 0.6)
.style("fill", "oldlace")
.attr("class", "bar")
.append("title")
.text(function (d) {
var dollarAmount1989 = d[1989] * 1.9340 * 1000000;
dollarAmount1989 = addCommas(dollarAmount1989);
var dollarAmount2014 = d[2014] * 1000000;
dollarAmount2014 = addCommas(dollarAmount2014);
var country = d.Source;
var countryCaps = country.toUpperCase();
return "$" + dollarAmount1989 + " worth of chocolate was imported from "
+ countryCaps + " in 1989; " + "$" + dollarAmount2014
+ " was imported in 2014.";
});
circles.transition()
.delay(800)
.duration(2000)
.attr("r", 4)
.style("opacity", 0.7)
.style("fill", "sienna");
var labels = svg.selectAll("text")
.data(data)
.enter()
.append("text");
labels.attr("x" , function(d) {
return xScale(+d[2014]+ padding[3] +110);
})
.attr("y" , function(d) {
return yScale(+d[1989] * 1.9340 - 2 );
})
.text (function(d) {
if (+d[2014] > 140 || +d[1989] > 30) {
return d["Source"];
}
})
.attr("opacity", 0)
.attr("fill", "oldlace")
.attr("text-anchor", "start")
.attr("font-size", "12px")
.transition()
.delay(1300)
.duration(3500)
.attr("opacity", 1)
.attr("fill", "saddlebrown");
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate("+ padding[3] + "," +(h - padding[2]) + ")")
.call(xAxis);
svg.append("g")
.attr("class", "y axis")
.attr("transform", "translate(" + padding[3] + ",0)")
.call(yAxis);
svg.append("text")
.attr("class", "label")
.attr("y", (h - padding[2]/2))
.attr("x", w/2)
.attr("dy", "1em")
.style("text-anchor", "middle")
.text("Value of imported chocolate, in millions, 2014");
svg.append("text")
.attr("class", "label")
.attr("transform", "rotate(-90)")
.attr("y", padding[3]-60)
.attr("x",0 - (h/2))
.attr("dy", "1em")
.style("text-anchor", "middle")
.text("Value of imported chocolate, in millions, 1989");
});
</script>
</body>
</html>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment