Skip to content

Instantly share code, notes, and snippets.

@HussamHallak
Last active January 31, 2018 09:30
Show Gist options
  • Save HussamHallak/bb217be3aca298ae365c11abe507f040 to your computer and use it in GitHub Desktop.
Save HussamHallak/bb217be3aca298ae365c11abe507f040 to your computer and use it in GitHub Desktop.
S18 - HW 3 - Bar Chart
license: gpl-3.0

CS 725/825 - Spring 2018 - Homework 3 - Bar Chart

Hussam Hallak

Bar Chart

1. What is the type of mark used in the bar chart?

2. List the channels, the attribute they are mapped to, and the data type of that attribute.

Answer:

The bar chart encodes two attributes using a line mark with the vertical spatial position channel for the quantitative attribute, letter usage frequency, and the horizontal spatial position channel for the categorical attribute, the letter itself. The letter data type is an item since each letter is an individual entity that is discrete. The letter frequency data type is an attribute, which is some specific property that can be measured, observed, and logged.

Edit the top chart to use the color hue channel to express the letter attribute. You will need to look at the d3 Color Scales links on the class Links page. Hint: This isn't a one-line change -- you have to setup a color scale and then apply it to the bars based on the appropriate value.

I Edited the top chart and was able to setup a color scale and use the color hue channel to express the letter attribute and applied it to the bars based on the appropriate value.

First, define a variable color and assign a color scale to it:

var color = d3.scaleOrdinal(d3.schemeCategory20);

Then, change the fill color for the bars from steelblue and make it a function of the data and use the letter part of the data d to color the bar.

I can show you better than I can tell you. Basically change this line:

.style("fill", "steelblue")  // color of the bars

to this:

.style("fill", function(d) { return color(d.letter);})  // color of the bars

3. After your changes, list the channels, the attribute they are mapped to, and the data type of that attribute.

The bar chart still encodes two attributes using a line mark with the vertical spatial position channel for the quantitative attribute, letter usage frequency, and two channels categorical attribute, the letter itself, the horizontal spatial position and color hue. The letter data type is an item since each letter is an individual entity that is discrete. The letter frequency data type is an attribute, which is some specific property that can be measured, observed, and logged. The question is: Did we add any useful information by doing this? The answer is NO! This is because these different colors do not group letters together based on their frequency. We are confusing the viewer, who might think that these colors mean something, but they do not.

Edit the bottom chart to use the color saturation channel to express the frequency attribute.

We can similarly do that by adding this line of code first to define a color2 variable and set it to light blue

var color2 = d3.color("lightblue");

Then we add the following lines to map into the data and return the frequency. Then find the maximum and minimum value to construct a multiplier that will make the saturation clear. This step is the hardest because it requires playing with the number until we get the best multiplier. Ideally, we need to normalize the frequency, but this multiplier gave good results.

  var dataVals = data.map(function(e) {return e.frequency});
  var minVal = d3.min(dataVals);
  var maxVal = d3.max(dataVals);
  var multiplier = 11/(maxVal-minVal);

Now it's time to use the multiplier and the frequency and return their product to pass to the function darker(k). I initially passed the raw frequency in the function darker() but the saturation barely changed because the values are so small. In the beginning, I thought my code was wrong, but it turned out that the code needs some tweaking to fit this kind of data.

.style("fill", function(d) { return d3.hsl(color2).darker(d.frequency * multiplier);})  // color of the bars

The darker the color, the more frequenct the letter is used, and the longer the bar.

4. After your changes, list the channels, the attribute they are mapped to, and the data type of that attribute.List all of the channels mapped to the frequency attribute.

The bar chart still encodes two attributes using a line mark, but with two channels, the vertical spatial position channel and color saturation channel for the quantitative attribute, letter usage frequency, and one channels categorical attribute, the letter itself, the horizontal spatial position. The letter data type is an item since each letter is an individual entity that is discrete. The letter frequency data type is an attribute, which is some specific property that can be measured, observed, and logged. Adain, did we add any useful information by doing this? The answer is NO! This is because these different shades of the same color repeat the same information projected by the length of the bars. This situation would be ideal if we are short on space and cannot afford to put a bar chart. In that case we can make all the bars short (squares) and use the color saturation to give information about the frequency. This time color saturation groups letters together based on their frequency. Again, We are confusing the viewer without adding any new information.

Scatterplot

1. What is the type of mark used in the scatterplot? 2. List the channels, the attribute they are mapped to, and the data type of that attribute. The type of marks used in the scatter plot is a point. The channels are the horizontal and vertical spacial positions which encode quantitative attributes which are the number of passing touchdowns (TDs) and the number of rushing TDs for each player. The data type for both TDs is attribute. Each point represents a player, an item or an entity.

Edit the chart to use the color hue channel to express the "Conf" attribute (there are 11 conferences) and use the size channel to express another attribute in the dataset.

Similar to what I have done for the bar charts, I added this line of code:

  var color = d3.scaleOrdinal(d3.schemeCategory20);

I also added this line in the section of the code //draw dots which gives colors to the dots based on the "Conf" value

.style("fill", function(d) { return color(d.Conf)})

I also added the size channel to the mix by adding this line of code under the //draw dots section, which changes the radius of the dot or point based on a value that combines values from two different columns in the data file:

.attr("r", function(d) { return d.Pct * d.Rate * 0.0005})

I have absolutely no idea what Pct and Rate are, but I tried with a couple of different attributes like Attempts and others. They all do not seem to make the difference in points' sizes obvious enough. I tried using the logScale on them, but that did not help so I commented its code. I finally tried multiplying the Rate and Pct and multiplying that by 0.0005 and that gave the best possible result. This is the section of the code //draw dots:

  // draw dots
  g.selectAll(".dot")
      .data(data)
    .enter().append("circle")
      .attr("class", "dot")
      .attr("r", function(d) { return d.Pct * d.Rate * 0.0005})
      .attr("cx", xMap)
      .attr("cy", yMap)
  		.style("fill", function(d) { return color(d.Conf)})

3. After your changes, list the channels, the attribute they are mapped to, and the data type of that attribute. The channels are the horizontal and vertical spacial positions which encode quantitative attributes which are the number of passing touchdowns (TDs) and the number of rushing TDs for each player. The data type for both TDs is attribute. Each point represents a player, an item or an entity. Another channel is the color hue which is an identity channel that encodes the conference which is a categorical attribute, which is an item as far as data type. Another channel is size which encodes Pct * Rate * k where k = 0.0005 and this attribute is quantitative and its data type is attribute. In this case, the value is new quatitative attribute that is constructed from the two existing quatitative attributes Pct and Rate.

See the assignment instructions at http://www.cs.odu.edu/~mweigle/CS725-S18/HW3

Explanation of how the bar chart is constructed (for the most part) is available at https://bost.ocks.org/mike/bar/3/

Other helpful links (scale, domain, range):

Original README.md

This simple bar chart is constructed from a TSV file storing the frequency of letters in the English language. The chart employs conventional margins and a number of D3 features:

forked from mbostock's block: Bar Chart

forked from weiglemc's block: S18 - HW 3 - Bar Chart

letter frequency
A .08167
B .01492
C .02782
D .04253
E .12702
F .02288
G .02015
H .06094
I .06966
J .00153
K .00772
L .04025
M .02406
N .06749
O .07507
P .01929
Q .00095
R .05987
S .06327
T .09056
U .02758
V .00978
W .02360
X .00150
Y .01974
Z .00074
<!DOCTYPE html>
<html>
<meta charset="utf-8">
<script src="https://d3js.org/d3.v4.min.js"></script>
<script src="https://d3js.org/d3-scale-chromatic.v1.min.js"></script> <!-- for color scales -->
<style>
body {font-family: calibri;}
.axis {font: 14px calibri;}
.label {font: 16px calibri;}
</style>
<body>
<p>Frequency of usage of letters in English</p>
<h2>Bar Chart 1</h2>
<div><svg id="chart1" width="800" height="400"></svg></div>
<h2>Bar Chart 2</h2>
<div><svg id="chart2" width="800" height="400"></svg></div>
<script>
// chart 1
var svg1 = d3.select("#chart1"),
margin = {top: 20, right: 20, bottom: 30, left: 40},
width = +svg1.attr("width") - margin.left - margin.right,
height = +svg1.attr("height") - margin.top - margin.bottom;
// See https://github.com/d3/d3-scale
var x1 = d3.scaleBand().rangeRound([0, width]).padding(0.1),
y1 = d3.scaleLinear().rangeRound([height, 0]); // note that we've reversed the range
// creates new svg <g> space, sets new (0,0) at left, top margin
var g1 = svg1.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
d3.tsv("data.tsv", function(d) {
d.frequency = +d.frequency; // convert text to number
return d;
}, function(error, data) {
if (error) throw error;
// See https://www.dashingd3js.com/d3js-scales
// maps domain of x values (letters) to range of positions on x-axis
x1.domain(data.map(function(d) { return d.letter; }));
// maps domain of y values (frequencies 0, max freq) to range of positions on y-axis
y1.domain([0, d3.max(data, function(d) { return d.frequency; })]);
// x-axis
g1.append("g")
.attr("class", "axis x-axis")
.attr("transform", "translate(0," + height + ")") // move axis to bottom of chart
.call(d3.axisBottom(x1));
// y-axis
g1.append("g")
.attr("class", "axis y-axis")
.call(d3.axisLeft(y1).ticks(10, "#")); // number of ticks and type
// y-axis label
g1.append("text")
.attr("class", "label")
.attr("x", 0-margin.left) // set x position of label
.attr("y", 0-margin.top/2) // set y position of label
.style("text-anchor", "start") // left-justify
.text ("Frequency")
// bars
g1.selectAll(".bar")
.data(data)
.enter().append("rect")
.attr("class", "bar")
.attr("x", function(d) { console.log ("letter: " + d.letter + ", x: " + x1(d.letter)); return x1(d.letter); })
.attr("y", function(d) { console.log ("freq: " + d.frequency + ", y: " + y1(d.frequency)); return y1(d.frequency); })
.attr("width", x1.bandwidth()) // width of each band
.attr("height", function(d) { return height - y1(d.frequency); })
.style("fill", "steelblue") // color of the bars
;
});
// chart 2
var svg2 = d3.select("#chart2"),
margin = {top: 20, right: 20, bottom: 30, left: 40},
width = +svg2.attr("width") - margin.left - margin.right,
height = +svg2.attr("height") - margin.top - margin.bottom;
var x2 = d3.scaleBand().rangeRound([0, width]).padding(0.1),
y2 = d3.scaleLinear().rangeRound([height, 0]);
var g2 = svg2.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
d3.tsv("data.tsv", function(d) {
d.frequency = +d.frequency;
return d;
}, function(error, data) {
if (error) throw error;
x2.domain(data.map(function(d) { return d.letter; }));
y2.domain([0, d3.max(data, function(d) { return d.frequency; })]);
// x-axis
g2.append("g")
.attr("class", "axis axis--x")
.attr("transform", "translate(0," + height + ")")
.call(d3.axisBottom(x2));
// y-axis
g2.append("g")
.attr("class", "axis axis--y")
.call(d3.axisLeft(y2).ticks(10, "#"));
// y-axis label
g2.append("text")
.attr("class", "label")
.attr("x", 0-margin.left)
.attr("y", 0-margin.top/2)
.style("text-anchor", "begin")
.text ("Frequency")
// bars
g2.selectAll(".bar")
.data(data)
.enter().append("rect")
.attr("class", "bar")
.attr("x", function(d) { return x2(d.letter); })
.attr("y", function(d) { return y2(d.frequency); })
.attr("width", x2.bandwidth())
.attr("height", function(d) { return height - y2(d.frequency); })
.style("fill", "teal") // color of the bars
;
});
</script>
</body>
</html>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment