Skip to content

Instantly share code, notes, and snippets.

@gardym
Created July 1, 2013 11:49
Show Gist options
  • Save gardym/5900173 to your computer and use it in GitHub Desktop.
Save gardym/5900173 to your computer and use it in GitHub Desktop.

In improve this we take a look at a reader submitted test, user interface, story or block of code and we try and improve it, without context, explaining what we did as we went.

In this issue, Mike sent a link to an event source to a realtime social media visualization.

Before we ever apply brainpower, let's apply computer power. JSLint and JSHint are both tools to find mistakes and oversights.

lib/streamers/twitter_stream_source.js :
  Implied globals:
		require: 1,2,3
		process: 6,7,8,9
		console: 34,38
		exports: 52
	Unused Variables:
		mongo(2), data(33), err(39), err(42),

The implied globals are all OK. The unused variables are not. We see immediately that:

  1. The mongo dependency isn't used.
  2. There is no error handling around inserting records into the database.

The first problem is easily solved. The second problem we'll report and ignore, because it appears throughout the rest of the program:

$ ack 'err' lib/
lib/streamers/instagram_source.js
38:      db.collection('grams', function(err, grams_collection) {
41:            grams_collection.find({id: gram.id}, function(err, cursor) {
42:              cursor.count(function(err, count) {
45:                  db.collection('events', function(err, events_collection) {
47:                      events_collection.insert(evt, function(err, inserted) {
59:          function(err) {

lib/streamers/twitter_stream_source.js
33:    stream.on('error', function(data, err) {
34:      console.log("++ Twitter -- ERROR -- %s", err);
39:        db.collection('tweets', function(err, tweets_collection) {
42:        db.collection('events', function(err, events_collection) {

lib/streamers.js
10:  var start_streaming = function(err, db) {

Let's apply brainpower. Three things stand out:

  1. map_tweet_to_event seems to have an unnecessary callback.
    This should be an easy fix.

  2. tweet.coordinates is both null-checked and uses magic numbers.
    This isn't a problem; but, data structures with optional nulls are easy to trip on in normal use and complicate testing.

  3. start_streaming is a set of deeply nested callbacks.
    This one is four levels deep. Not a serious offence by Javascript standards; but, we can do better.

Sadly, this code came with no tests. We write a characterization test to give confidence that we won't break anything. The bottom of nested callbacks are good places to find expected behaviors:

describe('twitter_stream_source', function() {
  var twitter_stream_source = rewire('../lib/streamers/twitter_stream_source');

  describe('start_streaming', function() {
    var db = {id: 'db'},
        tweets_collection = {id: 'tweets_collection'},
        events_collection = {id: 'events_collection'},
        expected_data = {id: 'expected_data'},
        expected_event = {id: 'expected_event'};

    it('should insert streamed tweets', function() {
      twitter_stream_source.track_current_user(db, 'fake terms');

      expect(tweets_collection.insert).to.be.calledWith(expected_data);
    });

    it('should insert streamed events', function() {
      twitter_stream_source.track_current_user(db, 'fake terms');

      expect(events_collection.insert).to.be.calledWith(expected_event);
    });
  });
});

We flesh out the test guided by the test failures.

Now, we can refactor with (more) confidence:

First, we collapse the map_tweet_to_event callback.

var map_tweet_to_event = function(tweet) {
  return {
    provider: 'Twitter',
    username: tweet.user.screen_name,
    name: tweet.user.name,
    profile_image: tweet.user.profile_image_url,
    text: tweet.text,
    at: tweet.created_at,
    coordinates: tweet.coordinates ? {
      lat: tweet.coordinates.coordinates[1],
      lng: tweet.coordinates.coordinates[0],
    } : null,
    place: tweet.place,
    recorded_at: moment().toDate()
  };
};

// ...

db.collection('events', function(err, events_collection) {
  events_collection.insert(map_tweet_to_event(data));
});

Second, we split up start_streaming up by responsibility. Those responsibilities— right now— are:

  1. Streaming tweets.
  2. Logging.
  3. Filtering tweets.
  4. Saving raw tweets.
  5. Saving events (processed tweets).

1 through 3 involve the Twitter stream. 4 and 5 involve the database.

We create a stream_tweets function:

var stream_tweets = function(terms, callback) {
  endpoint = 'user'
  params = {'with': 'followings', 'track': terms};
  Twitter.stream(endpoint, params, function(stream) {
    stream.on('error', function(data, err) {
      console.log("++ Twitter -- ERROR -- %s", err);
    });
    stream.on('data', function(data) {
      if(data.user && data.text) {
        console.log("-- Twitter (%s) -- @%s: %s", endpoint, data.user.screen_name, data.text);
        callback(data);
      }
    });
  });
};

Notice we inline the params object that was previously initialized in track_current_user because it is only used by the Twitter.stream method.

Then, we create a record_tweet function:

var record_tweet = function(db) {
  return function(data) {
    db.collection('tweets', function(err, tweets_collection) {
      tweets_collection.insert(data);
    });
    db.collection('events', function(err, events_collection) {
      events_collection.insert(map_tweet_to_event(data));
    });
  }
};

This function returns the callback function, but keeps the db in scope.

Finally, we update track_current_user:

exports.track_current_user = function(db, terms) {
  stream_tweets(terms, record_tweet(db));
};

The tests pass! That means it works, right? ;-)

var chai = require('chai'),
rewire = require('rewire'),
sinon = require('sinon'),
sinonChai = require('sinon-chai');
var expect = chai.expect;
chai.use(sinonChai);
describe('twitter_stream_source', function() {
var twitter_stream_source = rewire('../lib/streamers/twitter_stream_source');
describe('start_streaming', function() {
var clock,
db = {id: 'db'},
tweets_collection = {id: 'tweets_collection'},
events_collection = {id: 'events_collection'},
error = 'error string',
expected_data = {id: 'expected_data',
user: 'user', text: 'text',
coordinates: {coordinates: ['lng', 'lat']}},
expected_event = {at: undefined,
coordinates: {lat: 'lat', lng: 'lng'},
name: undefined,
place: undefined,
profile_image: undefined,
provider: 'Twitter',
recorded_at: undefined,
text: 'text',
username: undefined};
beforeEach(function() {
tweets_collection.insert = sinon.spy();
events_collection.insert = sinon.spy();
var stream = {on: sinon.stub()};
stream.on.withArgs('error').yields(null, error);
stream.on.withArgs('data').yields(expected_data);
var Twitter = {id: 'Twitter', stream: sinon.stub().yields(stream)};
twitter_stream_source.__set__({Twitter: Twitter});
db.collection = sinon.stub();
db.collection.withArgs('tweets').yields(null, tweets_collection);
db.collection.withArgs('events').yields(null, events_collection);
clock = sinon.useFakeTimers();
expected_event.recorded_at = new Date();
});
afterEach(function() {
clock.restore();
});
it('should insert streamed tweets', function() {
twitter_stream_source.track_current_user(db, 'fake terms');
expect(tweets_collection.insert).to.be.calledWith(expected_data);
});
it('should insert streamed events', function() {
twitter_stream_source.track_current_user(db, 'fake terms');
expect(events_collection.insert).to.be.calledWith(expected_event);
});
});
});
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment