timfitzzz/tlccheckin.md

## tlccheckin.md

      
    Raw
  

              tlccheckin.md
            
          
    A pretty good week

Or: ...no, yeah it was good, I awno

Last week at TLC we spent a lot of time talking about my MongoJS implementation for twinput, which is of course the Twitter-centric OWS archive tool that I've been working on. This was very helpful! At the end, though, Guillaume reminded me to consult the user stories that I we had previously developed -- which I hadn't really even looked at in a week or so -- and pointed out, to my surprised surprise, that integrating Mongo really isn't even a pre-requisite for the main story I need to be working on. Lesson: I need to always check the user stories! I had been mis-remembering them in a way that helped me to justify chasing rabbits down wrong holes.
However, perhaps because I am nothing if not great at doggedly pursuing the wrong rabbits down the wrong holes, I decided I would fix my Mongo implementation before turning back to the main user story for the week. This turned out to take quite a bit longer than I thought it would, but it led to some interesting discoveries thanks to some pairing wisdom from Austin (thanks buddy!), including:

The Node debugger, which was quite brilliant at helping me figure out exactly where one of my functions was going wrong. To use it you have to insert the line "debugger" at the points in your app where you want it to pause for inspection, and then launch your app with "node debug <mycode.js>". Then it will pause at that point you set and you can do inspectiony things like enter 'repl' to get into the REPL and then look at the contents of any variable that is in scope wherever you put your 'debugger' point. Very cool.
Austin was all "Tests tests tests tests tests" and I was like "I don't know how to start doing that [test-driven development] at this point but maybe next time". This is a thing I would like to understand a little better -- I sort of get the concept with tests, but where is the right place to situate them both in the code and in the workflow?
Separation of concerns -- namely, internal document/object handling versus IO operations. My initial implementation of Mongo wound up being to teach Tweet documents how to save themselves, which works OK but does put duplicate code into each document when it could probably live somewhere else. However,  I'm not really sure where those functions ought to go.
De-duplication was an issue. I decided to handle it during the importation stage, but in the future, when these functions are used to add tweets to the existing database rather than just loading them up the first time, they will need to be taught how to consider the Mongo database during de-duping.

So after I managed to get this all to work, I commented out the lines that import into Mongo (leaving the importation process to simply populate a global variable) and set all that work aside to pursue the following user story: "can delete/hide tweets between beginning and endpoints". I didn't quite get there (and yes, maybe I would have if I had set the Mongo stuff aside earlier), but here's what I did get done:

Built a one-page front-end for paging through tweets using Bootstrap and this WYSIWYG tool called Pinegrow Web Designer that was on sale a few months ago and is pretty cool. I was able to take the HTML it spit out and convert it to Jade using this tool, and then break it up into templates.
Implemented loadtweets() in an Express-configured app.js, then implemented routes for getting Tweets from the tweet_collection global variable. To do this, I ended up adding a tweetManager.js module under a "controllers" folder as well. This seems messy, structurally.
Wrote some basic jQuery for making the web UI talk to the server, and got that to work.

So, I didn't get to the point where we can delete or hide tweets from the UI -- and I didn't get to the point where we're browsing pre-set beginning and endpoints at all, either -- it just shows the entire library. But those things feel a hop, skip, jump away from where I'm at now, so that's super duper exciting. A couple of things that came up during this:

Nearly all the tweets have broken userpic URLs right now -- Twitter seems to have changed their mapping. I'm also realizing that this archive isn't going to show the userpics that originally appeared by each tweet -- it'll be showing their current userpic. This is probably fine, but it's not as historically accurate as I'd like -- that said, I don't think there's likely anything to be done about this, unless these original pics are archived somewhere else that we can eventually get access to (like Topsy or Datasift or another Twitter archiver that could take a tweet_ID and get an original pic that was associated with it.) But at any rate, I will need to teach the app how to grab the correct URLs for userpics, and/or cache the pics themselves locally somehow.
Structure of the code -- more on that in my questions.
When there's only one set of data to view -- right now, of all tweets in the collection -- paging is easy, because the client can keep track of what it's looking at and send a really simple request for the next page to the server. But what about when there are more users and they're looking at maybe non-cached data like the contents of a search? Should this be handled by making the server somehow session-aware, or what? No idea how ot think about this just yet.

Also, the code itself is pretty messy right now -- I ran out of time and didn't get to do much clean up. So you can see, warts and all, how well I comment as I go (not very well).
Questions for this week:

What's the best way to structure all these helper modules? Right now for tweets, I have three: loadtweets (for loading tweets from files into memory), TweetConstructor (for constructing documents of class Tweet), and tweetManager (for handling I/O of tweets from collection), as well as routes/tweets.js, which provides the Express routes for pulling tweets or metadata about tweets from the server. I'm not completely sure about this, but I think in Mongoose the first three all basically get combined in the Mongoose schema, which leads to question 2:
Somehow Mongoose lets you define both 'methods' and 'statics' on a schema, with methods being added to each new document / class instance, and statics living on the constructor itself. How does this happen? As I understand it, properties of the constructor appear in each new instance of the documents it constructs. How does Mongoose avoid this?
What's the best way to "number" DOM elements, like tweets, so that client-side jQuery and the server can have consensus about which object is being manipulated via the UI? (Austin had some thoughts, but maybe it's worth talking about).

I think that's enough for now!