terabyte/2013-01-04-using-gist-to-blog.md

## 2013-01-04-using-gist-to-blog.md

      
    Raw
  

              2013-01-04-using-gist-to-blog.md
            
          
    Using Gist to Blog

Seems like gists are a great way to quickly publish easily-browseable blog entries using git, which is pretty much the easiest tool for me.  Furthermore, github's markdown language is particularly well-suited for blogs which will contain lots of code snippets.
Like this:
    require 'foo'
    puts "Hello, #{World}!"
    exit 0
And this:
    import foo
    print "Hello, " + World + "!\n"
    exit 0
And this:
    use Foo;
    print "Hello, $World!\n";
    exit 0;
So...let's see how this works!

  
## 2013-01-07-ssl-and-java.md

      
    Raw
  

              2013-01-07-ssl-and-java.md
            
          
    Java and SSL - From the Client Perspective

If you seriously work in Java web services, eventually you will need to deal with SSL, so let me be the first to say: "I'm sorry".  It isn't my fault, but that doesn't lessen my intense, sincere empathy for you and your misfortune.
SSL in Java is nothing short of disgraceful and an absolute, convoluted, poorly
understood poorly documented disaster to figure out.  I'm writing this post as
much for myself as for you, kindred reader, so that our lives may be less sad
when either of us must again read this document.
What you need to know about SSL

Well, you need to know a lot about SSL, but I'm going to assume you want the Cliff Notes version.  SSL allows two parties to communicate securely despite having NEVER shared a secret (sorta).  How does this happen?  MAGIC.  Sorry, you wanted a little more than that?
See Diffie Helman on Wikipedia.
For SSL details, see SSL on Wikipedia.
Great, for you cliff notes folks, all you really need to know is every SSL server has a certificate.  You can communicate securely with that server using the server's certificate (which you don't need to have in advance, thanks to diffie helman).  BUT - what if someone used a Man In The Middle (MITM) to pretend to be that server?  The solution to that is, you only trust certificates which are signed by authorities, which you trust will not sign certificates for people unless they own the website the certificate is for.
So all SSL certificates on the internet ought to be signed by a certificate
authority.  There are a limited number of certificate authorities in the world,
because these companies, for whatever reason, are trusted - because they won't
sign a certificate for just anyone, and because they charge money to do so.
Browsers, like chrome, firefox, etc. compile a list of CAs they trust, and
include them in their web browsers, so if you want to use SSL on the public
internet, you buy the service of having a trusted CA sign your certificates.
So now we've come to your SSL service you want to run, you just have a simple little webservice, running in your firewall, that you want to talk to your client and vice versa.  Do you need to buy a certificate?
Thankfully, the answer is no.  You can also tell clients to trust a specific
certificate, or you can generate your own certificate authority and tell your
clients to trust that authority.  If you run a web server, asking every person
to trust your CA on every browser of every computer they own is probably not a
good idea, but if you control both the server and the client, such as when you
are running a simple web service in your company, you can do that.
For the remainder of this post, I am going to assume your company has its own internal certificate authority which it uses to issue server certificates, so you want your service to vend a specific certificate, and you want your clients to accept any certificate authority signed by your own CA (and possibly, not any others - after all, why would your web service client be connecting to a publicly signed certificate?).
How to use SSL via openssl CLI

So this bit has nothing to do with java.  Here is how you could generate a certificate using openssl, for use with, say, Apache.
First you have to generate a keypair - which has a private and a public component.
$ openssl genrsa -des3 -out server.key 4096
Generating RSA private key, 4096 bit long modulus
.....................................................++
............................++
e is 65537 (0x10001)
Enter pass phrase for server.key:
Verifying - Enter pass phrase for server.key:

This creates a private key (which has its public key embedded in it as well) in the file server.key.
If you are using a CA (public, or your company's), next you create a certificate request - which is a file which contains the "Request" to a CA that it sign the certificate.
$ openssl req -new -key server.key -out server.csr
Enter pass phrase for server.key:
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [AU]:US
State or Province Name (full name) [Some-State]:CA
Locality Name (eg, city) []:Palo Alto
Organization Name (eg, company) [Internet Widgits Pty Ltd]:Internet Widgits Pty Ltd
Organizational Unit Name (eg, section) []:IT
Common Name (eg, YOUR name) []:Joe ITGuy
Email Address []:dude@example.com

Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:

If you just want a certificate for testing, or want it self-signed so you can trust the certificate directly, then you can do that in a single command:
$ openssl req -new -x509 -key server.key -out signedcert.key -days 1095
Enter pass phrase for server.key:
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [AU]:US
State or Province Name (full name) [Some-State]:CA
Locality Name (eg, city) []:Palo Alto
Organization Name (eg, company) [Internet Widgits Pty Ltd]:Internet Widgits Pty Ltd
Organizational Unit Name (eg, section) []:IT
Common Name (eg, YOUR name) []:Joe ITGuy
Email Address []:dude@example.com

Now you have a signedcert.key file.  You can extract out just the certificate itself (no private bits) like this:
$ openssl x509 -in signedcert.pem -text | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > publickey.pem

Now you have the public key.  You can verify like this:
$ openssl verify -CAfile publickey.pem signedcert.pem
signedcert.pem: OK

Let's say you want to run a test server with the sertificate, openssl does that for you:
# the -www option will sent back an HTML-formatted status page
# to any HTTP clients that request a page
$ openssl s_server -cert mycert.pem -www

Doing it in Java

Before Java 1.6, you can't import a certificate's private piece into a keystore
(for use with a java server), you can only import certificates to trust (the
public piece).  To generate a keystore which contains a full certificate, you can do this:
$ keytool -genkeypair -keystore mykeystore -alias serverkey -keyalg RSA
Enter keystore password:  
Re-enter new password: 
What is your first and last name?
  [Unknown]:  www.example.com
What is the name of your organizational unit?
  [Unknown]:  IT
What is the name of your organization?
  [Unknown]:  Example Industries
What is the name of your City or Locality?
  [Unknown]:  Palo ALto
What is the name of your State or Province?
  [Unknown]:  California
What is the two-letter country code for this unit?
  [Unknown]:  US
Is CN=www.example.com, OU=IT, O=Example Industries, L=Palo ALto, ST=California, C=US correct?
  [no]:  yes

Enter key password for <serverkey>
        (RETURN if same as keystore password):

$ keytool -certreq -alias serverkey -keystore mykeystore

That generates the Certificate Request (CSR) which you can have your CA sign.  Finally, you can import the CA's certificate as well as your newly signed sert as follows:
$ keytool -import -keystore mykeystore -alias ca -file ca.pem
$ keytool -import -keystore mykeystore -alias serverkey -file serverkey.pem

TODO: test all of this
TODO: say how to directly import for java 1.6
References

http://www.madboa.com/geek/openssl/#cert-exam

  
## 2013-01-21-how-to-be-a-professional-programmer.md

      
    Raw
  

              2013-01-21-how-to-be-a-professional-programmer.md
            
          
    A Day in the Life of a Professional Programmer

People often ask me "what do you do all day?".  More frequently than that, however, people assume things about what I spend most of my time doing which is flat-out wrong.  I think part of this comes from TV, for which I will make a brief tangent.
What Didn't TV Get Wrong?

I've been watching the show Alias with my girlfiend, and while it is an enjoyable little romp, there is a character in the show that just bugs me.  It bugs me the same way the characters of the newer hit show "Big Bang Theory" bug me.  To quote a more eloquent nerd than I, "And here’s my issue, here’s why The Big Bang Theory makes me feel uncomfortable. We aren’t laughing with Leonard, Sheldon, Raj and Howard. We’re laughing at them. Chuck Lorre has given us four exceptionally intelligent, nerdy main characters and he’s positioned us as an audience against them. When I watch Big Bang it becomes more and more obvious that I’m not supposed to relate to the guys (or more recently Amy Farrah-Fowler). I’m expected to relate to Penny. You only need to pay attention to the audience laughter to realise that TBBT relies on positioning us as an outsider to the nerds, as someone like Penny who doesn’t understand their references, their science, their vocabulary even, and who doesn’t care to learn."(1).  The character in Alias is named "Marshal"(2) and he is the "in-house creator and proprietor of a number of gadgets and sophisticated tools" but he is also a commentary about how "normal people" view "the people who create these things" and how fundamentally incompatible people think they are with "normal society".  Marshal stutters, speaks using strained metaphors, is the only age-appropriate main character on the show the main female protagonist has no sexual tension with or interest in, and generally spends most of his screen time spouting gibberish they they couldn't even have been bothered enough to make plausible.  Unlike TBBT's four nerd characters, we are clearly on Marshal's side, but it is despite his frequently nerdy anti-social behavior, not because of it.
That's not really the half of it though.  What really annoys me is just how wrong they get technology.  I know plot is king and all, but there just isn't any good reason to butcher technology so bad, even 10 years ago when computers were less common (but still pretty common-place, especially in the intelligence field they depict).  As Mythbusters has shown, you can't actually shoot a gun at a lock to break it open.  Gun enthusiasts may be just as annoyed as me when they see this.  "Why not just have them use a crowbar?".  Well, technology is the same fail.  Why use a "wireless modem", the magic box they set on the computer to "hack it and download 'the hard drive'..."?  They could just pop in a USB thumb drive, and probably gotten some product-placement dollars while they were at it.  Season 1 of Alias was 2001-2002, but the thumb drive was invented in 1999 and came to market in 2000 (3).  It'd be so easy to use the terminology right, like network security, encryption, or firewalls, but they just throw the terms around with utter disregard for what they actually mean.  Among all the Alias drinking games, the most dangerous is the "every time Marshal says the word 'firewall'."  I don't recommend the "every time encrypted text is show MOVING on the screen into readable text as the machine decrypts it" drinking game either.
What Does it Mean To Be a Programmer?

Ok, tangent over.  Contrary to the myopic, insulting, and one-dimensional depictions of Marshal, Leonard, and Sheldon, working as a professional programmer (like most stereotypical technology positions) does not require inside jokes, social akwardness, nerf guns, nor the lack of social grace commonly depicted.  It only requires intelligence, confidence, creativity, and dedication.  What I am going to do next is, I am going to explain what exactly it is I do.  I am going to explain the kind of problems I solve, how I typically approach that, and what that typically involves.  I hope it gives you an idea of what my day actually looks like.
I work at an amazing company.  I arrive whenever I feel like, work as long or short as I like, then leave.  As long as your work gets done, when you work isn't important, but there are meetings and the like on occasion which are important to attend.  One such meeting is 'the Monday morning meeting', where the entire product group meets to sync up about what is going on that week, what expectations are, and distribute other company news and so on.  This meeting is at 11:45am Monday morning, to ensure nobody has to wake up particularly early.  I come in between 8:00 and 10:30, depending on the morning.  Sometimes I workout for an hour before I head to my desk.  I try to read all my email and plan out my day before the monday morning meeting.  Other mornings I read web comics or wake up in other ways to prepare myself for the upcoming mental stresses of the day.
What are these mental stresses?  What sort of thinking does a programmer do?  They solve problems using indirection, mostly.  The "Fundamental Theorem of Software Engineering", a remark by Butler Lampson, is "We can solve any problem by introducing an extra level of indirection (or abstraction)"(5). What does that mean?  Imagine you are reading a book and you want to be able to quickly locate any references to a particular subject, even though the text is not sorted by subject.  You could build a table of every subject mentioned, along with page numbers, so you can easily find all references.  This is called an index.  Now what about finding all the referecnes across all the books?  You could "add another level of indirection" by building an index of all of the indexes of every book.  So first you look up the subject in this "index of indexes", find all books whose indexes mention the given subject, then look in each of those indexes to find all the places you must look.  THis is an example of incrementally solving a problem by adding additional layers of abstraction, in indirection.  Problem solving in computer science is often like that, to various degrees.  Knowing where to add these layers of indirection is very challenging and requires creativity, forethought, experience, and deep understanding.
After my morning of meetings, reflection, and preparation, it's usually about noon, so I grab some lunch.  If I don't have someone to eat lunch with, I might just eat at my desk.  After lunch, it's time to really get to work.  I have two main modes, "feature development" and "bug fixing".
Bug Fixing

We'll discuss "bug fixing" mode first.  I work on the Internal Tools team, so all my customers are actually other developers at my company.  As a support-focused team, it is a little different from your average professional programmer, but not too much so.  We have a ticket queue where people (us and our customers) submit tickets along the lines of "X doesn't work", "Y is too slow", "Z would be a really useful feature".  We, with the help of our lead, prioritize this list based upon how much time each thing will take, and how big of an impact it has on us and our customers.  Then, we try to "fix" each thing from the top of the list to the bottom.  So I pick the first thing on the list, without much thought, and mark it as "in progress" so none of my teammates try to work on it.
Next, the real engineering starts.  First I have to understand the problem.  The submitter may or may not have been specific, and may or may not have included the necessary details.  I, using my existing understanding of the systems (which I have learned since joining the team) along with documentation (we have an internal wiki where we describe how things work) try to understand the problem.  If there is not enough information, I ask the requester for more information, and/or explore on my own.  I might connect to the machine where the service is running and examine logs to see what is happening, and look for errors or things out of place.  I might check how loaded the box is, and how much free memory there is, if the complaint is about performance.  I might find the source code for the application, and read it to see how the program works, and why it might be malfunctioning.  I might connect to the data base, and see if the data in it looks correct (example: You call Amazon and tell them 'I signed up for prime, but it won't let me select free shipping' - the first thing they might do is check the database and see if your account says it has prime).
Once I have collected all the understanding necessary to solve the problem, I...solve the problem.  It's not always straightforward.  If it is a software defect, I first reproduce the problem locally (on software running on my desktop, which I have built from the raw source code, so that I can modify that code and see the change immediately, to see that it fixes the problem).  Sometimes this involves some unguided exploration, sometimes I just dig in and start looking at what is going on.  Using a debugger, I can examime the state of the program and watch the failure happen to try to figure out why and how to fix it (this is another reason to run the code locally).  Once I have a solution, I show it to my coworkers.  If it is code, I submit a code review.  If it is a change in process, I discuss it with them in an email thread.  If it impacts customers, I include them in the discussion.  If people have a better idea for how to fix it, I may revise my solution.  It is almost always a discussion, one where the best idea wins.  Involving politics, seniority, or anything else in the process is universally consider the sign of a poisonous company culture and something professional programmers despise.  Furthermore, if someone comes up wiht a better way, I am relieved, I do not take it personally.  The better the solution I use, the less likely I will have to deal with it breaking again.  Unlike most depictions of programmers as being egotistical, prideful, and narcissistic, most good software shops filter these assholes out.  Team fit, and willingness to let the best idea win, is a core compitency for good software engineers.
Once the change is approved, we aren't done yet!  Next the change has to be deployed to production.  Frequently this is handed off to another person, but it depends on the levels of specialization, size of team, etc.  Sometimes it must be done during a particular time (like when nobody is using the system at 2am), other times you just shove it out there and make sure it works.  For really important fixes, even in a system where you might otherwise wait, it is sometimes worth doing an emergency fix.  You might email your customers to warn them the system will be down for a "Brief outage".  Sometimes deployment requires more careful planning.  You might have to have a plan for what to do if things go wrong - what if your fix mistakenly causes a bigger problem than it fixes?  You might have to "roll back" to the previous version.  Is that easy?  Did you make any changes to the database so that the old version will no longer work?  Deployment is a very hard problem and each system and change is a unique snowflake.
Finally, after the change is deployed, hopefully you monitor and test and make sure the thing you meant to change, and all the things you didn't mean to change, are working as expected now.  Your customers are the final arbiters of whether or not the fix is acceptable.  When they are happy, you close the ticket as "fixed" and move on to the next one.  Because deployment is sometimes delayed, frequently you will have several changes in the pipe at a time, working on new fixes while older ones are waiting to be deployed.  If they are changes to the same system, this can make things complicated.  If you deploy two changes at once and things break - which change is the problem?  This boils down to release engineering, a whole career and field of study itself, one that devs frequently don't think about, but one that I am particularly aware of as it is an important part of my job as Internal Tools (we make tools that help devs deploy our software safely).
Feature Development

Feature development is a little bit different, but has many similar aspects.  Like with bug fixing, you start out in research mode.  Unlike bug fixing, the goal of feature development research is not only to understand the problem, but to attempt to understand how long it will take to implement the feature and exactly what scope the feature will have.  Say your feature is to write a tool that helps developers test code by preparing a database with test data.  You would have to carefully decide exactly what is and is not "in scope" for the feature.  Writing code which prepares the database, for example, is in scope, but maybe writing code which automatically runs the tests after setting up the database is not in scope.  Maybe that feature will be implemented later, but if you keep adding to the features you eventually get "feature creep" and you'll never be "done" enough to deploy something that people can start benefitting from.  It is always your goal to break up your features into small, self-contained chunks that you can deliver intcrementally.  You might get feedback that helps you decide whether or not the next piece is the right thing to do to deliver a big impact to your customers.
So, first you write what we call an RFC (Request For Comments) document.  You try to plan out what is in scope, how long it will take, and explicitly decide how it will impact your customers.  List out customer use cases like "As a developer, I want to be able to prepare a database with test data with a single click".  You might also include some guidelines on how you plan to implement it, a test plan describing how you will test it, and list a few customers who have agreed to be the "first ginea pigs" to help try it out and give you early feedback.
As before, once your planning is done you dig in and start writing code, but after each chunk is done, you seek a code review.  This could be an informal "over the shoulder" code review, or a more formal process.  Sometimes code goes through several rounds of review.  Again, the goal is that the best ideas win.  When people disagree which method is best, the final arbiter is usually the owner of the code, or the person writing it, unless there is some way to test both ways and compare them objectively.
All the same things above about testing and deployment apply here too.  Frequently, during feature dev, it is important to write tests which exercise the code and ensure it does what you expect at the same time, or even before, you write the code.  Not only does this give you confidence the code works as you expect, but if you or others need to change the code later,  By knowing that the tests still work, you have higher confidence nothing unexpected has changed.  Also, if there are comprehensive tests written at the time code is developed, when you find bugs, it is easy to just add a few tests to exercise the newly found problem, to ensure future changes don't cause it again.
Wankery

One of the most difficult parts of my job to describe, and simultaneously one of the most difficult, is what I like to call "wankery".  As an example, just today, a coworker came by to discuss the incoming new-hire rush.  Many of our systems impact the developer work cycle directly.  If we have a lot more developers, then our systems will run slower, making them less efficient.  That would be bad.  So we need to prepare our systems to scale to higher demand, which means ordering more hardware usually.  But how much hardware do we need?  Do we even have enough now?  If we have 50 developers and 2 servers, will 75 developers require 3 servers total?  or 4?  or more?  Are the 2 servers we have now even 'enough'?  Would adding more servers make it faster, or are the systems as fast as they can be?  How do we begin to measure and answer these questions?
It isn't always clear what problems need to be solved, and customers will frequently say "I need a Ferrari" instead of "I need to get from point A to point B in this much time".  Frequently customers don't even know they need to get to point B until we tell them such a thing is possible.  Frequently the best ideas come from inprompteu brainstorming discussions.
Walkups

In addition to chatting with coworkers and brainstorming solutions, frequently in a support-focused role such as mine, customers will come to me with off-the-cuff problems or ideas, or seeking help with something in particular.  The ability to change context quickly, and juggle several projects, is of great value.  You have to be flexible, nimble, and prepared for interruptions.  At the same time, when it is needed, you have to be able to focus.
Summary

After a long day of responding to quickly changing requirements, thinking on your feet, coming up with creative solutions, and diving deep to understand complicated, intricate systems, it's time to grab dinner.  Frequenly, spending social and non-work time with your customers and coworkers also gives you a great opportunity to brainstorm new solutions, or overhear people talking about problems that it hasn't even occured to them to bring to your attention yet.  This is one reason why so many companies find it well worth the costs of catering meals, and sponsoring company events, and having things like video game rooms, gyms, and poker tables and board games for relaxing at work.
Hopefully this writeup gives you an idea of just what it is I do every day, why I love it, and why I don't have to be a social outcast to do it.  People who "sling bits for a living", people who "Make a career of telling computers what to do", are not so different from anyone else.  It is often difficult to be honest in your self-appraisals, but if I had to guess, the only difference between me now and 15 years ago, when I would rush home from school to play on the computer, is that I am better equipped with the tools and experience to solve these problems.  I am confident enough that I don't ask if I can figure out solutions, but rather, ask how long it will take, and if those solutions I come up with are the best they could be.  That's the confidence part.  Like many creative fields, you never stop learning.  An artist is always looking for new techniques, new brush strokes, new types of paint, different ways to achieve the exact result they are looking for.  Professional Programming is an art (4), and the talented artists that realize software designs from imagination to reality leave an unmistakeable impact on our world, often extending far beyond their direct customers.  There are many ways to change the world, but none so easy as via software.
Hopefully, this gives you a better idea of what a programmer does with their day, just what that looks like, and clears up some of the mystery.
References


http://butmyopinionisright.tumblr.com/post/31079561065/the-problem-with-the-big-bang-theory
http://en.wikipedia.org/wiki/Marshall_Flinkman
http://en.wikipedia.org/wiki/USB_flash_drive#History
http://en.wikiquote.org/wiki/Donald_Knuth - "Computer Programming is an art, because it applies accomulated knowledge to the world, because it requires skill and ingenuity, and especially because it produces objects of beauty", Donald Knuth, 1974.
http://en.wikipedia.org/wiki/Fundamental_theorem_of_software_engineering