To begin our explorations, let's look at an example from the Internet Relay Chat (IRC) protocol. The following string represents the command that you'd send to an IRC server to post a message to a particular channel:
"PRIVMSG #practicing-ruby-testing :Seasons greetings to you all!\r\n"
Even if you've never used IRC before or looked into its implementation
details, you can extract a great deal of meaning from this single line
of text. The structure is very simple, so it's fairly obvious that
PRIVMSG
represents a command, #practicing-ruby-testing
represents
the channel, and that the message to be delivered is
"Seasons greetings to you all!"
. If I asked you to parse this
string to produce the following array, you probably would have
no trouble doing so without any further instruction:
["PRIVMSG", "#practicing-ruby-testing", "Seasons greetings to you all!"]
But if this were a real project and not just an academic exercise, you might start to wonder more about the nuances of the protocol. Here are a few questions that might come up after a few minutes of careful thought:
-
What is the significance of the
:
character? Does it always signify the start of the message contents, or does it mean something else? -
Why does the message end in
\r\n
? Can messages contain newlines, and if so, should they be represented as\n
or\r\n
, or something else entirely? -
Will messages always take the form
"PRIVMSG #channelname :Message Body\r\n"
, or are their cases where additional information will be provided? -
Can channel names include spaces? How about
:
characters?
Try as we might, no amount of analyzing this single example will answer these questions for us. That leads us to a very important point: Understanding the meaning of a message doesn't necessarily mean that we know how to process the information contained within it.