Skip to content

Instantly share code, notes, and snippets.

@tonywok
Created May 16, 2010 23:38
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save tonywok/403258 to your computer and use it in GitHub Desktop.
Save tonywok/403258 to your computer and use it in GitHub Desktop.
require 'lib/string_util.rb'
require 'singleton'
class Tokenizer
include Singleton
attr_accessor :source, :chunks
def initialize(filename = '')
@source = IO.readlines(filename, '') unless filename.empty?
@chunks = self.source.to_s.split(' ') unless filename.empty?
end
def get_next_token
if valid_token?(self.chunks.first)
self.chunks.shift
elsif self.chunks.first =~ /end/
self.chunks.first.slice!(0..2)
else
get_token_from_chunk
end
end
def get_token_from_chunk
tokens = split_em(self.chunks.shift).reverse.each do |token|
self.chunks.unshift(token)
end
self.chunks.shift
end
def valid_token?(chunk)
chunk.whitespace? || chunk.integer? || chunk.identifier? || self.symbol?(chunk) || chunk.keyword?
end
def split_em(chunk)
count = 0
foo = chunk.split(//).inject([""]) do |arr,e|
if self.valid_token?(arr[count] + e)
arr[count] = arr[count] + e
arr
else
count+=1
arr << e
end
end
end
def peek
@chunks.first
end
# checks to see if the token is a CORE symbol
def symbol?(token)
SYMBOLS.has_key?(token)
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment