Skip to content

Instantly share code, notes, and snippets.

@ryanbriones
Forked from tonywok/tokenizer.rb
Created May 17, 2010 01:35
Show Gist options
  • Save ryanbriones/403306 to your computer and use it in GitHub Desktop.
Save ryanbriones/403306 to your computer and use it in GitHub Desktop.
require 'lib/string_util.rb'
require 'singleton'
require 'rubygems'
require 'active_support/core_ext/class/attribute_accessors'
class Tokenizer
include Singleton
cattr_accessor :source, :chunks
def self.filename=(filename)
self.source = IO.readlines(filename, '')
self.chunks = self.source.to_s.split(' ')
end
def get_next_token
if valid_token?(self.chunks.first)
self.chunks.shift
elsif self.chunks.first =~ /end/
self.chunks.first.slice!(0..2)
else
get_token_from_chunk
end
end
def get_token_from_chunk
tokens = split_em(self.chunks.shift).reverse.each do |token|
self.chunks.unshift(token)
end
self.chunks.shift
end
def valid_token?(chunk)
chunk.whitespace? || chunk.integer? || chunk.identifier? || self.symbol?(chunk) || chunk.keyword?
end
def split_em(chunk)
count = 0
foo = chunk.split(//).inject([""]) do |arr,e|
if self.valid_token?(arr[count] + e)
arr[count] = arr[count] + e
arr
else
count+=1
arr << e
end
end
end
def peek
self.chunks.first
end
# checks to see if the token is a CORE symbol
def symbol?(token)
SYMBOLS.has_key?(token)
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment