Skip to content

Instantly share code, notes, and snippets.

@lokeshh
Last active March 30, 2016 11:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save lokeshh/96fa8c9671f1a93e1961e22ac981440e to your computer and use it in GitHub Desktop.
Save lokeshh/96fa8c9671f1a93e1961e22ac981440e to your computer and use it in GitHub Desktop.
class Formula
  Formula class will be used by Statsample to parse a regression formula.
  It consist of two data members.
  attr_accessor :left_terms # This will store all the left terms. For example: ['y'] in case of 'y ~ a*b'
  attr_accessor :right_terms # This will store all the right terms. For example: ['a', 'b', ['a', 'b']]
  # in case of 'y ~ a*b'
  
  It will expose 'from_formula' function to the statsample to parse formulas
  For example to parse the formula 'y ~ a*b', Statsample will call the following function
  f = Formula.new.from_formula('y ~ a*b')
  and the result will be a formula object with two data members
  f.leftTerms = ['y']
  f.rightTerms = ['a', 'b', ['a', 'b']] 
  
  With these two terms, Statsample can code the required categorical variables
  and perform regression.
  
  
def from_formula exp
  This will parse the formula and store left and right terms
  
  In order to parse the language it will convert it first to 
  reduce language which I have defined as only containing the symbols
  '+' and ':'.
  Then it will call from_reduced_formula function to parse the terms
  and update left_terms and right_terms.

def from_reduced_formula reduced_exp
  This accepts expression in formula language only consisting of '+' and ':'
  and it's function is to take such an expression and fill the left and right terms.
  
  Working of this should be simple. It will be passed a reduced language
  expression, say 'y ~ a + b + a:b'.

  It will first split it by + and form an array ['a', 'b', 'a:b']
  Then again it will split by ':' to parse interaction terms.
  The final result will be that following data members will be set
  - left_terms = ['y']
  - right_terms = ['a', 'b', ['a', 'b']]
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment