Last active
August 29, 2015 13:57
-
-
Save DaneWeber/9550351 to your computer and use it in GitHub Desktop.
Notepad++ Python Script for Analyzing Nested Parentheses and Comma-Separated Arguments. Posted to https://sourceforge.net/p/notepad-plus/discussion/1290590/thread/c17ed459/
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/python | |
# input_text = editor.getSelText() | |
input_text = '=HLOOKUP(TablePeriods[[#Headers],[YearMonth]],TablePeriods[[#All],[YearMonth]],MATCH([@Date],INDIRECT("TablePeriods[[#All],["&[@Account]&"]]"),-1))' | |
# initialize the state table | |
state_table = {} | |
# event list and corresponding state table | |
event_list = ( "\"", "&", "(", ")", ",", "\n\r\t ", "[", "]", "{", "}" ) # last event is anything that doesn't match one of the other actions | |
state_table[0] = ((1,0), (0,1), (5,0), (0,3), (0,4), (0,5), (2,0), (0,0), (6,0), (0,0), (0,0)) # normal state | |
state_table[1] = ((0,0), (1,0), (1,0), (1,0), (1,0), (1,0), (1,0), (1,0), (1,0), (1,0), (1,0)) # double-quote comment | |
state_table[2] = ((2,0), (2,0), (2,0), (2,0), (2,0), (2,0), (3,0), (0,0), (2,0), (2,0), (2,0)) # inside bracketed table reference | |
state_table[3] = ((3,0), (3,0), (3,0), (3,0), (3,0), (3,0), (4,0), (2,0), (3,0), (3,0), (3,0)) # inside double-bracketed table reference | |
state_table[4] = ((4,0), (4,0), (4,0), (4,0), (4,0), (4,0), (-1,0), (3,0), (4,0), (4,0), (4,0)) # inside triple-bracketed table reference (I don't think this exists) | |
state_table[5] = ((1,2), (0,2), (5,2), (0,0), (0,2), (0,5), (2,2), (0,2), (0,2), (0,2), (0,2)) # found left-paren; only wrap and insert if not empty, like =row() | |
state_table[6] = ((6,0), (6,0), (6,0), (6,0), (6,0), (6,0), (6,0), (6,0), (6,0), (0,0), (6,0)) # inside curly-braced array | |
# table of actions to take, corresponding to the right-hand part of the state_table pairs; undefined numbers | |
action_table = { | |
0: "{z}", | |
1: "{newline}{indent}{z}{newline}{indent}", | |
2: "{newline}{indent}{z}", | |
3: "{newline}{indent}{z}", | |
4: "{z}{newline}{indent}" | |
} | |
# table of amounts to change the paren_depth based on the action | |
depth_table = { | |
2: +1, | |
3: -1 | |
} | |
# create mapping of characters to event number, such as mapping "&" to 1 | |
char_to_event = {char: i for i, event_chars in enumerate(event_list) for char in event_chars} | |
# initialize the state, parenthesis depth, and output text | |
current_state = 0 | |
paren_depth = 0 | |
output_text = "" | |
# define the tab and new line characters | |
TAB_CHAR = "\t" | |
NEW_LINE = "\r\n" | |
for z in input_text: | |
event = char_to_event.get(z, -1) # determine event based character and default to -1 for everything not specified | |
current_state, take_action = state_table[current_state][event] # move to the next state and set the take_action value | |
paren_depth += depth_table.get(take_action, 0) | |
output_text += action_table.get(take_action, "").format( | |
z=z, | |
indent=TAB_CHAR*paren_depth, | |
newline=NEW_LINE) | |
# editor.replaceSel(output_text) | |
print(output_text) |
Oh man, as soon as I posted this, I realized that I left out support for references to Table cells, such as TablePeriods[[#Headers],[YearMonth]]
. I'll have to add that soon.
Okay. I believe that states 3-5 handle square-brackets depths pretty well. Square brackets inside quotes will be ignored. Should be all good on that front.
Uploading an improvement that handles curly braces and empty parentheses
Now using spaces instead of tabs for code.
I just made some major improvements based on suggestions from http://codereview.stackexchange.com/questions/44699/how-pythonic-is-my-excel-formula-analysis-state-machine
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This is my implementation of a state machine. No idea if this is the Pythonic way to do a state machine...