Created
March 25, 2014 12:10
-
-
Save bennadel/9760573 to your computer and use it in GitHub Desktop.
Parsing CSV Data With An Input Stream And A Finite State Machine
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<cfcomponent | |
output="false" | |
hint="I listen for CSV parser events and compile a array of arrays."> | |
<cffunction | |
name="init" | |
access="public" | |
returntype="any" | |
output="false" | |
hint="I initialize this component."> | |
<!--- Set up the data. ---> | |
<cfset variables.csvData = [] /> | |
<!--- Return this object reference. ---> | |
<cfreturn this /> | |
</cffunction> | |
<cffunction | |
name="getData" | |
access="public" | |
returntype="array" | |
output="false" | |
hint="I return the current data collection."> | |
<cfreturn duplicate( variables.csvData ) /> | |
</cffunction> | |
<cffunction | |
name="handleEvent" | |
access="public" | |
returntype="any" | |
output="false" | |
hint="I listen for and then response to events published by a CSV parser."> | |
<!--- Define arguments. ---> | |
<cfargument | |
name="eventType" | |
type="string" | |
required="true" | |
hint="I am the type of event being raised." | |
/> | |
<cfargument | |
name="eventData" | |
type="string" | |
required="false" | |
default="" | |
hint="I am the (optional) data being published along with the CSV parsing event." | |
/> | |
<!--- | |
<cffile | |
action="append" | |
file="#expandPath( './log.txt' )#" | |
output="#arguments.eventType# [#arguments.eventData#]" | |
addnewline="true" | |
/> | |
---> | |
<!--- Check to see what kind of event we have. ---> | |
<cfif (arguments.eventType eq "startRow")> | |
<!--- Push a new row onto the data. ---> | |
<cfset arrayAppend( | |
variables.csvData, | |
arrayNew( 1 ) | |
) /> | |
<cfelseif (arguments.eventType eq "endField")> | |
<!--- Push this field onto the latest row. ---> | |
<cfset arrayAppend( | |
variables.csvData[ arrayLen( variables.csvData ) ], | |
arguments.eventData | |
) /> | |
</cfif> | |
<!--- Return this object reference. ---> | |
<cfreturn this /> | |
</cffunction> | |
</cfcomponent> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<!--- | |
Get the file path to the CSV data file that we will be reading | |
in with an Input Stream (so as not to have to read the whole | |
file at one time). | |
---> | |
<cfset filePath = expandPath( "./widgets.csv" ) /> | |
<!--- | |
Create our handler. This must have one method - handleEvent() - | |
which can respond to events published by the CSV parser. | |
---> | |
<cfset handler = createObject( "component", "Handler" ).init() /> | |
<!--- Create our CSV data evented parser. ---> | |
<cfset parser = createObject( "component", "CSVParser" ).init( | |
filePath, | |
handler | |
) /> | |
<!--- Output the result. ---> | |
<cfdump | |
var="#handler.getData()#" | |
label="CSV Data" | |
/> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<cfcomponent | |
output="false" | |
hint="I parse a CSV file using a buffered input reader. Rather than parsing the entire file at one time, events are published as aspects of the file are read."> | |
<!--- | |
This finite state machine is used to parse Comma Serparated | |
Values. The following states available are: | |
- Pre-Data (first state - only used once) | |
- Between Fields | |
- Non-Quoted Value | |
- Quoted Value | |
- Escaped Value | |
- Carriage Return | |
- New Line | |
---> | |
<cffunction | |
name="init" | |
access="public" | |
returntype="any" | |
output="false" | |
hint="I initialize this component instance."> | |
<!--- Define arguments. ---> | |
<cfargument | |
name="filePath" | |
type="string" | |
required="true" | |
hint="I am the file path to the CSV data." | |
/> | |
<cfargument | |
name="handler" | |
type="any" | |
required="true" | |
hint="I am the object that listens for CSV parsing events. Only one method is requires: handleEvent()." | |
/> | |
<!--- Store the file path. ---> | |
<cfset variables.filePath = arguments.filePath /> | |
<!--- Store the handler for the parsing events. ---> | |
<cfset variables.handler = arguments.handler /> | |
<!--- | |
Store a buffered input stream to the given file path. | |
This will allow us to optimize the input process while, | |
at the same time, not having to parse the entier file | |
in memory at any given time. | |
---> | |
<cfset variables.inputStream = createObject( "java", "java.io.BufferedInputStream" ).init( | |
createObject( "java", "java.io.FileInputStream" ).init( | |
javaCast( "string", variables.filePath ) | |
) | |
) /> | |
<!--- | |
I am the current value buffer. As we are building field | |
values up, a character at a time, we will need a place | |
to hold them before we publish a field event. | |
---> | |
<cfset variables.fieldBuffer = [] /> | |
<!--- | |
I am the current state. In our case, a state is | |
represented by a parser that can take one character | |
at a time. To begin with, we will put the state into a | |
pre-data state (this is the only time that it will be | |
used in order to see if any data is in the document). | |
---> | |
<cfset variables.state = this.inPreData /> | |
<!--- Start the actual CSV input stream parsing. ---> | |
<cfset this.parse() /> | |
<!--- Return this object reference. ---> | |
<cfreturn this /> | |
</cffunction> | |
<cffunction | |
name="parse" | |
access="public" | |
returntype="any" | |
output="false" | |
hint="I perform the actual parsing of the CSV input stream."> | |
<!--- Define the local scope. ---> | |
<cfset var local = {} /> | |
<!--- | |
Even if the document has no data, we will, at the very | |
least, start and end the document. | |
---> | |
<cfset this.publish( "startDocument" ) /> | |
<!--- Read the first character in the CSV input stream. ---> | |
<cfset local.nextByte = variables.inputStream.read() /> | |
<!--- | |
The input stream will be providing a single byte at a | |
time. It will continue doing this until it hits the end | |
of the stream, at which point, it will return -1. | |
---> | |
<cfloop condition="(local.nextByte neq -1)"> | |
<!--- Get the character version of the byte. ---> | |
<cfset local.nextCharacter = chr( local.nextByte ) /> | |
<!--- | |
Pass the character off to the current state. When the | |
state looks at the character, it will (potentially) | |
announce events and then return the state to which we | |
should transition. | |
---> | |
<cfset variables.state = variables.state( local.nextCharacter ) /> | |
<!--- Read the next byte. ---> | |
<cfset local.nextByte = variables.inputStream.read() /> | |
</cfloop> | |
<!--- | |
Now that the document has ended, we need to pass it onto | |
the current state so that it can wrap it up appropriately | |
(or fail if the End-of-File is in an inappropriate place). | |
At this point, we don't care about storing the resultant | |
state since we are done parsing. | |
NOTE: We are using EOT (end of transmission) to denote | |
the "End of File" since we can use RegEx to find that. | |
---> | |
<cfset variables.state( chr( 4 ) ) /> | |
<!--- End the document. ---> | |
<cfset this.publish( "endDocument" ) /> | |
<!--- Return this object reference. ---> | |
<cfreturn this /> | |
</cffunction> | |
<cffunction | |
name="publish" | |
access="public" | |
returntype="any" | |
output="false" | |
hint="I publish the given event with the given data."> | |
<!--- Define arguments. ---> | |
<cfargument | |
name="eventType" | |
type="string" | |
required="true" | |
hint="I am the even type. Possible types are: startDocument, startRow, startField, endField, endRow, endDocument." | |
/> | |
<cfargument | |
name="eventData" | |
type="string" | |
required="false" | |
hint="I am the optional data to announce with the event." | |
/> | |
<!--- | |
For our purposes, we'll just pass the invocation | |
arguments along to the event handler. | |
---> | |
<cfset variables.handler.handleEvent( | |
argumentCollection = arguments | |
) /> | |
<!--- Return this object reference for method chaining. ---> | |
<cfreturn this /> | |
</cffunction> | |
<cffunction | |
name="inBetweenFields" | |
access="public" | |
returntype="any" | |
output="false" | |
hint=""> | |
<!--- Define arguments. ---> | |
<cfargument | |
name="nextCharacter" | |
type="string" | |
required="true" | |
hint="I am the next character in the input stream." | |
/> | |
<!--- Define the local scope. ---> | |
<cfset var local = {} /> | |
<!--- Field character. ---> | |
<cfif reFind( "[^\r\n,""\x04]", arguments.nextCharacter )> | |
<!--- Start the new field. ---> | |
<cfset this.publish( "startField" ) /> | |
<!--- | |
Add the current character to the field buffer (as we | |
being to build up the field value). | |
---> | |
<cfset arrayAppend( | |
variables.fieldBuffer, | |
arguments.nextCharacter | |
) /> | |
<!--- Move to the non-quoted value. ---> | |
<cfreturn this.inNonQuotedValue /> | |
<!--- Comma. ---> | |
<cfelseif (arguments.nextCharacter eq ",")> | |
<!--- Start and end an empty field. ---> | |
<cfset this.publish( "startField" ) /> | |
<cfset this.publish( "endField", "" ) /> | |
<!--- Move to in between fields. ---> | |
<cfreturn this.inBetweenFields /> | |
<!--- Carriage return. ---> | |
<cfelseif reFind( "\r", arguments.nextCharacter )> | |
<!--- Start and end an empty field. ---> | |
<cfset this.publish( "startField" ) /> | |
<cfset this.publish( "endField", "" ) /> | |
<!--- End the row. ---> | |
<cfset this.publish( "endRow" ) /> | |
<!--- Move the carriage return. ---> | |
<cfreturn this.inCarriageReturn /> | |
<!--- New line. ---> | |
<cfelseif reFind( "\n", arguments.nextCharacter )> | |
<!--- Start and end an empty field. ---> | |
<cfset this.publish( "startField" ) /> | |
<cfset this.publish( "endField", "" ) /> | |
<!--- End the row. ---> | |
<cfset this.publish( "endRow" ) /> | |
<!--- Move to the new line. ---> | |
<cfreturn this.inNewLine /> | |
<!--- Double Quote. ---> | |
<cfelseif (arguments.nextCharacter eq """")> | |
<!--- Start the new field. ---> | |
<cfset this.publish( "startField" ) /> | |
<!--- Move to the quoted value. ---> | |
<cfreturn this.inQuotedValue /> | |
<!--- End of Transmission. ---> | |
<cfelseif reFind( "\x04", arguments.nextCharacter )> | |
<!--- Start and end an empty field. ---> | |
<cfset this.publish( "startField" ) /> | |
<cfset this.publish( "endField", "" ) /> | |
<!--- End the row. ---> | |
<cfset this.publish( "endRow" ) /> | |
<cfelse> | |
<!--- | |
If we made it this far, this state has been put | |
into an invalid state / transition. | |
---> | |
<cfthrow | |
type="InvalidStateTransition" | |
message="inBetweenFields[#arguments.nextCharacter#]" | |
/> | |
</cfif> | |
</cffunction> | |
<cffunction | |
name="inCarriageReturn" | |
access="public" | |
returntype="any" | |
output="false" | |
hint=""> | |
<!--- Define arguments. ---> | |
<cfargument | |
name="nextCharacter" | |
type="string" | |
required="true" | |
hint="I am the next character in the input stream." | |
/> | |
<!--- Define the local scope. ---> | |
<cfset var local = {} /> | |
<!--- New line. ---> | |
<cfif reFind( "\n", arguments.nextCharacter )> | |
<!--- Move to the new line. ---> | |
<cfreturn this.inNewLine /> | |
<!--- Carriage return. ---> | |
<cfelseif reFind( "\r", arguments.nextCharacter )> | |
<!--- Start and end an empty row. ---> | |
<cfset this.publish( "startRow" ) /> | |
<cfset this.publish( "endRow" ) /> | |
<!--- Move the carriage return. ---> | |
<cfreturn this.inCarriageReturn /> | |
<!--- Field character. ---> | |
<cfelseif reFind( "[^\r\n,""\x04]", arguments.nextCharacter )> | |
<!--- Start the next row. ---> | |
<cfset this.publish( "startRow" ) /> | |
<!--- Start the next field. ---> | |
<cfset this.publish( "startField" ) /> | |
<!--- Add the current character to the field buffer. ---> | |
<cfset arrayAppend( | |
variables.fieldBuffer, | |
arguments.nextCharacter | |
) /> | |
<!--- Move to the non-quoted value. ---> | |
<cfreturn this.inNonQuotedValue /> | |
<!--- Comma. ---> | |
<cfelseif (arguments.nextCharacter eq ",")> | |
<!--- Start the new row. ---> | |
<cfset this.publish( "startRow" ) /> | |
<!--- Start and end the empty field. ---> | |
<cfset this.publish( "startField" ) /> | |
<cfset this.publish( "endField", "" ) /> | |
<!--- Move to in between fields. ---> | |
<cfreturn this.inBetweenFields /> | |
<!--- End of Transmission. ---> | |
<cfelseif reFind( "\x04", arguments.nextCharacter )> | |
<!--- Already ended the row - nothing to publish. ---> | |
<cfelse> | |
<!--- | |
If we made it this far, this state has been put | |
into an invalid state / transition. | |
---> | |
<cfthrow | |
type="InvalidStateTransition" | |
message="inCarriageReturn[#arguments.nextCharacter#]" | |
/> | |
</cfif> | |
</cffunction> | |
<cffunction | |
name="inEscapedValue" | |
access="public" | |
returntype="any" | |
output="false" | |
hint=""> | |
<!--- Define arguments. ---> | |
<cfargument | |
name="nextCharacter" | |
type="string" | |
required="true" | |
hint="I am the next character in the input stream." | |
/> | |
<!--- Define the local scope. ---> | |
<cfset var local = {} /> | |
<!--- Double-escaped quote. ---> | |
<cfif (arguments.nextCharacter eq """")> | |
<!--- | |
This is just an embedded quote. Add it to the | |
field buffer. We don't have to worry about the | |
previous double-quote as it was only used to | |
escape this one. | |
---> | |
<cfset arrayAppend( | |
variables.fieldBuffer, | |
arguments.nextCharacter | |
) /> | |
<!--- Return back to the quoted value. ---> | |
<cfreturn this.inQuotedValue /> | |
<!--- Comma. ---> | |
<cfelseif (arguments.nextCharacter eq ",")> | |
<!--- | |
The previous quote was actually the end of the | |
previous field. End the previous field. | |
---> | |
<cfset this.publish( | |
"endField", | |
arrayToList( variables.fieldBuffer, "" ) | |
) /> | |
<!--- Clear the field buffer. ---> | |
<cfset variables.fieldBuffer = [] /> | |
<!--- Move to in between fields. ---> | |
<cfreturn this.inBetweenFields /> | |
<!--- Carriage return. ---> | |
<cfelseif reFind( "\r", arguments.nextCharacter )> | |
<!--- | |
The previous quote was actually the end of the | |
previous field. End the current field. | |
---> | |
<cfset this.publish( | |
"endField", | |
arrayToList( variables.fieldBuffer, "" ) | |
) /> | |
<!--- Clear the field buffer. ---> | |
<cfset variables.fieldBuffer = [] /> | |
<!--- End the row. ---> | |
<cfset this.publish( "endRow" ) /> | |
<!--- Move the carriage return. ---> | |
<cfreturn this.inCarriageReturn /> | |
<!--- New line. ---> | |
<cfelseif reFind( "\n", arguments.nextCharacter )> | |
<!--- | |
The previous quote was actually the end of the | |
previous field. End the current field. | |
---> | |
<cfset this.publish( | |
"endField", | |
arrayToList( variables.fieldBuffer, "" ) | |
) /> | |
<!--- Clear the field buffer. ---> | |
<cfset variables.fieldBuffer = [] /> | |
<!--- End the row. ---> | |
<cfset this.publish( "endRow" ) /> | |
<!--- Move to the new line. ---> | |
<cfreturn this.inNewLine /> | |
<!--- End of Transmission. ---> | |
<cfelseif reFind( "\x04", arguments.nextCharacter )> | |
<!--- | |
The previous quote was actually the end of the | |
previous field. End the current field. | |
---> | |
<cfset this.publish( | |
"endField", | |
arrayToList( variables.fieldBuffer, "" ) | |
) /> | |
<!--- Clear the field buffer. ---> | |
<cfset variables.fieldBuffer = [] /> | |
<!--- End the row. ---> | |
<cfset this.publish( "endRow" ) /> | |
<cfelse> | |
<!--- | |
If we made it this far, this state has been put | |
into an invalid state / transition. | |
---> | |
<cfthrow | |
type="InvalidStateTransition" | |
message="inEscapedValue[#arguments.nextCharacter#]" | |
/> | |
</cfif> | |
</cffunction> | |
<cffunction | |
name="inNewLine" | |
access="public" | |
returntype="any" | |
output="false" | |
hint=""> | |
<!--- Define arguments. ---> | |
<cfargument | |
name="nextCharacter" | |
type="string" | |
required="true" | |
hint="I am the next character in the input stream." | |
/> | |
<!--- Define the local scope. ---> | |
<cfset var local = {} /> | |
<!--- Carriage return. ---> | |
<cfif reFind( "\r", arguments.nextCharacter )> | |
<!--- End the row. ---> | |
<cfset this.publish( "endRow" ) /> | |
<!--- Move the carriage return. ---> | |
<cfreturn this.inCarriageReturn /> | |
<!--- New line. ---> | |
<cfelseif reFind( "\n", arguments.nextCharacter )> | |
<!--- Start and end an empty row. ---> | |
<cfset this.publish( "startRow" ) /> | |
<cfset this.publish( "endRow" ) /> | |
<!--- Move to the new line. ---> | |
<cfreturn this.inNewLine /> | |
<!--- Field character. ---> | |
<cfelseif reFind( "[^\r\n,""\x04]", arguments.nextCharacter )> | |
<!--- Start the next row. ---> | |
<cfset this.publish( "startRow" ) /> | |
<!--- Start the next field. ---> | |
<cfset this.publish( "startField" ) /> | |
<!--- Add the current character to the field buffer. ---> | |
<cfset arrayAppend( | |
variables.fieldBuffer, | |
arguments.nextCharacter | |
) /> | |
<!--- Move to the non-quoted value. ---> | |
<cfreturn this.inNonQuotedValue /> | |
<!--- Comma. ---> | |
<cfelseif (arguments.nextCharacter eq ",")> | |
<!--- Start the new row. ---> | |
<cfset this.publish( "startRow" ) /> | |
<!--- Start and end the empty field. ---> | |
<cfset this.publish( "startField" ) /> | |
<cfset this.publish( "endField", "" ) /> | |
<!--- Move to in between fields. ---> | |
<cfreturn this.inBetweenFields /> | |
<!--- Double-quote. ---> | |
<cfelseif (arguments.nextCharacter eq """")> | |
<!--- Start the new row. ---> | |
<cfset this.publish( "startRow" ) /> | |
<!--- Start the next field. ---> | |
<cfset this.publish( "startField" ) /> | |
<!--- Move to quoted value. ---> | |
<cfreturn this.inQuotedValue /> | |
<!--- End of Transmission. ---> | |
<cfelseif reFind( "\x04", arguments.nextCharacter )> | |
<!--- Already ended the row, nothing left to do. ---> | |
<cfelse> | |
<!--- | |
If we made it this far, this state has been put | |
into an invalid state / transition. | |
---> | |
<cfthrow | |
type="InvalidStateTransition" | |
message="inNewLine[#arguments.nextCharacter#]" | |
/> | |
</cfif> | |
</cffunction> | |
<cffunction | |
name="inNonQuotedValue" | |
access="public" | |
returntype="any" | |
output="false" | |
hint=""> | |
<!--- Define arguments. ---> | |
<cfargument | |
name="nextCharacter" | |
type="string" | |
required="true" | |
hint="I am the next character in the input stream." | |
/> | |
<!--- Define the local scope. ---> | |
<cfset var local = {} /> | |
<!--- Field character. ---> | |
<cfif reFind( "[^\r\n,""\x04]", arguments.nextCharacter )> | |
<!--- Add the current character to the field buffer. ---> | |
<cfset arrayAppend( | |
variables.fieldBuffer, | |
arguments.nextCharacter | |
) /> | |
<!--- Move to the non-quoted value. ---> | |
<cfreturn this.inNonQuotedValue /> | |
<!--- Comma. ---> | |
<cfelseif (arguments.nextCharacter eq ",")> | |
<!--- End the current field. ---> | |
<cfset this.publish( | |
"endField", | |
arrayToList( variables.fieldBuffer, "" ) | |
) /> | |
<!--- Clear the field buffer. ---> | |
<cfset variables.fieldBuffer = [] /> | |
<!--- Move to in between fields. ---> | |
<cfreturn this.inBetweenFields /> | |
<!--- Carriage return. ---> | |
<cfelseif reFind( "\r", arguments.nextCharacter )> | |
<!--- End the current field. ---> | |
<cfset this.publish( | |
"endField", | |
arrayToList( variables.fieldBuffer, "" ) | |
) /> | |
<!--- Clear the field buffer. ---> | |
<cfset variables.fieldBuffer = [] /> | |
<!--- End the row. ---> | |
<cfset this.publish( "endRow" ) /> | |
<!--- Move the carriage return. ---> | |
<cfreturn this.inCarriageReturn /> | |
<!--- New line. ---> | |
<cfelseif reFind( "\n", arguments.nextCharacter )> | |
<!--- End the current field. ---> | |
<cfset this.publish( | |
"endField", | |
arrayToList( variables.fieldBuffer, "" ) | |
) /> | |
<!--- Clear the field buffer. ---> | |
<cfset variables.fieldBuffer = [] /> | |
<!--- End the row. ---> | |
<cfset this.publish( "endRow" ) /> | |
<!--- Move to the new line. ---> | |
<cfreturn this.inNewLine /> | |
<!--- End of Transmission. ---> | |
<cfelseif reFind( "\x04", arguments.nextCharacter )> | |
<!--- End the current field. ---> | |
<cfset this.publish( | |
"endField", | |
arrayToList( variables.fieldBuffer, "" ) | |
) /> | |
<!--- Clear the field buffer. ---> | |
<cfset variables.fieldBuffer = [] /> | |
<!--- End the row. ---> | |
<cfset this.publish( "endRow" ) /> | |
<cfelse> | |
<!--- | |
If we made it this far, this state has been put | |
into an invalid state / transition. | |
---> | |
<cfthrow | |
type="InvalidStateTransition" | |
message="inNonQuotedValue[#arguments.nextCharacter#]" | |
/> | |
</cfif> | |
</cffunction> | |
<cffunction | |
name="inPreData" | |
access="public" | |
returntype="any" | |
output="false" | |
hint=""> | |
<!--- Define arguments. ---> | |
<cfargument | |
name="nextCharacter" | |
type="string" | |
required="true" | |
hint="I am the next character in the input stream." | |
/> | |
<!--- Define the local scope. ---> | |
<cfset var local = {} /> | |
<!--- Comma. ---> | |
<cfif (arguments.nextCharacter eq ",")> | |
<!--- Start the current row. ---> | |
<cfset this.publish( "startRow" ) /> | |
<!--- Start and end an empty field. ---> | |
<cfset this.publish( "startField" ) /> | |
<cfset this.publish( "endField", "" ) /> | |
<!--- Move to in between fields. ---> | |
<cfreturn this.inBetweenFields /> | |
<!--- Carriage return. ---> | |
<cfelseif reFind( "\r", arguments.nextCharacter )> | |
<!--- Start and end the row. ---> | |
<cfset this.publish( "startRow" ) /> | |
<cfset this.publish( "endRow" ) /> | |
<!--- Move the carriage return. ---> | |
<cfreturn this.inCarriageReturn /> | |
<!--- New line. ---> | |
<cfelseif reFind( "\n", arguments.nextCharacter )> | |
<!--- Start and end the row. ---> | |
<cfset this.publish( "startRow" ) /> | |
<cfset this.publish( "endRow" ) /> | |
<!--- Move to the new line. ---> | |
<cfreturn this.inNewLine /> | |
<!--- Double Quote. ---> | |
<cfelseif (arguments.nextCharacter eq """")> | |
<!--- Start the first row. ---> | |
<cfset this.publish( "startRow" ) /> | |
<!--- Start the new field. ---> | |
<cfset this.publish( "startField" ) /> | |
<!--- Move to the quoted value. ---> | |
<cfreturn this.inQuotedValue /> | |
<!--- Field character. ---> | |
<cfelseif reFind( "[^\r\n,""\x04]", arguments.nextCharacter )> | |
<!--- Start the first row. ---> | |
<cfset this.publish( "startRow" ) /> | |
<!--- Start the new field. ---> | |
<cfset this.publish( "startField" ) /> | |
<!--- | |
Add the current character to the field buffer (as we | |
being to build up the field value). | |
---> | |
<cfset arrayAppend( | |
variables.fieldBuffer, | |
arguments.nextCharacter | |
) /> | |
<!--- Move to the non-quoted value. ---> | |
<cfreturn this.inNonQuotedValue /> | |
<!--- End of Transmission. ---> | |
<cfelseif reFind( "\x04", arguments.nextCharacter )> | |
<!--- This file had no data, nothing left to do. ---> | |
<cfelse> | |
<!--- | |
If we made it this far, this state has been put | |
into an invalid state / transition. | |
---> | |
<cfthrow | |
type="InvalidStateTransition" | |
message="inBetweenRows[#arguments.nextCharacter#]" | |
/> | |
</cfif> | |
</cffunction> | |
<cffunction | |
name="inQuotedValue" | |
access="public" | |
returntype="any" | |
output="false" | |
hint=""> | |
<!--- Define arguments. ---> | |
<cfargument | |
name="nextCharacter" | |
type="string" | |
required="true" | |
hint="I am the next character in the input stream." | |
/> | |
<!--- Define the local scope. ---> | |
<cfset var local = {} /> | |
<!--- Non-double-quote. ---> | |
<cfif (arguments.nextCharacter neq """")> | |
<!--- Add the current character to the field buffer. ---> | |
<cfset arrayAppend( | |
variables.fieldBuffer, | |
arguments.nextCharacter | |
) /> | |
<!--- Move to the quoted value. ---> | |
<cfreturn this.inQuotedValue /> | |
<!--- Double quote. ---> | |
<cfelseif (arguments.nextCharacter eq """")> | |
<!--- | |
Not sure if this quote is an escaped quote or is the | |
end of this quoted field. Move to the escaped state | |
for further testing. | |
---> | |
<cfreturn this.inEscapedValue /> | |
<cfelse> | |
<!--- | |
If we made it this far, this state has been put | |
into an invalid state / transition. | |
---> | |
<cfthrow | |
type="InvalidStateTransition" | |
message="inQuotedValue[#arguments.nextCharacter#]" | |
/> | |
</cfif> | |
</cffunction> | |
</cfcomponent> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment