Skip to content

Instantly share code, notes, and snippets.

@mattyb149
Created May 19, 2017 19:04
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save mattyb149/6c9ac2d0961b8ff38ad716646f45b073 to your computer and use it in GitHub Desktop.
Save mattyb149/6c9ac2d0961b8ff38ad716646f45b073 to your computer and use it in GitHub Desktop.
A Groovy script for NiFi ExecuteScript to extract the schema from the header line of a CSV file
import groovy.json.*
def flowFile = session.get()
if(!flowFile) return
def delim = ','
try {
delim = delimiter?.value ?: ','
} catch (MissingPropertyException mpe) { }
try {
def line
def inputStream = session.read(flowFile)
inputStream.withReader { line = it.readLine() }
inputStream.close()
def json = new JsonBuilder()
json {
type('record')
name('csv_record')
fields(line.tokenize(delim).collect { col ->
['name': col, 'type': ["null","string"]]
})
}
flowFile = session.putAttribute(flowFile, 'avro.schema', json.toString())
session.transfer(flowFile, REL_SUCCESS)
} catch(Exception e) {
log.error('Error processing file, transferring to failure', e)
session.transfer(flowFile, REL_FAILURE)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment