Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mattyb149/53530bb8f1b332cf82b7 to your computer and use it in GitHub Desktop.
Save mattyb149/53530bb8f1b332cf82b7 to your computer and use it in GitHub Desktop.
NiFi template to get CSV data from randomuser.me & capitalize first/last name then output full name
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><template><description>This template takes CSV data from randomuser.me and capitalizes the first name and last name from the input and outputs the full name</description><name>InvokeScriptedProcessor_Groovy_PrintFullName</name><snippet><connections><id>3287d7e9-d003-4b4e-8a84-83cba7972923</id><parentGroupId>af7523d7-add4-43b6-ba5d-efe10bfd727f</parentGroupId><backPressureDataSizeThreshold>0 MB</backPressureDataSizeThreshold><backPressureObjectThreshold>0</backPressureObjectThreshold><destination><groupId>af7523d7-add4-43b6-ba5d-efe10bfd727f</groupId><id>a2fc65ea-dea4-4e31-aaae-02adf8cdbce0</id><type>PROCESSOR</type></destination><flowFileExpiration>0 sec</flowFileExpiration><labelIndex>1</labelIndex><name></name><selectedRelationships>success</selectedRelationships><source><groupId>af7523d7-add4-43b6-ba5d-efe10bfd727f</groupId><id>76d2915c-5ae5-484a-baaf-7640b36ba9e5</id><type>PROCESSOR</type></source><zIndex>0</zIndex></connections><connections><id>a7c6aaa6-d43e-4fe5-a33f-7da550c12d99</id><parentGroupId>af7523d7-add4-43b6-ba5d-efe10bfd727f</parentGroupId><backPressureDataSizeThreshold>0 MB</backPressureDataSizeThreshold><backPressureObjectThreshold>0</backPressureObjectThreshold><destination><groupId>af7523d7-add4-43b6-ba5d-efe10bfd727f</groupId><id>c171bb7a-fb18-4ce1-b60e-3d1fd1fb4c24</id><type>PROCESSOR</type></destination><flowFileExpiration>0 sec</flowFileExpiration><labelIndex>1</labelIndex><name></name><selectedRelationships>failure</selectedRelationships><selectedRelationships>success</selectedRelationships><source><groupId>af7523d7-add4-43b6-ba5d-efe10bfd727f</groupId><id>13301a72-2a31-4b52-9135-88e3fdd89861</id><type>PROCESSOR</type></source><zIndex>0</zIndex></connections><connections><id>797091ef-e0e3-4e1e-b6e0-25b950fc1c9f</id><parentGroupId>af7523d7-add4-43b6-ba5d-efe10bfd727f</parentGroupId><backPressureDataSizeThreshold>0 MB</backPressureDataSizeThreshold><backPressureObjectThreshold>0</backPressureObjectThreshold><destination><groupId>af7523d7-add4-43b6-ba5d-efe10bfd727f</groupId><id>76d2915c-5ae5-484a-baaf-7640b36ba9e5</id><type>PROCESSOR</type></destination><flowFileExpiration>0 sec</flowFileExpiration><labelIndex>1</labelIndex><name></name><selectedRelationships>success</selectedRelationships><source><groupId>af7523d7-add4-43b6-ba5d-efe10bfd727f</groupId><id>c171bb7a-fb18-4ce1-b60e-3d1fd1fb4c24</id><type>PROCESSOR</type></source><zIndex>0</zIndex></connections><connections><id>2e003733-d75d-40e8-be56-dab03b89cfac</id><parentGroupId>af7523d7-add4-43b6-ba5d-efe10bfd727f</parentGroupId><backPressureDataSizeThreshold>0 MB</backPressureDataSizeThreshold><backPressureObjectThreshold>0</backPressureObjectThreshold><destination><groupId>af7523d7-add4-43b6-ba5d-efe10bfd727f</groupId><id>13301a72-2a31-4b52-9135-88e3fdd89861</id><type>PROCESSOR</type></destination><flowFileExpiration>0 sec</flowFileExpiration><labelIndex>1</labelIndex><name></name><selectedRelationships>Response</selectedRelationships><source><groupId>af7523d7-add4-43b6-ba5d-efe10bfd727f</groupId><id>8f1c2905-425f-4971-bab4-29f95a4b3dba</id><type>PROCESSOR</type></source><zIndex>0</zIndex></connections><processors><id>13301a72-2a31-4b52-9135-88e3fdd89861</id><parentGroupId>af7523d7-add4-43b6-ba5d-efe10bfd727f</parentGroupId><position><x>531.6305632075491</x><y>968.8695583000617</y></position><config><bulletinLevel>WARN</bulletinLevel><comments>The CSV Data coming from randomuser.me contains a header line but does not have a # on the beginning. This Processor adds the # to the beginning of the first line.</comments><concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount><defaultConcurrentTasks><entry><key>TIMER_DRIVEN</key><value>1</value></entry><entry><key>EVENT_DRIVEN</key><value>0</value></entry><entry><key>CRON_DRIVEN</key><value>1</value></entry></defaultConcurrentTasks><defaultSchedulingPeriod><entry><key>TIMER_DRIVEN</key><value>0 sec</value></entry><entry><key>CRON_DRIVEN</key><value>* * * * * ?</value></entry></defaultSchedulingPeriod><descriptors><entry><key>Regular Expression</key><value><defaultValue>(?s:^.*$)</defaultValue><description>The Search Value to search for in the FlowFile content. Only used for 'Literal Replace' and 'Regex Replace' matching strategies</description><displayName>Search Value</displayName><dynamic>false</dynamic><name>Regular Expression</name><required>true</required><sensitive>false</sensitive><supportsEl>true</supportsEl></value></entry><entry><key>Replacement Value</key><value><defaultValue>$1</defaultValue><description>The value to insert using the 'Replacement Strategy'. Using &quot;Regex Replace&quot; back-references to Regular Expression capturing groups are supported, but back-references that reference capturing groups that do not exist in the regular expression will be treated as literal value. Back References may also be referenced using the Expression Language, as '$1', '$2', etc. The single-tick marks MUST be included, as these variables are not &quot;Standard&quot; attribute names (attribute names must be quoted unless they contain only numbers, letters, and _).</description><displayName>Replacement Value</displayName><dynamic>false</dynamic><name>Replacement Value</name><required>true</required><sensitive>false</sensitive><supportsEl>true</supportsEl></value></entry><entry><key>Character Set</key><value><defaultValue>UTF-8</defaultValue><description>The Character Set in which the file is encoded</description><displayName>Character Set</displayName><dynamic>false</dynamic><name>Character Set</name><required>true</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Maximum Buffer Size</key><value><defaultValue>1 MB</defaultValue><description>Specifies the maximum amount of data to buffer (per file or per line, depending on the Evaluation Mode) in order to apply the replacement. If 'Entire Text' (in Evaluation Mode) is selected and the FlowFile is larger than this value, the FlowFile will be routed to 'failure'. In 'Line-by-Line' Mode, if a single line is larger than this value, the FlowFile will be routed to 'failure'. A default value of 1 MB is provided, primarily for 'Entire Text' mode. In 'Line-by-Line' Mode, a value such as 8 KB or 16 KB is suggested. This value is ignored if the &lt;Replacement Strategy&gt; property is set to one of: Append, Prepend, Always Replace</description><displayName>Maximum Buffer Size</displayName><dynamic>false</dynamic><name>Maximum Buffer Size</name><required>true</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Replacement Strategy</key><value><allowableValues><description>Insert the Replacement Value at the beginning of the FlowFile or the beginning of each line (depending on the Evaluation Mode). For &quot;Line-by-Line&quot; Evaluation Mode, the value will be prepended to each line. For &quot;Entire Text&quot; evaluation mode, the value will be prepended to the entire text.</description><displayName>Prepend</displayName><value>Prepend</value></allowableValues><allowableValues><description>Insert the Replacement Value at the end of the FlowFile or the end of each line (depending on the Evluation Mode). For &quot;Line-by-Line&quot; Evaluation Mode, the value will be appended to each line. For &quot;Entire Text&quot; evaluation mode, the value will be appended to the entire text.</description><displayName>Append</displayName><value>Append</value></allowableValues><allowableValues><description>Interpret the Search Value as a Regular Expression and replace all matches with the Replacement Value. The Replacement Value may reference Capturing Groups used in the Search Value by using a dollar-sign followed by the Capturing Group number, such as $1 or $2. If the Search Value is set to .* then everything is replaced without even evaluating the Regular Expression.</description><displayName>Regex Replace</displayName><value>Regex Replace</value></allowableValues><allowableValues><description>Search for all instances of the Search Value and replace the matches with the Replacement Value.</description><displayName>Literal Replace</displayName><value>Literal Replace</value></allowableValues><allowableValues><description>Always replaces the entire line or the entire contents of the FlowFile (depending on the value of the &lt;Evaluation Mode&gt; property) and does not bother searching for any value. When this strategy is chosen, the &lt;Search Value&gt; property is ignored.</description><displayName>Always Replace</displayName><value>Always Replace</value></allowableValues><defaultValue>Regex Replace</defaultValue><description>The strategy for how and what to replace within the FlowFile's text content.</description><displayName>Replacement Strategy</displayName><dynamic>false</dynamic><name>Replacement Strategy</name><required>true</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Evaluation Mode</key><value><allowableValues><displayName>Line-by-Line</displayName><value>Line-by-Line</value></allowableValues><allowableValues><displayName>Entire text</displayName><value>Entire text</value></allowableValues><defaultValue>Entire text</defaultValue><description>Run the 'Replacement Strategy' against each line separately (Line-by-Line) or buffer the entire file into memory (Entire Text) and run against that.</description><displayName>Evaluation Mode</displayName><dynamic>false</dynamic><name>Evaluation Mode</name><required>true</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry></descriptors><lossTolerant>false</lossTolerant><penaltyDuration>30 sec</penaltyDuration><properties><entry><key>Regular Expression</key><value>(?s:^.*$)</value></entry><entry><key>Replacement Value</key><value>#</value></entry><entry><key>Character Set</key><value>UTF-8</value></entry><entry><key>Maximum Buffer Size</key><value>1 MB</value></entry><entry><key>Replacement Strategy</key><value>Prepend</value></entry><entry><key>Evaluation Mode</key><value>Entire text</value></entry></properties><runDurationMillis>0</runDurationMillis><schedulingPeriod>0 sec</schedulingPeriod><schedulingStrategy>TIMER_DRIVEN</schedulingStrategy><yieldDuration>1 sec</yieldDuration></config><name>Ensure Header Line</name><relationships><autoTerminate>false</autoTerminate><description>FlowFiles that could not be updated are routed to this relationship</description><name>failure</name></relationships><relationships><autoTerminate>false</autoTerminate><description>FlowFiles that have been successfully processed are routed to this relationship. This includes both FlowFiles that had text replaced and those that did not.</description><name>success</name></relationships><state>STOPPED</state><style/><supportsEventDriven>true</supportsEventDriven><supportsParallelProcessing>true</supportsParallelProcessing><type>org.apache.nifi.processors.standard.ReplaceText</type></processors><processors><id>8f1c2905-425f-4971-bab4-29f95a4b3dba</id><parentGroupId>af7523d7-add4-43b6-ba5d-efe10bfd727f</parentGroupId><position><x>527.1134796142578</x><y>779.0995788574219</y></position><config><bulletinLevel>WARN</bulletinLevel><comments></comments><concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount><defaultConcurrentTasks><entry><key>TIMER_DRIVEN</key><value>1</value></entry><entry><key>EVENT_DRIVEN</key><value>0</value></entry><entry><key>CRON_DRIVEN</key><value>1</value></entry></defaultConcurrentTasks><defaultSchedulingPeriod><entry><key>TIMER_DRIVEN</key><value>0 sec</value></entry><entry><key>CRON_DRIVEN</key><value>* * * * * ?</value></entry></defaultSchedulingPeriod><descriptors><entry><key>HTTP Method</key><value><defaultValue>GET</defaultValue><description>HTTP request method (GET, POST, PUT, DELETE, HEAD, OPTIONS). Arbitrary methods are also supported. Methods other than POST and PUT will be sent without a message body.</description><displayName>HTTP Method</displayName><dynamic>false</dynamic><name>HTTP Method</name><required>true</required><sensitive>false</sensitive><supportsEl>true</supportsEl></value></entry><entry><key>Remote URL</key><value><description>Remote URL which will be connected to, including scheme, host, port, path.</description><displayName>Remote URL</displayName><dynamic>false</dynamic><name>Remote URL</name><required>true</required><sensitive>false</sensitive><supportsEl>true</supportsEl></value></entry><entry><key>SSL Context Service</key><value><description>The SSL Context Service used to provide client certificate information for TLS/SSL (https) connections.</description><displayName>SSL Context Service</displayName><dynamic>false</dynamic><identifiesControllerService>org.apache.nifi.ssl.SSLContextService</identifiesControllerService><name>SSL Context Service</name><required>false</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Connection Timeout</key><value><defaultValue>5 secs</defaultValue><description>Max wait time for connection to remote service.</description><displayName>Connection Timeout</displayName><dynamic>false</dynamic><name>Connection Timeout</name><required>true</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Read Timeout</key><value><defaultValue>15 secs</defaultValue><description>Max wait time for response from remote service.</description><displayName>Read Timeout</displayName><dynamic>false</dynamic><name>Read Timeout</name><required>true</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Include Date Header</key><value><allowableValues><displayName>True</displayName><value>True</value></allowableValues><allowableValues><displayName>False</displayName><value>False</value></allowableValues><defaultValue>True</defaultValue><description>Include an RFC-2616 Date header in the request.</description><displayName>Include Date Header</displayName><dynamic>false</dynamic><name>Include Date Header</name><required>true</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Follow Redirects</key><value><allowableValues><displayName>True</displayName><value>True</value></allowableValues><allowableValues><displayName>False</displayName><value>False</value></allowableValues><defaultValue>True</defaultValue><description>Follow HTTP redirects issued by remote server.</description><displayName>Follow Redirects</displayName><dynamic>false</dynamic><name>Follow Redirects</name><required>true</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Attributes to Send</key><value><description>Regular expression that defines which attributes to send as HTTP headers in the request. If not defined, no attributes are sent as headers. Also any dynamic properties set will be sent as headers. The dynamic property key will be the header key and the dynamic property value will be interpreted as expression language will be the header value.</description><displayName>Attributes to Send</displayName><dynamic>false</dynamic><name>Attributes to Send</name><required>false</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Basic Authentication Username</key><value><description>The username to be used by the client to authenticate against the Remote URL. Cannot include control characters (0-31), ':', or DEL (127).</description><displayName>Basic Authentication Username</displayName><dynamic>false</dynamic><name>Basic Authentication Username</name><required>false</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Basic Authentication Password</key><value><description>The password to be used by the client to authenticate against the Remote URL.</description><displayName>Basic Authentication Password</displayName><dynamic>false</dynamic><name>Basic Authentication Password</name><required>false</required><sensitive>true</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Proxy Host</key><value><description>The fully qualified hostname or IP address of the proxy server</description><displayName>Proxy Host</displayName><dynamic>false</dynamic><name>Proxy Host</name><required>false</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Proxy Port</key><value><description>The port of the proxy server</description><displayName>Proxy Port</displayName><dynamic>false</dynamic><name>Proxy Port</name><required>false</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Put Response Body In Attribute</key><value><description>If set, the response body received back will be put into an attribute of the original FlowFile instead of a separate FlowFile. The attribute key to put to is determined by evaluating value of this property. </description><displayName>Put Response Body In Attribute</displayName><dynamic>false</dynamic><name>Put Response Body In Attribute</name><required>false</required><sensitive>false</sensitive><supportsEl>true</supportsEl></value></entry><entry><key>Max Length To Put In Attribute</key><value><defaultValue>256</defaultValue><description>If routing the response body to an attribute of the original (by setting the &quot;Put response body in attribute&quot; property or by receiving an error status code), the number of characters put to the attribute value will be at most this amount. This is important because attributes are held in memory and large attributes will quickly cause out of memory issues. If the output goes longer than this value, it will be truncated to fit. Consider making this smaller if able.</description><displayName>Max Length To Put In Attribute</displayName><dynamic>false</dynamic><name>Max Length To Put In Attribute</name><required>false</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Digest Authentication</key><value><allowableValues><displayName>true</displayName><value>true</value></allowableValues><allowableValues><displayName>false</displayName><value>false</value></allowableValues><defaultValue>false</defaultValue><description>Whether to communicate with the website using Digest Authentication. 'Basic Authentication Username' and 'Basic Authentication Password' are used for authentication.</description><displayName>Use Digest Authentication</displayName><dynamic>false</dynamic><name>Digest Authentication</name><required>false</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Always Output Response</key><value><allowableValues><displayName>true</displayName><value>true</value></allowableValues><allowableValues><displayName>false</displayName><value>false</value></allowableValues><defaultValue>false</defaultValue><description>Will force a response FlowFile to be generated and routed to the 'Response' relationship regardless of what the server status code received is or if the processor is configured to put the server response body in the request attribute. In the later configuration a request FlowFile with the response body in the attribute and a typical response FlowFile will be emitted to their respective relationships.</description><displayName>Always Output Response</displayName><dynamic>false</dynamic><name>Always Output Response</name><required>false</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Trusted Hostname</key><value><description>Bypass the normal truststore hostname verifier to allow the specified remote hostname as trusted. Enabling this property has MITM security implications, use wisely. Will still accept other connections based on the normal truststore hostname verifier. Only valid with SSL (HTTPS) connections.</description><displayName>Trusted Hostname</displayName><dynamic>false</dynamic><name>Trusted Hostname</name><required>false</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Add Response Headers to Request</key><value><allowableValues><displayName>true</displayName><value>true</value></allowableValues><allowableValues><displayName>false</displayName><value>false</value></allowableValues><defaultValue>false</defaultValue><description>Enabling this property saves all the response headers to the original request. This may be when the response headers are needed but a response is not generated due to the status code received.</description><displayName>Add Response Headers to Request</displayName><dynamic>false</dynamic><name>Add Response Headers to Request</name><required>false</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Content-Type</key><value><defaultValue>${mime.type}</defaultValue><description>The Content-Type to specify for when content is being transmitted through a PUT or POST. In the case of an empty value after evaluating an expression language expression, Content-Type defaults to application/octet-stream</description><displayName>Content-Type</displayName><dynamic>false</dynamic><name>Content-Type</name><required>true</required><sensitive>false</sensitive><supportsEl>true</supportsEl></value></entry><entry><key>Use Chunked Encoding</key><value><allowableValues><displayName>true</displayName><value>true</value></allowableValues><allowableValues><displayName>false</displayName><value>false</value></allowableValues><defaultValue>false</defaultValue><description>When POST'ing or PUT'ing content set this property to true in order to not pass the 'Content-length' header and instead send 'Transfer-Encoding' with a value of 'chunked'. This will enable the data transfer mechanism which was introduced in HTTP 1.1 to pass data of unknown lengths in chunks.</description><displayName>Use Chunked Encoding</displayName><dynamic>false</dynamic><name>Use Chunked Encoding</name><required>true</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Penalize on &quot;No Retry&quot;</key><value><allowableValues><displayName>true</displayName><value>true</value></allowableValues><allowableValues><displayName>false</displayName><value>false</value></allowableValues><defaultValue>false</defaultValue><description>Enabling this property will penalize FlowFiles that are routed to the &quot;No Retry&quot; relationship.</description><displayName>Penalize on &quot;No Retry&quot;</displayName><dynamic>false</dynamic><name>Penalize on &quot;No Retry&quot;</name><required>false</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry></descriptors><lossTolerant>false</lossTolerant><penaltyDuration>30 sec</penaltyDuration><properties><entry><key>HTTP Method</key><value>GET</value></entry><entry><key>Remote URL</key><value>http://api.randomuser.me/0.6/?format=csv&amp;nat=us&amp;results=100</value></entry><entry><key>SSL Context Service</key></entry><entry><key>Connection Timeout</key><value>5 secs</value></entry><entry><key>Read Timeout</key><value>15 secs</value></entry><entry><key>Include Date Header</key><value>True</value></entry><entry><key>Follow Redirects</key><value>True</value></entry><entry><key>Attributes to Send</key></entry><entry><key>Basic Authentication Username</key></entry><entry><key>Basic Authentication Password</key></entry><entry><key>Proxy Host</key></entry><entry><key>Proxy Port</key></entry><entry><key>Put Response Body In Attribute</key></entry><entry><key>Max Length To Put In Attribute</key><value>256</value></entry><entry><key>Digest Authentication</key><value>false</value></entry><entry><key>Always Output Response</key><value>false</value></entry><entry><key>Trusted Hostname</key></entry><entry><key>Add Response Headers to Request</key><value>false</value></entry><entry><key>Content-Type</key><value>${mime.type}</value></entry><entry><key>Use Chunked Encoding</key><value>false</value></entry><entry><key>Penalize on &quot;No Retry&quot;</key></entry></properties><runDurationMillis>0</runDurationMillis><schedulingPeriod>2 sec</schedulingPeriod><schedulingStrategy>TIMER_DRIVEN</schedulingStrategy><yieldDuration>1 sec</yieldDuration></config><name>Fetch CSV Data</name><relationships><autoTerminate>true</autoTerminate><description>The original FlowFile will be routed on any type of connection failure, timeout or general exception. It will have new attributes detailing the request.</description><name>Failure</name></relationships><relationships><autoTerminate>true</autoTerminate><description>The original FlowFile will be routed on any status code that should NOT be retried (1xx, 3xx, 4xx status codes). It will have new attributes detailing the request.</description><name>No Retry</name></relationships><relationships><autoTerminate>true</autoTerminate><description>The original FlowFile will be routed upon success (2xx status codes). It will have new attributes detailing the success of the request.</description><name>Original</name></relationships><relationships><autoTerminate>false</autoTerminate><description>A Response FlowFile will be routed upon success (2xx status codes). If the 'Output Response Regardless' property is true then the response will be sent to this relationship regardless of the status code received.</description><name>Response</name></relationships><relationships><autoTerminate>true</autoTerminate><description>The original FlowFile will be routed on any status code that can be retried (5xx status codes). It will have new attributes detailing the request.</description><name>Retry</name></relationships><state>STOPPED</state><style/><supportsEventDriven>false</supportsEventDriven><supportsParallelProcessing>true</supportsParallelProcessing><type>org.apache.nifi.processors.standard.InvokeHTTP</type></processors><processors><id>c171bb7a-fb18-4ce1-b60e-3d1fd1fb4c24</id><parentGroupId>af7523d7-add4-43b6-ba5d-efe10bfd727f</parentGroupId><position><x>535.3472059045525</x><y>1150.985021525768</y></position><config><bulletinLevel>WARN</bulletinLevel><comments></comments><concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount><defaultConcurrentTasks><entry><key>TIMER_DRIVEN</key><value>1</value></entry><entry><key>EVENT_DRIVEN</key><value>0</value></entry><entry><key>CRON_DRIVEN</key><value>1</value></entry></defaultConcurrentTasks><defaultSchedulingPeriod><entry><key>TIMER_DRIVEN</key><value>0 sec</value></entry><entry><key>CRON_DRIVEN</key><value>* * * * * ?</value></entry></defaultSchedulingPeriod><descriptors><entry><key>Delete Attributes Expression</key><value><description>Regular expression for attributes to be deleted from flowfiles.</description><displayName>Delete Attributes Expression</displayName><dynamic>false</dynamic><name>Delete Attributes Expression</name><required>false</required><sensitive>false</sensitive><supportsEl>true</supportsEl></value></entry><entry><key>filename</key><value><description></description><displayName>filename</displayName><dynamic>true</dynamic><name>filename</name><required>false</required><sensitive>false</sensitive><supportsEl>true</supportsEl></value></entry></descriptors><lossTolerant>false</lossTolerant><penaltyDuration>30 sec</penaltyDuration><properties><entry><key>Delete Attributes Expression</key></entry><entry><key>filename</key><value>${uuid}.csv</value></entry></properties><runDurationMillis>0</runDurationMillis><schedulingPeriod>0 sec</schedulingPeriod><schedulingStrategy>TIMER_DRIVEN</schedulingStrategy><yieldDuration>1 sec</yieldDuration></config><name>Assign Unique Filename</name><relationships><autoTerminate>false</autoTerminate><description>All FlowFiles are routed to this relationship</description><name>success</name></relationships><state>STOPPED</state><style/><supportsEventDriven>true</supportsEventDriven><supportsParallelProcessing>true</supportsParallelProcessing><type>org.apache.nifi.processors.attributes.UpdateAttribute</type></processors><processors><id>a2fc65ea-dea4-4e31-aaae-02adf8cdbce0</id><parentGroupId>af7523d7-add4-43b6-ba5d-efe10bfd727f</parentGroupId><position><x>989.5000637145927</x><y>1151.131399091444</y></position><config><bulletinLevel>WARN</bulletinLevel><comments></comments><concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount><defaultConcurrentTasks><entry><key>TIMER_DRIVEN</key><value>1</value></entry><entry><key>EVENT_DRIVEN</key><value>0</value></entry><entry><key>CRON_DRIVEN</key><value>1</value></entry></defaultConcurrentTasks><defaultSchedulingPeriod><entry><key>TIMER_DRIVEN</key><value>0 sec</value></entry><entry><key>CRON_DRIVEN</key><value>* * * * * ?</value></entry></defaultSchedulingPeriod><descriptors><entry><key>Directory</key><value><description>The directory to which files should be written. You may use expression language such as /aa/bb/${path}</description><displayName>Directory</displayName><dynamic>false</dynamic><name>Directory</name><required>true</required><sensitive>false</sensitive><supportsEl>true</supportsEl></value></entry><entry><key>Conflict Resolution Strategy</key><value><allowableValues><displayName>replace</displayName><value>replace</value></allowableValues><allowableValues><displayName>ignore</displayName><value>ignore</value></allowableValues><allowableValues><displayName>fail</displayName><value>fail</value></allowableValues><defaultValue>fail</defaultValue><description>Indicates what should happen when a file with the same name already exists in the output directory</description><displayName>Conflict Resolution Strategy</displayName><dynamic>false</dynamic><name>Conflict Resolution Strategy</name><required>true</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Create Missing Directories</key><value><allowableValues><displayName>true</displayName><value>true</value></allowableValues><allowableValues><displayName>false</displayName><value>false</value></allowableValues><defaultValue>true</defaultValue><description>If true, then missing destination directories will be created. If false, flowfiles are penalized and sent to failure.</description><displayName>Create Missing Directories</displayName><dynamic>false</dynamic><name>Create Missing Directories</name><required>true</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Maximum File Count</key><value><description>Specifies the maximum number of files that can exist in the output directory</description><displayName>Maximum File Count</displayName><dynamic>false</dynamic><name>Maximum File Count</name><required>false</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Last Modified Time</key><value><description>Sets the lastModifiedTime on the output file to the value of this attribute. Format must be yyyy-MM-dd'T'HH:mm:ssZ. You may also use expression language such as ${file.lastModifiedTime}.</description><displayName>Last Modified Time</displayName><dynamic>false</dynamic><name>Last Modified Time</name><required>false</required><sensitive>false</sensitive><supportsEl>true</supportsEl></value></entry><entry><key>Permissions</key><value><description>Sets the permissions on the output file to the value of this attribute. Format must be either UNIX rwxrwxrwx with a - in place of denied permissions (e.g. rw-r--r--) or an octal number (e.g. 644). You may also use expression language such as ${file.permissions}.</description><displayName>Permissions</displayName><dynamic>false</dynamic><name>Permissions</name><required>false</required><sensitive>false</sensitive><supportsEl>true</supportsEl></value></entry><entry><key>Owner</key><value><description>Sets the owner on the output file to the value of this attribute. You may also use expression language such as ${file.owner}.</description><displayName>Owner</displayName><dynamic>false</dynamic><name>Owner</name><required>false</required><sensitive>false</sensitive><supportsEl>true</supportsEl></value></entry><entry><key>Group</key><value><description>Sets the group on the output file to the value of this attribute. You may also use expression language such as ${file.group}.</description><displayName>Group</displayName><dynamic>false</dynamic><name>Group</name><required>false</required><sensitive>false</sensitive><supportsEl>true</supportsEl></value></entry></descriptors><lossTolerant>false</lossTolerant><penaltyDuration>30 sec</penaltyDuration><properties><entry><key>Directory</key><value>./data/out</value></entry><entry><key>Conflict Resolution Strategy</key><value>replace</value></entry><entry><key>Create Missing Directories</key><value>true</value></entry><entry><key>Maximum File Count</key></entry><entry><key>Last Modified Time</key></entry><entry><key>Permissions</key></entry><entry><key>Owner</key></entry><entry><key>Group</key></entry></properties><runDurationMillis>0</runDurationMillis><schedulingPeriod>0 sec</schedulingPeriod><schedulingStrategy>TIMER_DRIVEN</schedulingStrategy><yieldDuration>1 sec</yieldDuration></config><name>PutFile</name><relationships><autoTerminate>true</autoTerminate><description>Files that could not be written to the output directory for some reason are transferred to this relationship</description><name>failure</name></relationships><relationships><autoTerminate>true</autoTerminate><description>Files that have been successfully written to the output directory are transferred to this relationship</description><name>success</name></relationships><state>STOPPED</state><style/><supportsEventDriven>false</supportsEventDriven><supportsParallelProcessing>true</supportsParallelProcessing><type>org.apache.nifi.processors.standard.PutFile</type></processors><processors><id>76d2915c-5ae5-484a-baaf-7640b36ba9e5</id><parentGroupId>af7523d7-add4-43b6-ba5d-efe10bfd727f</parentGroupId><position><x>994.7486178659256</x><y>779.4191923373919</y></position><config><bulletinLevel>WARN</bulletinLevel><comments></comments><concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount><defaultConcurrentTasks><entry><key>TIMER_DRIVEN</key><value>1</value></entry><entry><key>EVENT_DRIVEN</key><value>0</value></entry><entry><key>CRON_DRIVEN</key><value>1</value></entry></defaultConcurrentTasks><defaultSchedulingPeriod><entry><key>TIMER_DRIVEN</key><value>0 sec</value></entry><entry><key>CRON_DRIVEN</key><value>* * * * * ?</value></entry></defaultSchedulingPeriod><descriptors><entry><key>Script Engine</key><value><allowableValues><displayName>ECMAScript</displayName><value>ECMAScript</value></allowableValues><allowableValues><displayName>Groovy</displayName><value>Groovy</value></allowableValues><allowableValues><displayName>lua</displayName><value>lua</value></allowableValues><allowableValues><displayName>python</displayName><value>python</value></allowableValues><allowableValues><displayName>ruby</displayName><value>ruby</value></allowableValues><defaultValue>ECMAScript</defaultValue><description>The engine to execute scripts</description><displayName>Script Engine</displayName><dynamic>false</dynamic><name>Script Engine</name><required>true</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Script File</key><value><description>Path to script file to execute. Only one of Script File or Script Body may be used</description><displayName>Script File</displayName><dynamic>false</dynamic><name>Script File</name><required>false</required><sensitive>false</sensitive><supportsEl>true</supportsEl></value></entry><entry><key>Script Body</key><value><description>Body of script to execute. Only one of Script File or Script Body may be used</description><displayName>Script Body</displayName><dynamic>false</dynamic><name>Script Body</name><required>false</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry><entry><key>Module Directory</key><value><description>Comma-separated list of paths to files and/or directories which contain modules required by the script.</description><displayName>Module Directory</displayName><dynamic>false</dynamic><name>Module Directory</name><required>false</required><sensitive>false</sensitive><supportsEl>false</supportsEl></value></entry></descriptors><lossTolerant>false</lossTolerant><penaltyDuration>30 sec</penaltyDuration><properties><entry><key>Script Engine</key><value>Groovy</value></entry><entry><key>Script File</key></entry><entry><key>Script Body</key><value>
class GroovyProcessor implements Processor {
def REL_SUCCESS = new Relationship.Builder().name(&quot;success&quot;).description(&quot;FlowFiles that were successfully processed&quot;).build();
def ProcessorLog log
@Override
void initialize(ProcessorInitializationContext context) {
log = context.getLogger()
}
@Override
Set&lt;Relationship&gt; getRelationships() {
return [REL_SUCCESS] as Set
}
@Override
void onTrigger(ProcessContext context, ProcessSessionFactory sessionFactory) throws ProcessException {
try {
def session = sessionFactory.createSession()
def flowFile = session.get()
if (!flowFile) return
def selectedColumns = ''
flowFile = session.write(flowFile,
{ inputStream, outputStream -&gt;
String line
final BufferedReader inReader = new BufferedReader(new InputStreamReader(inputStream, 'UTF-8'))
line = inReader.readLine()
String[] header = line?.split(',')
selectedColumns = &quot;${header[1]},${header[2]}&quot;
while (line = inReader.readLine()) {
String[] cols = line.split(',')
outputStream.write(&quot;${cols[2].capitalize()} ${cols[3].capitalize()}\n&quot;.getBytes('UTF-8'))
}
} as StreamCallback)
flowFile = session.putAttribute(flowFile, &quot;selected.columns&quot;, selectedColumns)
flowFile = session.putAttribute(flowFile, &quot;filename&quot;, &quot;split_cols_invoke.txt&quot;)
// transfer
session.transfer(flowFile, REL_SUCCESS)
session.commit()
}
catch (e) {
throw new ProcessException(e)
}
}
@Override
Collection&lt;ValidationResult&gt; validate(ValidationContext context) { return null }
@Override
PropertyDescriptor getPropertyDescriptor(String name) { return null }
@Override
void onPropertyModified(PropertyDescriptor descriptor, String oldValue, String newValue) { }
@Override
List&lt;PropertyDescriptor&gt; getPropertyDescriptors() { return null }
@Override
String getIdentifier() { return null }
}
processor = new GroovyProcessor()</value></entry><entry><key>Module Directory</key></entry></properties><runDurationMillis>0</runDurationMillis><schedulingPeriod>0 sec</schedulingPeriod><schedulingStrategy>TIMER_DRIVEN</schedulingStrategy><yieldDuration>1 sec</yieldDuration></config><name>InvokeScriptedProcessor</name><relationships><autoTerminate>false</autoTerminate><description>FlowFiles that were successfully processed</description><name>success</name></relationships><state>STOPPED</state><style/><supportsEventDriven>false</supportsEventDriven><supportsParallelProcessing>true</supportsParallelProcessing><type>org.apache.nifi.processors.script.InvokeScriptedProcessor</type></processors></snippet><timestamp>02/10/2016 21:10:29 EST</timestamp></template>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment