Skip to content

Instantly share code, notes, and snippets.

View ijokarumawak's full-sized avatar
🎯
Focusing

Koji Kawamura ijokarumawak

🎯
Focusing
View GitHub Profile
@ijokarumawak
ijokarumawak / ExtractEmailHeaders.java
Created July 1, 2016 09:17
NIFI-1899: ExtractEmailHeaders in PR-483
@Override
public void onTrigger(final ProcessContext context, final ProcessSession session) {
final ComponentLog logger = getLogger();
final List<FlowFile> invalidFlowFilesList = new ArrayList<>();
final List<FlowFile> processedFlowFilesList = new ArrayList<>();
final FlowFile originalFlowFile = session.get();
if (originalFlowFile == null) {
return;
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<template>
<description>This template generates messages, puts it to Kafka topic. Then another processor gets messages from
Kafka and put it on HDFS.
</description>
<name>Kerberized Kafka and HDFS</name>
<snippet>
<connections>
<id>2b93ffcd-0698-44a9-86f6-ce0ea6fc4145</id>
<parentGroupId>3bdd324d-db87-4a21-8149-f88d7a46741e</parentGroupId>
@ijokarumawak
ijokarumawak / ncm-nifi.properties
Last active July 19, 2016 06:38
NiFi 0.x cluster configuration
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
@ijokarumawak
ijokarumawak / jolt-shift-spec.json
Last active July 21, 2016 00:44
NIFI-2310 test data
{
"\\@context": {
"name": "&1.Name",
"ingredient": "&1.Inputs",
"yield": "\\@context.Makes",
"*": "&1.&"
},
"name": "Name",
"ingredient": "Inputs",
"yield": "Makes",
@ijokarumawak
ijokarumawak / 00.README.md
Last active February 15, 2022 11:20
NiFi 1.0.0 Site-to-Site performance test

Key findings

  • Measuring performance of a streaming application is difficult. GenerateFlowFile can be useful but understanding NiFi backpressure and scheduling is important.
  • Push provides better load distribution than Pull.
  • Pull can provide the same level of throughput with Push, but latency is bigger. Increasing backpressure threshold is encouraged.
  • Fewer larger flow-files provide better throughput than many smaller flow-files.
  • HTTP provides identical throughput with RAW Site-to-Site, but use slightly more CPU resources.
  • Be careful with Provenance repository max.storage.time, if it's too long for your use-case, CPU will be occupied to rollover the provenance storage and other tasks can't be executed. Once provenance storage starts having too many journal files, it starts backpressure mechanism and holds lock until it clears old events.

Environment

@ijokarumawak
ijokarumawak / 0.README.md
Last active February 15, 2022 11:19
Simple test cases and a sample output to illustrate an issue with HttpAsyncRequestProducer with HTTPS if used the library wrongly.

In order to use HttpAsyncRequestProducer correctly, it's important to know how it works. This Gist has two examples one uses it right, while the other one does it wrong.

@ijokarumawak
ijokarumawak / 0.NiFi-Loop-flow-example.md
Last active March 20, 2024 13:36
NiFi Loop Flow Example

NiFi Loop flow example

This template is analogous to the traditional for(i = 0; i < x; i++) loop in NiFi Data flow.

@ijokarumawak
ijokarumawak / 0.NiFi-S2S-BTW-1.0-0.7.md
Last active February 15, 2022 11:19
NiFI Site-to-Site between 1.0 and 0.7.

NiFi Example: Site-to-Site between 1.0 and 0.7

It's possible to connect NiFi 1.0 and 0.7 using Site-to-Site protocol (RemoteProcessGroup and Input/Output ports).

Transport protocol HTTP is added since NiFi 1.0. So when you connect RPG to NiFi 0.7, only RAW transport protocol works.

@ijokarumawak
ijokarumawak / 0.NiFi-Move-Files-Between-S3-Buckets.md
Last active January 9, 2024 15:17
Move Files between S3 buckets using NiFi

NiFi Flow Example: Move Files between S3 Buckets

Please be careful with specifying the right bucket name, region and credential.

When I misconfigured region, I got the following error:

2016-11-17 10:51:07,828 ERROR [Timer-Driven Process Thread-9] o.a.nifi.processors.aws.s3.PutS3Object
com.amazonaws.services.s3.model.AmazonS3Exception: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint. (Service: Amazon S3; Status Code: 301; ErrorCode: PermanentRedirect; Request ID: 99A62426D8544997)
@ijokarumawak
ijokarumawak / HTML_Processors_Test.xml
Created November 25, 2016 09:06
A NiFi template to test NIFI-3101.
<?xml version="1.0" ?>
<template encoding-version="1.0">
<description>A Process Group using Get/Modify/PutHTMLElement processors.</description>
<groupId>d3fed114-0156-1000-5b68-63e2f6052f7a</groupId>
<name>HTML Processors Test</name>
<snippet>
<processGroups>
<id>9a9dbd8e-0158-1000-0000-000000000000</id>
<parentGroupId>d3fed114-0156-1000-0000-000000000000</parentGroupId>
<position>