Skip to content

Instantly share code, notes, and snippets.

@mattyb149
mattyb149 / get_pentaho_tweets.groovy
Created September 22, 2014 17:47
"Get Pentaho Tweets" scripted datasource in Groovy for Pentaho Report Designer
@Grab(group="org.twitter4j", module="twitter4j-core", version="4.0.2")
import twitter4j.*
import twitter4j.api.*
import twitter4j.conf.*
import org.pentaho.reporting.engine.classic.core.util.TypedTableModel;
colNames = ['Screen Name', 'Tweet Date', 'Text'] as String[]
colTypes = [String, Date, String] as Class[]
model = new TypedTableModel(colNames, colTypes);
@mattyb149
mattyb149 / zklist.groovy
Last active February 4, 2023 06:24
Recursively list Zookeeper nodes and data
@Grab('org.apache.zookeeper:zookeeper:3.4.6')
import org.apache.zookeeper.*
import org.apache.zookeeper.data.*
import org.apache.zookeeper.AsyncCallback.StatCallback
import static org.apache.zookeeper.ZooKeeper.States.*
final int TIMEOUT_MSEC = 5000
final int RETRY_MSEC = 100
@mattyb149
mattyb149 / FlattenJSON.groovy
Created October 17, 2014 20:37
PDI Groovy script to flatten JSON into key/value pairs, with the JSON path being the key
@Grab(group='com.fasterxml.jackson.core', module='jackson-databind', version='2.3.3')
import com.fasterxml.jackson.databind.JsonNode
import com.fasterxml.jackson.databind.ObjectMapper
import com.fasterxml.jackson.databind.node.ArrayNode
import com.fasterxml.jackson.databind.node.ObjectNode
import com.fasterxml.jackson.databind.node.ValueNode
import org.pentaho.di.core.row.*
import org.pentaho.di.core.row.value.*
@mattyb149
mattyb149 / TransformationFinish_printRuntimeInfo.groovy
Created October 29, 2014 21:31
Extension point script to dump runtime info
// Build up a usage map
stringList = _object_.transMeta.getStringList( true, true, false, true );
varMap = [:]
// Look around in the strings, see what we find...
stringList.each { result ->
def varList = []
parent = result.parentObject
org.pentaho.di.core.util.StringUtil.getUsedVariables( result.string, varList, false );
varList.each { var ->
@mattyb149
mattyb149 / Spearman_sort_test.ktr
Created December 9, 2014 03:03
Determine sortedness of data in PDI with Spearman coefficient
<?xml version="1.0" encoding="UTF-8"?>
<transformation>
<info>
<name>Spearman_sort_test</name>
<description/>
<extended_description/>
<trans_version/>
<trans_type>Normal</trans_type>
<directory>&#x2f;</directory>
<parameters>
@mattyb149
mattyb149 / split_fields_num_records.xml
Last active August 29, 2015 14:13
Split rows based on a field value in Pentaho Data Integration
<?xml version="1.0" encoding="UTF-8"?>
<transformation>
<info>
<name>split_field_num_records</name>
<description/>
<extended_description/>
<trans_version/>
<trans_type>Normal</trans_type>
<directory>&#x2f;</directory>
<parameters>
@mattyb149
mattyb149 / rolling_distinct.xml
Created January 14, 2015 19:36
Calculate rolling distinct count based on date in PDI
<?xml version="1.0" encoding="UTF-8"?>
<transformation>
<info>
<name>rolling_distinct</name>
<description/>
<extended_description/>
<trans_version/>
<trans_type>Normal</trans_type>
<directory>&#x2f;</directory>
<parameters>
@mattyb149
mattyb149 / pdi-pig.xml
Created February 4, 2015 22:46
PDI transformation for use as an Apache Pig UDF
<?xml version="1.0" encoding="UTF-8"?>
<transformation>
<info>
<name>pdi-pig</name>
<description/>
<extended_description/>
<trans_version/>
<trans_type>Normal</trans_type>
<directory>&#x2f;</directory>
<parameters>
@mattyb149
mattyb149 / FlattenJSON_SuperScript.groovy
Created February 23, 2015 16:20
Groovy script for PDI SuperScript step to flatten JSON into key/value pairs, with the JSON path being the key
@Grab(group='com.fasterxml.jackson.core', module='jackson-databind', version='2.3.3')
import com.fasterxml.jackson.databind.JsonNode
import com.fasterxml.jackson.databind.ObjectMapper
import com.fasterxml.jackson.databind.node.ArrayNode
import com.fasterxml.jackson.databind.node.ObjectNode
import com.fasterxml.jackson.databind.node.ValueNode
import org.pentaho.di.core.row.*
import org.pentaho.di.core.row.value.*
@mattyb149
mattyb149 / applescript_infield.xml
Created March 2, 2015 19:46
PDI SuperScript using AppleScript with incoming field values (applescript_infield.ktr)
<?xml version="1.0" encoding="UTF-8"?>
<transformation>
<info>
<name>applescript_infield</name>
<description/>
<extended_description/>
<trans_version/>
<trans_type>Normal</trans_type>
<directory>&#x2f;</directory>
<parameters>