Skip to content

Instantly share code, notes, and snippets.

@mattyb149
mattyb149 / mbeanserver.groovy
Created March 15, 2014 15:48
Display registered MBeans in CRasH
import java.lang.management.ManagementFactory
import javax.management.*
MBeanServer mbeanServer = ManagementFactory.platformMBeanServer
def mbeans = mbeanServer.queryMBeans(null,ObjectName.WILDCARD)
mbeans.each {
def name = it.objectName.getKeyProperty 'name'
println name ?: it.toString();
}

The World Cup Graph

Initial Data Setup

@mattyb149
mattyb149 / jardiff.groovy
Created September 21, 2014 02:10
Find different versions of same JAR in two paths
commonLibs = [:]
leftLibs = []
rightLibs = []
if(args?.length > 1) {
new File(args[0]).eachFileRecurse {
fileName = (it.name - '.jar')
fileNameNoVersion = fileName[0..fileName.lastIndexOf('-')-1]
leftLibs += fileNameNoVersion
commonLibs.putAt(fileNameNoVersion, commonLibs[fileNameNoVersion] ? (commonLibs[fileNameNoVersion] + [1:fileName]) : [1:fileName])
}
@mattyb149
mattyb149 / get_pentaho_tweets.groovy
Created September 22, 2014 17:47
"Get Pentaho Tweets" scripted datasource in Groovy for Pentaho Report Designer
@Grab(group="org.twitter4j", module="twitter4j-core", version="4.0.2")
import twitter4j.*
import twitter4j.api.*
import twitter4j.conf.*
import org.pentaho.reporting.engine.classic.core.util.TypedTableModel;
colNames = ['Screen Name', 'Tweet Date', 'Text'] as String[]
colTypes = [String, Date, String] as Class[]
model = new TypedTableModel(colNames, colTypes);
@mattyb149
mattyb149 / TransformationFinish_printRuntimeInfo.groovy
Created October 29, 2014 21:31
Extension point script to dump runtime info
// Build up a usage map
stringList = _object_.transMeta.getStringList( true, true, false, true );
varMap = [:]
// Look around in the strings, see what we find...
stringList.each { result ->
def varList = []
parent = result.parentObject
org.pentaho.di.core.util.StringUtil.getUsedVariables( result.string, varList, false );
varList.each { var ->
@mattyb149
mattyb149 / Spearman_sort_test.ktr
Created December 9, 2014 03:03
Determine sortedness of data in PDI with Spearman coefficient
<?xml version="1.0" encoding="UTF-8"?>
<transformation>
<info>
<name>Spearman_sort_test</name>
<description/>
<extended_description/>
<trans_version/>
<trans_type>Normal</trans_type>
<directory>&#x2f;</directory>
<parameters>
@mattyb149
mattyb149 / split_fields_num_records.xml
Last active August 29, 2015 14:13
Split rows based on a field value in Pentaho Data Integration
<?xml version="1.0" encoding="UTF-8"?>
<transformation>
<info>
<name>split_field_num_records</name>
<description/>
<extended_description/>
<trans_version/>
<trans_type>Normal</trans_type>
<directory>&#x2f;</directory>
<parameters>
@mattyb149
mattyb149 / rolling_distinct.xml
Created January 14, 2015 19:36
Calculate rolling distinct count based on date in PDI
<?xml version="1.0" encoding="UTF-8"?>
<transformation>
<info>
<name>rolling_distinct</name>
<description/>
<extended_description/>
<trans_version/>
<trans_type>Normal</trans_type>
<directory>&#x2f;</directory>
<parameters>
@mattyb149
mattyb149 / pdi-pig.xml
Created February 4, 2015 22:46
PDI transformation for use as an Apache Pig UDF
<?xml version="1.0" encoding="UTF-8"?>
<transformation>
<info>
<name>pdi-pig</name>
<description/>
<extended_description/>
<trans_version/>
<trans_type>Normal</trans_type>
<directory>&#x2f;</directory>
<parameters>
@mattyb149
mattyb149 / FlattenJSON_SuperScript.groovy
Created February 23, 2015 16:20
Groovy script for PDI SuperScript step to flatten JSON into key/value pairs, with the JSON path being the key
@Grab(group='com.fasterxml.jackson.core', module='jackson-databind', version='2.3.3')
import com.fasterxml.jackson.databind.JsonNode
import com.fasterxml.jackson.databind.ObjectMapper
import com.fasterxml.jackson.databind.node.ArrayNode
import com.fasterxml.jackson.databind.node.ObjectNode
import com.fasterxml.jackson.databind.node.ValueNode
import org.pentaho.di.core.row.*
import org.pentaho.di.core.row.value.*