Skip to content

Instantly share code, notes, and snippets.

@s3u
Created April 8, 2012 02:39
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save s3u/2334012 to your computer and use it in GitHub Desktop.
Save s3u/2334012 to your computer and use it in GitHub Desktop.
ql.io - User Defined Functions

User Defined Functions

A user-defined function is a Javascript function that can be used in where clauses and columns clauses.

UDFs in the Where Clause

When specified in the where clause, a UDF can apply custom logic to filter out or re-shape a row from a data set.

Example:

udfs = require('myudfs');
u = select * from users where guid = 'user01234';
return select id, price, tax, shipping from items where udfs.applyDiscount("{u.id}");

where applyDiscount() computes a new price. This logic may even involve network IO.

exports.applyDiscount = function(price, u) {
    // Current row in the result
    var row = this.row;

    // Compute - can be async too.
    row.price = ...;

    // First arg is error, second result
    // A null/undefined second arg triggers a removal of the row from the result set.
    this.next(null, row);
}

UDFs in the where clause are invoked after invoking all non-UDF constraints in a where clause.

UDFs in the Column Clause

When specified in the columns clause, a UDF can apply custom logic to project a field in a row.

Example:

udfs = require('myudfs');
return select id, myudfs.sum(price, tax, shipping) from product where zip = 98074;

where sum() is defined as

exports.sum = function(price, tax, shipping) {
    // first arg is error, second result
    return this.next(null, price + tax + shipping);
}

Defining UDFs

You can define UDFs using the node.js module pattern.

// file: myudfs.js
exports.func1 = function() { ... };
exports.func2 = function() { ... };

Importing UDFs

In order to use a UDF in a script, you must import its module first. Importing is done via a built-in function called require.

udfs = require('myudfs')

Each user defined function can declare the following types of arguments:

  • JSON literals (such as strings, numbers, true, false, null, object)
  • Column names (i.e., names of fields listed for a projection in the columns clause of select statements.

The UDF will receive its arguments in the order they were used in the script.

UDF in Where Clause

A UDF declared in the clause receives a this object with the following fields:

  • All the variables currently available in the script's execution context.
  • All the params available in the script's execution context.
  • A rows field whose value is the data set. In the case of joins, the data set is the joined set before applying this UDF.
  • An index field whose value is the current row number.
  • A row field which is the current row.
  • A next function to return updated row. The UDF must call this function to continue processing.

UDF in Columns Clause

A UDF declared in the where clause receives a this object with the following fields:

  • All the variables currently available in the script's execution context.
  • All the params available in the script's execution context.
  • A rows field whose value is the data set.
  • An index field whose value is the current row number.
  • A row field which is the current row.
  • A next function to return a projection.

Example Script

u = require("filter.js");
a1 = [{"name": "Brand-A", "keys" : [{ "name": "G1"},{"name": "G2"},{"name": "G3"}]},
      {"name": "Brand-B", "keys" : [{ "name": "G1"},{"name": "G2"}]},
      {"name": "Brand-C", "keys" : [{ "name": "G4"},{"name": "G2"}]}];
a2 = [{"name": "Brand-A", "details": [{"name": "G3","count": 32},
                                      {"name": "G5","count": 18},
                                      {"name": "G1","count": 40}]},
      {"name": "Brand-C", "details": [{"name": "G3","count": 32}, 
                                      {"name": "G5","count": 18}]}];
return select a2.name, a2.details from a1 as a1, a2 as a2 where a1.name = a2.name and 
              u.filterRow(a1.keys)

Example UDF Impl

Here is an implementation of the UDF

exports.filterRow = function(keys) {
    var found = false;
    var details = [];
    for(var i = 0; i < this.row[1].length; i++) {
        for(var j = 0; j < keys.length; j++) {
            if(keys[j].name === this.row[1][i].name) {
                details.push(this.row[1][i]);
                found = true;
                break;
            }
        }
    }
    if(found) {
        this.row[1] = details;
    }
    return this.next(null, found ? this.row : null);
};
@rbdixon
Copy link

rbdixon commented Jun 18, 2012

Is there a way to access the log object as you can in a monkey patch?

@s3u
Copy link
Author

s3u commented Jun 18, 2012

@rbdixon
Copy link

rbdixon commented Jun 19, 2012

I'm afraid I don't follow. The monkey patch documentation specifies that the log function is passed in as an element of the args:

// Args: An object with uri, params, statement, response status, response headers, and body
exports['patch status'] = function(args) {
    var log = args.log; // Get the logger fn.

The new udf definition appears to pass the arguments to the udf exclusively and conflicts with the examples on the monkey patch page.

This is what I'd expect but it doesn't seem to work:

exports.filter = function() {
    this.log('warn','testmessage');
    return this.next(null, this.row);
}

Invoked as:

f = require('filter.js');
select url from github.repos where user = "rbdixon" and f.filter();

@s3u
Copy link
Author

s3u commented Jun 19, 2012

Sorry - my answer was incorrect - I spoke too soon. The this reference does not have log yet. I opened ql-io/ql.io#515.

The wrapper object in https://github.com/ql-io/ql.io/blob/master/modules/engine/lib/engine/udf.js like the way it is done in https://github.com/ql-io/ql.io/blob/master/modules/engine/lib/engine/source/verb.js.

cc @prabhakhar, @shimonchayim

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment