A user-defined function is a Javascript function that can be used in where
clauses and columns
clauses.
When specified in the where clause, a UDF can apply custom logic to filter out or re-shape a row from a data set.
Example:
udfs = require('myudfs');
u = select * from users where guid = 'user01234';
return select id, price, tax, shipping from items where udfs.applyDiscount("{u.id}");
where applyDiscount()
computes a new price. This logic may even involve network IO.
exports.applyDiscount = function(price, u) {
// Current row in the result
var row = this.row;
// Compute - can be async too.
row.price = ...;
// First arg is error, second result
// A null/undefined second arg triggers a removal of the row from the result set.
this.next(null, row);
}
UDFs in the where clause are invoked after invoking all non-UDF constraints in a where clause.
When specified in the columns clause, a UDF can apply custom logic to project a field in a row.
Example:
udfs = require('myudfs');
return select id, myudfs.sum(price, tax, shipping) from product where zip = 98074;
where sum()
is defined as
exports.sum = function(price, tax, shipping) {
// first arg is error, second result
return this.next(null, price + tax + shipping);
}
You can define UDFs using the node.js module pattern.
// file: myudfs.js
exports.func1 = function() { ... };
exports.func2 = function() { ... };
In order to use a UDF in a script, you must import its module first. Importing is done via a
built-in function called require
.
udfs = require('myudfs')
Each user defined function can declare the following types of arguments:
- JSON literals (such as strings, numbers, true, false, null, object)
- Column names (i.e., names of fields listed for a projection in the columns clause of select statements.
The UDF will receive its arguments in the order they were used in the script.
A UDF declared in the clause receives a this
object with the following fields:
- All the variables currently available in the script's execution context.
- All the params available in the script's execution context.
- A
rows
field whose value is the data set. In the case of joins, the data set is the joined set before applying this UDF. - An
index
field whose value is the current row number. - A
row
field which is the current row. - A
next
function to return updated row. The UDF must call this function to continue processing.
A UDF declared in the where clause receives a this
object with the following fields:
- All the variables currently available in the script's execution context.
- All the params available in the script's execution context.
- A
rows
field whose value is the data set. - An
index
field whose value is the current row number. - A
row
field which is the current row. - A
next
function to return a projection.
u = require("filter.js");
a1 = [{"name": "Brand-A", "keys" : [{ "name": "G1"},{"name": "G2"},{"name": "G3"}]},
{"name": "Brand-B", "keys" : [{ "name": "G1"},{"name": "G2"}]},
{"name": "Brand-C", "keys" : [{ "name": "G4"},{"name": "G2"}]}];
a2 = [{"name": "Brand-A", "details": [{"name": "G3","count": 32},
{"name": "G5","count": 18},
{"name": "G1","count": 40}]},
{"name": "Brand-C", "details": [{"name": "G3","count": 32},
{"name": "G5","count": 18}]}];
return select a2.name, a2.details from a1 as a1, a2 as a2 where a1.name = a2.name and
u.filterRow(a1.keys)
Here is an implementation of the UDF
exports.filterRow = function(keys) {
var found = false;
var details = [];
for(var i = 0; i < this.row[1].length; i++) {
for(var j = 0; j < keys.length; j++) {
if(keys[j].name === this.row[1][i].name) {
details.push(this.row[1][i]);
found = true;
break;
}
}
}
if(found) {
this.row[1] = details;
}
return this.next(null, found ? this.row : null);
};
I'm afraid I don't follow. The monkey patch documentation specifies that the log function is passed in as an element of the args:
The new udf definition appears to pass the arguments to the udf exclusively and conflicts with the examples on the monkey patch page.
This is what I'd expect but it doesn't seem to work:
Invoked as: