Skip to content

Instantly share code, notes, and snippets.

@kenwebb
Last active February 13, 2019 17:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kenwebb/5c0ca96bc2cf1aca750616925706f108 to your computer and use it in GitHub Desktop.
Save kenwebb/5c0ca96bc2cf1aca750616925706f108 to your computer and use it in GitHub Desktop.
tidyverse dplyr
<?xml version="1.0" encoding="UTF-8"?>
<!--Xholon Workbook http://www.primordion.com/Xholon/gwt/ MIT License, Copyright (C) Ken Webb, Wed Feb 13 2019 12:09:24 GMT-0500 (Eastern Standard Time)-->
<XholonWorkbook>
<Notes><![CDATA[
Xholon
------
Title: tidyverse dplyr
Description:
Url: http://www.primordion.com/Xholon/gwt/
InternalName: 5c0ca96bc2cf1aca750616925706f108
Keywords:
My Notes
--------
February 10, 2019
Figure out how to do some dplyr-like stuff in Xholon.
See also: my tidyverse/xhdplyr.js library
TODO:
- deal with tibble data types
- R can write to the binary RDS and feather formats that do include the data type
- I will do this manually by adding coltypes="dbl,dbl,dbl,dbl,fct" to Xholon Tibble
- possibly allow storing only a rudimentary Xholon-like object in intermediate arrays such as ccache
- ex: {coldata=[1,2,3,"four"]}
- OR let Tibble contain a single array of arrays
List of tibble data types:
int
dbl
chr
fct
date
dttm
list
XholonConsole code
------------------
<XingTibblebehavior implName="org.primordion.xholon.base.Behavior_gwtjs">&lt;![CDATA[
var me, beh = {
postConfigure: function() {
me = this.cnode.parent();
},
handleNodeSelection: function() {
return me.colnames + "\n" + me.coltypes + "\n" + me.first().coldata;
}
}
//# sourceURL=behaviorScript.js
]]&gt;</XingTibblebehavior>
References
----------
(1) https://dplyr.tidyverse.org/
(2) https://stackoverflow.com/questions/37460524/how-to-rearrange-an-array-by-indices-array
How to rearrange an array by indices array?
(3) https://sugarjs.com/
Sugar is a Javascript utility library for working with native objects.
includes numerous Array functions
(4) https://github.com/mihaifm/linq
linq.js - LINQ for JavaScript
This is a javascript implementation of the .NET LINQ library.
It contains all the origial .NET methods plus additional ones.
SQL-like
(5) https://stackoverflow.com/questions/14446511/most-efficient-method-to-groupby-on-a-array-of-objects
(6) https://stackoverflow.com/questions/122102/what-is-the-most-efficient-way-to-deep-clone-an-object-in-javascript
the following is in ES6, and is more complete than the Xholon clone methods such as node.cloneAfer()
var clone = Object.assign({}, obj);
this answer also includes a polyfill
BUT Object.assign() does not clone the XholonJsApi functions
) https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object/assign
(7) https://github.com/pvorb/clone
) https://github.com/pvorb/clone/blob/master/clone.js
]]></Notes>
<_-.XholonClass>
<PhysicalSystem/>
<Tibble childSuperClass="Row"/>
</_-.XholonClass>
<xholonClassDetails>
</xholonClassDetails>
<PhysicalSystem>
<!--
I obtained the iris data through R using:
write_csv(as_tibble(iris), "iris.csv")
OR
readr::write_csv(tibble::as_tibble(iris), "iris1.csv")
-->
<Tibble roleName="Iris" coltypes="dbl,dbl,dbl,dbl,fct">
<Attribute_String>
Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species
5.1,3.5,1.4,0.2,setosa
4.9,3,1.4,0.2,setosa
4.7,3.2,1.3,0.2,setosa
4.6,3.1,1.5,0.2,setosa
5,3.6,1.4,0.2,setosa
5.4,3.9,1.7,0.4,setosa
4.6,3.4,1.4,0.3,setosa
5,3.4,1.5,0.2,setosa
4.4,2.9,1.4,0.2,setosa
4.9,3.1,1.5,0.1,setosa
5.4,3.7,1.5,0.2,setosa
4.8,3.4,1.6,0.2,setosa
4.8,3,1.4,0.1,setosa
4.3,3,1.1,0.1,setosa
5.8,4,1.2,0.2,setosa
5.7,4.4,1.5,0.4,setosa
5.4,3.9,1.3,0.4,setosa
5.1,3.5,1.4,0.3,setosa
5.7,3.8,1.7,0.3,setosa
5.1,3.8,1.5,0.3,setosa
5.4,3.4,1.7,0.2,setosa
5.1,3.7,1.5,0.4,setosa
4.6,3.6,1,0.2,setosa
5.1,3.3,1.7,0.5,setosa
4.8,3.4,1.9,0.2,setosa
5,3,1.6,0.2,setosa
5,3.4,1.6,0.4,setosa
5.2,3.5,1.5,0.2,setosa
5.2,3.4,1.4,0.2,setosa
4.7,3.2,1.6,0.2,setosa
4.8,3.1,1.6,0.2,setosa
5.4,3.4,1.5,0.4,setosa
5.2,4.1,1.5,0.1,setosa
5.5,4.2,1.4,0.2,setosa
4.9,3.1,1.5,0.2,setosa
5,3.2,1.2,0.2,setosa
5.5,3.5,1.3,0.2,setosa
4.9,3.6,1.4,0.1,setosa
4.4,3,1.3,0.2,setosa
5.1,3.4,1.5,0.2,setosa
5,3.5,1.3,0.3,setosa
4.5,2.3,1.3,0.3,setosa
4.4,3.2,1.3,0.2,setosa
5,3.5,1.6,0.6,setosa
5.1,3.8,1.9,0.4,setosa
4.8,3,1.4,0.3,setosa
5.1,3.8,1.6,0.2,setosa
4.6,3.2,1.4,0.2,setosa
5.3,3.7,1.5,0.2,setosa
5,3.3,1.4,0.2,setosa
7,3.2,4.7,1.4,versicolor
6.4,3.2,4.5,1.5,versicolor
6.9,3.1,4.9,1.5,versicolor
5.5,2.3,4,1.3,versicolor
6.5,2.8,4.6,1.5,versicolor
5.7,2.8,4.5,1.3,versicolor
6.3,3.3,4.7,1.6,versicolor
4.9,2.4,3.3,1,versicolor
6.6,2.9,4.6,1.3,versicolor
5.2,2.7,3.9,1.4,versicolor
5,2,3.5,1,versicolor
5.9,3,4.2,1.5,versicolor
6,2.2,4,1,versicolor
6.1,2.9,4.7,1.4,versicolor
5.6,2.9,3.6,1.3,versicolor
6.7,3.1,4.4,1.4,versicolor
5.6,3,4.5,1.5,versicolor
5.8,2.7,4.1,1,versicolor
6.2,2.2,4.5,1.5,versicolor
5.6,2.5,3.9,1.1,versicolor
5.9,3.2,4.8,1.8,versicolor
6.1,2.8,4,1.3,versicolor
6.3,2.5,4.9,1.5,versicolor
6.1,2.8,4.7,1.2,versicolor
6.4,2.9,4.3,1.3,versicolor
6.6,3,4.4,1.4,versicolor
6.8,2.8,4.8,1.4,versicolor
6.7,3,5,1.7,versicolor
6,2.9,4.5,1.5,versicolor
5.7,2.6,3.5,1,versicolor
5.5,2.4,3.8,1.1,versicolor
5.5,2.4,3.7,1,versicolor
5.8,2.7,3.9,1.2,versicolor
6,2.7,5.1,1.6,versicolor
5.4,3,4.5,1.5,versicolor
6,3.4,4.5,1.6,versicolor
6.7,3.1,4.7,1.5,versicolor
6.3,2.3,4.4,1.3,versicolor
5.6,3,4.1,1.3,versicolor
5.5,2.5,4,1.3,versicolor
5.5,2.6,4.4,1.2,versicolor
6.1,3,4.6,1.4,versicolor
5.8,2.6,4,1.2,versicolor
5,2.3,3.3,1,versicolor
5.6,2.7,4.2,1.3,versicolor
5.7,3,4.2,1.2,versicolor
5.7,2.9,4.2,1.3,versicolor
6.2,2.9,4.3,1.3,versicolor
5.1,2.5,3,1.1,versicolor
5.7,2.8,4.1,1.3,versicolor
6.3,3.3,6,2.5,virginica
5.8,2.7,5.1,1.9,virginica
7.1,3,5.9,2.1,virginica
6.3,2.9,5.6,1.8,virginica
6.5,3,5.8,2.2,virginica
7.6,3,6.6,2.1,virginica
4.9,2.5,4.5,1.7,virginica
7.3,2.9,6.3,1.8,virginica
6.7,2.5,5.8,1.8,virginica
7.2,3.6,6.1,2.5,virginica
6.5,3.2,5.1,2,virginica
6.4,2.7,5.3,1.9,virginica
6.8,3,5.5,2.1,virginica
5.7,2.5,5,2,virginica
5.8,2.8,5.1,2.4,virginica
6.4,3.2,5.3,2.3,virginica
6.5,3,5.5,1.8,virginica
7.7,3.8,6.7,2.2,virginica
7.7,2.6,6.9,2.3,virginica
6,2.2,5,1.5,virginica
6.9,3.2,5.7,2.3,virginica
5.6,2.8,4.9,2,virginica
7.7,2.8,6.7,2,virginica
6.3,2.7,4.9,1.8,virginica
6.7,3.3,5.7,2.1,virginica
7.2,3.2,6,1.8,virginica
6.2,2.8,4.8,1.8,virginica
6.1,3,4.9,1.8,virginica
6.4,2.8,5.6,2.1,virginica
7.2,3,5.8,1.6,virginica
7.4,2.8,6.1,1.9,virginica
7.9,3.8,6.4,2,virginica
6.4,2.8,5.6,2.2,virginica
6.3,2.8,5.1,1.5,virginica
6.1,2.6,5.6,1.4,virginica
7.7,3,6.1,2.3,virginica
6.3,3.4,5.6,2.4,virginica
6.4,3.1,5.5,1.8,virginica
6,3,4.8,1.8,virginica
6.9,3.1,5.4,2.1,virginica
6.7,3.1,5.6,2.4,virginica
6.9,3.1,5.1,2.3,virginica
5.8,2.7,5.1,1.9,virginica
6.8,3.2,5.9,2.3,virginica
6.7,3.3,5.7,2.5,virginica
6.7,3,5.2,2.3,virginica
6.3,2.5,5,1.9,virginica
6.5,3,5.2,2,virginica
6.2,3.4,5.4,2.3,virginica
5.9,3,5.1,1.8,virginica
</Attribute_String>
<script><![CDATA[
// read iris text data into JS array
var tibble = this.parent();
var data = this.prev().remove().text().trim().split("\n");
this.remove();
// convert JS array into Xholon subtree
tibble.colnames = data.shift().split(",");
tibble.coltypes = tibble.coltypes.split(",");
var rowXhcName = tibble.xhc().attr("childSuperClass");
tibble.xhc().parent().append("<" + rowXhcName + "/>");
tibble.append('<_-.rows><' + rowXhcName + ' multiplicity="' + data.length + '"/></_-.rows>');
var row = tibble.first();
data.forEach(function(item) {
if (row) {
var itemarr = item.trim().split(",");
for (var i = 0; i < itemarr.length; i++) {
switch (tibble.coltypes[i]) {
case "dbl":
itemarr[i] = Number(itemarr[i]);
break;
case "fct":
default:
break;
}
}
row.coldata = itemarr;
row = row.next();
}
else {
tibble.println("row is null");
}
});
$wnd.console.log(tibble);
$wnd.console.log(tibble.first());
// operate on the tibble data
tibble.cache();
tibble.println(tibble.length);
tibble.println(tibble[0].name());
tibble.println(tibble.colnames);
tibble.println(tibble.coltypes);
for (var j = 0; j < 10; j++) {
tibble.println(tibble[j].coldata);
}
tibble.uncache();
// clone - for use by arrange()
// this must be called before calling tibble.cache("ccache")
var clone = function(tbl) {
var tblout = tbl.cloneAfter().next().remove();
tblout.colnames = tbl.colnames.slice();
tblout.coltypes = tbl.coltypes.slice();
var rowin = tbl.first();
var rowout = tblout.first();
while (rowin && rowout) {
rowout.coldata = rowin.coldata.slice();
rowin = rowin.next();
rowout = rowout.next();
}
return tblout;
}
// select the values of a column, where colix is the coldata index (0:3)
// TODO return a new Xholon Tibble
var select = function(tbl, colix) {
tbl.ccache.map(function(xhnode) {
tibble.println(xhnode.coldata[colix]);
});
}
// calculate sum and mean of the values of a column, where colix is the coldata index (0:3)
var sum = function(tbl, colix) {
var sum = 0.0;
tbl.ccache.map(function(xhnode) {
sum += xhnode.coldata[colix];
});
tibble.println("sum:" + sum.toFixed(1) + " mean:" + (sum / tbl.ccache.length).toFixed(1));
}
// add an extra column
// OR, I could return an array of cloned Xholon nodes, so the original nodes are retained
// TODO return a new Xholon Tibble
var mutate = function(tbl, colindexes, colname, coltype, func) {
tbl.colnames.push(colname);
tbl.coltypes.push(coltype);
tbl.ccache.map(function(xhnode) {
xhnode.coldata.push(func(xhnode.coldata[colindexes[0]], xhnode.coldata[colindexes[1]]));
});
}
// create and return a new array of Xholon nodes
// TODO return a new Xholon Tibble, not just the row array
var filter = function(tbl, colix, value) {
var arr = [];
tbl.ccache.map(function(xhnode) {
if (xhnode.coldata[colix] > value) {
arr.push(xhnode);
}
});
return arr;
}
var sortFuncColix = 0;
var sortfunc = function(xhnode1, xhnode2) {
//$wnd.console.log("sortfunc: " + xhnode1.name() + " " + xhnode2.name());
if (xhnode1.coldata[sortFuncColix] > xhnode2.coldata[sortFuncColix]) {
return 1;
}
else if (xhnode1.coldata[sortFuncColix] < xhnode2.coldata[sortFuncColix]) {
return -1;
}
else {
return 0;
}
}
// arrange
// to sort a clone of tbl: Object.assign({}, tbl)
var arrange = function(tbl, colix, desc=false) {
sortFuncColix = colix;
//var tblout = tbl; // will sort in-place
var tblout = Object.assign({}, tbl); // will sort a clone, but the clone has none of the XholonJsApi functions
//var tblout = tbl.cloneAfter().next().remove(); // does not include app.specific properties such as ccache
//var tblout = clone(tbl); // won't work if tbl has a ccache
var arr = tblout.ccache;
arr.sort(sortfunc);
if (desc) {
arr.reverse();
}
return tblout;
}
// Make an array of indexes, that serve to separate a tibble into groups.
// The tibble must already be sorted by the colix.
// For now, assume that values are non-negative numbers.
var makeGroupSepsArray = function(tbl, colix) {
var garr = [];
var value = -1;
var ix = 0;
tbl.ccache.map(function(xhnode) {
if (xhnode.coldata[colix] != value) {
value = xhnode.coldata[colix];
garr.push(ix);
}
ix++;
});
//tibble.println(ix + " " + tbl.ccache.length);
garr.push(ix);
return garr;
}
// group_by, for subsequent use by summarize()
var group_by = function(tbl, colix) {
arrange(tbl, colix, false);
var sarr = makeGroupSepsArray(tbl, colix);
tibble.println(sarr);
tibble.println(tbl.ccache.slice(sarr[0], sarr[1])); // first group
tibble.println(tbl.ccache.slice(sarr[1], sarr[2])); // second group
tibble.println(tbl.ccache.slice(sarr[2], sarr[3])); // third group
}
// summarize - arr.reduce()
var summarize = function(tbl, colix, callback, initialValue) {
var result = tbl.ccache.reduce(callback, initialValue);
return result;
}
tibble.cache("ccache");
select(tibble, 0);
sum(tibble, 0);
sum(tibble, tibble.colnames.indexOf("Sepal.Length"));
sum(tibble, tibble.colnames.indexOf("Sepal.Width"));
var func1 = function(x, y) {return (x + y) / 2;}
mutate(tibble, [0, 2], "mean.Width", "dbl", func1);
var filtarr = filter(tibble, 0, 7.0);
$wnd.console.log(filtarr);
// test
//sortfunc(tibble.ccache[0], tibble.ccache[1]);
var testColix = 1;
// arrange descending
var tibbleb = arrange(tibble, testColix, true);
$wnd.console.log(tibbleb);
select(tibbleb, testColix);
// arrange ascending
var tibblec = arrange(tibble, testColix, false);
select(tibblec, testColix);
group_by(tibble, testColix);
const reducer = function(accumulator, currentValue) {return accumulator + currentValue.coldata[testColix];}
var summ = summarize(tibble, testColix, reducer, 0.0);
tibble.println(summ.toFixed(1));
tibble.uncache("ccache");
// test clone()
var clone1 = clone(tibble);
$wnd.console.log("clone1");
$wnd.console.log(clone1);
//# sourceURL=Tibblescript.js
]]></script>
</Tibble>
</PhysicalSystem>
<SvgClient><Attribute_String roleName="svgUri"><![CDATA[data:image/svg+xml,
<svg width="100" height="50" xmlns="http://www.w3.org/2000/svg">
<g>
<title>Iris Tibble</title>
<rect id="PhysicalSystem/Tibble[@roleName='Iris']" fill="#98FB98" height="50" width="50" x="25" y="0"/>
<g>
<title>Row 1</title>
<rect id="PhysicalSystem/Tibble[@roleName='Iris']/Row[1]" fill="#6AB06A" height="50" width="10" x="80" y="0"/>
</g>
</g>
</svg>
]]></Attribute_String><Attribute_String roleName="setup">${MODELNAME_DEFAULT},${SVGURI_DEFAULT}</Attribute_String></SvgClient>
</XholonWorkbook>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment