Last active
September 14, 2018 11:18
-
-
Save kenwebb/1384981d4264e9282ec0e6eacaf24d2c to your computer and use it in GitHub Desktop.
Graphs, Rules - Machine Learning
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?xml version="1.0" encoding="UTF-8"?> | |
<!--Xholon Workbook http://www.primordion.com/Xholon/gwt/ MIT License, Copyright (C) Ken Webb, Fri Sep 14 2018 07:18:12 GMT-0400 (Eastern Daylight Time)--> | |
<XholonWorkbook> | |
<Notes><![CDATA[ | |
Xholon | |
------ | |
Title: Graphs, Rules - Machine Learning | |
Description: | |
Url: http://www.primordion.com/Xholon/gwt/ | |
InternalName: 1384981d4264e9282ec0e6eacaf24d2c | |
Keywords: | |
My Notes | |
-------- | |
September 6, 2018 | |
see notes from Sept 5 | |
see my Documents/GraphsRulesML folder | |
Be able to predict the right-hand side (RHS) from the left-hand side (LHS) in a set of rules, especially where the LHS and RHS are graphs. | |
Explore what approaches are offered by Machine Learning (ML) and Data Science (DS). | |
Typically the LHS will be more or less abstract, while the RHS will be more or less concrete. | |
One Xholon scenario (Island app): | |
- capture data from multiple human players, where each player is represented in the game by an Avatar | |
- the data will consist of the Avatar's "before" state (LHS), an action taken, and the Avatar's "after" state (RHS) | |
- the data will be saved to the in-browser IndexedDB database | |
- subsequently, I want non-human players to be able to use the captured data to make reasonably intelligent moves in the game space | |
- they could choose randomly from a list of available rules, or | |
- they could choose matching rules based on probablities (which RHS was most often chosen by human players), or | |
- they could apply ML/DS algorithms | |
References | |
---------- | |
(1) search: "machine learning" rules | |
(2) search: "machine learning" graph | |
(3) https://en.wikipedia.org/wiki/Rule-based_machine_learning | |
Rule-based machine learning (RBML) is a term in computer science intended to encompass any machine learning method | |
that identifies, learns, or evolves 'rules' to store, manipulate or apply. | |
The defining characteristic of a rule-based machine learner is the identification and utilization of a set of relational rules | |
that collectively represent the knowledge captured by the system. | |
This is in contrast to other machine learners that commonly identify a singular model that can be universally applied | |
to any instance in order to make a prediction. | |
Rule-based machine learning approaches include learning classifier systems, association rule learning, artificial immune systems, | |
and any other method that relies on a set of rules, each covering contextual knowledge. | |
While rule-based machine learning is conceptually a type of rule-based system, it is distinct from traditional rule-based systems, | |
which are often hand-crafted, and other rule-based decision makers. | |
This is because rule-based machine learning applies some form of learning algorithm to automatically identify useful rules, | |
rather than a human needing to apply prior domain knowledge to manually construct rules and curate a rule set. | |
KSW this is probably the iinverse of what I am looking for | |
(4) http://www.plantcell.org/content/23/9/3101 | |
) http://www.plantcell.org/content/plantcell/23/9/3101.full.pdf | |
Functional Network Construction in Arabidopsis Using Rule-Based Machine Learning on Large-Scale Data Sets | |
George W. Bassel, Enrico Glaab, Julietta Marquez, Michael J. Holdsworth, Jaume Bacardit | |
Published September 2011. | |
Rule-based learning methods ... where a model is composed by an arbitrary set of decision rules, where each rule specifies a subset of the input space. | |
An additional benefit of rule-based learning methods is that they produce human-readable rules (Figure 1A), | |
in contrast with other methods such as nonlinear SVM and artificial neural networks. | |
Here, we propose a novel approach in the construction of functional networks based on gene expression data, through the use of rule-based ML. | |
The premise of this approach is that genes present within the same rule that predicts a developmental outcome will have an increased likelihood | |
of being functionally related in the developmental process in question given their collective ability to act together in generating the prediction. | |
We term this associative measure “coprediction.” | |
The use of coprediction enables functional gene associations to be inferred that cannot be detected using coexpression analysis, for example. | |
Coprediction is not restricted by similarities in expression pattern that are the only measure used by coexpression. | |
Another difference between these two methods stems from rule-based methods focusing on identifying interactions within subsets of samples, | |
such as those belonging to the discrete developmental states or transitions between them, | |
rather than correlations across all samples, as is the case with coexpression. | |
In this way, state-dependant data can be considered independently from one another and novel knowledge extracted from the data. | |
KSW perhaps break Xholon subgraphs into fragments, that are at least spatio-temporally coordinated with each other, for example inventory and siblings | |
- use these in a way similar to what is described above | |
In this study we aimed to | |
(1) investigate whether rule-based ML methods can be used to predict the developmental output of a biological system, | |
(2) identify novel regulators using these predictions, and | |
(3) uncover functional associations between genes controlling a developmental phase transition using this approach. | |
We used the large number of publicly available transcriptomic data sets representing each the dormant and germinating states of | |
Arabidopsis seeds (Bassel et al., 2008) to generate a functional network using rule-based ML. | |
Four different ML methods representing different types of prediction techniques were compared to determine the accuracy with which they can predict developmental fate in seeds. | |
(5) http://vis.cs.ucdavis.edu/papers/kwon_infovis17.pdf | |
What Would a Graph Look Like in This Layout? | |
A Machine Learning Approach to Large Graph Visualization | |
Oh-Hyun Kwon, Student Member, IEEE, Tarik Crnovrsanin, and Kwan-Liu Ma, Fellow, IEEE | |
In the field of machine learning, several methods have been used to predict | |
the properties of graphs, such as the classes of graphs. One prominent | |
approach to predicting such properties is to use a graph kernel. Graph | |
kernel methods enable us to apply various kernelized machine learning | |
techniques, such as the Support Vector Machine (SVM) [18], on graphs. | |
Based on our assumption, we need to measure the topological similarities | |
between graphs. Depending on the discipline, this problem is called | |
graph matching, graph comparison, or network alignment. In the last | |
five decades, numerous approaches have been proposed to this problem | |
[22, 29]. Each approach measures the similarity between graphs | |
based on different aspects, such as isomorphism relation [48, 79, 92], | |
graph edit distance [11], or graph measures [4, 19]. However, many | |
of these traditional approaches are either computationally expensive, | |
not expressive enough to capture the topological features, or difficult to | |
adapt to different problems [64]. | |
Graph kernels have been recently introduced for measuring pairwise | |
similarities between graphs in the field of machine learning. They allow | |
us to apply many different kernelized machine learning techniques, e.g. | |
SVM [18], on graph problems, including graph classification problems | |
found in bioinformatics and chemoinformatics. | |
A graph kernel can be considered to be an instance of an Rconvolution | |
kernel [42]. It measures the similarity between two graphs | |
based on the recursively decomposed substructures of said graphs. | |
The measure of similarity varies with each graph kernel based on different | |
types of substructures in graphs. These substructures include | |
walks [35, 50, 85], shortest paths [8], subtrees [72, 74, 75], cycles [44], | |
and graphlets [76]. Selecting a graph kernel is challenging as many | |
kernels are available. To exacerbate the problem, there is no theoretical | |
justification as to why one graph kernel works better than another for a | |
given problem [76]. | |
Many graph kernels often have similar limitations as previously | |
mentioned approaches. They do not scale well to large graphs ... | |
or do not work well for unlabeled | |
graphs. To overcome this problem, a graph kernel based on sampling | |
a fixed number of graphlets has been introduced to be accurate and | |
efficient on large graphs [64, 76]. | |
Graphlets are small, induced, and non-isomorphic subgraph patterns | |
in a graph [67] (Fig. 2). Graphlet frequencies (Fig. 3) have been used | |
to characterize biological networks [66, 67], identify disease genes [59], | |
and analyze social network structures [82]. Depending on the definition, | |
the relative frequencies are called graphlet frequency distribution, | |
graphlet degree distribution, or graphlet concentrations. While these | |
works often have different definitions, the fundamental idea is to count | |
the individual graphlets and compare their relative frequencies of occurrence | |
between graphs. A graph kernel based on graphlet frequencies, | |
called a graphlet kernel, was first proposed by Shervashidze et al. [76]. | |
The main idea is to use a graphlet frequency vector as the feature vector | |
of a graph, then compute the similarity between graphs by defining the | |
inner product of the feature vectors. | |
(6) http://graphvis.net/wgl | |
(7) https://en.wikipedia.org/wiki/Graphlets | |
Graphlets are small connected non-isomorphic induced subgraphs of a large network. | |
Graphlets differ from network motifs, since they must be induced subgraphs, whereas motifs are partial subgraphs. | |
An induced subgraph must contain all edges between its nodes that are present in the large network, while a partial subgraph may contain only some of these edges. | |
Moreover, graphlets do not need to be over-represented in the data when compared with randomized networks, while motifs do. | |
(8) https://en.wikipedia.org/wiki/Network_motif | |
All networks, including biological networks, social networks, technological networks (e.g., computer networks and electrical circuits) and more, | |
can be represented as graphs, which include a wide variety of subgraphs. | |
One important local property of networks are so-called network motifs, which are defined as recurrent and statistically significant sub-graphs or patterns. | |
Motif Discovery Algorithms | |
Here, a review on computational aspects of major algorithms is given and their related benefits and drawbacks from an algorithmic perspective are discussed. | |
Well-Established Motifs and Their Functions | |
(9) http://robotics.stanford.edu/~quocle/CaeCheLeSmo07.pdf | |
Learning Graph Matching | |
Tiberio S. Caetano, Li Cheng, Quoc V. Le and Alex J. Smola ´ | |
Statistical Machine Learning Program, NICTA and ANU, Canberra ACT 0200, Australia | |
(10) https://arxiv.org/pdf/1506.05163.pdf | |
Deep Convolutional Networks on Graph-Structured Data | |
Mikael Henaff, Joan Bruna, Yann LeCun | |
(11) https://en.wikipedia.org/wiki/Convolutional_neural_network | |
(12) https://www.quora.com/How-would-you-apply-deep-learning-to-graph-data-What-algorithm-would-you-use?share=1 | |
How would you apply deep learning to graph data? What algorithm would you use? | |
Dileep Patchigolla, Specialist Data science at Hike Messenger | |
DL is used widely for network embeddings of graph data. | |
Basically, instead of representing each node of a graph with its links and features, there are several DL techniques using which each node can be represented in a low-dimensional latent space. | |
Once you get these embeddings, they are typically used for some downsteam ML applications. i.e, you can use these embeddings as features for your ML models. | |
CNNs are generally used to learn these node embeddings from its own attributes and attributes from its neighborhood. | |
Depending on how ‘deep’ you can go, a convolution operator can learn from the features of a node and its (extended) neighborhood. | |
This github location has a list of several great paper implementations: | |
chihming/awesome-network-embedding | |
(13) https://github.com/chihming/awesome-network-embedding | |
A curated list of network embedding techniques. | |
Also called network representation learning, graph embedding, knowledge embedding, etc. | |
The task is to learn the representations of the vertices from a given network. | |
(14) https://arxiv.org/pdf/1511.05493.pdf | |
GATED GRAPH SEQUENCE NEURAL NETWORKS | |
Yujia Li, Richard Zemel, Marc Brockschmidt, Daniel Tarlow | |
ABSTRACT | |
Graph-structured data appears frequently in domains including chemistry, natural | |
language semantics, social networks, and knowledge bases. In this work, we study | |
feature learning techniques for graph-structured inputs. Our starting point is previous | |
work on Graph Neural Networks (Scarselli et al., 2009), which we modify | |
to use gated recurrent units and modern optimization techniques and then extend | |
to output sequences. The result is a flexible and broadly useful class of neural network | |
models that has favorable inductive biases relative to purely sequence-based | |
models (e.g., LSTMs) when the problem is graph-structured. We demonstrate the | |
capabilities on some simple AI (bAbI) and graph algorithm learning tasks. We | |
then show it achieves state-of-the-art performance on a problem from program | |
verification, in which subgraphs need to be described as abstract data structures. | |
(15) https://en.wikipedia.org/wiki/Induced_subgraph | |
(16) https://github.com/josephmisiti/awesome-machine-learning#javascript | |
A curated list of awesome Machine Learning frameworks, libraries and software. | |
(17) http://www.weizmann.ac.il/mcb/UriAlon/homepage | |
Understanding the protein circuits that perform computations within the cell is a central problem in biology. | |
network motifs | |
]]></Notes> | |
<_-.XholonClass> | |
<!-- domain objects --> | |
<PhysicalSystem/> | |
<Block/> | |
<Brick/> | |
<!-- quantities --> | |
<Height superClass="Quantity"/> | |
</_-.XholonClass> | |
<xholonClassDetails> | |
<Block> | |
<port name="height" connector="Height"/> | |
</Block> | |
</xholonClassDetails> | |
<PhysicalSystem> | |
<Block> | |
<Height>0.1 m</Height> | |
</Block> | |
<Brick multiplicity="2"/> | |
</PhysicalSystem> | |
<Blockbehavior implName="org.primordion.xholon.base.Behavior_gwtjs"><![CDATA[ | |
var a = 123; | |
var b = 456; | |
var c = a * b; | |
if (console) { | |
console.log(c); | |
} | |
//# sourceURL=Blockbehavior.js | |
]]></Blockbehavior> | |
<Heightbehavior implName="org.primordion.xholon.base.Behavior_gwtjs"><![CDATA[ | |
var myHeight, testing; | |
var beh = { | |
postConfigure: function() { | |
testing = Math.floor(Math.random() * 10); | |
myHeight = this.cnode.parent(); | |
}, | |
act: function() { | |
myHeight.println(this.toString()); | |
}, | |
toString: function() { | |
return "testing:" + testing; | |
} | |
} | |
//# sourceURL=Heightbehavior.js | |
]]></Heightbehavior> | |
<Brickbehavior implName="org.primordion.xholon.base.Behavior_gwtjs"><![CDATA[ | |
$wnd.xh.Brickbehavior = function Brickbehavior() {} | |
$wnd.xh.Brickbehavior.prototype.postConfigure = function() { | |
this.brick = this.cnode.parent(); | |
this.iam = " red brick"; | |
}; | |
$wnd.xh.Brickbehavior.prototype.act = function() { | |
this.brick.println("I am a" + this.iam); | |
}; | |
//# sourceURL=Brickbehavior.js | |
]]></Brickbehavior> | |
<Brickbehavior implName="org.primordion.xholon.base.Behavior_gwtjs"><![CDATA[ | |
console.log("I'm another brick behavior"); | |
]]></Brickbehavior> | |
<SvgClient><Attribute_String roleName="svgUri"><![CDATA[data:image/svg+xml, | |
<svg width="100" height="50" xmlns="http://www.w3.org/2000/svg"> | |
<g> | |
<title>Block</title> | |
<rect id="PhysicalSystem/Block" fill="#98FB98" height="50" width="50" x="25" y="0"/> | |
<g> | |
<title>Height</title> | |
<rect id="PhysicalSystem/Block/Height" fill="#6AB06A" height="50" width="10" x="80" y="0"/> | |
</g> | |
</g> | |
</svg> | |
]]></Attribute_String><Attribute_String roleName="setup">${MODELNAME_DEFAULT},${SVGURI_DEFAULT}</Attribute_String></SvgClient> | |
</XholonWorkbook> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment