Skip to content

Instantly share code, notes, and snippets.

View ikwattro's full-sized avatar

Christophe Willemsen ikwattro

View GitHub Profile
MATCH (n:NER_Organization) 
RETURN n.value, size((n)<-[:HAS_TAG]-()) AS freq
ORDER BY freq DESC
╒══════════════════════════════════════════════════════════════╤══════╕
│"n.value"                                                     │"freq"│
╞══════════════════════════════════════════════════════════════╪══════╡

Create an NLP pipeline

CALL ga.nlp.processor.addPipeline({
name:"transcript", 
textProcessor: 'com.graphaware.nlp.processor.stanford.StanfordTextProcessor',
processingSteps: {tokenize:true, ner:true, dependencies:true}
})

Import the files

CALL ga.nlp.utils.listFiles("/Users/ikwattro/dev/_transcript", ".vtt")
YIELD filePath
MERGE (v:VideoTranscript {path: filePath})
WITH v, filePath
CALL ga.nlp.parser.webvtt(filePath)
YIELD startTime, endTime, text
MERGE (c:Caption {id: filePath + startTime + endTime}) SET c.text = text, c.start = startTime, c.end = endTime
MERGE (v)-[:HAS_CAPTION]-&gt;(c)
mkdir transcripts && cd transcripts
youtube-dl --all-subs --skip-download 'https://www.youtube.com/channel/UCvze3hU6OZBkB1vkhH2lH9Q'
@ikwattro
ikwattro / queries.md
Last active July 8, 2018 11:56
NLP with Whitelist
CALL ga.nlp.createSchema
CALL ga.nlp.processor.addPipeline({
name:"whitelist", 
whitelist:"hello,john,ibm", 
textProcessor:"com.graphaware.nlp.enterprise.processor.EnterpriseStanfordTextProcessor", 
╒═══════════════════════════════════════════════════════════╤════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╕
│"t.id" │"collect(distinct k)" │
╞═══════════════════════════════════════════════════════════╪════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│"Anthropology" │["the","and","have","for","with","government","country","that","from","include","they","not","who","after","during","which","can"] │
├───────────────────────────────────────────────────────────┼───────────────────────────────────────
@ikwattro
ikwattro / pizza.txt
Created February 3, 2018 21:31 — forked from edsu/pizza.txt
An example of looking up Pizza in Wikidata, to see the equivalent entities on other Wikipedias using the wggetentities call in the Wikidata API http://wikidata.org/w/api.php. The use of python as a filter on the end of curl is just to pretty print the JSON. You can obviously leave it off if you don't have Python available. In response to https:/…
curl "https://www.wikidata.org/w/api.php?action=wbgetentities&sites=itwiki&titles=Pizza&format=json" | python -mjson.tool
{
"entities": {
"q177": {
"descriptions": {
"de": {
"language": "de",
"value": "Gericht"
},
"en": {
This file has been truncated, but you can view the full file.
[{"id":"BE.NMBS.008811007","@id":"http://irail.be/stations/NMBS/008811007","name":"Schaarbeek/Schaerbeek","standardname":"Schaarbeek/Schaerbeek","adjacents":[{"id":"BE.NMBS.008812005","locationX":"4.360846","locationY":"50.859663","@id":"http://irail.be/stations/NMBS/008812005","name":"Brussels-North","standardname":"Brussel-Noord/Bruxelles-Nord","wikidata":{"adjacent_station":[{"exact_match":{"type":"uri","value":"http://irail.be/stations/NMBS/008812005"},"exact_match_adjacent":{"type":"uri","value":"http://irail.be/stations/NMBS/008811916"},"adjacent_stationLabel":{"xml:lang":"en","type":"literal","value":"Brussels-Schuman railway station"},"adjacent_stationUrl":{"type":"uri","value":"http://www.wikidata.org/entity/Q800589"}},{"exact_match":{"type":"uri","value":"http://irail.be/stations/NMBS/008812005"},"exact_match_adjacent":{"type":"uri","value":"http://irail.be/stations/NMBS/008811007"},"adjacent_stationLabel":{"xml:lang":"en","type":"literal","value":"Schaarbeek railway station"},"adjacent_stationUrl":
multiple_100 100
multiple_100 200
multiple_100 300
multiple_100 400
multiple_100 500
multiple_100 600
multiple_100 700
multiple_100 800
multiple_100 900
multiple_100 1000
class MyController
{
private $client;
public function __construct(Client $client)
{
$this->client = $client;
}