Skip to content

Instantly share code, notes, and snippets.

View seralf's full-sized avatar
🎯
Focusing

Alfredo Serafini seralf

🎯
Focusing
View GitHub Profile
@cbuil
cbuil / RDF4JLoad.java
Created January 20, 2023 16:45
loading wikidata on RDF4J
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;
import org.eclipse.rdf4j.common.exception.RDF4JException;
import org.eclipse.rdf4j.common.transaction.IsolationLevels;
@cgivre
cgivre / gist:a5c5c24048fe799278b79f971b39e6e5
Last active August 18, 2021 21:55
Convert ANSI SQL to T-SQL

One of the major challenges you may face is converting "normal" SQL to T-SQL which is Microsoft's dialect of SQL. I couldn't find any easy way to do this, however in doing some other work I found that Apache Calcite can actually perform this function quite simply. So... here's some code that does exactly that!

import org.apache.calcite.config.Lex;
import org.apache.calcite.sql.SqlDialect;
import org.apache.calcite.sql.SqlNode;
import org.apache.calcite.sql.parser.SqlParseException;

lakeFS with MinIO

lakeFS gives Git-like capabilities over your MinIO storage, allowing you to coordinate with colleagues when working on your data.

In the following example, we will use lakeFS to create a branch on your storage, commit changes to it, and then merge it to the master branch.

Prerequisites

  • Install MinIO Server from here.
  • Install mc from here.
  • Install docker-compose from here.
@pebbie
pebbie / sparqlqueryviz.py
Last active August 30, 2022 02:53
visualize BGP triples in SPARQL query
import sys
import os.path as path
from rdflib import Namespace, XSD, RDF, RDFS, OWL
from rdflib.term import Variable, URIRef, BNode, Literal
from rdflib.plugins.sparql.parser import parseQuery
from rdflib.plugins.sparql.parserutils import prettify_parsetree
from rdflib.plugins.sparql import prepareQuery
from rdflib.paths import Path
import pprint
import pygraphviz as pgv
@huchenxucs
huchenxucs / pos_embed.py
Created July 23, 2020 06:09
T5 relative positional embedding
import math
import torch
import torch.nn as nn
from torch.nn import functional as F
class RelativePositionBias(nn.Module):
def __init__(self, bidirectional=True, num_buckets=32, max_distance=128, n_heads=2):
super(RelativePositionBias, self).__init__()
self.bidirectional = bidirectional
@RobertAKARobin
RobertAKARobin / python.md
Last active June 13, 2024 04:24
Python Is Not A Great Programming Language
@vkocaman
vkocaman / annotators.csv
Created September 27, 2019 21:05
list of annotators offered by Spark NLP
We can make this file beautiful and searchable if this error is corrected: It looks like row 10 should actually have 5 columns, instead of 3. in line 9.
Annotator,Description,Version,Annotator Approach,Annotator Model
Tokenizer*,Identifies tokens with tokenization open standards,Opensource,-,+
Normalizer*,Removes all dirty characters from text,Opensource,-,+
Stemmer*,Returns hard'-stems out of words with the objective of retrieving the meaningful part of the word,Opensource,+,-
Lemmatizer*,Retrieves lemmas out of words with the objective of returning a base dictionary word,Opensource,-,+
RegexMatcher*,Uses a reference file to match a set of regular expressions and put them inside a provided key.,Opensource,+,+
TextMatcher*,Annotator to match entire phrases (by token) provided in a file against a Document,Opensource,+,+
Chunker*,Matches a pattern of part'-of'-speech tags in order to return meaningful phrases from document,Opensource,+,-
DateMatcher*,Reads from different forms of date and time expressions and converts them to a provided date format,Opensource,+,-
SentenceDetector*,Finds sentence bounds in raw text. Applies rules from Pragmatic Segmenter,Opensou
@jerieljan
jerieljan / How I Do PlantUML.md
Last active January 1, 2024 06:23
PlantUML with Style -- How I do PlantUML

I use PlantUML a lot. It's what I use for drawing all sorts of diagrams and it's handy because of its easy markup (once you get used to it) while making things easy to maintain as projects grow (thanks to version control)

This gist explains how I do my PlantUML workspace in a project.

  • The idea is to keep a globals directory for all diagrams to follow (like the "stylesheet" below) to keep things consistent.
  • I use a stylesheet.iuml file that keeps the use of colors consistent through use of basic FOREGROUND, BACKGROUND and ACCENT colors.
  • The style-presets.iuml file defines these colors so you can make "presets" or "themes" out of them.
  • As stated in the stylesheet.iuml, you'll need the Roboto Condensed and Inconsolata fonts for these to work properly.
  • You can choose to either run the PlantUML jar over your file/s, or use an IDE like VSCode with the PlantUML extension. Here's a preview of example-sequence.puml for example: https://imgur.com/Klk3w2F

GraphQL-LD

GraphQL-LD is a way to query Linked Data using GraphQL.

Instead of querying GraphQL interfaces, Linked Data interfaces are queried, such as SPARQL endpoints, TPF interfaces, Linked Data documents, ... This is done by semantifying GraphQL queries using a JSON-LD context.

Try it out from your browser: http://query.linkeddatafragments.org/

Alternatively, install GraphQL-LD or Comunica SPARQL and execute GraphQL-LD queries on your machine

@evan-burke
evan-burke / dremio-ubuntu
Last active July 10, 2021 01:01 — forked from jcaristy/info.sh
[DREMIO: Install dremio on Ubuntu] #dremio
Installing Dremio 1.4 on Ubuntu 16
### NOTE: this is significantly out of date since I last edited it in Jan 2018.
# See the comments on the gist for suggested changes for more recent versions.
##Install links / references
https://www.dremio.com/tutorials/recommender-scikit-learn-dremio-postgres-mongodb/
https://www.dremio.com/tutorials/dremio-oracle-aws/
https://docs.dremio.com/deployment/standalone-tarball.html