Skip to content

Instantly share code, notes, and snippets.

Alex Olieman aolieman

Block or report user

Report or block aolieman

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
View keybase.md

Keybase proof

I hereby claim:

  • I am aolieman on github.
  • I am alioli (https://keybase.io/alioli) on keybase.
  • I have a public key ASD9fx_rJMX8LRULGEOq3ymh-4MgCDo30SsGFyj3PIH_dwo

To claim this, I am signing this object:

@aolieman
aolieman / linking_parties.md
Last active Aug 29, 2015
Strategy for recognizing and linking Dutch political parties
View linking_parties.md

Strategy for linking political parties

Build a name tree and a lookup map

  1. Retrieve all party identifiers
  2. For each party: parse XML and map known names to its identifier
  3. Build an Aho-Corasick tree from the party names

String matching and linking

The tree is searched with the proceedings input string (case-insensitive, leftmost longest-match), yielding the names that were matched. The lookup map is used to find the party identifier that corresponds with the matched name. The name is linked, unless it has already been recognized as a member's name. Another reason not to link a found name of a (single-member) party, is if it is part of a longer name, such as that of a motion or committee.

@aolieman
aolieman / linking_gov_parl_members.md
Last active Aug 29, 2015
Strategy for linking mentioned Dutch government and parliament members
View linking_gov_parl_members.md

Strategy for linking mentioned Dutch government and parliament members

Spotting

Mentions of government and parliament members in a text are found by regular expression. At least three cases can be distinguished here.

  1. A single person is mentioned by name
  2. Multiple persons are mentioned by name
  3. A person is mentioned only by his/her function

In the first two cases, the strategy is to make use of highly regular address styles that are used in parliamentary speech, and thus also in the proceedings. Some examples (in translation) are: sir, madam, member, colleague, minister, and secretary of state. Such an address is immediately followed by a member's last name. The pattern that is used to find these names needs to span as much of the name as possible, without including any other subsequent words.

@aolieman
aolieman / 0_reuse_code.js
Last active Aug 29, 2015
Here are some things you can do with Gists in GistBox.
View 0_reuse_code.js
// Use Gists to store code you would like to remember later on
console.log(window); // log the "window" object to the console
@aolieman
aolieman / custom_elasticsearch.md
Last active Mar 27, 2019
Haystack provides an interface similar to Django's QuerySet, which instead enables easy querying in one or more popular search backends. Because the Haystack API is meant to hook up to several search backends, however, not all the functionality of the backends has been implemented in the API. In this post we show how Haystack's Elasticsearch bac…
View custom_elasticsearch.md

Extending Haystack's Elasticsearch backend

Haystack provides an interface similar to Django's QuerySet, which instead enables easy querying in one or more popular search backends. Because the Haystack SearchQuerySet API is meant to hook up to several search backends, however, not all the functionality of the backends has been implemented in the API. In this article we show how Haystack's Elasticsearch backend can be extended with advanced querying functionality.

As an exemplary use case, we'll focus on implementing Elasticsearch's Nested Query in the SearchQuerySetAPI, to enable e.g. weighted tags on documents. The usage of this extended API will be shown first, after which we'll go through the necessary implementation steps.

ConfigurableSearchQuerySet API Usage

import search.custom_elasticsearch as ces
from files import FileObject
You can’t perform that action at this time.