Guilhem Piat gpiat

## 01_Prolog_Intro.pl
/* Created by Guilhem Piat, 2022
View the interactive notebook version of this guide here:
https://swish.swi-prolog.org/p/intro_to_pl.swinb

        ##### INTRODUCTION TO PROLOG FOR CSPs #####

Prolog is a logic programming language that is typically meant to
solve logic problems, but can also easily be used as a CSP solver.

To use Prolog, you may use the online editor/console SWISH:

## aircc.cls
% ============================================================================
%% aircc.cls V 1.2, 2012/09/13, (c) 2012 Thomas Zink
%%  minor modifications by Artem Melentyev, 2014 and Guilhem Piat, 2022.
%%
%% This is an unofficial Latex class for Authors of AIRCC Papers.
%% It tries to follow the formating guidelines set in the official
%% template "aircc_template.doc" as close as possible.
%% Unfortunately, some are not easily applicable in Latex. Examples include
%% text style combinations like bold italicized small caps which simply are
%% generally not supported. Some font sizes are also not directly supported.

## .ater_blacklist
0020087J
0141408E
0141720U
0170145R
0171463Y
0211237F
0211139Z
0290119X
0290346U
0310152X

## get_books_and_wiki_corpora.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                gpiat
                / get_books_and_wiki_corpora.md
            
            
              Last active
              November 19, 2021 15:52
            
              
                This document describes how I acquire plaintext versions of the books and wikipedia corpora.
              
          
    How to: get books corpus and wikipedia corpus

As specified by the authors, the books corpus needs to be downloaded from smashwords. However, there is no easy download option, it seems that it needs to be scraped.
The Wikipedia dataset can be downloaded from Wikimedia but only as XML.
Huggingface makes these datasets available, making it easier to acquire them.
The steps are as follow:

  
## IOB2_to_IOBES.py
from sys import argv
from os import path


def parse_line(line):
    if line is None:
        return None, None, None
    semtype = None
    tok, tag = line
    if tag == 'O':

## interactive_BELT_training.ipynb

      
              1 file
            
          
              0 forks
            
          
              1 comment
            
          
              0 stars
            
          
                gpiat
                / interactive_BELT_training.ipynb
            
            
              Last active
              November 19, 2020 16:06
            
              
                Jupyter notebook for interactively training a BELT model. The necessary pickle files can be found in the comments.
              
          
        Loading

      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## huggingface_ner_germeval_preprocess.sh
#!/bin/bash

# find the tsv files here: https://drive.google.com/drive/folders/1kC0I2UGl2ltrluI9NqDjaQJGw5iliw_J

cat NER-de-train.tsv \
| grep -v "^#" | cut -f 2,3 | tr '\t' ' ' > train.txt.tmp
cat NER-de-dev.tsv \
| grep -v "^#" | cut -f 2,3 | tr '\t' ' ' > dev.txt.tmp
cat NER-de-test.tsv \
| grep -v "^#" | cut -f 2,3 | tr '\t' ' ' > test.txt.tmp

## convert_PubTator_BIO.py
import os
import pickle
import sys


def convert_targets(mode, targets):
    if mode == 'bin':
        return ['O' if t is None else 'M' for t in targets]
    elif mode == 'cuid':
        return ['O' if t is None else t[0] for t in targets]
	/* Created by Guilhem Piat, 2022
	View the interactive notebook version of this guide here:
	https://swish.swi-prolog.org/p/intro_to_pl.swinb

	##### INTRODUCTION TO PROLOG FOR CSPs #####

	Prolog is a logic programming language that is typically meant to
	solve logic problems, but can also easily be used as a CSP solver.

	To use Prolog, you may use the online editor/console SWISH:
	% ============================================================================
	%% aircc.cls V 1.2, 2012/09/13, (c) 2012 Thomas Zink
	%% minor modifications by Artem Melentyev, 2014 and Guilhem Piat, 2022.
	%%
	%% This is an unofficial Latex class for Authors of AIRCC Papers.
	%% It tries to follow the formating guidelines set in the official
	%% template "aircc_template.doc" as close as possible.
	%% Unfortunately, some are not easily applicable in Latex. Examples include
	%% text style combinations like bold italicized small caps which simply are
	%% generally not supported. Some font sizes are also not directly supported.
	0020087J
	0141408E
	0141720U
	0170145R
	0171463Y
	0211237F
	0211139Z
	0290119X
	0290346U
	0310152X
	from sys import argv
	from os import path


	def parse_line(line):
	if line is None:
	return None, None, None
	semtype = None
	tok, tag = line
	if tag == 'O':
	#!/bin/bash

	# find the tsv files here: https://drive.google.com/drive/folders/1kC0I2UGl2ltrluI9NqDjaQJGw5iliw_J

	cat NER-de-train.tsv \
	\| grep -v "^#" \| cut -f 2,3 \| tr '\t' ' ' > train.txt.tmp
	cat NER-de-dev.tsv \
	\| grep -v "^#" \| cut -f 2,3 \| tr '\t' ' ' > dev.txt.tmp
	cat NER-de-test.tsv \
	\| grep -v "^#" \| cut -f 2,3 \| tr '\t' ' ' > test.txt.tmp
	import os
	import pickle
	import sys


	def convert_targets(mode, targets):
	if mode == 'bin':
	return ['O' if t is None else 'M' for t in targets]
	elif mode == 'cuid':
	return ['O' if t is None else t[0] for t in targets]