Skip to content

Instantly share code, notes, and snippets.

View linuskohl's full-sized avatar
🙏

Linus Kohl linuskohl

🙏
View GitHub Profile
@linuskohl
linuskohl / dataset.json
Created April 13, 2024 08:50
Staedel-Museum Datensatz
This file has been truncated, but you can view the full file.
[
{
"identifier": "sg1153",
"title": "The Rose Lover",
"subjects": [
"adult man",
"walking, hiking (recreation)",
"peeping, voyeur",
"flowers: rose",
"couple of lovers",
# Create evaluation DataFrame containing BIOSSES pairings and additional information for evaluation
evaluation = pd.DataFrame(biosses_meta.loc[:,['Text1', 'Text2', 'Avg', 'Var']])
# Add CUI information
evaluation['Text1_CUIs'] = evaluation['Text1'].apply(lambda x: biosses_texts.loc[x,'UMLS_CUIs'])
evaluation['Text2_CUIs'] = evaluation['Text2'].apply(lambda x: biosses_texts.loc[x,'UMLS_CUIs'])
# Add UMLS terms
evaluation['Text1_UMLS_TERMS'] = evaluation['Text1'].apply(lambda x: biosses_texts.loc[x,'UMLS_Terms'])
evaluation['Text2_UMLS_TERMS'] = evaluation['Text2'].apply(lambda x: biosses_texts.loc[x,'UMLS_Terms'])
# Add texts for evaluation purposes
evaluation['Text1'] = evaluation['Text1'].apply(lambda x: biosses_texts.loc[x,'Text'])
# Load similarities
cui_similarities = pd.read_csv("cui_pairings_out.csv", header=None, names=["cui_0","cui_1","lch","path","wup"])
# Build index for faster access
cui_similarities_reverse = cui_similarities.copy()
cui_similarities_reverse.rename(columns={"cui_0": "cui_1", "cui_1": "cui_0"}, inplace=True)
cui_table = pd.concat([cui_similarities, cui_similarities_reverse], sort=False)
cui_table.set_index(["cui_0","cui_1"], inplace=True)
cui_table = cui_table.sort_index(level='cui_1')
cui_table = cui_table.sort_index()
import io
import os
import string
import csv
import xml
import re
import unicodedata
import itertools
import requests
from functools import partial
@linuskohl
linuskohl / biosses_texts.csv
Created June 26, 2020 19:23
Gizem Soğancıoğlu, Hakime Öztürk, Arzucan Özgür; BIOSSES: a semantic sentence similarity estimation system for the biomedical domain. Bioinformatics 2017; 33 (14): i49-i58. doi: 10.1093/bioinformatics/btx238
We can make this file beautiful and searchable if this error is corrected: Unclosed quoted field in line 7.
Id,Text
0,It has recently been shown that Craf is essential for Kras G12D-induced NSCLC.
1,"The Bcl-2 inhibitor ABT-737 induces regression of solid tumors and its derivatives are in the early clinical phase as cancer therapeutics; however, it targets Bcl-2, Bcl-XL, and Bcl-w, but not Mcl-1, which induces resistance against apoptotic cell death triggered by ABT-737."
2,Previous studies demonstrated that the decrease level of 5 hmC in tumors was due to the reduced expression of TET1/2/3 and IDH2 genes or tumor derived IDH1 and IDH2 mutations.
3,"More recently, IDH mutations and resultant 2-hydroxyglutarate (2HG) production in leukemia cells were reported to induce global DNA hypermethylation through impaired TET2 catalytic function."
4,Recent in vitro studies using shRNA-based approaches have suggested a role for TET2 in regulating myeloid differentiation and in regulating stem/progenitor cell proliferation.
5,"Recently, it was reported that expression of IDH1R132H suppresses TET2 activity and the mutations of
@linuskohl
linuskohl / biosses_meta.csv
Last active June 26, 2020 19:24
Gizem Soğancıoğlu, Hakime Öztürk, Arzucan Özgür; BIOSSES: a semantic sentence similarity estimation system for the biomedical domain. Bioinformatics 2017; 33 (14): i49-i58. doi: 10.1093/bioinformatics/btx238
Id Text1 Text2 Annotator A Annotator B Annotator C Annotator D Annotator E Avg Var
1 0 94 4 4 4 4 4 4 0
2 1 95 3 3 3 3 3 3 0
3 2 96 2 2 3 2 2 2.2 0.2
4 3 97 3 3 4 3 3 3.2 0.2
5 4 98 3 3 4 3 3 3.2 0.2
6 5 99 3 3 4 3 3 3.2 0.2
7 6 100 1 1 3 2 1 1.6 0.8
8 7 101 3 3 3 3 3 3 0
9 8 102 2 1 1 2 1 1.4 0.3
#!/usr/bin/perl
#
# Author: Linus Kohl
# E-Mail: kohl@munichresearch.com
# Org: MunichResearch
#
use strict;
use UMLS::Interface;
use UMLS::Similarity::lch;
use UMLS::Similarity::path;
@linuskohl
linuskohl / extract_links.js
Created April 9, 2020 09:30
Extract list of links by class in jQuery
@linuskohl
linuskohl / jwt_helpers.py
Last active December 30, 2022 10:10
Helper functions to validate JSON Web Tokens for flask RESTful APIs by fetching JWKs from OpenID Provider Metadata. Used with Okta.
from functools import wraps
from flask import request, abort, g
import json
import jwt
import requests
from typing import Union, List
from ..config import cache
from ..env import JWT_ISSUER, JWT_CLIENTID, JWT_AUDIENCE
DISCOVERY_URL = "/.well-known/oauth-authorization-server"
@linuskohl
linuskohl / hsozkult.py
Created August 20, 2019 16:01
Tiny script to send notification emails on new openings on H-Soz-Kult
#!/usr/bin/env python
# coding: utf-8
import os
import feedparser
import requests
import sqlite3
from sqlite3 import IntegrityError
from lxml import html
import lxml
from string import Template