Skip to content

Instantly share code, notes, and snippets.

View romanegloo's full-sized avatar
😀
clean-up

Jiho Noh romanegloo

😀
clean-up
  • University of Kentucky
  • Lexington, KY
View GitHub Profile
@romanegloo
romanegloo / BioASQ_questions_6b.txt
Created January 24, 2019 14:36
All the questions appeared in BioASQ 6b training data
"Is Hirschsprung disease a mendelian or a multifactorial disorder?"
"List signaling molecules (ligands) that interact with the receptor EGFR?"
"Is the protein Papilin secreted?"
"Are long non coding RNAs spliced?"
"Is RANKL secreted from the cells?"
"Does metformin interfere thyroxine absorption?"
"Which miRNAs could be used as potential biomarkers for epithelial ovarian cancer?"
"Which acetylcholinesterase inhibitors are used for treatment of myasthenia gravis?"
"Has Denosumab (Prolia) been approved by FDA?"
"List the human genes encoding for the dishevelled proteins?"

Lab 03

Schedule

  • Project 1 is due by Feb 8
  • Lab2 submission is due Tonight

Tips for Lab 03

How to print a float number with a fixed precision

ex) 41.4451 to 41.45

@romanegloo
romanegloo / installation.md
Created February 2, 2019 03:41
PyLucene Installation

On Mac OS

Java

Both of Oracle Java 1.8 and Apple's Java 1.6 are required. You can download Oracle Java 1.8 from here.

For Apple's Java 1.6, download the package from here and install it.

@romanegloo
romanegloo / lucene_indexing.py
Created February 2, 2019 17:42
Indexing using PyLucene, example code
'''
Indexing using PyLucene, example code
'''
import os
from pathlib import Path
import lucene
from java.nio.file import Paths
from org.apache.lucene.analysis.standard import StandardAnalyzer
@romanegloo
romanegloo / ukycs215_lab4.md
Last active February 7, 2019 14:01
cs215 lab note 4

Lab04

Objectives

  • read input temperatures by day
  • write summarization in an output file

from tempin.txt

01/24/2019 5 22 27 31 26 19
01/25/2019 8 20 25 30 35 40 38 32 29
@romanegloo
romanegloo / ukycs215_lab5.md
Created February 13, 2019 20:27
lab05_note

Lab 05

Lab 05 instruction

Easiest so far!!!

Checklist

  • menu option is single char
  • use do-while for the menu loop
@romanegloo
romanegloo / ukycs215_lab7.md
Last active March 7, 2019 16:43
lab7 note
@romanegloo
romanegloo / ukycs215_lab8.md
Last active March 29, 2019 14:23
lab8 note

Lab 08

instruction note (lab8)

Lab 9 is canceled (lab 8 is extended by a week)

Reviews

  1. include contact.h from contactList.h. Use double quotation marks
  2. #pragma once tells the compiler to include the source code only once
import csv
import sqlite3
eval_file = "data/eval/MayoSRS_mesh.csv"
db_file = "data/pubtator/pubtator-20190725-6496be10.db"
words = []
with open(eval_file) as f:
csv_reader = csv.DictReader(f)
for row in csv_reader:
@romanegloo
romanegloo / script_gen_pubtator_db.py
Created August 30, 2019 13:58
A script that reads MeSH descriptors and PubTator doc data from data files and create SQLite database to store the encoded docs for later training uses. (deprecated)
#!/usr/bin/env python3
"""Preprocess PubTator corpus and ScopeNotes of MeSH descriptors for language
model training (LmBMET).
1) Given the original PubTator biocepts annotated documents, this interpolates
the concept codes into document texts. Before that, this will count word
frequencies and generate vocabulary which will include the entire set of
bioconcepts (MeSH in particular). In case that a pre-trained embeddings file
(.vec) is provided, we obtain a vocabulary from the embeddings.