This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
git clone https://github.com/nlplab/brat.git |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"title": "DeFi Portal 1inch Launches Wallet App on Android", | |
"link": "https://www.coindesk.com/business/2022/03/31/defi-portal-1inch-launches-wallet-on-android/?utm_medium=referral&utm_source=rss&utm_campaign=headlines", | |
"pub_date": "31 Mar 2022 15:00:00 ", | |
"summary": "The move comes nearly one year after the app became available on Apple's iPhone.", | |
"image": "https://www.coindesk.com/resizer/QOu3JOV1i6UgnbwUc4nD1hXvaeo=/800x600/cloudfront-us-east-1.images.arcpublishing.com/coindesk/GXWHA5WEUJEFJJGAL44U5IXVKQ.png" | |
} | |
{ | |
"title": "The state of Web3: Community talks about opportunities around the world", | |
"link": "https://cointelegraph.com/news/the-state-of-web3-community-talks-about-opportunities-around-the-world", |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
for i in range(len(data)): | |
for key in data[i].keys(): | |
data[i][key] = data[i][key]["S"] | |
print(data[:2]) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
training_data = {'classes' : ['MEDICINE', "MEDICALCONDITION", "PATHOGEN"], 'annotations' : []} | |
for example in data['examples']: | |
temp_dict = {} | |
temp_dict['text'] = example['content'] | |
temp_dict['entities'] = [] | |
for annotation in example['annotations']: | |
start = annotation['start'] | |
end = annotation['end'] | |
label = annotation['tag_name'].upper() | |
temp_dict['entities'].append((start, end, label)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
patterns = [nlp.make_doc(name) for name in names] | |
matcher.add("COMPANY", patterns) | |
patterns = [nlp.make_doc(symbol) for symbol in data['Symbol']] | |
matcher.add("SYMBOL", patterns) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from newscatcher import describe_url | |
websites = ['nytimes.com', 'cronachediordinariorazzismo.org', 'libertaegiustizia.it'] | |
for website in websites: | |
print(describe_url(website)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def tagged_document(list_of_list_of_words): | |
for i, list_of_words in enumerate(list_of_list_of_words): | |
yield gensim.models.doc2vec.TaggedDocument(list_of_words, [i]) | |
training_data = list(tagged_document(data)) | |
model = gensim.models.doc2vec.Doc2Vec(vector_size=40, min_count=2, epochs=30) | |
model.build_vocab(training_data) | |
model.train(training_data, total_examples=model.corpus_count, epochs=model.epochs) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def dividend_info(article): | |
headline = nlp(article['title']) | |
if 'date' in [token.text.lower() for token in headline]: | |
date = get_date(headline) | |
if date: | |
org = get_org(headline) | |
ticker = get_ticker(headline) | |
amount = get_amount_summary(nlp(article['summary'])) | |
pay_date = get_pay_date(nlp(article['summary'])) | |
print("HEADLINE: " + article['title']) |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import os | |
count = 0 | |
for i in os.listdir(): | |
os.rename(i,str(count)+ '.'+ i.split('.')[-1]) | |
count+=1 |
NewerOlder