This file has been truncated, but you can view the full file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[ | |
{ | |
"publish_date": "2014-07-31", | |
"title": "Did Coulson\u2019s News of the World Incite Others to Commit Crimes and Cause Unsafe Convictions?", | |
"url": "https://www.bellingcat.com/news/uk-and-europe/2014/07/31/did-coulsons-news-of-the-world-incite-others-to-commit-crimes-and-cause-unsafe-convictions/", | |
"articles_text": "\n\nMore on the Fake Sheikh, the Police, and News of the World by occasional blogger @jpublik.\n\nAndy Coulson\u2018s News of the World sent a man to jail after luring him to sell them drugs he was terrified of carrying by promising him a job. He was sentenced to four years in prison before his conviction was quashed \u2013 after he\u2019d already served his time.\n\nIn a case which has hardly received any publicity, according to high court documents, Albanian Besnik Qema was asked to supply News of the World cocaine and a passport on a promise of job as security for a wealthy Arab family.\n\nThe High Court documents detail how in January 2005, Mazher Mahmood had asked Florim |
We can't make this file beautiful and searchable because it's too large.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
year,month,url,path,articles_text,publish_date,title | |
2014,7,https://www.bellingcat.com/news/uk-and-europe/2014/07/31/did-coulsons-news-of-the-world-incite-others-to-commit-crimes-and-cause-unsafe-convictions/,/news/uk-and-europe/2014/07/31/did-coulsons-news-of-the-world-incite-others-to-commit-crimes-and-cause-unsafe-convictions/," | |
More on the Fake Sheikh, the Police, and News of the World by occasional blogger @jpublik. | |
Andy Coulson‘s News of the World sent a man to jail after luring him to sell them drugs he was terrified of carrying by promising him a job. He was sentenced to four years in prison before his conviction was quashed – after he’d already served his time. | |
In a case which has hardly received any publicity, according to high court documents, Albanian Besnik Qema was asked to supply News of the World cocaine and a passport on a promise of job as security for a wealthy Arab family. | |
The High Court documents detail how in January 2005, Mazher Mahmood had asked Florim Gashi, a contact of his who h |
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
from functools import reduce | |
from dataclasses import dataclass | |
from bs4 import BeautifulSoup | |
from newspaper import Article | |
import pandas as pd | |
BASE_URL = "https://www.bellingcat.com" | |
BELLINGCAT_START_YEAR = 2014 # earliest article on site |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class LanguageModelMulti(Module): | |
""" | |
Deepening the model with the built-in RNN class for more accuracy | |
""" | |
def __init__(self, vocab_sz, n_hidden, n_layers): | |
self.i_h = nn.Embedding(vocab_sz, n_hidden) | |
# Creates an RNN within | |
self.rnn = nn.RNN(n_hidden, n_hidden, n_layers, batch_first=True) | |
self.h_o = nn.Linear(n_hidden, vocab_sz) | |
# Creates zeros for all layers |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class LanguageModelRecurrentState(Module): | |
""" | |
State is saved by moving the reset to the init method | |
Gradients are detached for all but 3 layers | |
""" | |
def __init__(self, vocab_sz, n_hidden): | |
self.i_h = nn.Embedding(vocab_sz, n_hidden) | |
self.h_h = nn.Linear(n_hidden, n_hidden) | |
self.h_o = nn.Linear(n_hidden, vocab_sz) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class LanguageModel(Module): | |
""" | |
Takes three words as input and returns a probability for the next | |
The 1st layer will use the first word's embedding | |
The 2nd layer will use the 2nd word's embedding and the 1st word's output activations | |
The 3rd layer will use the 3rd word's embedding plus the 2nd word's output activations | |
""" | |
def __init__(self, vocab_sz, n_hidden): | |
self.i_h = nn.Embedding(vocab_sz, n_hidden) # Converts the indices to a vector | |
self.h_h = nn.Linear(n_hidden, n_hidden) # Creates the activations for the successive word |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
welcome back here we go again great to | |
see you and congratulations | |
thank you you will never forget what is | |
going on in the world when you think | |
about when your child is born you will |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from sklearn.feature_extraction.text import CountVectorizer | |
def parse_txt(txt_file): | |
""" | |
Pass text file location and returns n list elements for each line in the file | |
""" | |
with open(txt_file, "r") as f: | |
# Reads files, removes new lines and appends to list | |
words = f.read().splitlines() | |
# Removes None elements |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from fastai import * | |
from fastbook import * | |
def create_params(size): | |
""" | |
Pass tensor shape | |
Returns normalised model parameters | |
""" | |
return nn.Parameter(torch.zeros(*size).normal_(0, 0.01)) |
NewerOlder