Skip to content

Instantly share code, notes, and snippets.

View mateisuica's full-sized avatar

Matei Suica mateisuica

View GitHub Profile
// Kotlin
var bob : User? = null
...
fun printBob() {
someOutput(bob?.name)
someOutput(bob?.surname)
if(bob?.age == 5) { // does this even work? I need to null check separately, right?
doSomeMoreStuff(bob?.dateOfBirth)
}
}
test = loadPage('http://www.mateisuica.com')
predicted = text_clf.predict([test])
print(predicted)
@mateisuica
mateisuica / gist:3c6b5c6d53a64b1a295c15c86eb99fc4
Created September 11, 2017 18:13
Filter and Train Naive Bayes
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.naive_bayes import MultinomialNB
from sklearn.pipeline import Pipeline
text_clf = Pipeline([('vect', CountVectorizer()),
('tfidf', TfidfTransformer()),
('clf', MultinomialNB()),
])
final_data = []
target = []
for link in sites:
print(link[0])
page = loadPage(link[0])
data = parseContent(page)
final_data.append(data)
target.append(link[1])
@mateisuica
mateisuica / gist:e4f8bb3a79eb1f77a66cb9c3243a330f
Created September 11, 2017 18:04
Extract text from an article HTML
from bs4 import BeautifulSoup
def parseContent( content ):
# parse the html using beautiful soap and store in variable `soup`
soup = BeautifulSoup(content, 'html.parser')
# Take out the <div> of name and get its value
content = soup.find_all(['h1','h2','h3','h4','h5', 'p','a'])
text = ""
@mateisuica
mateisuica / gist:3bd789b11bd5650a9aec0b90466dc001
Created September 11, 2017 18:02
Load a webpage content in Python
import urllib.request
def loadPage( site ):
hdr = {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8','User-Agent':'Mozilla/5.0'}
request=urllib.request.Request(site,None,hdr) #The assembled request
response = urllib.request.urlopen(request)
page = response.read() # The data u need
return page
@mateisuica
mateisuica / gist:5645ec4e437487f2ffbe7676c269727d
Created September 11, 2017 17:57
Open a CSV file in Python
import csv
sites = []
with open('data.csv', 'rt') as csvfile:
spamreader = csv.reader(csvfile)
for row in spamreader:
sites.append(row)
@mateisuica
mateisuica / gist:06f1717cdb09aa38311f33b83bcf1f5b
Created September 11, 2017 17:52
Links to train the classifier
https://blog.mindorks.com/how-to-become-more-productive-in-android-with-android-studio-plugins-3beb3861fa7,1
https://blog.aritraroy.in/30-bite-sized-pro-tips-to-become-a-better-android-developer-b311fd641089,1
https://blog.aritraroy.in/what-my-2-years-of-android-development-have-taught-me-the-hard-way-52b495ba5c51,1
https://betterhumans.coach.me/how-to-get-over-the-habit-hump-e8c037aacc25,0
https://betterhumans.coach.me/developing-mindfulness-using-a-didgeridoo-just-dont-call-it-that-e3bc3fa61f2c,0
https://betterhumans.coach.me/how-a-break-from-alcohol-can-unlock-peak-performance-in-business-and-life-ae75d2cbfc9e,0
https://m.signalvnoise.com/basecamp-3-for-ios-hybrid-architecture-afc071589c25,0
https://android.jlelse.eu/make-robolectric-compatible-with-latest-gradle-tools-3-0-0-eca085681b3b,1
https://blog.mindorks.com/kotlin-weekly-update-6-64c3a7f1063c,1
https://m.signalvnoise.com/peaked-78ec0b147aeb,0