README is empty
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"metadata": { | |
"name": "Playing with MySql" | |
}, | |
"nbformat": 3, | |
"nbformat_minor": 0, | |
"worksheets": [ | |
{ | |
"cells": [ | |
{ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"metadata": { | |
"name": "CSV read with Pandas" | |
}, | |
"nbformat": 3, | |
"nbformat_minor": 0, | |
"worksheets": [ | |
{ | |
"cells": [ | |
{ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import re | |
import sys | |
_, infile, outfile = sys.argv | |
s_pat_row = r''' | |
"([^"]+)" # match column; this is group 1 | |
\s*\t\s* # match separating tab and any optional white space | |
([^\t]+) # match a string of non-tab chars; this is group 2 | |
\s*\t\s* # match separating tab and any optional white space |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# define the function to split the file into smaller chunks | |
def splitFile(inputFile,chunkSize): | |
#read the contents of the file | |
f = open(inputFile, 'rb') | |
data = f.read() | |
f.close() | |
# get the length of data, ie size of the input file in bytes | |
bytes = len(data) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"metadata": { | |
"name": "" | |
}, | |
"nbformat": 3, | |
"nbformat_minor": 0, | |
"worksheets": [ | |
{ | |
"cells": [ | |
{ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import regex | |
import logging | |
import gensim | |
from gensim import corpora, models | |
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO) | |
class MySentences(object): | |
def __init__(self, fname): | |
self.fname = fname |
Working:
import requests
text = "Fördomen har alltid sin rot i vardagslivet - Olof Palme 🙈🙈🙌🙆👪👏yüá🎧ÖÅÄê"
r = requests.post("http://json-tagger.herokuapp.com/tag",data=dict(data=text))
r.json()
{'entities': [{'token_ids': ['tok:0:8', 'tok:0:9', 'tok:0:10'], 'word_form': 'Olof Palme 🙈🙈🙌🙆👪👏yüá🎧ÖÅÄê'}], 'sentences': [[{'morph_feat': 'UTR|SIN|DEF|NOM',
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"0307.n.0003": { | |
"cats": null, | |
"meta_cats": null, | |
"filename": "Utgrävningar_i_Teotihuacan_(1932)_-_SMVK_-_0307.n.0003", | |
"info": "{{photograph\n|photographer = {{creator:Sigvald_Linné}}\n|title = \n|description = {{sv|Chichén Itzá, Dzitas. Utgrävningar i Teotihuacan (1932).}}\n{{en|Images from the 1932 Sigvald Linné archeological expedition at Teotihuacán, Mexico.}}\n|depicted place = Chichén Itzá, Dzitas\n|date = 1932\n|medium = \n|dimensions = \n|institution = {{Institution:Statens museer för världskultur}}\n|department = [[:d:Q1371375|Etnografiska muséet]]\n|references = \n|object history = \n|exhibition history = \n|credit line = \n|inscriptions = \n|notes = \n|accession number = {{SMVK-EM-link|1=foto|2=2803890|3=0307.n.0003}}\n|source = Original file name, as received from SMVK: <br /> '''0307.n.0003.tif'''\n{{SMVK_cooperation_project|COH|museu |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/Users/mos/anaconda/bin/python /Users/mos/PycharmProjects/Medelhavsmuseet_2016-08/check_pages_without_images_smvk-em.py | |
--- [[commons:File:Från utgrävningarna vid Xolalpan - SMVK - 0307.a.0154.tif]] | |
{{speedydelete|broken file upload}} | |
{{photograph | |
|photographer = {{creator:Sigvald_Linné}} | |
|title = | |
|description = {{sv|Från utgrävningarna vid Xolalpan. Teotihuacan. Utgrävningar i Teotihuacan (1932).}} | |
{{en|Images from the 1932 Sigvald Linné archeological expedition at Teotihuacán, Mexico.}} | |
|depicted place = Q172613 |
OlderNewer