Skip to content

Instantly share code, notes, and snippets.

Avatar

k5.trismegistus k5trismegistus

View GitHub Profile
@k5trismegistus
k5trismegistus / test
Created Sep 6, 2014
Doujinshi Lexicon Parser
View test
# coding: UTF-8
import urllib.request
class LexiconParser():
def __init__(self):
self.comicinfodict = {}
def parse(self, url):
@k5trismegistus
k5trismegistus / Info.html
Created Sep 6, 2014
Lexiconでほしい情報が入っているあたり
View Info.html
<tr><td><B>原題:</B></td><td>バナナイスの一日</td></tr>
<tr><td><B>タイトル:</B></td><td></td></tr>
<tr><td><B>頁数:</B></td><td>28</td></tr>
<tr><td><B>Free:</B></td><td>No</td></tr>
@k5trismegistus
k5trismegistus / Lexicon Parser
Created Oct 11, 2014
Doujinshi & Manga Lexiconからスクレイピングしよう
View Lexicon Parser
circle = soup.find_all(href=re.compile('/browse/author/[0123456789]+')
View 欲しい列
<tr><td><img src="/images/aamn_2.gif" alt="サークル同人"> <a href="/browse/author/22476/">華村色花</A></td><td><a href="/browse/author/22476/"></A></td><td></td></tr>
@k5trismegistus
k5trismegistus / 神奈子様夢妄想
Created Oct 11, 2014
神奈子様夢妄想のHTML全文
View 神奈子様夢妄想
<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Strict//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="description" lang="en" content="Information about: 神奈子様夢妄想 - The Doujinshi & Manga Lexicon An atempt to document everything related to the manga art style.">
<meta name="description" lang="jp" content="同人誌データーベース 3.0 漫画アート・スタイルの作品を網羅し記録する試みです">
<meta name="keywords" lang="en" content="Kanako-sama Yume Mousou, Comic Communication 13, No Collections, Circle Nuruma-ya, Tsukiwani, Touhou Project, Kotiya Sanae, Moriya Suwako, Yasaka Kanako, Big ass, Bondage, Restraint, Rape, doujinshi, manga, The doujinshi DB project">
<meta name="keywords" lang="jp" content="神奈子様夢妄想, コミックコミュニケーション 13, シリーズでない, サークルぬるま屋, 月わに, 東方Project, 東風谷早苗, 洩矢諏訪子, 八坂神奈子, 拘束, 緊縛, 強姦, 同人誌, 一般コミック, 成年コミック, 同人">
<link REL="SHORTCUT ICON" HREF="http://www.doujinshi.org/DoujinDB.png">
<link rel="stylesheet" type="text/css" href="/style/main.css">
@k5trismegistus
k5trismegistus / scraper
Last active Aug 29, 2015
class LexiconScraper
View scraper
import urllib.request
import bs4
import re
class LexiconScraper():
def __init__(self, url):
html = urllib.request.urlopen(url).read().decode('utf-8')
self.soup = bs4.BeautifulSoup(html)
View ComicInfoEditor.py
import os
import wx
import InfoEditor
class ComicInfoGetter(wx.App):
def OnInit(self):
frm = GuiWindow("ComicInfoEditor")
frm.Show()
@k5trismegistus
k5trismegistus / search_by_title.html
Last active Aug 29, 2015
タイトル検索-東方浮世絵巻
View search_by_title.html
<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Strict//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="description" lang="en" content="The front page - The Doujinshi & Manga Lexicon An atempt to document everything related to the manga art style.">
<meta name="description" lang="jp" content="同人誌データーベース 3.0 漫画アート・スタイルの作品を網羅し記録する試みです">
<meta name="keywords" lang="en" content="doujinshi, manga, The doujinshi DB project">
<meta name="keywords" lang="jp" content="同人誌, 一般コミック, 成年コミック, 同人">
<link REL="SHORTCUT ICON" HREF="http://www.doujinshi.org/DoujinDB.png">
<link rel="stylesheet" type="text/css" href="/style/main.css">
@k5trismegistus
k5trismegistus / 0_search_from_key.py
Last active Aug 29, 2015
タイトルの一部から検索可能
View 0_search_from_key.py
def search_from_keyword(keyword):
"""Search by keyword(Title) and return proposals"""
def get_metadata(div):
metadata = {}
metadata['Series'] = get_series(div)
metadata['Writer'] = get_writer(div)
metadata['Penciller'] = get_penciller(div)
metadata['Genre'] = get_genre(div)
View bookinfo.html
<div class="bookinfo">
<span class="tab L0"><B>原題:</B></span>
<span class="tab LPEXACT1">東方浮世絵巻 パチュリー・ノーリッジ</span>
<BR>
<span class="tab L0"><B>タイトル:</B></span>
<span class="tab LPEXACT1">Touhou Ukiyo Emaki Patchouli Knowledge</span>
<BR>
<span class="tab L0"><B>サークル:</B></span>
<span class="tab LPEXACT1">PARANOIA CAT</span>
<BR>