Skip to content

Instantly share code, notes, and snippets.

Avatar
✌️
Do the right thing

Mour mylamour

✌️
Do the right thing
View GitHub Profile
@mylamour
mylamour / musicbrainzUrlSchemaParseCsvToJSON.py
Created Dec 31, 2016
from musicbrainz Url Schema to extract some useful infomation and export csv , with parse to json. there is a amusing way to decide source due to the url type was various.
View musicbrainzUrlSchemaParseCsvToJSON.py
import csv
import json
import os
from urlparse import urlparse
csvfile = open('url.csv', 'r')
jsonfile = open('test.json', 'w')
fieldnames = ("@id","sourceUrl")
reader = csv.DictReader( csvfile, fieldnames)
@mylamour
mylamour / luigiPgDemo.py
Created Jan 1, 2017
use luigi to Backup Postgresql, there has a problem, it's not flexible. but i still think that's workflow should be
View luigiPgDemo.py
import luigi
import psycopg2
class QueryBackToTmp(luigi.Task):
def run(self):
conn_string = "host='ec2-54-zzz-xxx-yyy.cn-north-1.compute.amazonaws.com.cn' " \
"dbname='musicbrainz' " \
"user='postgres' " \
"password='password'"
@mylamour
mylamour / README.md
Last active Jan 2, 2017
use crontab to define a event and exec every second , it's simply like a log count
View README.md
  • use crontab -e to define a event every minuter,due to the script was exahust 1 min , so it's mean this log can be ouput every second
  • and you can see what you want , just change the listEverySecond.sh , you can easyliy get file info , and something else
  1. just insert this string : */1 * * * * /bin/bash ~/listEverySecond.sh >> cat.log
  2. just make a test, in your dest dir : wget -m -p -c http://your.test.domain.name
  3. log view (In a new terminal window) : tail -f cat.log
@mylamour
mylamour / listEverySecond.sh
Created Jan 2, 2017
use crontab to define a event and exec every second , it's simply like a log count
View listEverySecond.sh
##!/bin/bash
step=1
for (( i = 0; i < 60; i=(i+step) )); do
# ls ~/tmp/javfor.me/ | wc -l
# ls ~/tmp/javfor.me/ | grep -E "html\."
ls -R ~/jav/ | wc -l
sleep $step
@mylamour
mylamour / dialog.sh
Last active Jan 3, 2017
progress bar with shell .
View dialog.sh
#!/bin/sh
for((i=0;i<100;i++))
do
sleep 0.1
echo $i | dialog --title 'Copy' --gauge 'Backp file from postgresql!' 10 70 0
@mylamour
mylamour / httpserver.md
Last active Jan 12, 2017
Different Way to Open a HTTP SERVER (temp or not )
View httpserver.md

python:

  • python2 -m SimpleHTTPServer
  • python3 -m http.server
  • twistd -n web -p 8000 --path .

ruby:

  • ruby -rwebrick -e "WEBrick::HTTPServer.new(:Port => 8888, :DocumentRoot => Dir.pwd).start"
  • ruby -run -ehttpd . -p8000
@mylamour
mylamour / uncompelete regex example.md
Last active Jan 18, 2017
Regex USEFUL Regex FROM : http://www.regexr.com/v1/RegExr.php , different programe language was uncompelete support the regex features, you should take care about it.
View uncompelete regex example.md
  • name="UniProt+Fastaheader"
`/^>[^\|]*\|([^\|]*)\|.*OS=([^=]*).*GN=([^ ]*).*$/g`

Matches UniProt accessionnumber, genename and organism in a UniProt fasta header

  • name="E-mail+validator+for+International+Domain"
@mylamour
mylamour / unnicodeConvert.py
Last active Jan 28, 2017
conert single line json with unicode to normal, 解决读取人名乱码问题
View unnicodeConvert.py
#这个是真奇怪,从pg里拿到的数据,怎么转换也不行,最后只能用这种办法修改的,
# s.decode("UTF-8").encode("GBK") 之前爬虫时可以用,但是现在这个场景不适合。
import json
with open('/home/ubuntu/origin/musicgroup.json') as origin, open('/home/ubuntu/json/musicgroup.json','w') as dest:
for i in origin.readlines():
t = json.loads(i)
dest.write(json.dumps(t, ensure_ascii=False).encode('utf8') + '\n')
View jsondumpsdataime.py
import datetime
from time import mktime
try:
import simplejson as json
except ImportError:
import json
class DateTimeEncoder(json.JSONEncoder): # 为 JSONEncoder 进行扩展
def default(self, obj):
if isinstance(obj, datetime.datetime):
@mylamour
mylamour / checkProxy.py
Last active Feb 16, 2017
之前在vultr上masscan扫到1千多万的8080端口的ip,写了个脚本验证下,当然,这个脚本还需要改进很多。还不如爬代理网站靠谱
View checkProxy.py
#!/usr/bin/env python
import requests
import cPickle as pickle
s = requests.Session()
with open('./iplist.list') as proxylists,open("needproxy.pkl","a") as usefull:
for proxy in proxylists.readlines():
tmp = {
"http": "http://{}:8080".format(proxy),