Skip to content

Instantly share code, notes, and snippets.

View sany2k8's full-sized avatar
:octocat:
Focusing

Md. Sany Ahmed sany2k8

:octocat:
Focusing
  • Khulna, Bangladesh
View GitHub Profile

API workthough

  1. Open a browser

    # start an instance of firefox with selenium-webdriver
    driver = Selenium::WebDriver.for :firefox
    # :chrome -> chrome
    # :ie     -> iexplore
    
  • Go to a specified URL
# 10_basic.py
# 15_make_soup.py
# 20_search.py
# 25_navigation.py
# 30_edit.py
# 40_encoding.py
# 50_parse_only_part.py
@sany2k8
sany2k8 / scrapy_cheatsheet.md
Created February 24, 2021 06:40 — forked from zlin888/scrapy_cheatsheet.md
scrapy cheatsheet

Scrapy Cheatsheet

For test

scrapy shell https://example.com/

Run

scrapy crawl spider_name

Output

scrapy crawl dapps -o data/07-07-dapps.csv
scrapy crawl dapps -t csv -o - >"data/dapp/$DATE-dapp.csv"

@sany2k8
sany2k8 / get_url_link.py
Created February 22, 2021 09:20 — forked from elena-roff/get_url_link.py
Creates a clickable URL from two fields of the pandas DataFrame
@sany2k8
sany2k8 / pg_stat_statements
Created January 6, 2021 08:00 — forked from troyk/pg_stat_statements
enable postgres pg_stat_statements
1) see re: increasing shmmax http://stackoverflow.com/a/10629164/1283020
2) add to postgresql.conf:
shared_preload_libraries = 'pg_stat_statements' # (change requires restart)
136 pg_stat_statements.max = 1000
137 pg_stat_statements.track = all
3) restart postgres
4) check it out in psql
@sany2k8
sany2k8 / notes.md
Last active January 5, 2021 14:56 — forked from ian-whitestone/notes.md
Best practices for presto sql

Presto Specific

  • Don’t SELECT *, Specify explicit column names (columnar store)
  • Avoid large JOINs (filter each table first)
    • In PRESTO tables are joined in the order they are listed!!
    • Join small tables earlier in the plan and leave larger fact tables to the end
    • Avoid cross joins or 1 to many joins as these can degrade performance
  • Order by and group by take time
    • only use order by in subqueries if it is really necessary
  • When using GROUP BY, order the columns by the highest cardinality (that is, most number of unique values) to the lowest.
@sany2k8
sany2k8 / mysql_cheat_sheet.md
Created December 11, 2020 12:30 — forked from bradtraversy/mysql_cheat_sheet.md
MySQL Cheat Sheet

MySQL Cheat Sheet

Help with SQL commands to interact with a MySQL database

MySQL Locations

  • Mac /usr/local/mysql/bin
  • Windows /Program Files/MySQL/MySQL version/bin
  • Xampp /xampp/mysql/bin

Add mysql to your PATH

@sany2k8
sany2k8 / Crontab
Created February 18, 2020 09:03
Crontab manual
# ┌───────────── minute (0 - 59)
# │ ┌───────────── hour (0 - 23)
# │ │ ┌───────────── day of month (1 - 31)
# │ │ │ ┌───────────── month (1 - 12)
# │ │ │ │ ┌───────────── day of week (0 - 6) (Sunday to Saturday;
# │ │ │ │ │ 7 is also Sunday on some systems)
# │ │ │ │ │
# │ │ │ │ │
# * * * * * command_to_execute
@sany2k8
sany2k8 / tarcheatsheet.md
Created November 9, 2019 09:35 — forked from haskaalo/tarcheatsheet.md
Tar usage / Tar Cheat Sheet

Tar Usage / Cheat Sheet

Compress a file or directory

e.g: tar -czvf name-of-archive.tar.gz /path/to/directory-or-file

  • -c: Create an archive.
  • -z: Compress the archive with gzip.
  • -v: makes tar talk a lot. Verbose output shows you all the files being archived and much.
  • -f: Allows you to specify the filename of the archive.
@sany2k8
sany2k8 / install virtualenv ubuntu 16.04.md
Created September 26, 2019 07:31 — forked from frfahim/install virtualenv ubuntu 16.04.md
How to install virtual environment on ubuntu 16.04

How to install virtualenv:

Install pip first

sudo apt-get install python3-pip

Then install virtualenv using pip3

sudo pip3 install virtualenv