A personal diary of DataFrame munging over the years.
Convert Series datatype to numeric (will error if column has non-numeric values)
(h/t @makmanalp)
#!/usr/bin/env python | |
""" | |
No dependency Python script for joining two paired end FASTQ files. | |
Supports concatenating reads with a separator ("NNNNN") or interleaving reads via the | |
--interleave option. Auto-detects gzip'd files, offers header checking via a --strict flag, | |
and supports output to STDOUT a gzip'd FASTQ or an uncompressed FASTQ (--uncompressed flag). | |
""" | |
import argparse | |
import gzip |
A personal diary of DataFrame munging over the years.
Convert Series datatype to numeric (will error if column has non-numeric values)
(h/t @makmanalp)
A lot of these are outright stolen from Edward O'Campo-Gooding's list of questions. I really like his list.
I'm having some trouble paring this down to a manageable list of questions -- I realistically want to know all of these things before starting to work at a company, but it's a lot to ask all at once. My current game plan is to pick 6 before an interview and ask those.
I'd love comments and suggestions about any of these.
I've found questions like "do you have smart people? Can I learn a lot at your company?" to be basically totally useless -- everybody will say "yeah, definitely!" and it's hard to learn anything from them. So I'm trying to make all of these questions pretty concrete -- if a team doesn't have an issue tracker, they don't have an issue tracker.
I'm also mostly not asking about principles, but the way things are -- not "do you think code review is important?", but "Does all code get reviewed?".
import urllib2 | |
import re | |
import sys | |
from collections import defaultdict | |
from random import random | |
""" | |
PLEASE DO NOT RUN THIS QUOTED CODE FOR THE SAKE OF daemonology's SERVER, IT IS | |
NOT MY SERVER AND I FEEL BAD FOR ABUSING IT. JUST GET THE RESULTS OF THE | |
CRAWL HERE: http://pastebin.com/raw.php?i=nqpsnTtW AND SAVE THEM TO "archive.txt" |
# coding=UTF-8 | |
from __future__ import division | |
import nltk | |
from collections import Counter | |
# This is a simple tool for adding automatic hashtags into an article title | |
# Created by Shlomi Babluki | |
# Sep, 2013 | |
Let's have some command-line fun with curl, [jq][1], and the [new GitHub Search API][2].
Today we're looking for: