Create a gist now

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@Paul-Yuchao-Dong

This comment has been minimized.

Show comment
Hide comment
@Paul-Yuchao-Dong

Paul-Yuchao-Dong Apr 12, 2016

Great gist! This is really helpful to ppl who finished Wes' great book and want to catch up the further improvement on pandas. I cannot believe I am the first one to leave a message here.

However, I did come asking for help. I'm not sure how whether the post request still works. The 3rd cell gave me a trace back.
BadZipfile: File is not a zip file
Or is it a python 2/3 issue? I'm running anaconda=4.0 with python 2.7.

Actually it would be helpful if you can show how should I download the zip file manually.

Thank you again for the great post explaining the recent development.

EDIT: I think I understand the problem now. It is indeed a python 2/3 problem, I think Py2 didn't wait until the request was complete for some reason. I separated the 3rd cell and got it to run smoothly.

thanks!

Great gist! This is really helpful to ppl who finished Wes' great book and want to catch up the further improvement on pandas. I cannot believe I am the first one to leave a message here.

However, I did come asking for help. I'm not sure how whether the post request still works. The 3rd cell gave me a trace back.
BadZipfile: File is not a zip file
Or is it a python 2/3 issue? I'm running anaconda=4.0 with python 2.7.

Actually it would be helpful if you can show how should I download the zip file manually.

Thank you again for the great post explaining the recent development.

EDIT: I think I understand the problem now. It is indeed a python 2/3 problem, I think Py2 didn't wait until the request was complete for some reason. I separated the 3rd cell and got it to run smoothly.

thanks!

@andportnoy

This comment has been minimized.

Show comment
Hide comment
@andportnoy

andportnoy Apr 24, 2016

Hi, Tom.

I might be wrong, but in the second cell

with open("flights.csv", 'wb') as f:

should be replaced with

with open("flights.csv.zip", 'wb') as f:

since that's what you are then unzipping in the following cell.

P.S. Great post series, and I can't wait to see the second edition of Wes's book!

Hi, Tom.

I might be wrong, but in the second cell

with open("flights.csv", 'wb') as f:

should be replaced with

with open("flights.csv.zip", 'wb') as f:

since that's what you are then unzipping in the following cell.

P.S. Great post series, and I can't wait to see the second edition of Wes's book!

@andportnoy

This comment has been minimized.

Show comment
Hide comment
@andportnoy

andportnoy May 15, 2016

As of pandas 0.18.1:
read_csv will now raise a TypeError if parse_dates is neither a boolean, list, or dictionary

As of pandas 0.18.1:
read_csv will now raise a TypeError if parse_dates is neither a boolean, list, or dictionary

@rebost

This comment has been minimized.

Show comment
Hide comment
@rebost

rebost Sep 18, 2016

@andportnoy, replace
df = pd.read_csv(fp, parse_dates="FL_DATE").rename(columns=str.lower)
with
df = pd.read_csv(fp, parse_dates=["FL_DATE"]).rename(columns=str.lower)

@TomAugspurger, thanks for this great resource

rebost commented Sep 18, 2016

@andportnoy, replace
df = pd.read_csv(fp, parse_dates="FL_DATE").rename(columns=str.lower)
with
df = pd.read_csv(fp, parse_dates=["FL_DATE"]).rename(columns=str.lower)

@TomAugspurger, thanks for this great resource

@sbraden

This comment has been minimized.

Show comment
Hide comment
@sbraden

sbraden Jun 14, 2017

I was not able to use cells 1 through 3 to download the data. I downloaded the data manually and it appears that the format has changed a bit. "FL_DATE" is now "FlightDate" for example. Thank you for writing these "not exactly for beginners" tutorials.

sbraden commented Jun 14, 2017

I was not able to use cells 1 through 3 to download the data. I downloaded the data manually and it appears that the format has changed a bit. "FL_DATE" is now "FlightDate" for example. Thank you for writing these "not exactly for beginners" tutorials.

@lidgen

This comment has been minimized.

Show comment
Hide comment
@lidgen

lidgen Jun 19, 2017

Hi @sbraden , You can open this link https://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time in your browser, choose year to 2014, tick the items in the following list, and then click the 'Download' button on the right. You will get a zip file that should satisfy you.

FL_DATE
UNIQUE_CARRIER
AIRLINE_ID
TAIL_NUM
FL_NUM
ORIGIN_AIRPORT_ID
ORIGIN_AIRPORT_SEQ_ID
ORIGIN_CITY_MARKET_ID
ORIGIN
ORIGIN_CITY_NAME
ORIGIN_STATE_NM
DEST_AIRPORT_ID
DEST_AIRPORT_SEQ_ID
DEST_CITY_MARKET_ID
DEST
DEST_CITY_NAME
DEST_STATE_NM
CRS_DEP_TIME
DEP_TIME
DEP_DELAY
TAXI_OUT
WHEELS_OFF
WHEELS_ON
TAXI_IN
CRS_ARR_TIME
ARR_TIME
ARR_DELAY
CANCELLED
CANCELLATION_CODE
DIVERTED
DISTANCE
CARRIER_DELAY
WEATHER_DELAY
NAS_DELAY
SECURITY_DELAY
LATE_AIRCRAFT_DELAY

lidgen commented Jun 19, 2017

Hi @sbraden , You can open this link https://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time in your browser, choose year to 2014, tick the items in the following list, and then click the 'Download' button on the right. You will get a zip file that should satisfy you.

FL_DATE
UNIQUE_CARRIER
AIRLINE_ID
TAIL_NUM
FL_NUM
ORIGIN_AIRPORT_ID
ORIGIN_AIRPORT_SEQ_ID
ORIGIN_CITY_MARKET_ID
ORIGIN
ORIGIN_CITY_NAME
ORIGIN_STATE_NM
DEST_AIRPORT_ID
DEST_AIRPORT_SEQ_ID
DEST_CITY_MARKET_ID
DEST
DEST_CITY_NAME
DEST_STATE_NM
CRS_DEP_TIME
DEP_TIME
DEP_DELAY
TAXI_OUT
WHEELS_OFF
WHEELS_ON
TAXI_IN
CRS_ARR_TIME
ARR_TIME
ARR_DELAY
CANCELLED
CANCELLATION_CODE
DIVERTED
DISTANCE
CARRIER_DELAY
WEATHER_DELAY
NAS_DELAY
SECURITY_DELAY
LATE_AIRCRAFT_DELAY

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment