Skip to content

Instantly share code, notes, and snippets.

@JoseALermaIII
Last active October 19, 2018 18:37
Show Gist options
  • Save JoseALermaIII/d5f6c3d01b8d0f52931bb98156d34b7e to your computer and use it in GitHub Desktop.
Save JoseALermaIII/d5f6c3d01b8d0f52931bb98156d34b7e to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# How to save lightcurve data locally?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Saving lightcurve data helps reduce the load on any APIs providing it. There are many ways to save data locally, but some basic methods like csv files, fits files, and pickle serialization (\"pickling\") will be covered in this tutorial."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example: Saving a lightcurve as a csv file"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The LightCurve class features a built in `to_csv` method that automatically generates a csv file.\n",
"\n",
"First, obtain a random Kepler Target Pixel File from the MAST data archive and convert it into a lightcurve:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from lightkurve import KeplerTargetPixelFile, KeplerLightCurve\n",
"# Open a Kepler Target Pixel File\n",
"tpf = KeplerTargetPixelFile.from_archive(6922244, quarter=4)\n",
"\n",
"# Convert the target pixel file into a light curve\n",
"lc = tpf.to_lightcurve(aperture_mask=tpf.pipeline_mask)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's confirm we have a lightcurve by checking the metadata:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Kepler'"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"lc.mission"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"4"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"lc.quarter"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, we can save the lightcurve using the `to_csv` method:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"lc.to_csv('K6922244q4.csv')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The file name can be any string with the `.csv` extension, but specifying the Kepler ID and quarter help identify its contents.\n",
"\n",
"If no file name is specified, it returns a csv-formatted string."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example: Saving a light curve as a fits file"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Similar to csv files, the LightCurve class also has a built in `to_fits` method that automatically generates a fits file.\n",
"\n",
"First, obtain a random Kepler Light Curve File from the Mast data archive and separate the SAP flux:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"from lightkurve import KeplerLightCurveFile, KeplerLightCurve\n",
"lcf = KeplerLightCurveFile.from_archive(6679295, quarter=4)\n",
"sapflux = lcf.SAP_FLUX"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, edit the lightcurve by removing outliers:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"no_outlier = sapflux.remove_outliers()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"An edited lightcurve can be effectively saved by using the `to_fits()` method:"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[<astropy.io.fits.hdu.image.PrimaryHDU object at 0x7f7214e71898>, <astropy.io.fits.hdu.table.BinTableHDU object at 0x7f7214e71ef0>]"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"no_outlier.to_fits('K6679295q4.fits', overwrite=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Any string can be used as the file name, but specifiying the Kepler ID and quarter help identify the contents.\n",
"\n",
"If no file name is specified, it returns an `astropy.io.fits` object.\n",
"\n",
"We can now open the fits file and confirm the contents:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'astropy.io.fits.hdu.hdulist.HDUList'>\n"
]
}
],
"source": [
"from astropy.io import fits\n",
"fits_file = fits.open('K6679295q4.fits')\n",
"print(type(fits_file))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can also check the header of the first HDU object:"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"SIMPLE = T / conforms to FITS standards BITPIX = 8 / array data type NAXIS = 0 / number of array dimensions EXTEND = T / file contains extensions NEXTEND = 2 / number of standard extensions EXTNAME = 'PRIMARY ' / name of extension EXTVER = 1 / extension version number (not format version) ORIGIN = 'Unofficial data product' / institution responsible for file DATE = '2018-10-13' / file creation date. CREATOR = 'lightkurve' / pipeline job and program used t TELESCOP= 'KEPLER ' / telescope INSTRUME= 'Kepler Photometer' / detector type OBJECT = '6679295 ' / string version of target id KEPLERID= 6679295 / unique Kepler target identifier CHANNEL = 47 / CCD channel RADESYS = 'ICRS ' / reference frame of celestial coordinates RA_OBJ = 287.94861 / [deg] right ascension DEC_OBJ = 42.15614 / [deg] declination EQUINOX = 2000 / equinox of celestial coordinate system PROCVER = '1.0b18 ' QUARTER = 4 MISSION = 'Kepler ' DATE-OBS= '2009-12-19T21:02:09.596' CHECKSUM= 'LQigLQffLQffLQff' / HDU checksum updated 2018-10-13T20:49:34 DATASUM = '0 ' / data unit checksum updated 2018-10-13T20:49:34 END \n"
]
}
],
"source": [
"print(fits_file[0].header)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example: Pickling a lightcurve and saving it in a file"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, obtain a random Kepler Light Curve File from the MAST data archive and separate the fluxes:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"from lightkurve import KeplerLightCurveFile\n",
"lcf = KeplerLightCurveFile.from_archive(757076, quarter=3)\n",
"pdcsap = lcf.PDCSAP_FLUX\n",
"sapflux = lcf.SAP_FLUX"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, pickle the fluxes and save them in files:"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import pickle\n",
"with open('K757076q3pdc.pickle', 'wb') as pickle_file:\n",
" pickle.dump(pdcsap, pickle_file)\n",
"with open('K757076q3sap.pickle', 'wb') as pickle_file:\n",
" pickle.dump(sapflux, pickle_file)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"On line 2, we open a file called `K757076q3pdc.pickle` in binary mode located in the current working directory. Line 3 pickles the PDCSAP flux data from `757076`'s 3rd quarter and saves it inside the open file, while lines 4 and 5 do the same for the SAP flux data. Files opened using `with` statements are closed automatically.\n",
"\n",
"The file names can be any string, but if we use generic file names like `lcf`, there's a chance that it may get overwritten later, or that the data won't be identifiable without unpickling.\n",
"\n",
"Once the pickled data is dumped to a file, the program can safely finish while the data in the file will remain saved.\n",
"\n",
"To access the saved data, the file must be reopened in binary mode and the data within unpickled:"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"with open('K757076q3pdc.pickle', 'rb') as pickle_file:\n",
" pdc = pickle.load(pickle_file)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, we can access the flux data from the variable directly:"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'lightkurve.lightcurve.KeplerLightCurve'>\n"
]
}
],
"source": [
"print(type(pdc))"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"KeplerLightCurve(ID: 757076)"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pdc"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Kepler'"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pdc.mission"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pdc.quarter"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ nan, 301526.7 , 301533.7 , ..., 301663.22, 301678.34,\n 301723.8 ], dtype=float32)"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pdc.flux"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment