Skip to content

Instantly share code, notes, and snippets.

@phn
Created July 5, 2012 17:15
Show Gist options
  • Star 7 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save phn/3054997 to your computer and use it in GitHub Desktop.
Save phn/3054997 to your computer and use it in GitHub Desktop.
Astropython.org PyFits tutorial

PyFITS: FITS files in Python

In this article, we provide examples of using the python module PyFITS for working with FITS data. We first go through a brief overview of the FITS standard, and then we describe ways for accessing information in FITS files, using convenience functions defined in PyFITS. PyFITS offers facilities that provide more advanced access to information in FITS files, but these will not be discussed here. Perhaps, a future article will discuss these.

These instructions are based on the PyFITS Handbook (link is to a PDF file), which has exhaustive information on the facilties provided by PyFITS.

Brief overview of the FITS standard

FITS stands for Flexibile Image Transport System, and is the IAU recognized standard for storing astronomical data.

A FITS file consists of one or more header and data units, referred to as HDUs. An HDU consists of a header, which describes the data contained in the HDU, followed by the data itself. The first HDU is called the Primary HDU. Other HDUs, if present, are referred to as extensions.

In a valid FITS file the data part of an HDU can be empty. The simplest valid FITS file is one with just the header of the Primary HDU. A FITS file with only the Primary HDU is refered to as a Basic FITS file or a Single Image FITS (SIF) file. A FITS file with extensions is referred to as a Multi-Extension FITS (MEF) file.

A header consists of records, also called card images, each having a maximum of 80 characters. A record contains a keyword with maximum length of 8 characters, an "=" sign at the 9th position, a space at the 10th position, followed by a value for the keyword. A "/" after the value of the keyword indicates the beginning of a comment for the record and the characters from that point, through the 80th character becomes the comment. The only characters allowed in a header record are ASCII values 32 to 126. A record can be a blank record in which case the entire record will be filled with ASCII space character (ASCII 32). The special keywords HISTORY and COMMENT, do not use "=" sign and simply uses positions 11 to 80, to store the value.

There is a set of standard keywords, some mandatory and others optional, defined in the FITS standard. Document describing the standard is available from the FITS Support Office. In addition to these there are many non-standard FITS keywords that are used quite frequently. The exact keywords included in an header depends on the type of the data in the data part of the HDU. A list of standard and, non-standard but frequently used, keywords is available at http://fits.gsfc.nasa.gov/fits_dictionary.html. In addition, "World Coordinate System" (WCS) information is stored in keywords defined in appropriate standards, which are also linked to at the above url.

The value of a header keyword can be of the following types.

  1. String; ascii characters enclosed in single quotes.
  2. Logical; letter T or letter F.
  3. Integers; signed decimal numbers.
  4. Floating point numbers; similar to integers, with E or D denoting the exponent part.
  5. Complex numbers; specified as (real, imag).

The data in an HDU can be an image, which is an array of dimensions 1 to 999, a binary table or an ASCII table. There is another type of data structure called random groups, which is used only in radio interferometry work.

Data can be 8-bit characters, integers (8-bit unsigned, and 16-bit, 32-bit and 64-bit signed integers) and floating point numbers (32-bit and 64-bit). The data type is indicated using the BITPIX keyword, with value set to the number of bits, e.g., 8 for characters and unsigned integers. The values -32 and -64 are used for 32-bit and 64-bit formats, respectively.

An important thing to keep in mind is that the value stored in a data array, the raw value, need not be the actual value representing the physical quantity. If this is so, then the keywords BSCALE and BZERO will have values other than 1.0 and 0.0, respectively. The actual quantity that is being conveyed can be calculated as

physical_value = raw_value * BSCALE + BZERO.

Note

PyFITS will automatically perform this conversion, when it reads in data from a FITS file. The BITPIX value will also be changed to reflect this.

For tables, the corresponding keywords are TSCALEn and TZEROn, where "n" denotes the field, i.e., table column, for which this value is applicable. So we have, 1 <= n <= TFIELDS, where TFIELDS is the number of columns in the table.

The number of "dimensions" for data arrays is specified using the keyword NAXIS. The length of each dimension is specified using the keyword NAXISn where 1 <= n <= NAXIS.

Consider a "data cube" with 200 rows (y-axis), 200 columns (x-axis) and 4 "pages" (z-axis). Then NAXIS = 3, NAXIS1 = 200 (x-axis), NAXIS2 = 200 (y-axis) and NAXIS3 = 4 (z-axis). The data is stored in "row-major" format so that, we would access an element in row 10, col 8 and page 1 as data[0][9][8] in C, and as data[0][9][8] or data[0,9,8] in Python.

Tables can only appear in extensions and not in Primary HDU and the type of the table will be indicated in the XTENSION keyword of the header of the concerned HDU. If an extension contains image data, i.e., an array, then XTENSION = IMAGE.

For an ASCII table, i.e., XTENSION = TABLE, NAXIS will always be 2. NAXIS1 will be the total number of 8-bit characters in a row, and NAXIS2 will be the number of rows. In FITS a "column" is a position in a row, where as a "field" represents a column in the table. The number of columns in a table, is therefore, given by the value of the keyword TFIELDS.

Binary tables, i.e., XTENSION = BINTABLE, have the same interpretation for NAXIS and NAXIS2. In ASCII tables a value in a particular "cell" will be a scalar. But in a binary table this can be a vector. The value of NAXIS1 for a binary table will take this into account.

Accesing FITS files with PyFITS

PyFITS provides several "convenience" functions that allow the user to work with FITS data, without having to deal with opening and closing files. As mentioned before, there are methods available that provide much finer control over accessing FITS data.

In the following examples, we use the FITS file "WFPC2u5780205r_c0fx.fits", which is linked to by the first link in the table at http://fits.gsfc.nasa.gov/fits_samples.html.

In the code samples below, lines beginning with ">>>" are code entered into the python shell. Only part of long outputs are shown and these are indicated with the string "...".

Getting basic information on a FITS file

>>>  import pyfits
>>>  pyfits.info("WFPC2u5780205r_c0fx.fits")
Filename: WFPC2u5780205r_c0fx.fits
No.    Name         Type      Cards   Dimensions   Format
0    PRIMARY     PrimaryHDU     262  (200, 200, 4)  float32
1    U5780205R_CVT.C0H.TAB  TableHDU       353  4R x 49C      [D25.17,
D25.17, E15.7, E15.7, E15.7, E15.7, E15.7, E15.7, E15.7, E15.7,
A1, E15.7, I12, I12, D25.17, D25.17, A8, A8, I12, E15.7, E15.7,
E15.7, E15.7, E15.7, E15.7, I12, I12, I12, I12, I12, I12, I12,
I12, A48, E15.7, E15.7, E15.7, E15.7, E15.7, E15.7, E15.7,
E15.7, E15.7, E15.7, E15.7, E15.7, E15.7, E15.7, E15.7]

The output shows that the file has two HDUs. The header of the Primary HDU has 262 header records and the data part is a 4x200x200 array, i.e., 4 "pages", 200 rows and 200 columns, of 32 bit floating point numbers. The file has a table extension which has a header of 353 records and the data part has 4 rows and 49 columns. The data format in each column is also given. The total number of 8-bit bytes in one row is 747+49, i.e., 747 bytes indicated by the format strings and one 8-bit separator associated with each column.

Note that while using PyFITS we often refer to the Primary HDU as extension number 0 and the first FITS extension as extension number 1.

Getting information from headers of FITS files

The getheader function returns the header of a specified extension. If no extension is given then the header of the Primary HDU is returned.

>>> header_primary = pyfits.getheader("WFPC2u5780205r_c0fx.fits") 

The header can be treated as a python dictionary. So to find all keywords in the header we can execute

>>> header.keys()
['SIMPLE',
 'BITPIX',
 'NAXIS',
 'NAXIS1',
 'NAXIS2',
 'NAXIS3',
 'EXTEND',
 'COMMENT',
 'BSCALE',
  ...

The keys can be used to access information in the header. Note that the keys are case-insensitive.

>>> header['BITPIX']
-32
>>> header['bitpix']
-32
>>> header['naxis']
3
>>> header['naxis1']
200
>>> header['naxis2']
200
>>> header['naxis3]
4

To get the header of the first extension we can give a number, indicating the extension, to the getheader function. Here, since the table is the first extension we give the number 1. To get the primary header we can either omit the number or give the number 0.

>>> header_table = pyfits.getheader("WFPC2u5780205r_c0fx.fits",1)
>>> header_table.keys()
['XTENSION',
 'BITPIX',
 'NAXIS',
 'NAXIS1',
 'NAXIS2',
 'PCOUNT',
 'GCOUNT',
 'TFIELDS',
 'EXTNAME',
 ...
>>> header_table['naxis']
2
>>> header_table['tfields'] # A field is a table column.
49
>>> header_table['naxis2']
4
>>> header_table['naxis1']
796

If we know that a header has a keyword, and we just want to find its value, we can use the getval function.

>>> pyfits.getval('WFPC2u5780205r_c0fx.fits','bitpix',0)
-32
>>> pyfits.getval('WFPC2u5780205r_c0fx.fits','bitpix',1)
8
>>> pyfits.getval('WFPC2u5780205r_c0fx.fits','xtension',1)
'TABLE'

The last line of the output shows that the table in the FITS file is an ASCII table. A binary table will have the value 'BINTABLE' and an image array will have the value 'IMAGE'.

To see the comment associated with a keyword, we will need to access the list of Card Object in the header.

>>> header1_cards = header1.ascardlist()
>>> print header1_cards['bitpix']
BITPIX  =                    8 / Data typex

Getting data from FITS files

To get the data part from an HDU we use the getdata function, passing it the name of the FITS file and the HDU from which data needs to be obtained.

>>> data_cube = pyfits.getdata('WFPC2u5780205r_c0fx.fits",0)

To get header along with data set the header option to True.

>>> data_cube, header_data_cube = pyfits.getdata(
   'WFPC2u5780205r_c0fx.fits', 0, header=True)

If data in the HDU is an image, then the data returned is a numpy ndarray object .

>>> type(data_cube)
<type 'numpy.ndarray'>
>>> data_cube.shape
(4, 200, 200)

The shape parameter returns the shape of the array as a tuple of the format (NAXIS3, NAXIS2, NAXIS1) or in general (NAXISn, ..., NAXIS1), where n is the value of the keyword NAXIS. This means that NAXIS1 is the number of columns (x axis), NAXIS2 is the number of rows (y axis) and NAXIS3 is the number of pages (z-axes).

The following illustrates that the "data cube" behaves just like a regular array. Here, numdisplay is used to display the array on DS9. DS9 should be running before calling numdisplay.display.

>>> import copy
>>> data_cube_copy = copy.deepcopy(data_cube)
>>> data_cube_copy_page_3 = data_cube_copy[3]
>>> import numdisplay
>>> numdisplay.display(data_cube[3])
Image displayed with Z1:  -4.29414  Z2: 813.35
>>> data_cube_copy_page_3[90:110,:] = 32000.0
>>> numdisplay.display(data_cube_copy_page_3)
Image displayed with Z1:  -4.29414  Z2: 32000.0
>>> data_cube_copy_page_3[:,20:30] = 32000.0
>>> numdisplay.display(data_cube_copy_page_3)
Image displayed with Z1:  -4.29414  Z2: 32000.0
>>> data_cube_copy_page_3[:,-30:-20] = 32000.0
>>> numdisplay.display(data_cube_copy_page_3)
Image displayed with Z1:  -4.29414  Z2: 32000.0

If data in the HDU is a table then an object similar to a numpy record array is returned. As mentioned before, in FITS, what we call a "table column" is referred to as a "field". A "column" in FITS referes to an 8-bit byte in a row, i.e., a position in a row. Hence a "field" will span a range of "columns".

>>> data_table, header_table =
    pyfits.getdata("WFPC2u5780205r_c0fx.fits", 1, header=True)
>>> type(data_table)
<class 'pyfits.core.FITS_rec'>
>>> data_table[0] # First row
...
>>> data_table[0][0] # First row, first column
182.63118863080001
>>> data_table.field(0) # first column
array([ 182.63118863,  182.62552336,  182.65237923,  182.65002236])
>>> data_table.names # Names of individual columns
...
>>> len(data_table.names) # Should be 49, since there are 49 columns
49
>>> data_table.field('crval1') # Data in column named CRVAL1
array([ 182.63118863,  182.62552336,  182.65237923,  182.65002236])
>>> b = data_table.field('backgrnd')
>>> b[0] # first row in the column 'backgrnd'
-0.36763531

Creating and modifying FITS files

Keys can be easily added to and removed from headers, using the methods provided by the "header object".

>>> del header_table['bitpix']
>>> header_table.has_key('bitpix')
False
>>> header_table.update('bitpix', 8, comment="Data type in bits.")
>>> header_table['bitpix']
8

To add a key use the update method of the header. This method will add a keyword if that is not present, and will modify the value of the keyword if it is present.

>>> header_table.has_key('avg1')
False
>>> header_table.update('AVG1', 23.0, "Avg in BOX1")
>>> header_table['avg1']
23.0
>>> header_table.ascardlist()['avg1']
AVG1    =                 23.0 / Average in BOX1   

The methods, add_comment, add_history and add_blank can be used to insert COMMENT, HISTORY and blank keywords, while the get_comment and get_history methods will retrieve a list of COMMENT and HISTORY keywords in the header. One can use the syntax header['COMMENT'] to get the value of a comment, but since a FITS header can contain more than 1 COMMENT and HISTORY keywords, this will only return the first such keyword. The methods to add the above keywords also allow one to specify where the keywords must be inserted.

The code

>>> header_table.add_comment("A new comment", before="median")

will insert a COMMENT key, just before the key MEDIAN. There is a keyword parameter, after, to do the insertion after a specified keyword.

To create a new FITS file with some data and header information we can use the writeto function, providing it with a filename, some data and a header.

>>> pyfits.writeto(filename, data, header)

Example:

>>> pyfits.writeto("WFPC_copy_page_3.fits", data_copy_page_3)

Since a header was not provided, a minimal header will be inserted into the file.

The function append can be used to append an HDU to an already existing FITS file.

>>> pyfits.append(filename, data, header)

The data and header provided will be appended as an HDU extension to the FITS file represented by filename.

Note that in both writeto and append functions, the header is optional. If not given a minimal header will be inserted into the HDU extension.

The function update is used to update an HDU.

>>> pyfits.update(filename, data, header, extension)

Examples:

  • Update extension number 3 with provided data and header

    >>> pyfits.update(filename, data, header, ext=3)
  • Update extension 3 with data but do not change the header

    >>> pyfits.update(filename, data, ext=3)
  • Update the extension named 'sci'

    >>> pyfits.update(filename, data, ext='sci')

FITS header keywords

THe following are some of the important keywords defined in the FITS standard.

Keyword Description
SIMPLE Set to T if the file conforms to the FITS standard and to F if not.
BITPIX Datatype of the data in the HDU.
XTENSION Type of extension: IMAGE, TABLE, BINTABLE.
NAXIS Number of dimensions of a data array. For tables this is always 2.
NAXISn Length of each dimension. Here 1 <= n <= NAXIS. For tables NAXIS1 is the number of table rows.
TFIELDS Number of fields, i.e., table columns, in the table.
TFORMn Format of data in the nth table column. See table 7.3 in page 50 of the FITS Standard.
BSCALE Factor by which data in an image has been scaled.
BZERO Zero point of the data in an image array.
TSCALEn Same as BSCALE, but for table column n.
TZEROn Same as BZERO, but for table column n.
  • PyFITS

    Python module for working with FITS data.

  • NASA/GSFC FITS Support Office

    Authoritative information on the FITS format, including documentation and links to software for manipulating FITS files.

  • Numpy

    Basic python package for scientific computing, including facilities for the creation and manipulation of N-dimensional arrays.

  • Numdisplay

    Python module for interacting with image display softwares, such as DS9.

FITS files in PyFITS

PyFITS has many methods that provide more control over manipulating FITS files, than that provided by the convenience functions discussed in part 1 of this article. In the following sections we discuss several of these methods.

HDUList

PyFITS treats a FITS file as a collection of HDU objects, each of which represents a FITS HDU. The function pyfits.open returns an HDUList object, which is a collection of all the HDU objects in the FITS file.

>>> hdulist = pyfits.open("WFPC2u5780205r_c0fx.fits")
>>> type(hdulist)
<class 'pyfits.core.HDUList'>
>>> len(hdulist)
2
>>> hdulist.info()
Filename: WFPC2u5780205r_c0fx.fits
No.    Name         Type      Cards   Dimensions   Format
0    PRIMARY     PrimaryHDU     262  (200, 200, 4)  float32
1    U5780205R_CVT.C0H.TAB  TableHDU       353  4R x 49C ['D25.17', 
 'D25.17', 'E15.7', 'E15.7', 'E15.7', 'E15.7', 'E15.7', 
 'E15.7', 'E15.7', 'E15.7', 'A1', 'E15.7', 'I12', 'I12', 'D25.17',
 'D25.17', 'A8', 'A8', 'I12', 'E15.7', 'E15.7', 'E15.7', 'E15.7', 
 'E15.7', 'E15.7', 'I12', 'I12', 'I12', 'I12', 'I12', 'I12', 'I12',
 'I12', 'A48', 'E15.7', 'E15.7', 'E15.7', 'E15.7', 'E15.7',
 'E15.7', 'E15.7', 'E15.7', 'E15.7', 'E15.7', 'E15.7', 'E15.7', 
 'E15.7', 'E15.7', 'E15.7']

This function takes a parameter, mode, that will set the mode in which the file is opened, based on the string passed as its value:

mode string file mode
copyonwrite rb
readonly rb
update rb+
append ab+
ostream w

If we plan to modify the file then it should be opened with the appropriate mode string passed to pyfits.open, using the mode parameter.

HDU object

To access an HDU object in the HDUList, treat the latter as a list or as a dictionary with the extension names as keys.

>>> hdu0 = hdulist[0] 
>>> type(hdu0)
<class 'pyfits.core.PrimaryHDU'>
>>> for i in hdulist: print i.name
.... 
PRIMARY
U5780205R_CVT.C0H.TAB
>>> del i
>>> hdulist['primary'] is hdulist[0]
True
>>> hdulist['u5780205r_cvt.c0h.tab'] is hdulist[1]
True
>>> hdu1 = hdulist[1]
>>> type(hdu1)
<class 'pyfits.core.TableHDU'>

An HDU object has a header attribute and a data attribute, which contain the FITS header and data, respectively, of the FITS HDU.

Header

The header of an extension is stored in the header attribute of the corresponding HDU object.

>>> header0 = hdu0.header
>>> header1 = hdu1.header

In PyFITS, a header is a dictionary like object with FITS keywords as keys. A header can also be treated like a list. This feature is needed to access keywords that can occur multiple times in a header, for example COMMENT; dictionary syntax will extract only the first occurrence. Headers support only simple indexing, i.e., no slices.

>>> header0.keys()
['SIMPLE',
'BITPIX',
'NAXIS',
'NAXIS1',
'NAXIS2',
'NAXIS3',
'EXTEND',
'COMMENT',
...
>>> header1.keys()
['XTENSION',
'BITPIX',
'NAXIS',
'NAXIS1',
'NAXIS2',
'PCOUNT',
'GCOUNT',
'TFIELDS',
'EXTNAME',
...
>>> header0['ra_targ']
182.63550000000001
>>> header0['gal_long']
155.079532
>>> header1['NAXIS']
2
>>> header1['naxis'] # Case insensitive
2
>>> header1['naxis2']
4
>>> header1['tfields']
49

Keyword values in a header can be changed, and new keywords can be added using the update method of the header:

>>> header1.update("Author", "I Me Myself", "Name")
>>> header1['author']
'I Me Myself'
>>> del header1['author']
>>> header1['author']
...
KeyError: "Keyword 'author' not found."

The changes occur only in memory and so, in order to make the changes permanent, the FITS file must be written to disk.

To add new HISTORY, COMMENT and blank records use the appropriately named methods of the header object. Blank records will be placed at the end of the header, while COMMENT and HISTORY records will be placed immediately after the last occurrence of the keyword, unless an explicit position is given.

>>> header1.add_history("Changing history")
>>> header1.add_blank("A blank record. No keyword")
>>> header1.add_comment("A new comment line.")
>>> header1.add_comment("Some comment", before='histwide')
>>> header1.add_comment("Some more", after='skewness')
>>> header1.keys()
...
'MEDSHADO',
'COMMENT',
'HISTWIDE',
'SKEWNESS',
'MEANC10',
'MEANC25'  
...

The before and after argument can also be integers, in which case they will be used as list index numbers.

From the above output, we can see that only the first COMMENT record can be accessed using a dictionary key. To see that we have indeed added a second COMMENT record after the SKEWNESS record , we can print the header or use the get_comment method of the header object. The latter returns all the COMMENT records as a list of strings. The method get_history extracts all HISTORY records.

>>> print header1
...
MEDSHADO= 'median pixel value in shadow of pyramid edge' /
COMMENT Some comment                    
HISTWIDE= 'width of the histogram' /
SKEWNESS= 'skewness of the histogram' /
COMMENT Some more
MEANC10 = 'mean of a 10x10 region at center of chip' / 
... 
>>> header1.get_comment()
["  FITS (Flexible Image Transport System) format is defined in 'Astronomy",
"  and Astrophysics', volume 376, page 359; bibcode: 2001A&A...376..359H",
'A new comment line.',
'Some comment',
'Some more']

The following code can be used to find the position of a COMMENT card in a Card List. The ascardlist method is explained in the Card object and card lists section:

>>> for i, card in enumerate(header1.ascardlist()):
 ....:     if card.key == 'COMMENT':
 ....:         print i, card.key, card.value, card.comment
 ....:         
 ....:         
7 COMMENT   FITS (Flexible Image Transport System) format is defined in 'Astronomy 
8 COMMENT   and Astrophysics', volume 376, page 359; bibcode: 2001A&A...376..359H 
9 COMMENT A new comment line. 
60 COMMENT Some comment 
63 COMMENT Some more

To get all the blank records we can check to see if a card object in a card list has a blank key:

>>> for i, card in enumerate(header1.ascardlist()):
....:    if card.key == "":
....:        print i, card.value, card.comment  
....:
75  
79  
80       / WFPC-II DATA DESCRIPTOR KEYWORDS 
81  
86  
87       / SCIENCE INSTRUMENT CONFIGURATION 
88  
...
185  
192  
193       / EXPOSURE INFORMATION 
194  
199  
206  
207       / TARGET & PROPOSAL ID 
215    
>>> print data_header_cards[205:210]
EXPFLAG = 'NORMAL       '      / Exposure interruption indicator                

              / TARGET & PROPOSAL ID                                            
TARGNAME= 'NGC4151                       ' / proposer's target name             
RA_TARG =   1.826355000000E+02 / right ascension of the target (deg) (J2000)

Card object and card lists

A header itself consists of Card Objects, each representing a record in the FITS header. The list of card object in a header can be obtained using the ascardlist function of the header object.

>>> card_list = header1.ascardlist()
>>> len(card_list)
>>> cardlist.keys()
...
...
>>> type(card_list)
<class 'pyfits.core.CardList'>
>>> card_list['BITPIX']
BITPIX  =                    8 / 8-bits per 'pixels'
>>> card_list[1]
BITPIX  =                    8 / 8-bits per 'pixels'

Unlike the Header object, a card list supports list slicing:

>>> card_list[0:10].keys()
['XTENSION',
 'BITPIX',
 'NAXIS',
 'NAXIS1',
 'NAXIS2',
 'PCOUNT',
 'GCOUNT',
 'TFIELDS',
 'EXTNAME',
 '']

The maximum length of a standard FITS keyword is 8 characters and each record can have a maximum of 80 characters. There are two keywords that can be used to overcome these limitations: HIERARCH and CONTINUE. See pages 19 and 20 of the PyFITS Handbook, for more information.

Each object in a card list is a Card Image. A Card Image represents one FITS header record.

>>> a_card = card_list[0]
>>> type(a_card)
<class 'pyfits.core.Card'>
>>> print a_card
XTENSION= 'TABLE   '           / Ascii table extension

A Card object has three main attributes: key, value and comment. They represent the keyword name, the value and the comment of the corresponding FITS record, respectively.

>>> a_card.key
'XTENSION'
>>> a_card.value
'TABLE'
>>> a_card.comment
'Ascii table extension'

Creating a new header

There are two ways of creating a new FITS header: create an empty header and use its update method to add records, or create a list of Card images and then create a header by passing this list to the pyfits.Header constructor.

The following illustrates the first method:

>>> header_0 = pyfits.Header()
>>> header_0.update(key="simple", value="T", comment="Conforms to standard")
>>> header_0.update(key="bitpix", value=32)
>>> header_0.update(key="date", value="2010-10-10")
>>> header_0.update(key="origin", value="kpno", comment="2010B: Prj. X")
>>> print header_0
SIMPLE  = 'T       '           / Conforms to standard  
BITPIX  =                   32
DATE    = '2010-10-10'   
ORIGIN  = 'kpno    '           / 2010B: Prj. X

A Card object can be created using the pyfits.Card constructor. The constructor takes as arguments, a key, a value and a comment. Note that within PyFITS, keywords are case insensitive. Also, the constructor will check if the resultant FITS record is valid or not. If for some reason, we want to create non-conforming Cards, then use the pyfits.Card().fromstring method.

>>> card_0 = pyfits.Card("simple", "T")
>>> card_1 = pyfits.Card("bitpix', -32)
>>> card_2 = pyfits.Card("date", "2010-10-10")
>>> card_3 = pyfits.Card("origin", "kpno", "2010B run for Prj. X")
>>> card_4 = pyfits.Card("dddddddddddddddddddddd","123", "222")
...
ValueError: keyword name dddddddddddddddddddddd is too long (> 8), use HIERARCH.
>>> print card_0
SIMPLE  = 'T       '
>>> print card_1
BITPIX  =                  -32
>>> print card_2
DATE    = '2010-10-10'
>>> print card_3
ORIGIN  = 'kpno    '           / 2010B: Prj. X

The verify method of a Card object can be used to verify that it conforms to the FITS standard.

If the header is going to be part of an extension, rather than the Primary HDU, then add the keyword "XTENSION", with value set to "IMAGE", "TABLE" or "BINTABLE", as appropriate.

Once we have a set of Cards, we can create a header by passing the list of Cards to the pyfits.Header constructor.

>>> card_list = [card_0, card_1, card_2, card_3]
>>> header_0 = pyfits.Header(cards=card_list)
>>> print header_0  
SIMPLE = 'T ' 
BITPIX = -32 
DATE = '2010-10-10' 
ORIGIN = 'kpno ' / 2010B: Prj. X

Data

As mentioned before, a FITS HDU consists of a data part and a header part. The data part can be accessed through the data attribute of a HDU object. In the Primary HDU and extensions with XTENSION = IMAGE, the data part can be a multi-dimensional array. Extensions with XTENSION = TABLE and XTENSION = BINTABLE, will contain tabular data in ASCII and binary formats, respectively.

Image data

Image data consists of a multi-dimensional array, and can occur as the data part of the Primary HDU or an extension HDU with XTENSION = IMAGE. In PyFITS the image data is stored as numpy arrays. The dimensions and data type of this array must match the appropriate keywords in the header.

To create a Primary HDU with image data we can use the pyfits.PrimaryHDU constructor.

>>> data = numpy.zeros((2,4,3), dtype=numpy.float64)
>>> primary_hdu = pyfits.PrimaryHDU(data=data, header=header_0)
>>> print primary_hdu.header
SIMPLE = T / conforms to FITS standard 
BITPIX = -64 / array data type 
NAXIS = 3 / number of array dimensions 
NAXIS1 = 3 
NAXIS2 = 4
NAXIS3 = 2 
DATE = '2010-10-10' 
ORIGIN = 'kpno ' / 2010B: Prj. X

The header, header_0, is the same as that in the creating a new header section of this document; PyFITS has automatically changed the BITPIX value to reflect the actual data type, and has also inserted the NAXIS and NAXISn keywords to store information on the dimensions of the data. It has also added comments to the keyword SIMPLE.

See page 22 of the PyFITS Handbook for information on how to handle "scaled" data. Page 24 of the document lists methods for working with large data sets.

Tables

The header associated with a table extension stores meta-data for the columns in a table, in addition to the regular FITS header records. Some of the metadata stored are:

FITS keyword Property
NAXIS2 Number of rows
TFIELDS Number of columns
TFORMn Format of data in column n
TTYPEn Name of column n
TUNITn Unit for data in column n
>>> hdu_list = pyfits.open("WFPC2u5780205r_c0fx.fits")
>>> table_hdu = hdu_list[1]
>>> table_header = table_hdu.header
>>> table_header['tfields']
49
>>> table_header['naxis2']
4
>>> table_header['tform1']
'D25.17'
>>> table_header['tform2']
'D25.17'
>>> table_header['tform3']
'E15.7'
>>> table_header['ttype1']
'CRVAL1'
>>> table_header['ttype2']
'CRVAL2'
>>> table_header['ttype3']
'CRPIX1'

A table HDU object has a property columns, which gives the properties of the columns in the table; this object is a ColDefs object. A table column stores several properties that describe the data in it: name, format, unit, a display format disp, null values, the data array and others. Of these, only format is mandatory.

>>> table_hdu.columns
ColDefs(name = 'CRVAL1'
format = 'D25.17'
unit = ' '
disp = 'G25.16'
start = 1, name = 'CRVAL2'
format = 'D25.17'
unit = ' '
disp = 'G25.16'
start = 27, name = 'CRPIX1'
format = 'E15.7'
unit = ' '
disp = 'G15.7'
start = 53, name = 'CRPIX2'
...

Data in tables, both ASCII and binary tables, are stored using a data type similar to numpy record arrays. The data part of the table has several properties that can be used to extract rows and columns from the table.

>>> table_data = table_hdu.data
>>> table_data.field(0) # First column
array([ 182.63118863,  182.62552336,  182.65237923,  182.65002236])
>>> table_hdu.names
['CRVAL1',
'CRVAL2',
'CRPIX1',
'CRPIX2',
'CD1_1',
'CD1_2',
'CD2_1',
'CD2_2',
'DATAMIN',
'DATAMAX',
...
'MEANC100',
'MEANC200',
'MEANC300',
'BACKGRND']
>>> table_data.field('crval1') # Column named 'crval1'
array([ 182.63118863,  182.62552336,  182.65237923,  182.65002236])
>>> table_data['crval1'] # Column named 'crval1'
array([ 182.63118863,  182.62552336,  182.65237923,  182.65002236])
>>> table_data[0] # First row
(182.63118863080001, 39.396336734110001, 420.0, 424.5,
...
)
>>> table_data[0]['crval1'] # First row of column 'crval1'
182.63118863080001

Some differences between ASCII and binary tables are as follows:

  • Binary tables can hold all data types allowed in the FITS standard. ASCII tables can only have characters, integers and floating point numbers. It cannot have Boolean and complex numbers.
  • The value in an ASCII table cell must be a scalar. In binary tables, a cell can contain an array.
  • The TFORMn values differ for ASCII and binary tables.

The keyword TFORMn indicates the format of the data stored in column number n.

Format codes for ASCII tables:

Aw Character string of width w; w is an integer
Iw Integers
Fw.d 32-bit floating point number
Ew.d, Dw.d 64-bit floating point number, in exponential notation

Format codes for binary tables:

L Boolean of length 1 byte
B Unsigned byte
I, J, K 16-bit, 32-bit and 64-bit integers, respectively
A Characters
E, D 32-bit and 64-bit floating point numbers, respectively
C, M 32-bit and 64-bit complex numbers, respectively

For binary tables, an integer preceding the format codes for numerical types sets the length of data in each cell; a number greater than 1 indicates that the each cell holds a multidimensional data array. An integer preceding or succeeding 'A' indicates the length of the string in each cell of the column.

Creating a new FITS table

Creating a new FITS table involves creating several Column objects, creating a ColDefs object using these and then creating a TableHDU or a BinTableHDU using the ColDefs object. Since tables can only be present in FITS extensions, an HDUList is created containing a PrimaryHDU and the table HDU. The resulting HDUList can then be written onto the disk. Headers can be provided explicitly; if not present PyFITS will insert a header with appropriate keywords for the table HDU.

In the following example, we will create a binary table with 3 columns with names "target", "counts", and "spectrum". The "target" column will hold names of objects; a name will be a string of maximum 10 characters. The "counts" column will hold 32 bit integer numbers. Each cell in the "spectrum" column will hold an array of 1000 floating point numbers. The table will hold data for 4 objects, i.e., 4 rows. We will name the table "MYTAB", which will be stored in the header keyword EXTNAME.

>>> targets = numpy.array(['ngc1', 'ngc2', 'ngc3', 'ngc4'])
>>> counts = numpy.array([312, 334, 308, 317])
>>> spectrum = numpy.ones((4,1000), dtype=numpy.float32)

Each row in the spectrum array, in the code fragment above, is the "spectrum" for an object.

PyFITS Column constructor is used to create columns. This constructor takes as argument a name for the column, a format describing the format of the data and the data array itself.

>>> c1 = pyfits.Column(name='target', format='10A', array=targets)
>>> c2 = pyfits.Column(name="counts", format="J", array=counts)
>>> c3 = pyfits.Column(name="spectrum", format="1000E", array=spectrum)

To create a new table we can use the pyfits.new_table function. This function will accept ColDefs object or a list of Column object. So we don't have to perform the intermediate step of creating a ColDefs object from Column object.

>>> table_hdu = pyfits.new_table([c1, c2, c3])
>>> table_hdu.columns
ColDefs(name = 'target'
format = '10A', name = 'counts'
format = 'J', name = 'spectrum'
format = '1000E')
>>> print table_hdu.header
XTENSION= 'BINTABLE'           / binary table extension 
BITPIX  =                    8 / array data type        
NAXIS   =                    2 / number of array dimensions  
NAXIS1  =                 4014 / length of dimension 1
NAXIS2  =                    4 / length of dimension 2  
PCOUNT  =                    0 / number of group parameters
GCOUNT  =                    1 / number of groups
TFIELDS =                    3 / number of table fields
TTYPE1  = 'target  '
TFORM1  = '10A     '  TTYPE2  = 'counts  '
TFORM2  = 'J       '  TTYPE3  = 'spectrum'
TFORM3  = '1000E   '  >>> table_hdu.header['naxis2'] # Number of rows
4
>>> table_hdu.header['naxis1'] # Number of bytes in a row = 10 + 4 + 4*1000
4014
>>> table_hdu.header['tfields'] # Number of columns
3
>>> table_hdu.header['tform1'] # Format of first column
10A
>>> table_hdu.header['tform2'] # Format of second column
J
>>> table_hdu.header['tform1'] # Format of third column
1000E

Note that PyFITS has automatically inserted the required keywords with proper values. By default the table created is a binary table; pass the keyword tbtype with value set to TableHDU to create ASCII tables. To add additional records to the header, we can use the update method of the header.

The new_table function takes a keyword parameter, nrows, that can be used to specify the number of rows in a table. If we need to add more rows, we can create a new table with data from the existing table, and setting nrows to the desired value. The empty rows can then be filled in with data.

>>> table_hdu = pyfits.new_table(table_hdu.data, nrows=10)
>>> print table_hdu.header
XTENSION= 'BINTABLE'           / binary table extension  
BITPIX  =                    8 / array data type  
NAXIS   =                    2 / number of array dimensions
NAXIS1  =                 4014 / length of dimension 1
NAXIS2  =                   10 / length of dimension 2
PCOUNT  =                    0 / number of group parameters
GCOUNT  =                    1 / number of groups
TFIELDS =                    3 / number of table fields
TTYPE1  = 'target  '
TFORM1  = '10A     '
TTYPE2  = 'counts  '
TFORM2  = 'J       '
TTYPE3  = 'spectrum'
TFORM3  = '1000E   '
>>> print table_hdu.header['naxis2']
10
>>> print table_hdu.data['target']
['ngc1' 'ngc2' 'ngc3' 'ngc4' '0.0' '0.0' '0.0' '0.0' '0.0' '0.0']
>>> table_hdu.data[4]['target'] = "ngc1"
>>> print table_hdu.data['target']
['ngc1' 'ngc2' 'ngc3' 'ngc4' 'ngc5' '0.0' '0.0' '0.0' '0.0' '0.0']
>>> table_hdu.data[4]['counts'] = 315
>>> print table_hdu.data['counts']
[312 334 308 317 315   0   0   0   0   0]
>>> table_hdu.data[4]['spectrum'] = numpy.random.random(1000)
>>> print table_hdu.data[4]['spectrum']
[  7.87860215e-01   5.22207081e-01   4.59663749e-01   2.55125035e-02
   1.91508323e-01   9.30920959e-01   7.40450263e-01   6.01666987e-01
   7.82849014e-01   8.67393970e-01   7.16790557e-02   5.63477933e-01
   ...
   4.05643554e-03   9.19736564e-01   8.43929410e-01   2.26114035e-01
   9.95179892e-01   4.55000669e-01   4.56752568e-01   2.68411219e-01]

Since a table cannot be part of the Primary HDU, we create an empty Primary HDU and use it with the table HDU to create a FITS file.

>>> phdu = pyfits.PrimaryHDU()
>>> print phdu.header
SIMPLE  =                    T / conforms to FITS standard 
BITPIX  =                    8 / array data type
NAXIS   =                    0 / number of array dimensions
EXTEND  =                    T 
>>> table_hdu.name = "MyTab" # EXTNAME header keyword
>>> print table_hdu.header
XTENSION= 'BINTABLE'           / binary table extension
BITPIX  =                    8 / array data type
NAXIS   =                    2 / number of array dimensions
NAXIS1  =                 4014 / length of dimension 1
NAXIS2  =                   10 / length of dimension 2
PCOUNT  =                    0 / number of group parameters
GCOUNT  =                    1 / number of groups
TFIELDS =                    3 / number of table fields
TTYPE1  = 'target  ' 
TFORM1  = '10A     '
TTYPE2  = 'counts  '
TFORM2  = 'J       '
TTYPE3  = 'spectrum'
TFORM3  = '1000E   '
EXTNAME = 'MYTAB   '           / extension name 
>>> hdulist = pyfits.HDUList([phdu, table_hdu])
>>> hdulist.info()
Filename: (No file associated with this HDUList)
No.    Name         Type      Cards   Dimensions   Format
0    PRIMARY     PrimaryHDU       4  ()            
1    MYTAB        BinTableHDU     15  10R x 3C      ['10A', 'J', '1000E']
>>> hdulist.writeto('table.fits')
>>> pyfits.info('table.fits')
Filename: table.fits
No.    Name         Type      Cards   Dimensions   Format
0    PRIMARY     PrimaryHDU       4  ()            uint8
1    MYTAB       BinTableHDU     15  4R x 3C       [10A, J, 1000E]

Section 5.2, in page 26, of the PyFITS Handbook describes ways to append columns to a table and merge two tables.

Finally

http://twitter.com/#!/doug_burke/status/2375100477214720

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment