Skip to content

Instantly share code, notes, and snippets.

@crmackay
Last active May 7, 2021 16:27
Show Gist options
  • Save crmackay/c35ccc406c1601a45146bf9c57538594 to your computer and use it in GitHub Desktop.
Save crmackay/c35ccc406c1601a45146bf9c57538594 to your computer and use it in GitHub Desktop.
description of illumina's bcl file format

taken from : https://github.com/broadinstitute/picard/blob/6f1c99263c2b8f712f06a07b50d4db0550ed4885/src/java/picard/illumina/parser/readers/BclReader.java#L47-74

BCL Files are base call and quality score binary files containing a (base,quality) pair for successive clusters. The file is structured as followed: Bytes 1-4 : unsigned int numClusters Bytes 5-numClusters + 5 : 1 byte base/quality score

The base/quality scores are organized as follows (with one exception, SEE BELOW): The right 2 most bits (these are the LEAST significant bits) indicate the base, where A=00(0x00), C=01(0x01), G=10(0x02), and T=11(0x03)

The remaining bytes compose the quality score which is an unsigned int.

EXCEPTION: If a byte is entirely 0 (e.g. byteRead == 0) then it is a no call, the base becomes '.' and the Quality becomes 2, the default illumina masking value

(E.g. if we get a value in binary of 10001011 it gets transformed as follows:

Value read: 10001011(0x8B)

Quality Base
100010 11
00100010 0x03
0x22 T
34 T

So the output base/quality will be a (T/34)

@kemin711
Copy link

kemin711 commented Feb 5, 2021

I could not find the bcl file format full description. Is there an official document? Maybe can read the source code of bcl2fastq program which is still open source (last updated 2017). The newer bcl convert program is only binary-distributed.

@kemin711
Copy link

kemin711 commented Feb 5, 2021

Found page 7-9 of Illumina bcl2fastq program manual has some description.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment