Skip to content

Instantly share code, notes, and snippets.

@crmackay
Last active May 7, 2021 16:27
Show Gist options
  • Save crmackay/c35ccc406c1601a45146bf9c57538594 to your computer and use it in GitHub Desktop.
Save crmackay/c35ccc406c1601a45146bf9c57538594 to your computer and use it in GitHub Desktop.
description of illumina's bcl file format

taken from : https://github.com/broadinstitute/picard/blob/6f1c99263c2b8f712f06a07b50d4db0550ed4885/src/java/picard/illumina/parser/readers/BclReader.java#L47-74

BCL Files are base call and quality score binary files containing a (base,quality) pair for successive clusters. The file is structured as followed: Bytes 1-4 : unsigned int numClusters Bytes 5-numClusters + 5 : 1 byte base/quality score

The base/quality scores are organized as follows (with one exception, SEE BELOW): The right 2 most bits (these are the LEAST significant bits) indicate the base, where A=00(0x00), C=01(0x01), G=10(0x02), and T=11(0x03)

The remaining bytes compose the quality score which is an unsigned int.

EXCEPTION: If a byte is entirely 0 (e.g. byteRead == 0) then it is a no call, the base becomes '.' and the Quality becomes 2, the default illumina masking value

(E.g. if we get a value in binary of 10001011 it gets transformed as follows:

Value read: 10001011(0x8B)

Quality Base
100010 11
00100010 0x03
0x22 T
34 T

So the output base/quality will be a (T/34)

@kemin711
Copy link

kemin711 commented Feb 5, 2021

Found page 7-9 of Illumina bcl2fastq program manual has some description.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment