NickRoz1

## test.py
MEGABYTE_SIZE = 1048576
FILE_SIZE = 4000 * MEGABYTE_SIZE
BUFFER_SIZE = 1000 * MEGABYTE_SIZE
RECORD_SIZE = 300

import random
import tempfile
import os
import mmap
import math

## final_blog.md

      
        
          
            
              
              1 file
            
          
          
            
              
              0 forks
            
          
          
            
              
              0 comments
            
          
          
            
              
              1 star
            
          
        
        
          
              
          
          
            
                NickRoz1
                / final_blog.md
            
            
              Last active
              October 3, 2019 11:56
            
              
                Final Blog
              
          
        
      
        
  
      
    Final Blog

Nick Rozinsky

The code is located at https://github.com/NickRoz1/cBAM
History:

CBAM file format intended to eliminate unnecessary disk overhead when processing of BAM data requires only few fields of each BAM records. Initial implementation was simple tool for parsing full columns and rowgroups, and wasn't suitable for big files and lack of features. Current implementation provides convenient API which is convenient to use.
Since the time of July blog, the CBAM reader implementation was completely refactored. It lacked two, major features - iteration over column in foreach and iteration over few columns simultaneously.
These features implemented on a base of a new Column primitive. Now, to acquire a column of CBAM file one may use:
	MEGABYTE_SIZE = 1048576
	FILE_SIZE = 4000 * MEGABYTE_SIZE
	BUFFER_SIZE = 1000 * MEGABYTE_SIZE
	RECORD_SIZE = 300

	import random
	import tempfile
	import os
	import mmap
	import math