Skip to content

Instantly share code, notes, and snippets.

  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save samuell/5559768 to your computer and use it in GitHub Desktop.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXFLEN 70000000 /* Larger than the file. */
/* Assume that sizeof(short)==2, unaligned reads ok, etc. */
int main()
{
char *m=malloc(MAXFLEN);
char tablegc[65536]={0,};
char tabletotal[65536]={0,};
int gc=0;
int total=0;
FILE *f=fopen("Homo_sapiens.GRCh37.67.dna_rm.chromosome.Y.fa","r");
int items=fread(m,1,MAXFLEN,f); /* Read the whole file into memory. */
char *ptr=m;
short *sptr;
int nums, i;
while (*ptr++!='\n'); /* Find end of first line. */
for (i=0; i<256; i++)
{
tabletotal['A'*256+i]++;
tabletotal[i*256+'A']++;
tabletotal['T'*256+i]++;
tabletotal[i*256+'T']++;
tabletotal['C'*256+i]++;
tabletotal[i*256+'C']++;
tabletotal['G'*256+i]++;
tabletotal[i*256+'G']++;
tablegc['C'*256+i]++;
tablegc[i*256+'C']++;
tablegc['G'*256+i]++;
tablegc[i*256+'G']++;
}
nums=(items-(ptr-m));
nums=nums/2+nums%2; /* There is also EOL, so no need to worry about that, just
round the number of items up. */
sptr=(short*)ptr;
#pragma omp parallel for private(i),reduction(+:gc),reduction(+:total)
for (i=0; i<nums; i++) {
short s=sptr[i];
total+=(int)tabletotal[(int)s];
gc+=(int)tablegc[(int)s];
}
fclose(f);
free(m);
printf("%.10f\n",(100.*gc)/total);
return 0;
}
@JervenBolleman
Copy link

I believe this produces incorrect output for Homo_sapiens.GRCh38.dna.chromsome.1.fa

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment