Skip to content

Instantly share code, notes, and snippets.

@gpertea
Created February 1, 2018 16:09
Show Gist options
  • Save gpertea/39141a105a42a0a78374254f31dc5728 to your computer and use it in GitHub Desktop.
Save gpertea/39141a105a42a0a78374254f31dc5728 to your computer and use it in GitHub Desktop.
fast in-place parse of a list of comma-delimited int values in a string SAM tag
char* str=brec->tag_str("ZD"); //let's say the tag is "ZD"
GVec<int> vals;
char* p=str; //slice start
for (int i=0;;++i) {
char ch=str[i];
if (ch==',') {
str[i]=0;
int v=atoi(p); //check for int parsing errors?
vals.Add(v);
p=str+i+1;
} else if (ch==0) {
int v=atoi(p); //check for int parsing errors?
vals.Add(v);
break;
}
}
//now vals should have all parsed int values
@gpertea
Copy link
Author

gpertea commented Feb 1, 2018

Wondering if, for longer strings, using strchr is faster (if it's implemented with SSE optimization). In that case the code would look like this:

char* str=brec->tag_str("ZD"); //let's say the tag is "ZD"
GVec<int> vals;
char *p=str;
char *pd=NULL; //position of last delimiter found
while (1) {
   pd=strchr(p, ';');
   if (pd!=NULL) {
       *pd=0; //end the slice string
        int v=atoi(p);
        vals.Add(v);
        p=pd+1;
   }
   else {
        int v=atoi(p);
        vals.Add(v);
        break;
   }
}

In the case of multiple choices of a single-character delimiter, the code is very similar, using strpbrk(p, delim) instead of strchr(), where delim is a string with the set of characters which can be slice delimiters (e.g. char* delim=",;.:";).

If the delimiter is a string using strstr() is definitely the way to go (especially if it's SSE optimized, which it should be):

char* str=brec->tag_str("ZD"); //let's say the tag is "ZD"
GVec<int> vals;
char *p=str;
char *pd=NULL; //position of last delimiter found
const char* delim="^^";
int dlen=strlen(delim);
while (1) {
   pd=strstr(p, delim);
   if (pd!=NULL) {
       *pd=0; //end the slice string
        int v=atoi(p);
        vals.Add(v);
        p=pd+dlen;
   }
   else {
        int v=atoi(p);
        vals.Add(v);
        break;
   }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment