Skip to content

Instantly share code, notes, and snippets.

@rofl0r
Created August 30, 2010 15:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rofl0r/557533 to your computer and use it in GitHub Desktop.
Save rofl0r/557533 to your computer and use it in GitHub Desktop.
OOC: Changes to String
String is now a class, which wraps a Buffer in a immutable way.
this means every method that can change the string, like trim, is executed on a clone of the internal buffer, then a new String including that buffer is returned.
the other difference between string and buffer is, that methods for the buffer are mostly of void return type, while String always returns a new instance. so you can do something like
// s infers to String here
s := " /tmp/aaa/bbb " trim() replaceAll("/", "\\") reverse()
here a new string is created, then each time a clone created and the method applied on the clones buffer
so this makes for at least 4 memory allocations and complete memcpy, which isnt very efficient.
especially if i.e. the trim call was unnecesseary.
the advantage of this style is the immutability of the string, which can eventually be exploited for parallel computing.
// this here has the same effect
b := Buffer new()
b append(" /tmp/aaa/bbb ") // the string literal infers to string, which is appended to b
b trim() // trim function directly applied on the mutable buffer
b replaceAll("/", "\\") // function directly applied on the mutable buffer
b reverse()
s := b toString() // creates a new string, using our buffer (no copy here)
this version here leads to the same result, but is faster, since there are less copying memory operations involved, and the trim can i.e. check if theres work to do and just decide to do nothing at all.
however, appending of the stringliteral is still costly.
String/Buffer features a size members, which speeds up things since strlen has not to be called everytime the length is needed.
i further introduced the CString type in Character ooc, which is a primitive representing the simple char array pointer C uses- it features some basic methods like length() or clone() and toString()
a CString can be converted to a String via mycstring toString()
a String can be "converted" to a CString via mystring toCString() (basically it returns only the pointer to Strings data)
Buffer internally takes care about zero termination. if you only use the supplied methods, it is guaranteed (through assertions) to be always zero terminated (it may contain zeroes itself though). this way when a c function is called, you can just safely pass the data pointer returned by toCString() without any further runtime costs.
there's one catch though, regarding C variadic functions, since C doesnt know how to handle a String class.
printf("%s\n", mystring) will print nothing or garbage for that reason
you must now use: printf("%s\n", mystring toCString())
note that theoretically even printf("%s\n" toCString(), mystring toCString()) would be required, but on extern functions that are declared to accept a parameter of type CString or Char*, an implicit conversion is done via a "implicit as" operator.
the same holds for format, etc.
we're working on a solution for better varargs.
the author would pretty much like a concat: func(...) for Buffer, which could be passed a variable number of strings and buffers and appended with a single malloc, since the total size of each element can be calculated in advance.
Casting:
some rock developers used to cast Char*'s to Strings and vice versa. that would lead to a bug now since it would overwrite the strings class struct with the data.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment