Skip to content

Instantly share code, notes, and snippets.

@saxbophone
Last active August 3, 2023 03:33
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save saxbophone/e988cef9f351863f4312f2eef41a3a83 to your computer and use it in GitHub Desktop.
Save saxbophone/e988cef9f351863f4312f2eef41a3a83 to your computer and use it in GitHub Desktop.
Python StringView implementation. Supports slicing, iteration and creating sub-views of existing StringViews. No copying, only reference semantics.
class StringView:
"""
StringView implementation using minimal copying with maximum use of
reference semantics. Creating a sub-view of an existing StringView using
either object slicing or constructing one from another will reüse the same
source string object, using a reference rather than a copy.
The contents() method can similarly be used to get an iterator (Generator)
to access the view contents sequentially without putting it all in memory
at once.
A brand new string object is only created if the StringView is cast to str.
"""
def __init__(self, source: str, start=0, stop=None):
if isinstance(source, StringView):
self.__source = source.__source
self.__start = source.__start + start
self.__stop = source.__start + (min(stop, len(source)) if stop is not None else 0)
else:
self.__source = source
self.__start = start
self.__stop = stop if stop is not None else len(source)
assert self.__start <= self.__stop
def __str__(self):
return self.__source[self.__start:self.__stop]
def __repr__(self):
return 'StringView({}, {}, {})'.format(
repr(self.__source),
self.__start,
self.__stop
)
# these next two methods are provided so we can produce StringViews from StringViews
def __len__(self):
return self.__stop - self.__start
def __getitem__(self, key):
if isinstance(key, slice):
if key.step is not None:
raise TypeError('StringView does not support step when slicing')
return StringView(self, key.start, key.stop)
"""
not only is there no point returning a StringView of length 1, it's also
slightly less memory-intensive to just return a str of length 1...
"""
return self.__source[key]
def contents(self):
"""
Returns Generator for efficient no-copy iteration over string contents
"""
return (self.__source[i] for i in range(self.__start, self.__stop))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment