greggman/gist:18953af231c9bf79cb12

## gistfile1.md

      
    Raw
  

              gistfile1.md
            
          
    #Lifetime of an omnibox key press:
The stuff below is all made up. I have not worked on the omnibox in Chrome. But I wouldn't be surprised if this is close
to reality.


Should we initiate an IME? Chrome has one built in.

I'm sure there's lots of allocations required for an IME but let's skip those


Create request to ask Google for things that match what we're typing


Create HTTP request data to send in request. Let's assume it's UTF8 json like {query:"k"}.

Note: Being UTF-8 we're going to have to write that query on the fly.
Being that the omnibar has no limit in input size we can't just a fixed size buffer to generate this string


Create someting to manage sending and receiving the request. Let's call it an HTTPRequest.
Because this is async we can't statically allocate it. We have no idea how many ms or seconds until we get a response.
The user may have typed 50 more characters before we get this back. We'll be sending a request and expecting a response
for each key. In other words if we're typing "amazon" we're going to first do a request for "a", then "am", then "ama",
then "amaz", then "amazo" etc..


Each of those requests has no idea how big the response will be. It could be 1 byte, it could be 1meg, We could reject
large results but the point is we don't know. So we're going to have to allocate buffers to receive the data or give
every request a buffer large enough to hold the largest result we support.


Once the data arrives we have to parse it. Assume it's something like
{ matches: [
   { desc: "amazon.com", url: "http://amazon.com", },
   { desc: "amazing.com", url: "http://amazing.com", },
   { desc: "the amazing randy", url: http://soandsosblog.com/amazing-randy.html", },
  ]
}

We have to parse that. With creative coding we could possibly parse it in place.
But. We've no idea how many matches there will be so the list of results will probably have to allocated.


We'll probably want to cache those requests so if the user types "a" again or "am" again or just types "a", "[DEL]", "a", "[DEL]" over and over we're not making a gazillion requests so, hey more allocations to put things in a cache.


We'll also querying our local database of history and bookmarks. That database is also accessed async
because we don't want the omnibox to get janky if the database is large. So,
Filling out request, again allocated because again we have no idea how long they will take until the answer arrives back.
They'll be sent to another thread or possibly another process.
Effectively repeating all the steps above.


We've also got a list of "special" urls like how if you start typing "youtube.com" you'll see after
only one character it says "Press [tab] to search Youtube Video Search". I'm only guessing that system is similar to our
history DB. I mean, why write an entirely new system to query that because there are infinite URLs so might as well
repurpose the same system since it's already async and therefore will gracefully handle if there's every a giant list
of those special URLs. So, repeat the last few steps.


Let's assume we some how manage to get both the history and the search results. We now need some way to merge those results
into the list of things we want to display


Chrome keeps tons of statistics which it (opt-in) reports back to the chrome team. I have a feeling there is allocation
for those structures as well. (what are people doing with the omnibox)


Next we need to put that data in some format that represents what we want to draw. Some characters will be bold, some in
different colors, depending on the type of the url there will be different icons. If those icons have not been loaded
yet we'll need to start threads to download, and decode them, again async so we don't block user input.


Now we want to display them. We support unlimited fonts and full unicode so we can't have pre-made texture of every glpyh in
every font. We have to have a glyph cache which means the first time any letter is users it needs to be added to the cache in
each size and style it will be drawn. In other words, a bold letter will be in the cache separate from it's none bold
self. It's even possibly it will be cached 2 or 3 times. Once pulled out of the font and processed into something more useful
(as a point of reference Unicode Arial is 22meg). Then again rasterized into a bitmap. And finally turned into a texture for
GPU accelerated rendering. I'm guessing there isn't one bitmap nor one texture per glyph but even if there isn't there has
to be structures for knowing where in a bitmap/texture a particular glyph has been cached.
It should be pointed out that some people would think their IMGUI would do this better. They load a texture of ASCII glyphs
and use a custom allocator to insert quads that point to glyph in their texture. Several problems with that approach. One,
can't support all of unicode and fit in memory. 2, pre-rendered ASCII texture not correct for platform, Windows should look
different than OSX, Linux, Android. 3. Doesn't handle user prefs. Some OSes let the user pick the font. If not the style
at least the size. Also HD-DPI vs non. 4. doesn't handle font styles. Would need to load more textures for bold, italic, etc.
5. Does not likely handle the OSes accessibility APIs for people with disabilities.
Here's a screenshot of the omnibox results. I count at least 60 different transitions/areas. Icon vs text, bold vs normal,
green, black, gray

Here's another

And right to left input

Using the native text reading code means interfacing with the native APIs which often require allocations.


Chrome takes security seriously
For example that means JavaScript is always run in a sandboxed process. You can type JavaScript into the omnibox.
Type javascript:alert("test"). That JavaScript has to get run in a sandboxed process otherwise any bugs in JavaScript
or any other API accessible from JavaScript would allow it to own your machine.
Similarly GPU access has to be from one sandboxed process. That process has no other privileges
(can not access nor write files) so in the off chance Chrome's GPU command validation still allows someone to
access a GPU driver bug they can't access things they aren't supposed to.
That means lots of cross process communication happening to make sure things stay secure. That often means
allocation since data has to be transferred across processes.


PS: I consider custom allocation to still be allocation. Other words if I do this
struct Entry {
  const char* key;
  const char* value;
};
struct Dict {
  int maxEntries;
  int numEntries;
  Entry firstEntry[1];
};

Dict* InPlaceStringDict(int maxEntries, void* memToManageDictIn) {
   Dcict* e = (Dict*)memToManageDictIn;
   e->maxEntries = maxEntries;
   e->numEntries = 0;
};

bool AddToDict(Dict* dict, const char* key, const char* value) {
   if (dict->numEntires == dict->maxEntries) {
     return false;
   ]
   Entry* e = &dict->firstEntry[dict->numEntries++];
   e->key = key;
   e->value = value;
   return true;
};

I can then write code like this
Dict* dict = InPlaceStringDict(50, someMemWithSpacefor50EntriesPlusDictMaybePreallocatedOrMalloced);
while(stillParsing) {
  if (foundKeyAndValue) {
    if (!AddToDict(dict, key, value)) {
      // oops, filled the dictionary
      break;
    }
  }
}

AddToDict is still allocation. I've just written a custom allocator. I haven't gotten rid of allocation.
As another example here's a memory allocator I've used in games
static char* allMem;
static char* highMem;
static char* lowMem;
static char* memStack[MAX_MEM_STACK];
static int memStackPtr = 0;

void InitMemSystem(mem) {
   static void* allMem = malloc(512 * 1024);
}

void SaveMemState() {
  memStack[memStackPtr++] = lowMem;
}

void* Alloc(int size) {
  assert(highMem - lowmem >= size);
  void* p = lowMem;
  lowMem += size;
  return p;
}

void Free(void* ) {
  // NO-OP
}

void PopMemState() {
  lowMem = memStack[--memStackPtr];
}

At the start of a level I call SaveMemState. I never de-allocate anything. At the end of a level I call PopMemState. Allocation is now super efficient. IT'S STILL AN ALLOCATOR.
I'd also like to point out that Chrome is a large project with hundreds of programmers. What I mean by that is I need write code that's hard for the next programmer to use incorrectly. If I'm manually managing memory so as to avoid allocations I'd have to be aware that the next programmer won't have my mental image of how memory is organized, what things last what lifetimes, etc. I'm not saying there's no way to do it. I'm am saying that direct pointer manipulation into chunks of memory is not likely a good practice for a team with 100s of programmer.
References:

http://www.chromium.org/omnibox-history-provider
http://www.chromium.org/developers/design-documents/graphics-and-skia
http://www.chromium.org/developers/design-documents/multi-process-resource-loading
http://www.chromium.org/developers/design-documents/sync