Skip to content

Instantly share code, notes, and snippets.

@greggman
Last active August 29, 2015 14:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save greggman/18953af231c9bf79cb12 to your computer and use it in GitHub Desktop.
Save greggman/18953af231c9bf79cb12 to your computer and use it in GitHub Desktop.
What could possibly happen when typing a single key in the Chrome omnibox

#Lifetime of an omnibox key press:

The stuff below is all made up. I have not worked on the omnibox in Chrome. But I wouldn't be surprised if this is close to reality.

  • Should we initiate an IME? Chrome has one built in.

    • I'm sure there's lots of allocations required for an IME but let's skip those
  • Create request to ask Google for things that match what we're typing

  • Create HTTP request data to send in request. Let's assume it's UTF8 json like {query:"k"}.

    • Note: Being UTF-8 we're going to have to write that query on the fly.
    • Being that the omnibar has no limit in input size we can't just a fixed size buffer to generate this string
  • Create someting to manage sending and receiving the request. Let's call it an HTTPRequest. Because this is async we can't statically allocate it. We have no idea how many ms or seconds until we get a response. The user may have typed 50 more characters before we get this back. We'll be sending a request and expecting a response for each key. In other words if we're typing "amazon" we're going to first do a request for "a", then "am", then "ama", then "amaz", then "amazo" etc..

  • Each of those requests has no idea how big the response will be. It could be 1 byte, it could be 1meg, We could reject large results but the point is we don't know. So we're going to have to allocate buffers to receive the data or give every request a buffer large enough to hold the largest result we support.

  • Once the data arrives we have to parse it. Assume it's something like

    { matches: [
       { desc: "amazon.com", url: "http://amazon.com", },
       { desc: "amazing.com", url: "http://amazing.com", },
       { desc: "the amazing randy", url: http://soandsosblog.com/amazing-randy.html", },
      ]
    }
    

    We have to parse that. With creative coding we could possibly parse it in place. But. We've no idea how many matches there will be so the list of results will probably have to allocated.

  • We'll probably want to cache those requests so if the user types "a" again or "am" again or just types "a", "[DEL]", "a", "[DEL]" over and over we're not making a gazillion requests so, hey more allocations to put things in a cache.

  • We'll also querying our local database of history and bookmarks. That database is also accessed async because we don't want the omnibox to get janky if the database is large. So, Filling out request, again allocated because again we have no idea how long they will take until the answer arrives back. They'll be sent to another thread or possibly another process. Effectively repeating all the steps above.

  • We've also got a list of "special" urls like how if you start typing "youtube.com" you'll see after only one character it says "Press [tab] to search Youtube Video Search". I'm only guessing that system is similar to our history DB. I mean, why write an entirely new system to query that because there are infinite URLs so might as well repurpose the same system since it's already async and therefore will gracefully handle if there's every a giant list of those special URLs. So, repeat the last few steps.

  • Let's assume we some how manage to get both the history and the search results. We now need some way to merge those results into the list of things we want to display

  • Chrome keeps tons of statistics which it (opt-in) reports back to the chrome team. I have a feeling there is allocation for those structures as well. (what are people doing with the omnibox)

  • Next we need to put that data in some format that represents what we want to draw. Some characters will be bold, some in different colors, depending on the type of the url there will be different icons. If those icons have not been loaded yet we'll need to start threads to download, and decode them, again async so we don't block user input.

  • Now we want to display them. We support unlimited fonts and full unicode so we can't have pre-made texture of every glpyh in every font. We have to have a glyph cache which means the first time any letter is users it needs to be added to the cache in each size and style it will be drawn. In other words, a bold letter will be in the cache separate from it's none bold self. It's even possibly it will be cached 2 or 3 times. Once pulled out of the font and processed into something more useful (as a point of reference Unicode Arial is 22meg). Then again rasterized into a bitmap. And finally turned into a texture for GPU accelerated rendering. I'm guessing there isn't one bitmap nor one texture per glyph but even if there isn't there has to be structures for knowing where in a bitmap/texture a particular glyph has been cached.

    It should be pointed out that some people would think their IMGUI would do this better. They load a texture of ASCII glyphs and use a custom allocator to insert quads that point to glyph in their texture. Several problems with that approach. One, can't support all of unicode and fit in memory. 2, pre-rendered ASCII texture not correct for platform, Windows should look different than OSX, Linux, Android. 3. Doesn't handle user prefs. Some OSes let the user pick the font. If not the style at least the size. Also HD-DPI vs non. 4. doesn't handle font styles. Would need to load more textures for bold, italic, etc. 5. Does not likely handle the OSes accessibility APIs for people with disabilities.

    Here's a screenshot of the omnibox results. I count at least 60 different transitions/areas. Icon vs text, bold vs normal, green, black, gray

    alt text

    Here's another

    alt text

    And right to left input

    alt text

    Using the native text reading code means interfacing with the native APIs which often require allocations.

  • Chrome takes security seriously

    For example that means JavaScript is always run in a sandboxed process. You can type JavaScript into the omnibox. Type javascript:alert("test"). That JavaScript has to get run in a sandboxed process otherwise any bugs in JavaScript or any other API accessible from JavaScript would allow it to own your machine.

    Similarly GPU access has to be from one sandboxed process. That process has no other privileges (can not access nor write files) so in the off chance Chrome's GPU command validation still allows someone to access a GPU driver bug they can't access things they aren't supposed to.

    That means lots of cross process communication happening to make sure things stay secure. That often means allocation since data has to be transferred across processes.

PS: I consider custom allocation to still be allocation. Other words if I do this

struct Entry {
  const char* key;
  const char* value;
};
struct Dict {
  int maxEntries;
  int numEntries;
  Entry firstEntry[1];
};

Dict* InPlaceStringDict(int maxEntries, void* memToManageDictIn) {
   Dcict* e = (Dict*)memToManageDictIn;
   e->maxEntries = maxEntries;
   e->numEntries = 0;
};

bool AddToDict(Dict* dict, const char* key, const char* value) {
   if (dict->numEntires == dict->maxEntries) {
     return false;
   ]
   Entry* e = &dict->firstEntry[dict->numEntries++];
   e->key = key;
   e->value = value;
   return true;
};

I can then write code like this

Dict* dict = InPlaceStringDict(50, someMemWithSpacefor50EntriesPlusDictMaybePreallocatedOrMalloced);
while(stillParsing) {
  if (foundKeyAndValue) {
    if (!AddToDict(dict, key, value)) {
      // oops, filled the dictionary
      break;
    }
  }
}

AddToDict is still allocation. I've just written a custom allocator. I haven't gotten rid of allocation.

As another example here's a memory allocator I've used in games

static char* allMem;
static char* highMem;
static char* lowMem;
static char* memStack[MAX_MEM_STACK];
static int memStackPtr = 0;

void InitMemSystem(mem) {
   static void* allMem = malloc(512 * 1024);
}

void SaveMemState() {
  memStack[memStackPtr++] = lowMem;
}

void* Alloc(int size) {
  assert(highMem - lowmem >= size);
  void* p = lowMem;
  lowMem += size;
  return p;
}

void Free(void* ) {
  // NO-OP
}

void PopMemState() {
  lowMem = memStack[--memStackPtr];
}

At the start of a level I call SaveMemState. I never de-allocate anything. At the end of a level I call PopMemState. Allocation is now super efficient. IT'S STILL AN ALLOCATOR.

I'd also like to point out that Chrome is a large project with hundreds of programmers. What I mean by that is I need write code that's hard for the next programmer to use incorrectly. If I'm manually managing memory so as to avoid allocations I'd have to be aware that the next programmer won't have my mental image of how memory is organized, what things last what lifetimes, etc. I'm not saying there's no way to do it. I'm am saying that direct pointer manipulation into chunks of memory is not likely a good practice for a team with 100s of programmer.

References:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment