Skip to content

Instantly share code, notes, and snippets.

@netroy
Last active October 28, 2015 01:58
Show Gist options
  • Save netroy/3414397 to your computer and use it in GitHub Desktop.
Save netroy/3414397 to your computer and use it in GitHub Desktop.
A better IndexedDB IMHO

--------- Work In progress ---------

##Context Multiple discussion threads spun out after this tweet mentioning my talk on Full-Text Search in IndexedDB using Inverted Indices at BerlinJS.

Only the beginning of my talk was about my discontent with the current IDB API and how it is a much needed technology. Unfortunately it's not terrific as a spec and has rather messy implementations across browsers. The API is way too low-level and needs a decent amount of abstraction for everyday web-developers to work with it.

Sharing my slides isn't going to be of much help, and neither is discussing about them in 140 chars. Hence, I'm going to use this gist to propose changes to the IDB API spec and the implementations.

##Issues

Transactions

I appreciate the idea of transactions in IDB. They let the system use locks if the transaction is a "Write", but let the system perform multiple "Read" operations when there is no lock, ensuring consistency in the storage.

But, In most cases, IDB-transactions are single-operation and auto-commit once the last callback is fired, so why can't a single-action operation decide the transaction type and abstract it away from the developer who doesn't explicitly need transactions.

So,

// DB - the handle to the database
// storeName - the name of the objectStore
var writeTransaction = DB.transaction([storeName], IDBTransaction.READ_WRITE);
var store = writeTransaction.objectStore(storeName);
var writeRequest = store.put({
  'foo': 'bar'
});
writeRequest.onsuccess = function (e) {
  callback(null, e.target.result);
}
writeRequest.onerror = function(err) {
  callback(err);
}

could be written in a much simpler way as,

DB.objectStores[storeName].put({
  'foo': 'bar'
}, callback);

This also saves us from IDBTransaction.READ_WRITE vs 1 vs readwrite across the implementations. (You'd know this if you've written any apps that runs across a few browsers)

Versioning

I like the idea of having the databases versioned. It lets you create migration paths whenever the developer decides to change the DB structure.

But I also think that for a large number of use-cases there should be a way of storing data without having to deal with the concept of version completely. Most common use case would be an asynchronous localStorage replacement (It's about time we had one). Someone looking for a simple key-value store shouldn't need to resort to localStorage + JSON.stringify, two synchronous APIs enough to kill any app's performance.

Querying

Why can't we just have IDBKeyRange.only and IDBKeyRange.bound as the only key-range functions, and pass undefined for open-ended queries like <, ≤, ≥ or >. This way, the arguments also define the sort direction, so we can just open the cursor on the bounds and iterate through without IDBCursor.prev or IDBCursor.next. The second argument to index.openCursor, if true, can be then used for iterating uniques only.

It'd be great to also have utility functions that do the iterations for a bound, limit and offset.

So,

// DB - the handle to the database
// storeName - the name of the objectStore
// indexKey - name of the index
// lowerBound & upperBound - ranges
// limit & offset - for pagination
var queryTransaction = DB.transaction([storeName], IDBTransaction.READ_ONLY);
var store = queryTransaction.objectStore(storeName);
var index = store.index(indexKey);
var lower, upper, direction, results = [], count = 0;
if(lowerBound > upperBound) {
  lower = upperBound;
  upper = lowerBound;
  direction = IDBCursor.PREV;
} else {
  lower = lowerBound;
  upper = upperBound;
  direction = IDBCursor.NEXT;
}
var bounds = IDBKeyRange.bound(lower, upper, true, true);
var queryCursor = index.openCursor(bounds, direction);
queryCursor.onsuccess = function (e) {
  var cursor = e.target.result;
  if (!cursor) {
    callback(null, results);
  } else {
    if(count >= offset) {
      results.push(cursor.value);
      if(results.length === limit) {
        return cursor.continue(upper + 1);
      }
    }
    count++;
    cursor.continue();
  }
}
queryCursor.onerror = function(err) {
  callback(err);
}

could be simplified as

var bounds = IDBKeyRange.bound(lower, upper, true, true);
var cursor = DB.objectStores[storeName].index(indexKey).openCursor(bounds, false);
IDBUtils.readCursor(limit, offset, callback);

Memory Leaks

I know it's not a spec's problem if the implementations are bad at managing memory. But the spec can propose that all open handles to IDB databases be closed on context/window unload.

Unclosed DB handles on a page reload make up for the largest majority of the memory leak issues I've seen. A good-developer would learn it the hard way and close the DB handles. Unfortunately we don't can't assume that all IDB consumers are good-developers.

Sync API

Again, not a problem with the spec. But if the browsers implement this soon, the developers can move all there non-DOM code to a worker, keep the code simpler and unblock the UI. I'm fine with the Async API, but it'd really good to have both the options.

Error codes

Some IDB error codes are helpful, some aren't. Some are documented in MDN's Obsolete IDBDatabaseException page, some aren't.And some remind me of IE6's 'undefined' is null or not an object at line 0.

For something as important as IDB, we definitely need better error logging. For most JS errors, the error message in the console is helpful enough. I'd personally like to see equally verbose error messages for IDB.

Full Text Search

While creating Indexes, if flag like "text"=true is passed in optionalParameters to the createIndex method of the objectStore, the browser should create an Inverted Index, or a Suffix Tree or even something as simple as a Trie structure for doing a prefix search.

For more advanced use-cases if these tokenizing/indexing/ranking functions are overridable, then we can build phonetic search or auto-correct in the apps.

Binary Data

Again an implementation issue, not the spec. While Firefox & IE10 support storing blobs (and TypedArrays??), Chrome still doesn't.

Opera & Safari

Even if we iron most of these issues out, how do we get Opera & Apple to support IDB ??

@mikeal
Copy link

mikeal commented Aug 21, 2012

Being that the API takes a single callback, should I assume it matches the same function pattern common in node.js function (err, result) {}.

@netroy
Copy link
Author

netroy commented Aug 21, 2012

@mikeal yep .. which i personally think makes writing async code easier ..

@axemclion
Copy link

Some comments-

Disclaimer - The jquery plugin that I wrote - http://axemclion.github.com/jquery-indexeddb tries to fix many of the annoyances described above, so I may be a little bit biased.

Transactions

Having a methods like objectStore.put of objectStore.get util methond would be awesome, but I think they would only add to the bulk in the specification - multiple ways to do something often confuses users, and IMHO, there would be people complaining about doing the same thing multiple ways.I think the spec should be as basic as possible, and util method can always be added at the top at almost zero cost. Btw, the jquery plugin does realize this fact, and hence has put and get both at object store and at objectStore.transaction levels.

Versioning

Though a lot of people would not worry about versions, there should be a place where the 'schema' is specified. Having this piece of code separate from the usual read/write operations helps and I think that the onupgradeneeded callback does exactly that. For people trying to replace localStorage, a simple schema setup in at 1 place should not be very heavy. The jquery plugin has a migration section, allowing users to specify easy migration paths

Querying

I think I agree with you about the range syntax. May be I am a minimalist, but I would prefer just one method, that takes undefined or null for other values. However, I believe that there is value in iterating over cursors as continue may not be the only option. Deleting, advancing or updating records is also important and a simple callback syntax may not make it simple. I tried using an objectstore.each(callback) syntax where the return value determined if a record should be deleted, updated, etc. But the syntax is not any simpler.

Error Codes

Totally agree - in fact I would love for the error code to have more debug information. A great example of error handling is the LINQ2IndexedDB library that exposes the lower level errors for debugging.

Full Text Search

This and database collations are something that are not defined, but from what I know, it is mostly due to the fact that both are hard problems solved in totally different ways by the participating browsers. Any specifications here could have changed the underlying database that was used in the implementation, and I think the committee decided to get a first version of the spec out, and then address these in the next version.
I believe that these 2 are needed (atleast collations), and are big holes in the spec.

Support from all browser

Would love to see all browsers support it, but till then, I just use the polyfill - http://axemclion.github.com/IndexedDBShim. There are minor differences, but it works for the apps that I have :)

My 2 cents

Since IndexedDB is one of the specifications not relying on the DOM and only on Javascript, I also wonder how the the next version of javascript (like iterators, setters, etc) will change the specification.
Here is a post about it - http://blog.nparashuram.com/2011/11/indexeddb-apis-javascriptnext.html and some really awesome answers by Jonas Sicking (one of the authors of the specification)

@kristofdegrave
Copy link

Hi,
Got the proposal by axemclion to give my two cents, so here they are :). (btw I'm currently writing an library on the IndexedDB API to ease the pain for developers. It's cross browsers supported + should be easy to use. It's called Linq2IndexedDB and you can find it on http://linq2indexeddb.codeplex.com).

Transactions

I understand why you want to do it, and it would be simplified that way, but I think you need to keep in mind not every transaction will only handle a single object store. Btw the issue with the transaction Types can be solved with a little shim.

Versioning

As you said I also like the idea of the versioning of the database. This way you as developer can easily know how to upgrade the database to its latest version. Of course for simple apps this is overkill. Also some developers don't even care, they just want to use the database without having to think about there structure. (which is a bad idea, but there are developers that think that way). So that is why the linq2indexeddb library has an autoCreate function. So when ever you call an object store, it will check if it already exist. If it doesn't the framework will increase the version number and create the objectstore for you. This is the same with indexes. If a developer wants to query on a data property, an index will be made. This enables the developer to start developing prototypes without having to wonder about the structure first. This can be dealt with in a later version.
For that versioning, my lib supports several ways. You can define a schema by a key value pair, the key is the version and the value contains a function that receives an transaction where you can call your creatObjectStore, createIndex on, ... You can provide a definition, this is an JSON object discribing the structure and mutations for every version. The last 2 options are an onupgrade callback with the current version of the database, the version with which you opened the database and the version change transaction. And an versionchange callback with the version you need to migrate to and the version change transaction. This last callback can be called multiple times depending on the difference between the current version and the requested version. so if the current is 2 and the requested is 4 it gets called 2 times, once with version 3 and once with version 4.

Querying

I think this is the biggest pain in the IndexedDB API, but it is almost impossible to solve this. I don't think you can enable things like a full text search. Why, well the IndexedDB API doesn't have any idea about the type of the property you want to query on. And even if it would know its type, it can't be sure that all objects containing that property will have the same type for that property. (We are working in js and this is a disadvantage of not working with a strong typed language, but how ever it can be an advantage as well :)).
An other pain is the fact you can only query 1 property. What if some one wants to narrow the result even more, or wants to use an or operator,...
For solving the querying in my library, I just added a layer above it. The first thing i do when querying is checking if I can use the IndexedDB API to do some of the querying for me. If it is possible I use the API to do the first query for me, if it isn't possible I retrieve all the records from the object store. Next I send all this + filters to a webworker where I do the rest of my querying. (things like in operator, like, ... and even own filters you write).
Third pain is sorting. This is the only thing I always do in the webworker. Why, well if you use the sorting on an index, only the records containing a value for this index will be retrieved. This is not something I want. I want to have all records when sorting, and not only the ones with a value for that property.
As last, I want to not that you have something like an offset (advance).

Memory leaks

Been there, done that. I can't count the number of times I had issues with non closed database connections. Luckily restarting the browser solved this, but that is the reason why I don't have a property db with the db connection in my library. After every action I do, I close the db connection in the complete handle of the transaction. An other option would be keeping a pool of all open connections, and close them all when I need to upgrade the version. I choose the first option because it is hard to determine when a connection can be closed. I wouldn't like closing a connection that still has an active transaction and would result into an error because I close it.

Error Codes

This is a true hell. Non of the browsers handle this the same way. I have just finished wrapping all errors in my lib, so I can give developers a useful error message instead of a domexception code or errorcode or exceptionname, ...

Future

I would love to use the next version of javascript. I have seen some thing with the yield keyword to make async calls handle like sync calls. This would simplify every thing for a lot of developers. It is always a pain having to use callbacks for everything you do.
If you have any remarks on my 2 cents, let me know. I'm really interested to debate this subject to make the spec even better (and also my library :))

@mikeal
Copy link

mikeal commented Aug 23, 2012

maybe the way to deal with versions is simply to pass a boolean when creating a store about whether or not to support versioning. dealing with it at all is annoying and possibly error prone for 90% of the use cases I can think of it while the other 10% are impossible without it. maybe the only solution is to enable or disable it explicitly upon creation.

@mikeal
Copy link

mikeal commented Aug 23, 2012

also, is the sync API dead now? if it's not officially dead I would advocate killing it entirely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment