##Context
Multiple discussion threads spun out after this tweet mentioning my talk on Full-Text Search in IndexedDB using Inverted Indices
at BerlinJS.
Only the beginning of my talk was about my discontent with the current IDB API and how it is a much needed technology. Unfortunately it's not terrific as a spec and has rather messy implementations across browsers. The API is way too low-level and needs a decent amount of abstraction for everyday web-developers to work with it.
Sharing my slides isn't going to be of much help, and neither is discussing about them in 140 chars. Hence, I'm going to use this gist to propose changes to the IDB API spec and the implementations.
##Issues
I appreciate the idea of transactions in IDB. They let the system use locks if the transaction is a "Write", but let the system perform multiple "Read" operations when there is no lock, ensuring consistency in the storage.
But, In most cases, IDB-transactions are single-operation and auto-commit once the last callback is fired, so why can't a single-action operation decide the transaction type and abstract it away from the developer who doesn't explicitly need transactions.
So,
// DB - the handle to the database
// storeName - the name of the objectStore
var writeTransaction = DB.transaction([storeName], IDBTransaction.READ_WRITE);
var store = writeTransaction.objectStore(storeName);
var writeRequest = store.put({
'foo': 'bar'
});
writeRequest.onsuccess = function (e) {
callback(null, e.target.result);
}
writeRequest.onerror = function(err) {
callback(err);
}
could be written in a much simpler way as,
DB.objectStores[storeName].put({
'foo': 'bar'
}, callback);
This also saves us from IDBTransaction.READ_WRITE
vs 1
vs readwrite
across the implementations. (You'd know this if you've written any apps that runs across a few browsers)
I like the idea of having the databases versioned. It lets you create migration paths whenever the developer decides to change the DB structure.
But I also think that for a large number of use-cases there should be a way of storing data without having to deal with the concept of version completely. Most common use case would be an asynchronous localStorage replacement (It's about time we had one). Someone looking for a simple key-value store shouldn't need to resort to localStorage + JSON.stringify, two synchronous APIs enough to kill any app's performance.
Why can't we just have IDBKeyRange.only and IDBKeyRange.bound as the only key-range functions, and pass undefined
for open-ended queries like <, ≤, ≥ or >.
This way, the arguments also define the sort direction, so we can just open the cursor on the bounds and iterate through without IDBCursor.prev or IDBCursor.next. The second argument to index.openCursor, if true, can be then used for iterating uniques only.
It'd be great to also have utility functions that do the iterations for a bound, limit and offset.
So,
// DB - the handle to the database
// storeName - the name of the objectStore
// indexKey - name of the index
// lowerBound & upperBound - ranges
// limit & offset - for pagination
var queryTransaction = DB.transaction([storeName], IDBTransaction.READ_ONLY);
var store = queryTransaction.objectStore(storeName);
var index = store.index(indexKey);
var lower, upper, direction, results = [], count = 0;
if(lowerBound > upperBound) {
lower = upperBound;
upper = lowerBound;
direction = IDBCursor.PREV;
} else {
lower = lowerBound;
upper = upperBound;
direction = IDBCursor.NEXT;
}
var bounds = IDBKeyRange.bound(lower, upper, true, true);
var queryCursor = index.openCursor(bounds, direction);
queryCursor.onsuccess = function (e) {
var cursor = e.target.result;
if (!cursor) {
callback(null, results);
} else {
if(count >= offset) {
results.push(cursor.value);
if(results.length === limit) {
return cursor.continue(upper + 1);
}
}
count++;
cursor.continue();
}
}
queryCursor.onerror = function(err) {
callback(err);
}
could be simplified as
var bounds = IDBKeyRange.bound(lower, upper, true, true);
var cursor = DB.objectStores[storeName].index(indexKey).openCursor(bounds, false);
IDBUtils.readCursor(limit, offset, callback);
I know it's not a spec's problem if the implementations are bad at managing memory. But the spec can propose that all open handles to IDB databases be closed on context/window unload.
Unclosed DB handles on a page reload make up for the largest majority of the memory leak issues I've seen. A good-developer would learn it the hard way and close the DB handles. Unfortunately we don't can't assume that all IDB consumers are good-developers.
Again, not a problem with the spec. But if the browsers implement this soon, the developers can move all there non-DOM code to a worker, keep the code simpler and unblock the UI. I'm fine with the Async API, but it'd really good to have both the options.
Some IDB error codes are helpful, some aren't. Some are documented in MDN's Obsolete IDBDatabaseException page, some aren't.And some remind me of IE6's 'undefined' is null or not an object at line 0
.
For something as important as IDB, we definitely need better error logging. For most JS errors, the error message in the console is helpful enough. I'd personally like to see equally verbose error messages for IDB.
While creating Indexes, if flag like "text"=true is passed in optionalParameters to the createIndex method of the objectStore, the browser should create an Inverted Index, or a Suffix Tree or even something as simple as a Trie structure for doing a prefix search.
For more advanced use-cases if these tokenizing/indexing/ranking functions are overridable, then we can build phonetic search or auto-correct in the apps.
Again an implementation issue, not the spec. While Firefox & IE10 support storing blobs (and TypedArrays??), Chrome still doesn't.
Even if we iron most of these issues out, how do we get Opera & Apple to support IDB ??
Some comments-
Disclaimer - The jquery plugin that I wrote - http://axemclion.github.com/jquery-indexeddb tries to fix many of the annoyances described above, so I may be a little bit biased.
Transactions
Having a methods like
objectStore.put
ofobjectStore.get
util methond would be awesome, but I think they would only add to the bulk in the specification - multiple ways to do something often confuses users, and IMHO, there would be people complaining about doing the same thing multiple ways.I think the spec should be as basic as possible, and util method can always be added at the top at almost zero cost. Btw, the jquery plugin does realize this fact, and hence hasput
andget
both at object store and at objectStore.transaction levels.Versioning
Though a lot of people would not worry about versions, there should be a place where the 'schema' is specified. Having this piece of code separate from the usual read/write operations helps and I think that the
onupgradeneeded
callback does exactly that. For people trying to replace localStorage, a simple schema setup in at 1 place should not be very heavy. The jquery plugin has a migration section, allowing users to specify easy migration pathsQuerying
I think I agree with you about the range syntax. May be I am a minimalist, but I would prefer just one method, that takes undefined or null for other values. However, I believe that there is value in iterating over cursors as
continue
may not be the only option. Deleting, advancing or updating records is also important and a simple callback syntax may not make it simple. I tried using anobjectstore.each(callback)
syntax where the return value determined if a record should be deleted, updated, etc. But the syntax is not any simpler.Error Codes
Totally agree - in fact I would love for the error code to have more debug information. A great example of error handling is the LINQ2IndexedDB library that exposes the lower level errors for debugging.
Full Text Search
This and database collations are something that are not defined, but from what I know, it is mostly due to the fact that both are hard problems solved in totally different ways by the participating browsers. Any specifications here could have changed the underlying database that was used in the implementation, and I think the committee decided to get a first version of the spec out, and then address these in the next version.
I believe that these 2 are needed (atleast collations), and are big holes in the spec.
Support from all browser
Would love to see all browsers support it, but till then, I just use the polyfill - http://axemclion.github.com/IndexedDBShim. There are minor differences, but it works for the apps that I have :)
My 2 cents
Since IndexedDB is one of the specifications not relying on the DOM and only on Javascript, I also wonder how the the next version of javascript (like iterators, setters, etc) will change the specification.
Here is a post about it - http://blog.nparashuram.com/2011/11/indexeddb-apis-javascriptnext.html and some really awesome answers by Jonas Sicking (one of the authors of the specification)