vhata/gist:2956727

## gistfile1.md

      
    Raw
  

              gistfile1.md
            
          
    In reply to http://www.zopyx.de/blog/goodbye-mongodb
I'm answering the questions as if I'm answering the blogger, no you. :)
First order of business.
http://facility9.com/2010/09/five-reasons-to-use-nosql/
MongoDB was deisgned for massive scale data storage and the
architecture does it very well.
A design decision is not a flaw if you do not like how it behaves in
your use case.

the currently memory model of MongoDB based on memory-mapped files is
brain-dead. Leaving memory management to the operating is a nice idea

in reality it does not scale and does not play very well. There is
no single way to control =A0the memory usage using system tools except
maintaining mongod instances on dedicated virtual machines without
running further services. There are numerous complaints from people
about this stupid architectural decision and 10gen is doing nothing to
change this brain-dead memory model.


That is the way it is supposed to be. Massive dedicated cluster of
boxes for storage. Do not complain when you cannot run other apps on
the box as well. Run Crysis at home and keep data on the MongoCluster.

Locking: a global server lock for a scalable database solution is a
no-go - especially since MongoDB =A0supports only atomic operations. Now
there is relief in the making with more granular locking or the
temporary yielding of the lock during long-running write operations.

I agree with this.

Query engine: the query engine of MongoDB still can only use of one
index per query. How insane is this? There is no obvious reason why
this limitation exists. The index model of MongoDB is very similar to
relational databases - in fact: it borrows lots of ideas from
relational database. Having worked on indexes and search engines
myself for more than a decade I can not recognize any particular
reason why the query engine can not use multiple indexes per query -
the query engine appears poorly implemented.

Did you read 5 reasons to use no-sql? If you want to use complicated
queries instead of map reduce, you are user the wrong database. While
I agree that querying on multiple indexes would be awesome, look at
what MongoDB was designed for instead.

Query language: using JSON as a query language was a bad decision. The
current JSON query language works for standard queries but the
functionality of the operators is limited. It is still not possible to
express arbitrary queries like in SQL using JSON. One would argue: not
needed - but in reality there are always cases where you need more
complex queries. The only way around is to implement something
client-side or use the server-side JS code execution (single-threaded,
slow). Having no option to perform an operation comparable to UPDATE
table SET foo=3Dbar WHERE.... (which is possibly a low-hanging fruit).
There are various odds and ends with the query language and its
implementation. E.g. why don't you get an error message when using the
$and operator with MongoDB version that does not support it? Why does
MongoDB not complain here about an inappropriate usage of operators?
Look at the mailing list and discover such flaws all day long in
various postings. Silently discarding errors is a worse thing. If
there is a problem then raise the issue and don't hide it under the
carpet.

Yes. True. But it all still looks like you are trying to use a
relational database. Go get MySql or, if you are like me and like real
databases, go get PostgreSQL.

Map-Reduce: Map-reduce in MongoDB feels like a useless appendix added
at some point to MongoDB. Same problem as with server-side code
execution: it blocks. =A0Now instead of fixing a bad implementation or
fixing the underlaying architectural issues, 10gen seems to address
the MR limitations by supporting Hadoop for the MR part - either they
don't trust their own MR implementation or they won't/can't fix it.
No, we do not need more tools for doing map-reduce - there are already
too many moving parts in a setup for scalable applications. Either fix
MR inside MongoDB or throw it out completely.

Yes. The developers need to make a decision on this. Cut your losses
and restart the map reduce.

Sharding: yet another misfeature of MongoDB. Going from a single
server installation to a partitioned setups is huge. You need at
least two replica sets for the shards, three config servers and the
load balancers. That's like building a skyscraper beside a small
town-house.

Not an issue. Intended use. You don't start a skyscraper by building a
house first, then start complaining that you need to dig out the old
foundation because you cannot scale up. MongoDB massive distributed
storage with hot fail-over of replica nodes and intelligent shard
migration. You don't get that with two boxes.

Data-center awareness: yet another feature that has been tinkered
together. Replica sets only support one primary with multiple
secondaries. Writes can only go to one primary. Running a replica set
across multiple datacenter is doable but writes can only go to one
primary in one data-center. Assuming have a replica set with nodes in
Europe, US and Asia with the current master being located in US: all
writes from US and Asia need to be performed against the master in US
and replicated back to the secondaries in Europe and Asia - insane and
not scalable.

Are you seriously complaining about a method that will insure data
integrity. Write to primary, sync to secondaries seems like a logical
way to handle this problem.
Now please remember how old MongoDB is and how far they have come. All
features cannot go into version 1.
https://jira.mongodb.org/browse/SERVER-2545

The "safe" mode is off by default: who made this idiotic decision?
Many reports why people about data los have been seen - just for the
reason that "safe" is off by default. Although this is documented here
and there: does such a decision bring trust to MongoDB? Safe mode must
be enabled by default - people should be able to turn it off for
performance reasons and with the understanding that turning it off may
lead to data loss unless they perform explicit error checking
client-side.

Yes. it is a bad default. Just flip the switch to on and the problem
goes away. You can even set the minimum number of nodes to write to
before returning with a success. A bad default setting doesn't make
the app useless.

Journaling: MongoDB pre-allocates 3 GB of data for journaling -
independent of the actual database size(s) - insane for small
installations.

Again? Why are you running MongoDB on small installations.
Now talking to Jonathan.
It was a very interesting article. I agree with a few things, but look
at some of his complaints. Non issues. Foursquare is still raving
about MongoDB. Maybe they are using it like the developers intend it
to be used.
I still like MongoDB a lot, I've played with Mongo/Django apps, but I
still cannot justify using it for one of my apps. It is made for
massive data storage and that is just not something I need. Still cool
to play with though.