In order to get into a trustworthy state, the [big data] toolchain needs to:
-
Consolidate. There are too many tools for every job. There are even too many tools to manage your too many tools, and frontends for your frontends.
-
Lose weight. Every project depends on way too many other projects, each of which only contributes a tiny fragment for a very specific use case. Get rid of most dependencies!
-
Modularize. If you can't get rid of a dependency, but it is still only of