I've been working with Sulu for a few weeks, getting my bearings and just trying to really learn it, and there's this one maddening rabbit hole that just keeps getting deeper. This is going to be a long post, so strap yourself in and grab a cuppa if you'd like to join the discussion.
There's the Sulu CMS. It's structured like this:
Image from here.
Okay, ignoring the features of Sulu itself in the top part, the "logic" of Sulu is based on Symfony and Symfony CMF, while the data is stored in ... all of that at the bottom? We'll get to that.
Why point out both Symfony and Symfony CMF, if Symfony CMF is just a collection of bundles that "lets users easily add CMS functionality to their Symfony apps"? Eh.
And of the three Data blocks, which one is used for what? Why are all of those listed? Let's look into each separately.
Okay, so Sulu uses PHPCR presumably because Symfony CMF uses PHPCR. So far so good. So what is PHPCR?
PHPCR is a PHPized JCR specification. What? Okay, let's re-check that.
The PHP Content Repository is an adaption of the Java Content Repository (JCR) standard, an open API specification defined in JSR-283. The API defines how to handle hierarchical semi-structured data in a consistent way.
What does this even mean? Okay, let's look into JCR's JSR-283. It turns out it's a 9 year old specification from Java. Which now begs the questions, how do you PHPize a specification? A specification is a specification, and PHP implementations could just as well be respecting that, no? But okay, we have two specifications now, with PHPCR taking the upper hand because we're in PHP land now, so let's move on.
If it's a spec, then it works, right? Then it delivers what it was specced out to do, and no implementation can claim it implements it without actually implementing everything, otherwise it's not an implementation, right? You can't implement an interface in PHP unless you literally implement the required methods, right?
Yeah, about that... And indeed, if we check the feature table, none of the implementations have all of the spec-prescribed features.
So... WordPress has been powering 75% of the web for decades and has had post versions for half that time, all in MySQL with almost no insurmountable scaling issues, and a "new" implementation of a "new" specification of an "old" specification still doesn't? Even 4 whole years after development on it started?
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
but for unstructured data in a tree structure with optional schema its already much better than anything you will quickly whip up yourself— Lukas Kahwe Smith (@lsmith) 6 June 2017
I think our definition of "quickly" differs. It would be trivially easy to slap content versions into an RDBMS WP-style, and it would scale just fine if indexed properly. "But this couples you to MySQL" some would say. Sure... and using one of the two different and incomplete implementations of a specification doesn't? Keep in mind that when using PHPCR, you're basically forced to use one of those two right now. Let's focus on that next.
Jackalope Doctrine DBAL
From the website:
Uses a conventional RDBMS (e.g. MySQL) to store the content repository.
So... went through 2 layers of specification and ended up with a partial implementation in order to, what, do the same thing WordPress has been doing, but worse? Because this implementation doesn't support versioning.
To summarize, to properly use a broken implementation of an outdated spec written for an obsolete language to store my data, I need to know not only the whole mess of things listed above, but also be familiar with Doctrine, which carries its own learning curve with it.
So what's my choice if I want to use versioning with Sulu, or any Symfony CMF-based CMS?
Jackrabbit, it appears, is some kind of Java storage engine from Apache. Jackalope Jackrabbit is an implementation of PHPCR (not really, as we saw in the feature table).
Here's what's interesting about Jackrabbit. It can only support up to 10000 children per node. Since CMS pages are often tree-based, that's fine. A page will have a few children and nothing more to it. But consider, for example, an existing enterprise online magazine that's been online for a while. Consider SitePoint's 50k current posts, still getting hits every day. Consider wanting to migrate. In that case Jackrabbit becomes useless, and a Sulu installation needs something like this - this bundle will auto-shard content into smaller fragments (e.g. by month), so the 10k limit isn't reached. For all intents and purposes, a hack, but one that works. But a hack that's specific to Sulu, and that isn't even in that JCR "solution to all problems in the world" specification.
This is all after you install yet another piece of software (Jackrabbit), bog down your server's resources by an extra 30% (Java) and learn to use it.
Edit: just realized that to use the ArticleBundle, you actually need ElasticSearch (why, if PHPCR provides search?) which is a nightmare to install especially considering everything else we had to install so far. I'm now wondering about the target audience for Sulu. Surely a company which just needs a CMS for pages could easily get those up and running with a static page generator? And a company which wants to make an online magazine (like SitePoint) would do better using WP which has versioning and doesn't need even 20% of the software running on the server that Sulu does. So I'm kind of confused as to who the intended end user of Sulu apps is. Can someone list some companies and their areas of work that use Sulu so I can learn from examples?
After all this, after 2 specifications and 5 levels of tools spread across all of that, what is a good alternative for a CMS to store content? Here are some thoughts:
- RDBMS like it's been done so far. It works. It's dirty, but it's no more dirty than incomplete specifications that don't work with each other. Add a new row per version and be done with it. It scales just fine, trust us.
- MySQL and Postgre both now support JSON fields. These can easily store and search unstructured data.
- Store content as files. Built in version control right there. Keep permissions, relations, etc, in RDBMS.
With all this said, I'm desperately looking for human-friendly explanations on why PHPCR exists, why Symfony CMF exists, why it's all so complex and incomplete, and why it's in any way, shape or form a better solution than any of the alternatives listed above. I want to learn this, please help me understand, but try to think from a new user perspective, not someone who used or developed this and thus must love it.
I spent a good bit of time getting Vagrant to play along nicely with Sulu because I approached it from a new user perspective, several times, until it clicked and worked completely. I'd like to feel that click in terms of Sulu, PHPCR, and SymfonyCMF in general.
Patrik says that admittedly, the PHPCR implementations need more love and contributors, and sure, yeah, I can see that. But in my opinion, a project (PHPCR) so dramatically incomplete and lacking of features should never end up in a project as serious as Sulu that's now rapidly approaching version 2.0 and having people (and companies) depend on it. As a company owner looking to adopt a new CMS to peddle instead of WP, this concerns me greatly.