joemaller/wordpress-archives.md

## wordpress-archives.md

      
    Raw
  

              wordpress-archives.md
            
          
    WordPress Archives: Here we go again

We've been here before. A WordPress site needs to list a collection of things. Native WordPress archives seem like a good idea. But then they're not.
Beyond the most basic use, WordPress archives tend to fail in a number of ways.
For page-dominant sites, archives are unintuitive, they just appear magically -- or don't. None of the expected authoring tools are available and archive endpoints don't show as a default choice when building menus.
There's no clean way to author content on archive landing pages. Embedding content into template files is too rigid and contrary to the entire point of a CMS. Pulling custom values from a separate ACF options page requires extra documentation (which no one reads), is easily forgettable and is a mess conceptually.
A common solution is to build a custom page template which mimics the desired archive, but allows users to directly author additional content and metadata. The downside of these pages is that authors see them as mostly empty and it all feels kind of broken.
From a development point of view, pages and archives don't mix well. Custom Post Type archive endpoints will override a page with the same permalink. So if we have an archive listing at /thing-archive/ but then create a Thing Archive page with the same /thing-archive/ permalink, the new page won't be accessible because the archive's permalinks will get in the way.
Which way do we go?

Our biggest need is for authors to be able to add content and metadata to archive landing pages. There appear to be two possible solutions:


Mix archives and pages, adjusting the $wp_rewrite rules so WordPress serves pages permalinks before archives.


Create our own page-based archives by disabling CPT archives and creating a custom, page-aware templates.


TL;DR

Custom page templates using get_query_var('page').
The Permalink Problem

The default WordPress rewrite rules prioritize archives over pages. Despite their utility, pages are pretty close to the bottom of the routing stack. When almost every every other routing pattern has failed to match, the request might be a page.
The complete set of rewrite rules is a huge associative array where each key is a regular expression pattern and each value is a query string with captured replacements. This is the heart of WordPress routing. The complete set of rules can be dumped to debug.log with something like this:
add_action('wp_loaded', function () {
    global $wp_rewrite;
    error_log(print_r($wp_rewrite->wp_rewrite_rules(), true));
});
Down near the bottom is the routing pattern for pages:
'(.?.+?)(?:/([0-9]+))?/?$' => "index.php?pagename=$matches[1]&page=$matches[2]"
We could use the WP Rewrite API to rearrange the default routing order so pages would be matched before CPT archives.
But this isn't without risks.
There are certainly sound, not-obvious reasons pages have remained way down in the routing pile. Promoting the page-matching pattern over 100+ other routes could allow a poorly-considered page-slug to unexpectedly override other areas of the site. Besides every Custom Post Type, some other things which WordPress matches before pages include: Tags, Categories, search, date-based archives, authors, feeds and the JSON API.
Considering how fundamental this routing table is to WordPress and how much could go wrong, this starts to feel like a Chesterton's Fence decision which could end up causing far more problems than it solves.
Don't fight WordPress. WordPress always wins.
Paged Pages

Looking more closely at the routing pattern for pages reveals a curiously little-used feature: WordPress always checks for an optional integer page value after the page's name-slug. That value is then passed onto the page itself.
This sounds a little bit insane, but that routing pattern means pretty much every page on every WordPress site is also accessible from 2⁶³ - 1 alternate, numbered urls. These all point to the same page:

/example-page/
/example-page/2/
/example-page/222/
/example-page/9223372036854775807/

Crazy as that may be, it means pages can be paged. And if a custom page template is already collecting items with a custom query, it's not difficult to add pagination using the query_var's value. With that, there's very little reason to use native archives at all.
A path forward

Compared to the complexity of trying to integrate pages and native archives, paged custom templates seem like a much cleaner, safer solution.
Each CPT requiring an archive page should have a matching top-level, named template file. Authors will create an archive by assigning the appropriate template to the page.
Configuration of collection page templates will be baked into the template source files or set via template-specific edit-page Advanced Custom Fields. The templates will be page-aware, so subsets of the collection can be viewed and paged through.
But this is just the beginning, there's an even better solution.
Enter Blocks

Instead of a custom template, the new Gutenberg Block Editor means we can create custom, collection blocks. These dynamic, page-aware blocks will allow authors to create archive pages without custom templates. Any page or post will be capable of displaying a collection and authors will clearly see that collected content in the editor.
This is by far the most portable, flexible and logical solution to the archive problem.
Ideally, the CPT-specific collection block will be included with and imported from the CPT code, so developer complexity would be reduced to a single PHP include statement.
Summary

WordPress pages are self-contained entities, everything on the page should be editable from the page.
Archives are simply a collections of items. It's a reasonable assumption that an "archive page" should be created by adding a collection to a page.
Custom page templates and dynamic collection blocks will let authors create archives anywhere, adding supplemental content and metadata with tools they already know. Development complexity will be reduced by removing extra templates, eliminating clumsy metadata workarounds and shrinking the routing surface.

More page-archive integration issues


The page-based solution solves the problem of user-authored archive titles.


If a page-archive shows a different number of items than the archive default, several entries would either become unreachable or appear twice unless we adjust the posts_per_page value. (eg. With an archive default of 10 items, the first item on the second archive page would be item 11. If the page author sets the block to display 6 items, items 7-10 would become unreachable. If the page author sets the block to display 15 items, then items 11-15 would appear twice.)


We also might be able to use paged pages to work around that by "paging" our pages and omitting native archives altogether. If we go this route, urls would be something like /thing/, /thing/2, /thing/6 etc. We would need to calculate which items actually appear on the page. This would eliminate the need to mess with $wp_rewrite but would also complicate convenience functions like paginate_links.


One suggested workaround for mixing pages and archives is to split archives off into a separate url structure which doesn't overlap with pages. This feels sloppy and would duplicate some content between the custom page and the first archive page. It could also be difficult to provide an archive click-through when the number of displayed item count differs from posts_per_page.


Questions and Concerns


Where do all the source files go? For now we'll just import the custom block into the editor_blocks.js entry point, but this smells bad and likely isn't sustainable. It would be nice to enable everything from a single PHP include, but how to handle JS transpilation and compiling Sass-based styles?


Native archive pages don't appear in sitemaps generated by The SEO Framework. Our first pages will appear, since they're just pages, but should additional paged-archive pages also appear?


While pages are numerically pageable, a CPT with an overlapping slug will hijack that pageability. So a CPT at /thing-collection/[thing] will work with a page named /thing-collection/ but /thing-collection/2 will look for a single CPT and not page 2 of the page.


Todos


For future flexibility, Custom Post Type definitions will include a per_page query var to enable more flexible collections.


Collected items viewed in the Block Editor should show an edit-link to get to their individual entry/edit page.


Existing paged archives will need a redirect rule to point urls like /thing/page/3 to /thing/3.


Archive collection blocks should offer controls for number of items and/or date-constraints. Likley two different blocks to keep the interface and internal logic from getting too messy.