Skip to content

Instantly share code, notes, and snippets.

@pepopowitz
Last active November 14, 2023 22:41
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pepopowitz/c0c9bc63bd50318f54a17e19fd9788f5 to your computer and use it in GitHub Desktop.
Save pepopowitz/c0c9bc63bd50318f54a17e19fd9788f5 to your computer and use it in GitHub Desktop.
Camunda 7 search `noindex` experiment definition

(Contents copied with permission from a private GitHub issue.)

The current state of c7 search is:

  • versions 7.18 through 7.15, and develop/master, and latest all have canonical URLs pointing at the 7.18 version of the page
  • The 7.18 sitemap contains canonical URLs that point at the 7.18 versions
  • The 7.18 sitemap has been submitted to Google to crawl/index
  • Versions 7.6 through 7.17 are marked to not be indexed by Google (with a <meta name="robots" content="noindex" /> tag)

In this current state, we still see Google pointing at a lot of pre-7.18 canonicals. There are many reasons Google might choose a different canonical than what we've declared, but it appears that the noindex tags on older versions contribute. They seem to be preventing Google from re-reading the older version pages (and seeing that the older pages declare 7.18 to be canonical).

Hypothesis

  • The noindex tags on older versions are unnecessary. If we mark the old versions with a proper canonical tag, Google will not penalize duplicate content, and will be likely to choose the newer version of the page as canonical.
  • The noindex tags are an impediment to Google correcting its canonicals, and cause Google to hang on to old canonicals despite our efforts to convince the Googlebot otherwise.

Experiment 1: Remove noindex for a few pages

  • find <5 pages that are 7.17 canonicalized
  • Capture the state of them in Google search console, and via a Google search for site:docs.camunda.org <phrase visible on that specific page>
  • Update the 7.17 site to remove the noindex tag on only those pages (camunda/camunda-docs-manual#1395).
  • Submit the 7.17 pages to be crawled via Google search console
  • Capture any changes in Google search console, as well as a Google search for a phrase on each page. Note that it could take weeks for changes to appear.
  • Analyze results.

Results: 2 of the 3 test pages became searchable, with version 7.18 chosen as canonical! Conclusion: we should extend the experiment.

Experiment 2: Remove noindex from an entire version

  • Update the 7.15 site to remove the noindex tag from all pages.
  • Update 7.15 version to exclude noindex from all pages & deploy it (camunda/camunda-docs-manual#1409)
  • Click the "Validate Fix" button on the "Duplicate, Google chose different canonical than user" dataset
  • Wait for a few weeks
  • Re-analyze "Duplicate, Google chose different canonical than user" dataset

Success criteria

  • 7.15 canonicals have declined by at least 50% in the "Duplicate, Google chose different canonical than user" dataset
  • No new 7.15 results show
    • i.e. The Google search query site:docs.camunda.org inurl:"15" -inurl:"javadoc" still yields 0 results.

Results: 7.15 canonicals did not decline, and some 7.15 pages became newly canonical. Conclusion: removing noindex from all pages would result in many non-7.18 canonicals.

Experiment cleanup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment