Skip to content

Instantly share code, notes, and snippets.

@vitorio
Last active December 30, 2021 07:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save vitorio/7ef4e79f0bb9189a42811d441c0145be to your computer and use it in GitHub Desktop.
Save vitorio/7ef4e79f0bb9189a42811d441c0145be to your computer and use it in GitHub Desktop.
DEPRECATED /.well-known/archival-transfer-marker
# A /.well-known/archival-transfer-marker file is a short-lived,
# well-known location meant to act as a boundary marker for web
# archives. It marks the END of a period of time, and itself SHOULD
# only exist for no more than 24 hours (TTL of 86400 seconds).
#
# The capture of a /.well-known/archival-transfer-marker file
# indicates that any captures of any other pages on that domain AFTER
# THE CAPTURE OF THE MARKER may have different owners, rights,
# robots.txt policies, permissions, etc.
#
# The capture of a /.well-known/archival-transfer-marker file on its
# own SHOULD NOT confer any new or different rights or permissions to
# the archiving party for existing captures; its capture only marks a
# potential change in explicit policy, whether due to domain name
# ownership or other reasons.
#
# A /.well-known/archival-transfer-marker file MUST NOT generally
# exist outside of a period of transfer of ownership or change in
# policy. The intended use case is for the INTENTIONAL CREATION of
# the file, the INTENTIONAL SUBMISSION of the file to web archives,
# and then the INTENTIONAL REMOVAL of the file, all within a 24 hour
# (86400 second) period.
#
# A /.well-known/archival-transfer-marker file which persists much
# beyond that MAY be considered equivalent to no file at all.
#
# A /.well-known/archival-transfe-marker file MAY be empty; its
# existence is enough to indicate the boundary. The file MAY also
# include human-readable context for the boundary change, suitable to
# support the decision-making of web archives as to whether to
# continue to support the availability of content after e.g. a
# robots.txt policy change.
#
#
# EXAMPLE:
#
# To support web archives which use robots.txt as a policy for
# presenting archived content, a domain:
# - with a robots.txt which accurately reflects its desired
# archiving policy, and
# - for which a near-future ownership change may result in a change
# in robots.txt,
# SHOULD:
# 1. capture its robots.txt in a web archive,
# 2. create /.well-known/archival-transfer-marker,
# 3. capture /.well-known/archival-transfer-marker in the same web
# archive,
# 4. delete /.well-known/archival-transfer-marker, and
# 5. capture the 404 for /.well-known/archival-transfer-marker in the
# same web archive.
# This indicates that a change in robots.txt AFTER the capture of the
# marker is not reflective of the policies of captures AT OR PRIOR TO
# the time of the marker. As such, an archive MAY choose to use a
# previously captured robots.txt as policy for older captures
# instead.
#
#
# EXAMPLE:
#
# To support web archives which can have human-designated exceptions
# for what are otherwise robots.txt-based policies, a domain:
# - with a robots.txt which accurately reflects its desired
# archiving policy, and
# - for which a near-future ownership change may result in a change
# in robots.txt,
# MAY:
# 1. capture its robots.txt in a web archive,
# 2. create /.well-known/archival-transfer-marker with an explicit
# attestation such as:
#
# This domain is transferring ownership in the near future. The
# current owners have held this domain since Fri Nov 07 05:52:26
# 2008, and recognize caches, archives, and libraries may have
# captured content from this domain during that time. The owners
# retain any and all rights in that preserved content, but grant
# caches, archives, and libraries a nonexclusive right to present
# said content as appropriate in their capacity as caches,
# archives, and/or libraries.
#
# 3. capture /.well-known/archival-transfer-marker in the same web
# archive,
# 4. delete /.well-known/archival-transfer-marker, and
# 5. capture the 404 for /.well-known/archival-transfer-marker in the
# same web archive.
# This indicates that a change in robots.txt AFTER the capture of the
# marker is not reflective of the policies of captures AT OR PRIOR TO
# the time of the marker. In addition, it formally notes the
# transfer of ownership and provides a start date for the grant of
# permissions to go with the marker's existence as an end date. As
# such, an archive MAY choose to use a previously captured robots.txt
# as policy for older captures instead.
#
#
# EXAMPLE:
#
# To support web archives which can have human-designated exceptions
# for what are otherwise robots.txt-based policies, an entity which
# has taken control of a domain due to expiration of hosting,
# registration, or other services, and which automatically "parks"
# the domain (placing generic, unrelated, or advertising content on
# it),
# MAY:
# 1. create /.well-known/archival-transfer-marker with an explicit
# attestation such as:
#
# This domain was automatically parked after a domain registration
# expiration. The expiration date was 2020-05-16T22:03:38+00:00.
#
# 2. capture /.well-known/archival-transfer-marker in a web archive,
# 3. delete /.well-known/archival-transfer-marker, and
# 4. capture the 404 for /.well-known/archival-transfer-marker in the
# same web archive.
# This indicates that a change in robots.txt AFTER the capture of the
# marker is not reflective of the policies of captures AT OR PRIOR TO
# the time of the marker. In addition, it formally notes the date
# of ownership change, as that is likely to be before the creation of
# the marker, and web archives may have captured changed robots.txt
# or other parked pages in the interim. As such, an archive MAY
# choose to use a previously captured robots.txt as policy for older
# captures instead.
#
#
# Sources considered:
#
# "Digitization of Special Collections and Archives: Legal and
# Contractual Issues," which has model deeds that the first example
# attestation was modeled after:
# https://publications.arl.org/rli279/6
# via Nancy Sims.
#
# "The Oakland Archive Policy," for which this document helps address
# some of the cases in the "re-insertions of web sites based on
# change of ownership" category:
# https://groups.ischool.berkeley.edu/archive/aps/removal-policy
#
# "Robots.txt Files and Archiving .gov and .mil Websites," for which
# this document helps address the example of "a domain name changes
# hands:"
# http://blog.archive.org/2016/12/17/robots-txt-gov-mil-websites/
#
# "Robots.txt meant for search engines don’t work well for web
# archives," for which this document helps address the example of
# "parked domains:"
# http://blog.archive.org/2017/04/17/robots-txt-meant-for-search-engi
# nes-dont-work-well-for-web-archives/
# all three via Jonah Edwards.
#
@vitorio
Copy link
Author

vitorio commented Dec 30, 2021

This draft is deprecated! Please don't use this, it has technical issues discovered in a pilot. See http://vitor.io/archival-markers instead for an updated archival ownership markers proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment