There are multiple different specifications which cover the production of identifiers in contexts which are relevant to XML producers.
https://www.w3.org/TR/html4/types.html#type-id
ID and NAME tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons (":"), and periods (".").
ID = IDStartChar IDChar*
IDChar = IDStartChar | [0-9] | "-" | "_" | ":" | "."
IDStartChar = [A-Z] | [a-z]
https://www.w3.org/TR/xmlschema-2/#ID
The ·value space· of ID is the set of all strings that ·match· the NCName production in [Namespaces in XML]
NCName = NCNameStartChar NCNameChar*
NCNameChar = NCNameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]
NCNameStartChar = [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
Note: This permits everything that HTML 4 ID does except for ':'.
https://www.w3.org/TR/REC-xml/#id
Values of type ID must match the Name production. A name must not appear more than once in an XML document as a value of this type; i.e., ID values must uniquely identify the elements which bear them.
Name = NameStartChar (NameChar)*
NameChar = NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]
NameStartChar = ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
Note: This is more permissive than either HTML 4 or xsd:id
https://www.w3.org/TR/html5/dom.html#the-id-attribute
The value must be unique amongst all the IDs in the element's home subtree and must contain at least one character. The value must not contain any space characters. There are no other restrictions on what form an ID can take; in particular, IDs can consist of just digits, start with a digit, start with an underscore, consist of just punctuation, etc.
For maximum compatibility we recommend that identifiers in JATS documents follow the production:
JATSID = JATSIDStartChar JATSIDChar*
JATSIDChar = JATSIDStartChar | [0-9] | "-" | "_" | "."
JATSIDStartChar = [A-Z] | [a-z]
Or in regex form: [A-Za-z][-_.A-Za-z0-9]*
This is the HTML 4 production with XML NCName ':' restriction.