This algorithm can be used to parse the Link header fields that a HTTP header set contains. Given a
header_set
of (string field_name
, string field_value
) pairs, assuming ASCII encoding, it
returns a list of link objects.
-
Let
field_values
be a list containing the members ofheader_set
whosefield_name
is a case-insensitive match for "link". -
Let
links
be an empty list. -
For each
field_value
infield_values
:-
Let
value_links
be the result of Parsing A Link Field Value fromfield_value
. -
Append each member of
value_links
tolinks
.
-
-
Return
links
.
This algorithm parses zero or more comma-separated link-values from a Link header field. Given a
string field_value
, assuming ASCII encoding, it returns a list of link objects.
-
Let
links
be an empty list. -
While
field_value
has content:-
Consume any leading OWS.
-
If the first character is not "<", return
links
. -
Discard the first character ("<").
-
Consume up to but not including the first ">" character or end of
field_value
and let the result betarget_string
. -
If the next character is not ">", return
links
. -
Discard the leading ">" character.
-
Let
link_parameters
, be the result of Parsing Parameters fromfield_value
(consuming zero or more characters of it). -
Let
target
be the result of relatively resolving (as per {{RFC3986}}, Section 5.2)target_string
. Note that any base URI carried in the payload body is NOT used. -
Let
relations_string
be the second item of the first tuple oflink_parameters
whose first item matches the string "rel", or the empty string ("") if it is not present. -
Split
relations_string
on RWS (removing it in the process) into a list of stringsrelation_types
. -
Let
context_string
be the second item of the first tuple oflink_parameters
whose first item matches the string "anchor". If it is not present,context_string
is the identity of the representation carrying the Link header {{RFC7231}}, Section 3.1.4.1, serialised as a URI. Where the identity is "anonymous"context_string
is null. -
Let
context
be the result of relatively resolving (as per {{RFC3986}}, Section 5.2)context_string
, unlesscontext_string
is null in which casecontext
is null. Note that any base URI carried in the payload body is NOT used. -
Let
target_attributes
be an empty list. -
For each tuple (
param_name
,param_value
) oflink_parameters
:-
If
param_name
matches "rel" or "anchor", skip this tuple. -
If
param_name
matches "media", "title", "title*" or "type" andtarget_attributes
already contains a tuple whose first element matches the value ofparam_name
, skip this tuple. -
Append (
param_name
,param_value
) totarget_attributes
.
-
-
Let
star_param_names
be the set ofparam_name
s in the (param_name
,param_value
) tuples oflink_parameters
where the last character ofparam_name
is an asterisk ("*"). -
For each
star_param_name
instar_param_names
:-
Let
base_param_name
bestar_param_name
with the last character removed. -
If the implementation does not choose to support an internationalised form of a parameter named
base_param_name
for any reason (including, but not limited to, it being prohibited by the parameter's specification), remove all tuples fromlink_parameters
whose first member isstar_param_name
and skip to the nextstar_param_name
. -
Remove all tuples from
link_parameters
whose first member isbase_param_name
. -
Change the first member of all tuples in
link_parameters
whose first member isstar_param_name
tobase_param_name
.
-
-
For each
relation_type
inrelation_types
:-
Case-normalise
relation_type
to lowercase. -
Append a link object to
links
with the targettarget
, relation type ofrelation_type
, context ofcontext
, and target attributestarget_attributes
.
-
-
-
Return
links
.
This algorithm parses the parameters from a header field value. Given an ASCII string input
, it
returns a list of (string parameter_name
, string parameter_value
) tuples that it contains.
input
is modified to remove the parsed parameters.
-
Let
parameters
be an empty list. -
While
input
has content:-
Consume any leading OWS.
-
If the first character is not ";", return
parameters
. -
Discard the leading ";" character.
-
Consume any leading OWS.
-
Consume up to but not including the first BWS, "=", ";", "," character or end of
input
and let the result beparameter_name
. -
Consume any leading BWS.
-
If the next character is "=":
-
Discard the leading "=" character.
-
Consume any leading BWS.
-
If the next character is DQUOTE, let
parameter_value
be the result of Parsing a Quoted String frominput
(consuming zero or more characters of it). -
Else, consume the contents up to but not including the first ";", "," character or end of
input
and let the results beparameter_value
. -
If the last character of
parameter_name
is an asterisk ("*"), decodeparameter_value
according to {{I-D.ietf-httpbis-rfc5987bis}}. Continue processinginput
if an unrecoverable error is encountered.
-
-
Else:
- Let
parameter_value
be an empty string.
- Let
-
Case-normalise
parameter_name
to lowercase. -
Append (
parameter_name
,parameter_value
) toparameters
. -
Consume any leading OWS.
-
If the next character is "," or the end of
input
, stop processinginput
and returnparameters
.
-
This algorithm parses a quoted string, as per {{RFC7230}}, Section 3.2.6. Given an ASCII string
input
, it returns an unquoted string. input
is modified to remove the parsed string.
-
Let
output
be an empty string. -
If the first character of
input
is not DQUOTE, returnoutput
. -
Discard the first character.
-
While
input
has content:-
If the first character is a backslash ("\"):
-
Discard the first character.
-
If there is no more
input
, returnoutput
. -
Else, consume the first character and append it to
output
.
-
-
Else, if the first character is DQUOTE, discard it and return
output
.
-
-
Return
output
.
I think this goes into the right direction; it's mostly a parser based on a relaxed ABNF transformed to prose (which makes me wonder whether there's a tool to be written here :-).
Other comments:
"\\"
- here, the DQUOTE is preceded by backslash, but it's not part of the quoted-pair.