Skip to content

Instantly share code, notes, and snippets.

@hugowetterberg
Last active March 6, 2024 18:28
Show Gist options
  • Star 11 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save hugowetterberg/81747 to your computer and use it in GitHub Desktop.
Save hugowetterberg/81747 to your computer and use it in GitHub Desktop.
A useful function for splitting ical content into 75-octet lines, taking multibyte characters into account. See: http://www.ietf.org/rfc/rfc2445.txt, section 4.1
<?php
mb_internal_encoding("UTF-8");
$desc = <<<TEXT
<p>Lines of text SHOULD NOT be longer than 75 octets, (och hör på den) excluding the line break. Long content lines SHOULD be split into a multiple line representations using a line "folding" technique.</p>
That is, a long line can be split between any two characters by inserting a CRLF
immediately followed by a single linear white space character (i.e.,
SPACE, <b>US-ASCII</b> decimal 32 or HTAB, US-ASCII decimal 9). Any sequence
of CRLF followed immediately by a single linear white space character
is ignored (i.e., removed) when processing the content type.
TEXT;
/**
* Apply folding compliant with RFC 5545
* See https://www.rfc-editor.org/rfc/rfc5545#section-3.1
*
* @param string $preamble The property name, e.g. DESCRIPTION
* @param string $value The value for the property, e.g. a very long string
* @param bool $strip_tags Strip HTML tags from the value
*
* @return string Returns the folded string without the property name
*/
function ical_split($preamble, $value, $strip_tags=true)
{
$value = trim($value);
$value = preg_replace('/[\r\n]+/', ' ', $value);
$value = preg_replace('/\s{2,}/', ' ', $value);
if ($strip_tags) {
$value = strip_tags($value);
}
$value = $preamble . ':' . $value;
$offset = 0;
$chunkSize = 75;
$lines = [];
while ($line = mb_strcut($value, $offset, $chunkSize - 1)) {
$lines[] = $line;
$offset += $chunkSize;
}
return substr(join("\r\n\t", $lines), strlen($preamble) + 1);
}
$split = ical_split('DESCRIPTION:', $desc);
print 'DESCRIPTION:' . $split;
// Test results
$lines = preg_split('/\r\n/', 'DESCRIPTION:' . $split);
print "\n\nTests\n";
foreach ($lines as $i => $line) {
print "Line {$i}: " . strlen($line) . " octets\n";
}
print "\nAlt desc output:\n";
$split = ical_split('X-ALT-DESC:', $desc, false);
print 'X-ALT-DESC:' . $split;
print "\n\n";
DESCRIPTION:Lines of text SHOULD NOT be longer than 75 octets, (och hör
å den) excluding the line break. Long content lines SHOULD be split into
multiple line representations using a line "folding" technique. That is,
long line can be split between any two characters by inserting a CRLF imm
diately followed by a single linear white space character (i.e., SPACE, US
ASCII decimal 32 or HTAB, US-ASCII decimal 9). Any sequence of CRLF follow
d immediately by a single linear white space character is ignored (i.e., r
moved) when processing the content type.
Tests
Line 0: 73 octets
Line 1: 75 octets
Line 2: 75 octets
Line 3: 75 octets
Line 4: 75 octets
Line 5: 75 octets
Line 6: 75 octets
Line 7: 41 octets
Alt desc output:
X-ALT-DESC:<p>Lines of text SHOULD NOT be longer than 75 octets, (och hö
på den) excluding the line break. Long content lines SHOULD be split int
a multiple line representations using a line "folding" technique.</p> Tha
is, a long line can be split between any two characters by inserting a CR
F immediately followed by a single linear white space character (i.e., SPA
E, <b>US-ASCII</b> decimal 32 or HTAB, US-ASCII decimal 9). Any sequence o
CRLF followed immediately by a single linear white space character is ign
red (i.e., removed) when processing the content type.
@hugowetterberg
Copy link
Author

Huh, 14 years... time flies :)
Your implementation looks nice and elegant @viavario. Stripping out tags should probably have been separate from the folding, but I added an optional param to your implementation that can be used to disable tag stripping, preserving the old behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment