Skip to content

Instantly share code, notes, and snippets.

@joshuaadickerson
Last active August 29, 2015 14:27
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save joshuaadickerson/c528aae77d1cf0de029d to your computer and use it in GitHub Desktop.
Save joshuaadickerson/c528aae77d1cf0de029d to your computer and use it in GitHub Desktop.
SMF/Elkarte BBC Parser

This is a very slow and thorough rewrite of the BBC parsing in Elkarte.

I have done this once or twice before with miserable results. Now, I am going to do it much thoroughly.

Each commit should pass all of the tests (as I write those tests). Not every commit will result in a faster parser, but the end result should be faster and better for resources. In the end, I hope to make it much more maintainable and more object oriented.

Long term, this should use preg_metch with offset capture and even an AST

preg_split() on any [$tag and [/$tag (itemcodes includes) and ]

in index.php, setting SAVE_TOP_RESULTS to true will result in it creating a csv file. To parse this CSV file, open TopResults.php

=== Changes

  • $no_autolink_tags no longer exists. It is now an attribute of the tag as "autolink"
  • you can get the bbc without loading parse_bbc(). Seperate loading from parsing
  • changed substr() == str to substr_compare(). Don't use strpos() when you want to do a substr_compare either
  • seperate construction of the parser from the execution more
  • replace substr() . substr() with substr_replace()
  • removed ftp:// autolinking and the [ftp] tag. Pointless to keep it
<?php
/**
* Microsoft uses their own character set Code Page 1252 (CP1252), which is a
* superset of ISO 8859-1, defining several characters between DEC 128 and 159
* that are not normally displayable. This converts the popular ones that
* appear from a cut and paste from windows.
*
* @param string|false $string
* @return string $string
*/
function sanitizeMSCutPaste($string)
{
if (empty($string))
return $string;
// UTF-8 occurrences of MS special characters
$findchars_utf8 = array(
"\xe2\x80\x9a", // single low-9 quotation mark
"\xe2\x80\x9e", // double low-9 quotation mark
"\xe2\x80\xa6", // horizontal ellipsis
"\xe2\x80\x98", // left single curly quote
"\xe2\x80\x99", // right single curly quote
"\xe2\x80\x9c", // left double curly quote
"\xe2\x80\x9d", // right double curly quote
"\xe2\x80\x93", // en dash
"\xe2\x80\x94", // em dash
);
// safe replacements
$replacechars = array(
',', // &sbquo;
',,', // &bdquo;
'...', // &hellip;
"'", // &lsquo;
"'", // &rsquo;
'"', // &ldquo;
'"', // &rdquo;
'-', // &ndash;
'--', // &mdash;
);
$string = str_replace($findchars_utf8, $replacechars, $string);
return $string;
}
/**
* Parse smileys in the passed message.
*
* What it does:
* - The smiley parsing function which makes pretty faces appear :).
* - If custom smiley sets are turned off by smiley_enable, the default set of smileys will be used.
* - These are specifically not parsed in code tags [url=mailto:Dad@blah.com]
* - Caches the smileys from the database or array in memory.
* - Doesn't return anything, but rather modifies message directly.
*
* @param string $message
*/
function parsesmileys(&$message)
{
global $modSettings, $txt, $user_info;
static $smileyPregSearch = null, $smileyPregReplacements = array();
// No smiley set at all?!
if ($user_info['smiley_set'] == 'none' || trim($message) == '')
return;
// If smileyPregSearch hasn't been set, do it now.
if (empty($smileyPregSearch))
{
// Use the default smileys if it is disabled. (better for "portability" of smileys.)
if (empty($modSettings['smiley_enable']))
{
$smileysfrom = array('>:D', ':D', '::)', '>:(', ':))', ':)', ';)', ';D', ':(', ':o', '8)', ':P', '???', ':-[', ':-X', ':-*', ':\'(', ':-\\', '^-^', 'O0', 'C:-)', 'O:)');
$smileysto = array('evil.gif', 'cheesy.gif', 'rolleyes.gif', 'angry.gif', 'laugh.gif', 'smiley.gif', 'wink.gif', 'grin.gif', 'sad.gif', 'shocked.gif', 'cool.gif', 'tongue.gif', 'huh.gif', 'embarrassed.gif', 'lipsrsealed.gif', 'kiss.gif', 'cry.gif', 'undecided.gif', 'azn.gif', 'afro.gif', 'police.gif', 'angel.gif');
$smileysdescs = array('', $txt['icon_cheesy'], $txt['icon_rolleyes'], $txt['icon_angry'], $txt['icon_laugh'], $txt['icon_smiley'], $txt['icon_wink'], $txt['icon_grin'], $txt['icon_sad'], $txt['icon_shocked'], $txt['icon_cool'], $txt['icon_tongue'], $txt['icon_huh'], $txt['icon_embarrassed'], $txt['icon_lips'], $txt['icon_kiss'], $txt['icon_cry'], $txt['icon_undecided'], '', '', '', $txt['icon_angel']);
}
else
{
// Load the smileys in reverse order by length so they don't get parsed wrong.
if (($temp = cache_get_data('parsing_smileys', 480)) == null)
{
$smileysfrom = array();
$smileysto = array();
$smileysdescs = array();
// @todo there is no reason $db should be used before this
$db = database();
$db->fetchQueryCallback('
SELECT code, filename, description
FROM {db_prefix}smileys
ORDER BY LENGTH(code) DESC',
array(
),
function($row) use (&$smileysfrom, &$smileysto, &$smileysdescs)
{
$smileysfrom[] = $row['code'];
$smileysto[] = htmlspecialchars($row['filename']);
$smileysdescs[] = $row['description'];
}
);
cache_put_data('parsing_smileys', array($smileysfrom, $smileysto, $smileysdescs), 480);
}
else
list ($smileysfrom, $smileysto, $smileysdescs) = $temp;
}
// The non-breaking-space is a complex thing...
$non_breaking_space = '\x{A0}';
// This smiley regex makes sure it doesn't parse smileys within code tags (so [url=mailto:David@bla.com] doesn't parse the :D smiley)
$smileyPregReplacements = array();
$searchParts = array();
$smileys_path = htmlspecialchars($modSettings['smileys_url'] . '/' . $user_info['smiley_set'] . '/');
for ($i = 0, $n = count($smileysfrom); $i < $n; $i++)
{
$specialChars = htmlspecialchars($smileysfrom[$i], ENT_QUOTES);
$smileyCode = '<img src="' . $smileys_path . $smileysto[$i] . '" alt="' . strtr($specialChars, array(':' => '&#58;', '(' => '&#40;', ')' => '&#41;', '$' => '&#36;', '[' => '&#091;')). '" title="' . strtr(htmlspecialchars($smileysdescs[$i]), array(':' => '&#58;', '(' => '&#40;', ')' => '&#41;', '$' => '&#36;', '[' => '&#091;')) . '" class="smiley" />';
$smileyPregReplacements[$smileysfrom[$i]] = $smileyCode;
$searchParts[] = preg_quote($smileysfrom[$i], '~');
if ($smileysfrom[$i] != $specialChars)
{
$smileyPregReplacements[$specialChars] = $smileyCode;
$searchParts[] = preg_quote($specialChars, '~');
}
}
$smileyPregSearch = '~(?<=[>:\?\.\s' . $non_breaking_space . '[\]()*\\\;]|^)(' . implode('|', $searchParts) . ')(?=[^[:alpha:]0-9]|$)~';
//$smileyPregSearch = '~\n(?<=[>:\?\.\s' . $non_breaking_space . '[\]()*\\\;]|^)(' . implode('|', $searchParts) . ')(?=[^[:alpha:]0-9]|$)\n~';
}
// Replace away!
$message = preg_replace_callback($smileyPregSearch, function ($matches) use ($smileyPregReplacements)
{
return $smileyPregReplacements[$matches[0]];
}, $message);
}
/**
* Calculates all the possible permutations (orders) of an array.
*
* What it does:
* - should not be called on arrays bigger than 10 elements as this function is memory hungry
* - returns an array containing each permutation.
* - e.g. (1,2,3) returns (1,2,3), (1,3,2), (2,1,3), (2,3,1), (3,1,2), and (3,2,1)
* - really a combinations without repetition N! function so 3! = 6 and 10! = 4098 combinations
* - Used by parse_bbc to allow bbc tag parameters to be in any order and still be
* parsed properly
*
* @param mixed[] $array index array of values
* @return mixed[] array representing all permutations of the supplied array
*/
function permute($array)
{
$orders = array($array);
$n = count($array);
$p = range(0, $n);
for ($i = 1; $i < $n; null)
{
$p[$i]--;
$j = $i % 2 != 0 ? $p[$i] : 0;
$temp = $array[$i];
$array[$i] = $array[$j];
$array[$j] = $temp;
for ($i = 1; $p[$i] == 0; $i++)
$p[$i] = 1;
$orders[] = $array;
}
return $orders;
}
function pc_next_permutation($p, $size)
{
// If there is only 1, then there can only be 1 permutation... duh.
if ($size < 1)
{
return false;
}
// slide down the array looking for where we're smaller than the next guy
for ($i = $size - 1; isset($p[$i]) && $p[$i] >= $p[$i + 1]; --$i);
// if this doesn't occur, we've finished our permutations
// the array is reversed: (1, 2, 3, 4) => (4, 3, 2, 1)
if ($i < 0)
{
return false;
}
// slide down the array looking for a bigger number than what we found before
for ($j = $size; $p[$j] <= $p[$i]; --$j);
// swap them
$tmp = $p[$i];
$p[$i] = $p[$j];
$p[$j] = $tmp;
// now reverse the elements in between by swapping the ends
for (++$i, $j = $size; $i < $j; ++$i, --$j)
{
$tmp = $p[$i];
$p[$i] = $p[$j];
$p[$j] = $tmp;
}
return $p;
}
// This is just a mock so we don't break anything
function call_integration_hook($hook, $parameters = array())
{
return;
}
function cache_put_data($key, $value, $ttl = 120)
{
return;
}
function cache_get_data($key, $ttl = 120)
{
return;
}
<?php
$total_old_time = 0;
$total_new_time = 0;
$stack = array();
$stack_max_len = 5;
$stack_len = 0;
foreach ($results as $i => $result)
{
if (!is_array($result))
{
continue;
}
$total_old_time += $result['old']['total_time'];
$total_new_time += $result['new']['total_time'];
if (defined('SAVE_TOP_RESULTS') && SAVE_TOP_RESULTS)
{
if (count($stack) < $stack_max_len + 1)
{
$stack_len++;
$stack[$i] = $result['time_diff_perc'];
}
else
{
foreach ($stack as $k => $v)
{
if ($v < $result['time_diff_perc'])
{
unset($stack[$k]);
$stack[$i] = $result['time_diff_perc'];
asort($stack);
break;
}
}
}
}
}
if (defined('SAVE_TOP_RESULTS') && SAVE_TOP_RESULTS)
{
asort($stack);
file_put_contents('top_time_diff_perc.csv', implode(array_keys($stack), ',') . "\n", FILE_APPEND);
}
?>
<div>
Messages: <?= $results['num_messages'] ?><br>
Iterations: <?= $results['iterations'] ?><br>
Total Time In Tests: <?= round($total_old_time + $total_new_time, 2) ?><br>
Total Old Time: <?= round($total_old_time, 2) ?><br>
Total New Time: <?= round($total_new_time, 2) ?><br>
Diff Total Time: <?= round(max($total_old_time, $total_new_time) - min($total_old_time, $total_new_time), 2) ?><br>
Diff Total Time %: <?= round((max($total_old_time, $total_new_time) - min($total_old_time, $total_new_time) / max($total_old_time, $total_new_time)), 2) ?><br>
</div>
<table class="table table-striped table-bordered table-condensed" data-page-length="1000">
<!--<colgroup>
<col class="col-md-1">
<col class="col-md-3">
<col class="col-md-4">
<col class="col-md-4">
</colgroup>-->
<thead>
<tr>
<th>Test</th>
<th>Order</th>
<th>Pass</th>
<th>Old Time</th>
<th>New Time</th>
<th>Time Diff</th>
<th>Time Diff %</th>
<th>Old Mem</th>
<th>New Mem</th>
<th>Mem Diff</th>
<th>Old Peak Mem</th>
<th>New Peak Mem</th>
<th>Mem Peak Diff</th>
</tr>
</thead>
<tbody>
<?php
foreach ($results as $test => $result)
{
if (!is_array($result))
{
continue;
}
?>
<tr>
<td><?= $test ?></td>
<td><?= $result['order'] ?></td>
<?php
if (isset($result['pass']))
{
echo '<td class="', $result['pass'] ? 'success' : 'danger', '">', $result['pass'] ? 'pass' : 'fail', '</td>';
}
else
{
echo '<td></td>';
}
?>
<td class="<?= $result['time_winner'] === 'old' ? 'success' : ''?>">
<?= $result['old']['total_time'] ?>
</td>
<td class="<?= $result['time_winner'] === 'new' ? 'success' : ''?>">
<?= $result['new']['total_time'] ?>
</td>
<td><?= $result['time_diff'] ?></td>
<td><?= round(($result['time_diff'] / max($result['new']['total_time'], $result['old']['total_time'])) * 100, 2) ?></td>
<td class="<?= $result['mem_winner'] === 'old' ? 'success' : ''?>">
<?= $result['old']['memory_usage'] ?>
</td>
<td class="<?= $result['mem_winner'] === 'new' ? 'success' : ''?>">
<?= $result['new']['memory_usage'] ?>
</td>
<td><?= $result['mem_diff'] ?></td>
<td class="<?= $result['peak_mem_winner'] === 'old' ? 'success' : ''?>"><?= $result['old']['memory_peak_after'] ?></td>
<td class="<?= $result['peak_mem_winner'] === 'new' ? 'success' : ''?>"><?= $result['new']['memory_peak_after'] ?></td>
<td><?= $result['peak_mem_diff'] ?></td>
</tr>
<?php
}
?>
</tbody>
</table>
<?php
namespace BBC;
// @todo add attribute for TEST_PARAM_STRING and TEST_CONTENT so people can test the content
// @todo change ATTR_TEST to be able to test the entire message with the current offset
class Codes
{
/** the tag's name - must be lowercase */
const ATTR_TAG = 1;
/** One of self::TYPE_* */
const ATTR_TYPE = 2;
/**
* An optional array of parameters, for the form
* [tag abc=123]content[/tag]. The array is an associative array
* where the keys are the parameter names, and the values are an
* array which *may* contain any of self::PARAM_ATTR_*
*/
const ATTR_PARAM = 3;
/**
* A regular expression to test immediately after the tag's
* '=', ' ' or ']'. Typically, should have a \] at the end.
* Optional.
*/
const ATTR_TEST = 4;
/**
* Only available for unparsed_content, closed, unparsed_commas_content, and unparsed_equals_content.
* $1 is replaced with the content of the tag.
* Parameters are replaced in the form {param}.
* For unparsed_commas_content, $2, $3, ..., $n are replaced.
*/
const ATTR_CONTENT = 5;
/**
* Only when content is not used, to go before any content.
* For unparsed_equals, $1 is replaced with the value.
* For unparsed_commas, $1, $2, ..., $n are replaced.
*/
const ATTR_BEFORE = 6;
/**
* Similar to before in every way, except that it is used when the tag is closed.
*/
const ATTR_AFTER = 7;
/**
* Used in place of content when the tag is disabled.
* For closed, default is '', otherwise it is '$1' if block_level is false, '<div>$1</div>' elsewise.
*/
const ATTR_DISABLED_CONTENT = 8;
/**
* Used in place of before when disabled.
* Defaults to '<div>' if block_level, '' if not.
*/
const ATTR_DISABLED_BEFORE = 9;
/**
* Used in place of after when disabled.
* Defaults to '</div>' if block_level, '' if not.
*/
const ATTR_DISABLED_AFTER = 10;
/**
* Set to true the tag is a "block level" tag, similar to HTML.
* Block level tags cannot be nested inside tags that are not block level, and will not be implicitly closed as easily.
* One break following a block level tag may also be removed.
*/
const ATTR_BLOCK_LEVEL = 11;
/**
* Trim the whitespace after the opening tag or the closing tag or both.
* One of self::TRIM_*
* Optional
*/
const ATTR_TRIM = 12;
/**
* Except when type is missing or 'closed', a callback to validate the data as $data.
* Depending on the tag's type, $data may be a string or an array of strings (corresponding to the replacement.)
*/
const ATTR_VALIDATE = 13;
/**
* When type is unparsed_equals or parsed_equals only, may be not set,
* 'optional', or 'required' corresponding to if the content may be quoted.
* This allows the parser to read [tag="abc]def[esdf]"] properly.
*/
const ATTR_QUOTED = 14;
/**
* An array of tag names, or not set.
* If set, the enclosing tag *must* be one of the listed tags, or parsing won't occur.
*/
const ATTR_REQUIRE_PARENTS = 15;
/**
* similar to require_parents, if set children won't be parsed if they are not in the list.
*/
const ATTR_REQUIRE_CHILDREN = 16;
/**
* Similar to, but very different from, require_parents.
* If it is set the listed tags will not be parsed inside the tag.
*/
const ATTR_DISALLOW_PARENTS = 17;
/**
* Similar to, but very different from, require_children.
* If it is set the listed tags will not be parsed inside the tag.
*/
const ATTR_DISALLOW_CHILDREN = 18;
/**
* When ATTR_DISALLOW_* is used, this gets put before the tag.
*/
const ATTR_DISALLOW_BEFORE = 19;
/**
* * When ATTR_DISALLOW_* is used, this gets put after the tag.
*/
const ATTR_DISALLOW_AFTER = 20;
/**
* an array restricting what BBC can be in the parsed_equals parameter, if desired.
*/
const ATTR_PARSED_TAGS_ALLOWED = 21;
/**
* (bool) Turn uris like http://www.google.com in to links
*/
const ATTR_AUTOLINK = 22;
/**
* The length of the tag
*/
const ATTR_LENGTH = 23;
/**
* Whether the tag is disabled
*/
const ATTR_DISABLED = 24;
/** [tag]parsed content[/tag] */
const TYPE_PARSED_CONTENT = 0;
/** [tag=xyz]parsed content[/tag] */
const TYPE_UNPARSED_EQUALS = 1;
/** [tag=parsed data]parsed content[/tag] */
const TYPE_PARSED_EQUALS = 2;
/** [tag]unparsed content[/tag] */
const TYPE_UNPARSED_CONTENT = 3;
/** [tag], [tag/], [tag /] */
const TYPE_CLOSED = 4;
/** [tag=1,2,3]parsed content[/tag] */
const TYPE_UNPARSED_COMMAS = 5;
/** [tag=1,2,3]unparsed content[/tag] */
const TYPE_UNPARSED_COMMAS_CONTENT = 6;
/** [tag=...]unparsed content[/tag] */
const TYPE_UNPARSED_EQUALS_CONTENT = 7;
/** [*] */
const TYPE_ITEMCODE = 8;
/** a regular expression to validate and match the value. */
const PARAM_ATTR_MATCH = 0;
/** true if the value should be quoted. */
const PARAM_ATTR_QUOTED = 1;
/** callback to evaluate on the data, which is $data. */
const PARAM_ATTR_VALIDATE = 2;
/** a string in which to replace $1 with the data. Either it or validate may be used, not both. */
const PARAM_ATTR_VALUE = 3;
/** true if the parameter is optional. */
const PARAM_ATTR_OPTIONAL = 4;
/** */
const TRIM_NONE = 0;
/** */
const TRIM_INSIDE = 1;
/** */
const TRIM_OUTSIDE = 2;
/** */
const TRIM_BOTH = 3;
const OPTIONAL = -1;
/**
* An array of self::ATTR_*
* ATTR_TAG and ATTR_TYPE are required for every tag.
* The rest of the attributes depend on the type and other options.
*/
protected $bbc;
protected $itemcodes;
protected $additional_bbc;
protected $disabled;
public function __construct(array $tags = array(), array $disabled = array())
{
$this->bbc = $this->getDefault();
$this->additional_bbc = $tags;
$this->disabled = $disabled;
foreach ($disabled as $tag)
{
$this->removeTag($tag);
}
foreach ($tags as $tag)
{
$this->addTag($tag);
}
}
public function addTag(array $tag)
{
$this->checkNewTag($tag);
}
protected function checkNewTag(array &$tag)
{
if (!isset($tag[self::ATTR_TAG]) || !is_string($tag))
{
throw new \InvalidArgumentException('BBC must have a tag name');
}
$tag[self::ATTR_TAG] = trim($tag[self::ATTR_TAG]);
if ($tag[self::ATTR_TAG] == '')
{
throw new \InvalidArgumentException('BBC must have a tag name');
}
$tag[self::ATTR_TYPE] = empty($tag[self::ATTR_TYPE]) ? self::UNPARSED_CONTENT : $tag[self::ATTR_TYPE];
if (!is_int($tag[self::ATTR_TYPE]) || $tag[self::ATTR_TYPE] > self::TYPE_PARSED_EQUALS_CONTENT || $tag[self::ATTR_TYPE] < self::UNPARSED_CONTENT)
{
throw new \InvalidArgumentException('Invalid type for tag: ' . $tag[self::ATTR_TYPE]);
}
if (isset($tag[self::ATTR_PARAM]))
{
foreach ($parameters as &$parameter)
{
$parameter[self::PARAM_ATTR_QUOTED] = !empty($parameter[self::PARAM_ATTR_QUOTED]);
$parameter[self::PARAM_ATTR_OPTIONAL] = !empty($parameter[self::PARAM_ATTR_OPTIONAL]);
if (isset($parameter[self::PARAM_ATTR_VALIDATE]) && isset($parameter[self::PARAM_ATTR_VALUE]))
{
throw new \InvalidArgumentException('Parameters may only use value or validate, not both');
}
}
}
if (!isset($tag[self::ATTR_LENGTH]))
{
$tag[self::ATTR_LENGTH] = strlen($tag[self::ATTR_TAG]);
}
$tag[self::ATTR_AUTOLINK] = !empty($tag[self::ATTR_AUTOLINK]);
$tag[self::ATTR_BLOCK_LEVEL] = !empty($tag[self::ATTR_BLOCK_LEVEL]);
}
public function removeTag($tag)
{
foreach ($this->bbc as $k => $v)
{
if ($tag === $v[self::ATTR_TAG])
{
unset($this->bbc[$k]);
}
}
/*
array_filter takex 50% more time
return;
$this->bbc = array_filter($this->bbc, function ($ele) use ($tag) {
return $ele[self::ATTR_TAG] !== $tag;
});*/
}
public function getDefault()
{
global $modSettings, $txt, $scripturl;
return array(
array(
self::ATTR_TAG => 'abbr',
self::ATTR_TYPE => self::TYPE_UNPARSED_EQUALS,
self::ATTR_BEFORE => '<abbr title="$1">',
self::ATTR_AFTER => '</abbr>',
self::ATTR_QUOTED => self::OPTIONAL,
self::ATTR_DISABLED_AFTER => ' ($1)',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 4,
),
array(
self::ATTR_TAG => 'anchor',
self::ATTR_TYPE => self::TYPE_UNPARSED_EQUALS,
// USES CLOSING BRACKET
//self::ATTR_TEST => '[#]?([A-Za-z][A-Za-z0-9_\-]*)\]',
self::ATTR_TEST => '[#]?([A-Za-z][A-Za-z0-9_\-]*)',
self::ATTR_BEFORE => '<span id="post_$1">',
self::ATTR_AFTER => '</span>',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 6,
),
array(
self::ATTR_TAG => 'b',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_BEFORE => '<strong class="bbc_strong">',
self::ATTR_AFTER => '</strong>',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 1,
),
array(
self::ATTR_TAG => 'br',
self::ATTR_TYPE => self::TYPE_CLOSED,
self::ATTR_CONTENT => '<br />',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => false,
self::ATTR_LENGTH => 2,
),
array(
self::ATTR_TAG => 'center',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_BEFORE => '<div class="centertext">',
self::ATTR_AFTER => '</div>',
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 6,
),
array(
self::ATTR_TAG => 'code',
self::ATTR_TYPE => self::TYPE_UNPARSED_CONTENT,
self::ATTR_CONTENT => '<div class="codeheader">' . $txt['code'] . ': <a href="javascript:void(0);" onclick="return elkSelectText(this);" class="codeoperation">' . $txt['code_select'] . '</a></div><pre class="bbc_code prettyprint">$1</pre>',
self::ATTR_VALIDATE => $this->isDisabled('code') ? null : function(&$tag, &$data, $disabled) {
$data = str_replace("\t", "<span class=\"tab\">\t</span>", $data);
},
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_AUTOLINK => false,
self::ATTR_LENGTH => 4,
),
array(
self::ATTR_TAG => 'code',
self::ATTR_TYPE => self::TYPE_UNPARSED_EQUALS_CONTENT,
self::ATTR_CONTENT => '<div class="codeheader">' . $txt['code'] . ': ($2) <a href="#" onclick="return elkSelectText(this);" class="codeoperation">' . $txt['code_select'] . '</a></div><pre class="bbc_code prettyprint">$1</pre>',
self::ATTR_VALIDATE => $this->isDisabled('code') ? null : function(&$tag, &$data, $disabled) {
$data[0] = str_replace("\t", "<span class=\"tab\">\t</span>", $data[0]);
},
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_AUTOLINK => false,
self::ATTR_LENGTH => 4,
),
array(
self::ATTR_TAG => 'color',
self::ATTR_TYPE => self::TYPE_UNPARSED_EQUALS,
// USES CLOSING BRACKET
//self::ATTR_TEST => '(#[\da-fA-F]{3}|#[\da-fA-F]{6}|[A-Za-z]{1,20}|rgb\((?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\s?,\s?){2}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\))\]',
self::ATTR_TEST => '(#[\da-fA-F]{3}|#[\da-fA-F]{6}|[A-Za-z]{1,20}|rgb\((?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\s?,\s?){2}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\))',
self::ATTR_BEFORE => '<span style="color: $1;" class="bbc_color">',
self::ATTR_AFTER => '</span>',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 5,
),
array(
self::ATTR_TAG => 'email',
self::ATTR_TYPE => self::TYPE_UNPARSED_CONTENT,
self::ATTR_CONTENT => '<a href="mailto:$1" class="bbc_email">$1</a>',
self::ATTR_VALIDATE => function(&$tag, &$data, $disabled) {
$data = strtr($data, array('<br />' => ''));
},
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => false,
self::ATTR_LENGTH => 5,
),
array(
self::ATTR_TAG => 'email',
self::ATTR_TYPE => self::TYPE_UNPARSED_EQUALS,
self::ATTR_BEFORE => '<a href="mailto:$1" class="bbc_email">',
self::ATTR_AFTER => '</a>',
//self::ATTR_DISALLOW_CHILDREN => array('email', 'ftp', 'url', 'iurl'),
self::ATTR_DISALLOW_CHILDREN => array('email' => 'email', 'url' => 'url', 'iurl' => 'iurl'),
self::ATTR_DISABLED_AFTER => ' ($1)',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => false,
self::ATTR_LENGTH => 5,
),
array(
self::ATTR_TAG => 'footnote',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_BEFORE => '<sup class="bbc_footnotes">%fn%',
self::ATTR_AFTER => '%fn%</sup>',
//self::ATTR_DISALLOW_PARENTS => array('footnote', 'code', 'anchor', 'url', 'iurl'),
self::ATTR_DISALLOW_PARENTS => array('footnote' => 'footnote', 'code' => 'code', 'anchor' => 'anchor', 'url' => 'url', 'iurl' => 'iurl'),
self::ATTR_DISALLOW_BEFORE => '',
self::ATTR_DISALLOW_AFTER => '',
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 8,
),
array(
self::ATTR_TAG => 'font',
self::ATTR_TYPE => self::TYPE_UNPARSED_EQUALS,
// USES CLOSING BRACKET
//self::ATTR_TEST => '[A-Za-z0-9_,\-\s]+?\]',
self::ATTR_TEST => '[A-Za-z0-9_,\-\s]+?',
self::ATTR_BEFORE => '<span style="font-family: $1;" class="bbc_font">',
self::ATTR_AFTER => '</span>',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 4,
),
/* array(
self::ATTR_TAG => 'ftp',
self::ATTR_TYPE => self::TYPE_UNPARSED_CONTENT,
self::ATTR_CONTENT => '<a href="$1" class="bbc_ftp new_win" target="_blank">$1</a>',
self::ATTR_VALIDATE => function(&$tag, &$data, $disabled) {
$data = strtr($data, array('<br />' => ''));
if (strpos($data, 'ftp://') !== 0 && strpos($data, 'ftps://') !== 0)
{
$data = 'ftp://' . $data;
}
},
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => false,
self::ATTR_LENGTH => 3,
),
array(
self::ATTR_TAG => 'ftp',
self::ATTR_TYPE => self::TYPE_UNPARSED_EQUALS,
self::ATTR_BEFORE => '<a href="$1" class="bbc_ftp new_win" target="_blank">',
self::ATTR_AFTER => '</a>',
self::ATTR_VALIDATE => function(&$tag, &$data, $disabled) {
if (strpos($data, 'ftp://') !== 0 && strpos($data, 'ftps://') !== 0)
{
$data = 'ftp://' . $data;
}
},
self::ATTR_DISALLOW_CHILDREN => array('email', 'ftp', 'url', 'iurl'),
self::ATTR_DISABLED_AFTER => ' ($1)',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => false,
self::ATTR_LENGTH => 3,
),
*/ array(
self::ATTR_TAG => 'hr',
self::ATTR_TYPE => self::TYPE_CLOSED,
self::ATTR_CONTENT => '<hr />',
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_AUTOLINK => false,
self::ATTR_LENGTH => 2,
),
array(
self::ATTR_TAG => 'i',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_BEFORE => '<em>',
self::ATTR_AFTER => '</em>',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 1,
),
array(
self::ATTR_TAG => 'img',
self::ATTR_TYPE => self::TYPE_UNPARSED_CONTENT,
self::ATTR_PARAM => array(
'alt' => array(self::PARAM_ATTR_OPTIONAL => true),
'width' => array(
self::PARAM_ATTR_OPTIONAL => true,
self::PARAM_ATTR_VALUE => 'width:100%;max-width:$1px;',
self::PARAM_ATTR_MATCH => '(\d+)'
),
'height' => array(
self::PARAM_ATTR_OPTIONAL => true,
self::PARAM_ATTR_VALUE => 'max-height:$1px;',
self::PARAM_ATTR_MATCH => '(\d+)'
),
),
self::ATTR_CONTENT => '<img src="$1" alt="{alt}" style="{width}{height}" class="bbc_img resized" />',
self::ATTR_VALIDATE => function(&$tag, &$data, $disabled) {
$data = strtr($data, array('<br />' => ''));
if (strpos($data, 'http://') !== 0 && strpos($data, 'https://') !== 0)
{
$data = 'http://' . $data;
}
},
self::ATTR_DISABLED_CONTENT => '($1)',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => false,
self::ATTR_LENGTH => 3,
),
array(
self::ATTR_TAG => 'img',
self::ATTR_TYPE => self::TYPE_UNPARSED_CONTENT,
self::ATTR_CONTENT => '<img src="$1" alt="" class="bbc_img" />',
self::ATTR_VALIDATE => function(&$tag, &$data, $disabled) {
$data = strtr($data, array('<br />' => ''));
if (strpos($data, 'http://') !== 0 && strpos($data, 'https://') !== 0)
{
$data = 'http://' . $data;
}
},
self::ATTR_DISABLED_CONTENT => '($1)',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => false,
self::ATTR_LENGTH => 3,
),
array(
self::ATTR_TAG => 'iurl',
self::ATTR_TYPE => self::TYPE_UNPARSED_CONTENT,
self::ATTR_CONTENT => '<a href="$1" class="bbc_link">$1</a>',
self::ATTR_VALIDATE => function(&$tag, &$data, $disabled) {
$data = strtr($data, array('<br />' => ''));
if (strpos($data, 'http://') !== 0 && strpos($data, 'https://') !== 0)
{
$data = 'http://' . $data;
}
},
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => false,
self::ATTR_LENGTH => 4,
),
array(
self::ATTR_TAG => 'iurl',
self::ATTR_TYPE => self::TYPE_UNPARSED_EQUALS,
self::ATTR_BEFORE => '<a href="$1" class="bbc_link">',
self::ATTR_AFTER => '</a>',
self::ATTR_VALIDATE => function(&$tag, &$data, $disabled) {
if ($data[0] === '#')
{
$data = '#post_' . substr($data, 1);
}
elseif (strpos($data, 'http://') !== 0 && strpos($data, 'https://') !== 0)
{
$data = 'http://' . $data;
}
},
//self::ATTR_DISALLOW_CHILDREN => array('email', 'ftp', 'url', 'iurl'),
self::ATTR_DISALLOW_CHILDREN => array('email' => 'email', 'url' => 'url', 'iurl' => 'iurl'),
self::ATTR_DISABLED_AFTER => ' ($1)',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => false,
self::ATTR_LENGTH => 4,
),
array(
self::ATTR_TAG => 'left',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_BEFORE => '<div style="text-align: left;">',
self::ATTR_AFTER => '</div>',
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 4,
),
array(
self::ATTR_TAG => 'li',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_BEFORE => '<li>',
self::ATTR_AFTER => '</li>',
self::ATTR_TRIM => self::TRIM_OUTSIDE,
self::ATTR_REQUIRE_PARENTS => array('list'),
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_DISABLED_BEFORE => '',
self::ATTR_DISABLED_AFTER => '<br />',
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 2,
),
array(
self::ATTR_TAG => 'list',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_BEFORE => '<ul class="bbc_list">',
self::ATTR_AFTER => '</ul>',
self::ATTR_TRIM => self::TRIM_INSIDE,
self::ATTR_REQUIRE_CHILDREN => array('li', 'list'),
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 4,
),
array(
self::ATTR_TAG => 'list',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_PARAM => array(
'type' => array(self::PARAM_ATTR_MATCH => '(none|disc|circle|square|decimal|decimal-leading-zero|lower-roman|upper-roman|lower-alpha|upper-alpha|lower-greek|lower-latin|upper-latin|hebrew|armenian|georgian|cjk-ideographic|hiragana|katakana|hiragana-iroha|katakana-iroha)'),
),
self::ATTR_BEFORE => '<ul class="bbc_list" style="list-style-type: {type};">',
self::ATTR_AFTER => '</ul>',
self::ATTR_TRIM => self::TRIM_INSIDE,
self::ATTR_REQUIRE_CHILDREN => array('li'),
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 4,
),
array(
self::ATTR_TAG => 'me',
self::ATTR_TYPE => self::TYPE_UNPARSED_EQUALS,
self::ATTR_BEFORE => '<div class="meaction">&nbsp;$1 ',
self::ATTR_AFTER => '</div>',
self::ATTR_QUOTED => 'optional',
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_DISABLED_BEFORE => '/me ',
self::ATTR_DISABLED_AFTER => '<br />',
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 2,
),
array(
self::ATTR_TAG => 'member',
self::ATTR_TYPE => self::TYPE_UNPARSED_EQUALS,
// USES CLOSING BRACKET
self::ATTR_TEST => '[\d*]',
self::ATTR_BEFORE => '<span class="bbc_mention"><a href="' . $scripturl . '?action=profile;u=$1">@',
self::ATTR_AFTER => '</a></span>',
self::ATTR_DISABLED_BEFORE => '@',
self::ATTR_DISABLED_AFTER => '',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 6,
),
array(
self::ATTR_TAG => 'nobbc',
self::ATTR_TYPE => self::TYPE_UNPARSED_CONTENT,
self::ATTR_CONTENT => '$1',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 5,
),
array(
self::ATTR_TAG => 'pre',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_BEFORE => '<pre class="bbc_pre">',
self::ATTR_AFTER => '</pre>',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 3,
),
array(
self::ATTR_TAG => 'quote',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_BEFORE => '<div class="quoteheader">' . $txt['quote'] . '</div><blockquote>',
self::ATTR_AFTER => '</blockquote>',
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 5,
),
array(
self::ATTR_TAG => 'quote',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_PARAM => array(
'author' => array(
self::PARAM_ATTR_MATCH => '(.{1,192}?)',
self::ATTR_QUOTED => true
),
),
self::ATTR_BEFORE => '<div class="quoteheader">' . $txt['quote_from'] . ': {author}</div><blockquote>',
self::ATTR_AFTER => '</blockquote>',
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 5,
),
array(
self::ATTR_TAG => 'quote',
self::ATTR_TYPE => self::TYPE_PARSED_EQUALS,
self::ATTR_BEFORE => '<div class="quoteheader">' . $txt['quote_from'] . ': $1</div><blockquote>',
self::ATTR_AFTER => '</blockquote>',
self::ATTR_QUOTED => 'optional',
// Don't allow everything to be embedded with the author name.
self::ATTR_PARSED_TAGS_ALLOWED => array('url', 'iurl', 'ftp'),
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 5,
),
array(
self::ATTR_TAG => 'quote',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_PARAM => array(
'author' => array(self::PARAM_ATTR_MATCH => '([^<>]{1,192}?)'),
'link' => array(self::PARAM_ATTR_MATCH => '(?:board=\d+;)?((?:topic|threadid)=[\dmsg#\./]{1,40}(?:;start=[\dmsg#\./]{1,40})?|msg=\d{1,40}|action=profile;u=\d+)'),
'date' => array(self::PARAM_ATTR_MATCH => '(\d+)', self::ATTR_VALIDATE => 'htmlTime'),
),
self::ATTR_BEFORE => '<div class="quoteheader"><a href="' . $scripturl . '?{link}">' . $txt['quote_from'] . ': {author} ' . ($modSettings['todayMod'] == 3 ? ' - ' : $txt['search_on']) . ' {date}</a></div><blockquote>',
self::ATTR_AFTER => '</blockquote>',
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 5,
),
array(
self::ATTR_TAG => 'quote',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_PARAM => array(
'author' => array(self::PARAM_ATTR_MATCH => '(.{1,192}?)'),
),
self::ATTR_BEFORE => '<div class="quoteheader">' . $txt['quote_from'] . ': {author}</div><blockquote>',
self::ATTR_AFTER => '</blockquote>',
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 5,
),
array(
self::ATTR_TAG => 'right',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_BEFORE => '<div style="text-align: right;">',
self::ATTR_AFTER => '</div>',
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 5,
),
array(
self::ATTR_TAG => 's',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_BEFORE => '<del>',
self::ATTR_AFTER => '</del>',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 1,
),
array(
self::ATTR_TAG => 'size',
self::ATTR_TYPE => self::TYPE_UNPARSED_EQUALS,
// USES CLOSING BRACKET
//self::ATTR_TEST => '([1-9][\d]?p[xt]|small(?:er)?|large[r]?|x[x]?-(?:small|large)|medium|(0\.[1-9]|[1-9](\.[\d][\d]?)?)?em)\]',
self::ATTR_TEST => '([1-9][\d]?p[xt]|small(?:er)?|large[r]?|x[x]?-(?:small|large)|medium|(0\.[1-9]|[1-9](\.[\d][\d]?)?)?em)',
self::ATTR_BEFORE => '<span style="font-size: $1;" class="bbc_size">',
self::ATTR_AFTER => '</span>',
self::ATTR_DISALLOW_PARENTS => array('size' => 'size'),
self::ATTR_DISALLOW_BEFORE => '<span>',
self::ATTR_DISALLOW_AFTER => '</span>',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 4,
),
array(
self::ATTR_TAG => 'size',
self::ATTR_TYPE => self::TYPE_UNPARSED_EQUALS,
// USES CLOSING BRACKET
//self::ATTR_TEST => '[1-7]\]',
//self::ATTR_TEST => '[1-7]',
self::ATTR_TEST => '[1-7]{1}$',
self::ATTR_BEFORE => '<span style="font-size: $1;" class="bbc_size">',
self::ATTR_AFTER => '</span>',
self::ATTR_VALIDATE => function(&$tag, &$data, $disabled) {
$sizes = array(1 => 0.7, 2 => 1.0, 3 => 1.35, 4 => 1.45, 5 => 2.0, 6 => 2.65, 7 => 3.95);
$data = $sizes[(int) $data] . 'em';
},
self::ATTR_DISALLOW_PARENTS => array('size' => 'size'),
self::ATTR_DISALLOW_BEFORE => '<span>',
self::ATTR_DISALLOW_AFTER => '</span>',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 4,
),
array(
self::ATTR_TAG => 'spoiler',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_BEFORE => '<span class="spoilerheader">' . $txt['spoiler'] . '</span><div class="spoiler"><div class="bbc_spoiler" style="display: none;">',
self::ATTR_AFTER => '</div></div>',
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 7,
),
array(
self::ATTR_TAG => 'sub',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_BEFORE => '<sub>',
self::ATTR_AFTER => '</sub>',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 3,
),
array(
self::ATTR_TAG => 'sup',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_BEFORE => '<sup>',
self::ATTR_AFTER => '</sup>',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 3,
),
array(
self::ATTR_TAG => 'table',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_BEFORE => '<div class="bbc_table_container"><table class="bbc_table">',
self::ATTR_AFTER => '</table></div>',
self::ATTR_TRIM => self::TRIM_INSIDE,
self::ATTR_REQUIRE_CHILDREN => array('tr'),
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 5,
),
array(
self::ATTR_TAG => 'td',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_BEFORE => '<td>',
self::ATTR_AFTER => '</td>',
self::ATTR_REQUIRE_PARENTS => array('tr'),
self::ATTR_TRIM => self::TRIM_OUTSIDE,
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_DISABLED_BEFORE => '',
self::ATTR_DISABLED_AFTER => '',
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 2,
),
array(
self::ATTR_TAG => 'th',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_BEFORE => '<th>',
self::ATTR_AFTER => '</th>',
self::ATTR_REQUIRE_PARENTS => array('tr'),
self::ATTR_TRIM => self::TRIM_OUTSIDE,
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_DISABLED_BEFORE => '',
self::ATTR_DISABLED_AFTER => '',
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 2,
),
array(
self::ATTR_TAG => 'tr',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_BEFORE => '<tr>',
self::ATTR_AFTER => '</tr>',
self::ATTR_REQUIRE_PARENTS => array('table'),
self::ATTR_REQUIRE_CHILDREN => array('td', 'th'),
self::ATTR_TRIM => self::TRIM_BOTH,
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_DISABLED_BEFORE => '',
self::ATTR_DISABLED_AFTER => '',
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 2,
),
array(
self::ATTR_TAG => 'tt',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_BEFORE => '<span class="bbc_tt">',
self::ATTR_AFTER => '</span>',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 2,
),
array(
self::ATTR_TAG => 'u',
self::ATTR_TYPE => self::TYPE_PARSED_CONTENT,
self::ATTR_BEFORE => '<span class="bbc_u">',
self::ATTR_AFTER => '</span>',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => true,
self::ATTR_LENGTH => 1,
),
array(
self::ATTR_TAG => 'url',
self::ATTR_TYPE => self::TYPE_UNPARSED_CONTENT,
self::ATTR_CONTENT => '<a href="$1" class="bbc_link" target="_blank">$1</a>',
self::ATTR_VALIDATE => function(&$tag, &$data, $disabled) {
$data = strtr($data, array('<br />' => ''));
if (strpos($data, 'http://') !== 0 && strpos($data, 'https://') !== 0)
{
$data = 'http://' . $data;
}
},
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => false,
self::ATTR_LENGTH => 3,
),
array(
self::ATTR_TAG => 'url',
self::ATTR_TYPE => self::TYPE_UNPARSED_EQUALS,
self::ATTR_BEFORE => '<a href="$1" class="bbc_link" target="_blank">',
self::ATTR_AFTER => '</a>',
self::ATTR_VALIDATE => function(&$tag, &$data, $disabled) {
if (strpos($data, 'http://') !== 0 && strpos($data, 'https://') !== 0)
{
$data = 'http://' . $data;
}
},
//self::ATTR_DISALLOW_CHILDREN => array('email', 'ftp', 'url', 'iurl'),
self::ATTR_DISALLOW_CHILDREN => array('email' => 'email', 'url' => 'url', 'iurl' => 'iurl'),
self::ATTR_DISABLED_AFTER => ' ($1)',
self::ATTR_BLOCK_LEVEL => false,
self::ATTR_AUTOLINK => false,
self::ATTR_LENGTH => 3,
),
);
}
public function getItemCodes()
{
$item_codes = array(
'*' => 'disc',
'@' => 'disc',
'+' => 'square',
'x' => 'square',
'#' => 'decimal',
'0' => 'decimal',
'o' => 'circle',
'O' => 'circle',
);
//call_integration_hook('integrate_item_codes', array(&$item_codes));
return $item_codes;
}
public function getCodes()
{
return $this->bbc;
}
public function getCodesGroupedByTag()
{
$bbc = array();
foreach ($this->bbc as $code)
{
if (!isset($bbc[$code[self::ATTR_TAG]]))
{
$bbc[$code[self::ATTR_TAG]] = array();
}
$bbc[$code[self::ATTR_TAG]][] = $code;
}
return $bbc;
}
public function getTags()
{
$tags = array();
foreach ($this->bbc as $tag)
{
$tags[$tag[self::ATTR_TAG]] = $tag[self::ATTR_TAG];
}
return $tags;
}
// @todo besides the itemcodes (just add a arg $with_itemcodes), this way should be standard and saved like that.
// Even, just remove the itemcodes when needed
public function getForParsing()
{
$bbc = $this->bbc;
//call_integration_hook('bbc_codes_parsing', array(&$bbc, &$itemcodes));
if (!$this->isDisabled('li') && !$this->isDisabled('list'))
{
foreach ($this->getItemCodes() as $c => $dummy)
{
// Skip anything "bad"
if (!is_string($c) || (is_string($c) && trim($c) === ''))
{
continue;
}
$bbc[$c] = $this->getItemCodeTag($c);
}
}
$return = array();
// Find the first letter of the tag faster
foreach ($bbc as $code)
{
$return[$code[self::ATTR_TAG][0]][] = $code;
}
return $return;
}
public function newGetBBC()
{
$bbc = array();
foreach ($this->bbc as $code)
{
$char = $code[0];
$tag = $code[self::ATTR_TAG];
if (!isset($return[$char]))
{
$bbc[$code[0]] = array();
}
if (!isset($return[$char][$tag]))
{
$bbc[$char][$tag] = array();
}
$bbc[$char][$tag][] = $code;
}
}
protected function getItemCodeTag($code)
{
return array(
self::ATTR_TAG => $code,
self::ATTR_TYPE => self::TYPE_ITEMCODE,
self::ATTR_BLOCK_LEVEL => true,
self::ATTR_LENGTH => 1,
'itemcode' => true,
);
}
public function setForPrinting()
{
// Colors can't well be displayed... supposed to be black and white.
$this->disable('color');
$this->disable('me');
// Links are useless on paper... just show the link.
$this->disable('url');
$this->disable('iurl');
$this->disable('email');
// @todo Change maybe?
if (!isset($_GET['images']))
{
$this->disable('img');
}
// @todo Interface/setting to add more?
// call_integration_hook();
return $this;
}
public function isDisabled($tag)
{
return isset($this->disabled[$tag]);
}
public function getDisabled()
{
return $this->disabled;
}
public function disable($tag, $disable = true)
{
// It was already disabled.
if (isset($this->disabled[$tag]))
{
return true;
}
elseif (!$disable)
{
unset($this->disabled[$tag]);
}
foreach ($this->bbc as &$bbc)
{
if ($bbc['tag'] === $tag)
{
$bbc[self::ATTR_DISABLED] = $disable;
}
}
$this->disabled[$tag] = $tag;
}
}
<?php
namespace BBC;
// Sanitize inputs
$type = isset($_GET['type']) ? $_GET['type'] : false;
if (isset($_GET['msg']))
{
if (is_array($_GET['msg']))
{
$msgs = array();
foreach ($_GET['msg'] as $msg)
{
$msgs[] = (int) $msg;
}
$msgs = array_unique($msgs);
}
else
{
$msgs = $_GET['msg'];
}
}
else
{
$msgs = null;
}
$input = array(
'type' => array(
'test' => $type === 'test' ? ' selected="selected"' : '',
'bench' => $type === 'bench' ? ' selected="selected"' : '',
'codes' => $type === 'codes' ? ' selected="selected"' : '',
),
'iterations' => isset($_GET['iterations']) ? min($_GET['iterations'], 10000) : 0,
'debug' => isset($_GET['debug']) && $_GET['debug'] ? 'checked="checked"' : '',
'fatal' => isset($_GET['fatal']) && $_GET['fatal'] ? 'checked="checked"' : '',
'msg' => $msgs,
);
// Setup those constants for the test file
define('ITERATIONS', $input['iterations']);
define('DEBUG', !empty($input['debug']));
define('FAILED_TEST_IS_FATAL', !empty($input['fatal']));
define('SAVE_TOP_RESULTS', true);
// Include the test file
require_once 'Tester.php';
// Run the test (based on type)
$test_types = array(
'test' => 'tests',
'bench' => 'benchmark',
'codes' => 'codes',
);
if (isset($test_types[$type]))
{
define('TEST_TYPE', $type);
$results = call_user_func('\BBC\\' . $test_types[$type], $input);
}
?><!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>BBC Parser Test</title>
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/css/bootstrap.min.css">
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/css/bootstrap-theme.min.css">
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.3/jquery.min.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/js/bootstrap.min.js"></script>
<script src="//cdn.datatables.net/1.10.8/js/jquery.dataTables.min.js"></script>
<!--
<link rel="stylesheet" href="//cdnjs.cloudflare.com/ajax/libs/highlight.js/8.7/styles/default.min.css">
<script src="//cdnjs.cloudflare.com/ajax/libs/highlight.js/8.7/highlight.min.js"></script>
<script>hljs.initHighlightingOnLoad();</script>
-->
<style>
.code {
height: auto;
max-height: 10em;
overflow: auto !important;
word-break: normal !important;
word-wrap: normal !important;
width: 30em;
margin-bottom: .5em;
}
</style>
</head>
<body>
<div class="container-fluid">
<div id="top">
<button type="button" class="btn btn-primary btn-lg pull-right" data-toggle="modal" data-target="#controls">Controls</button>
<h1>BBC Parser Test</h1>
</div>
<?php
if (empty($results))
{
?><div>
No results to display. Click the "Controls" button to run tests.
<pre class="well"><?= htmlspecialchars('<html><body>something</body></html>'); ?></pre>
</div><?php
}
// RESULTS TO DISPLAY
else
{
if (isset($test_types[$type]))
{
require_once ucfirst($type) . 'Output.php';
}
} // RESULTS TO DISPLAY
?>
</div>
<div class="modal" id="controls" tabindex="-1" role="dialog" aria-labelledby="controlsLabel">
<form class="modal-dialog" role="document" method="get">
<div class="modal-content">
<div class="modal-header">
<button type="button" class="close" data-dismiss="modal" aria-label="Close"><span aria-hidden="true">&times;</span></button>
<h4 class="modal-title">Controls</h4>
</div>
<div class="modal-body">
<div class="formgroup">
<label for="debug">Enable debug()?</label>
<input name="debug" type="checkbox" <?= $input['debug'] ?> class="form-control">
</div>
<div class="formgroup">
<label for="type">Type of test to run</label>
<select name="type" class="form-control">
<option value="test" <?= $input['type']['test'] ?>>Test</option>
<option value="bench" <?= $input['type']['bench'] ?>>Benchmark</option>
<option value="code" <?= $input['type']['codes'] ?>>Codes</option>
</select>
</div>
<div class="formgroup">
<label for="fatal">End tests if one fails</label>
<input name="fatal" type="checkbox" <?= $input['fatal'] ?> class="form-control">
</div>
<div class="formgroup">
<label for="iterations">Number of iterations</label>
<input name="iterations" type="text" value="<?= $input['iterations'] ?>" class="form-control">
</div>
</div>
<div class="modal-footer">
<button type="button" class="btn btn-default" data-dismiss="modal">Close</button>
<button type="submit" class="btn btn-primary">Save changes</button>
</div>
</div><!-- /.modal-content -->
</form><!-- /.modal-dialog -->
</div>
<script>
$(document).ready(function(){
$('table').DataTable();
});</script>
</body>
</html>
<?php
/**
* The test messags.
* Generally, they go from less to more complex
*/
return array(
// Nothing. It should just return
'',
// It shouldn't treat these as a bool
'false',
'0',
'array()',
' ',
// Simple message, no BBC
'hello world',
'foo bar',
"Breaker\nbreaker\n1\n9",
// Simple BBC
'[b]Bold[/b]',
'[i]Italics[/i]',
'[u]Underline[/u]',
'[s]Strike through[/s]',
'[b][i][u]Bold, italics, underline[/u][/i][/b]',
'Super[sup]script[/sup]',
'Sub[sub]script[/sub]',
'[sup]Super[/sup]-[sub]sub[/sub]-script',
// A longer message but without bbc
'This is a div with multiple classes and no ID. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec volutpat tellus vulputate dui venenatis quis euismod turpis pellentesque. Suspendisse [sit] amet ipsum eu odio sagittis ultrices at non sapien. Quisque viverra feugiat purus, eu mollis felis condimentum id. In luctus faucibus felis eget viverra. Vivamus et velit orci. In in tellus mauris, at fermentum diam. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Sed a magna nunc, vel tempor magna. Nam dictum, arcu in pretium varius, libero enim hendrerit nisl, et commodo enim sapien eu augue.
This is a div with multiple classes and no ID. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec volutpat tellus vulputate dui venenatis quis euismod turpis pellentesque. Suspendisse [sit] amet ipsum eu odio sagittis ultrices at non sapien. Quisque viverra feugiat purus, eu mollis felis condimentum id. In luctus faucibus felis eget viverra. Vivamus et velit orci. In in tellus mauris, at fermentum diam. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Sed a magna nunc, vel tempor magna. Nam dictum, arcu in pretium varius, libero enim hendrerit nisl, et commodo enim sapien eu augue.
This is a div with multiple classes and no ID. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec volutpat tellus vulputate dui venenatis quis euismod turpis pellentesque. Suspendisse [sit] amet ipsum eu odio sagittis ultrices at non sapien. Quisque viverra feugiat purus, eu mollis felis condimentum id. In luctus faucibus felis eget viverra. Vivamus et velit orci. In in tellus mauris, at fermentum diam. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Sed a magna nunc, vel tempor magna. Nam dictum, arcu in pretium varius, libero enim hendrerit nisl, et commodo enim sapien eu augue.
This is a div with multiple classes and no ID. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec volutpat tellus vulputate dui venenatis quis euismod turpis pellentesque. Suspendisse [sit] amet ipsum eu odio sagittis ultrices at non sapien. Quisque viverra feugiat purus, eu mollis felis condimentum id. In luctus faucibus felis eget viverra. Vivamus et velit orci. In in tellus mauris, at fermentum diam. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Sed a magna nunc, vel tempor magna. Nam dictum, arcu in pretium varius, libero enim hendrerit nisl, et commodo enim sapien eu augue.
This is a div with multiple classes and no ID. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec volutpat tellus vulputate dui venenatis quis euismod turpis pellentesque. Suspendisse [sit] amet ipsum eu odio sagittis ultrices at non sapien. Quisque viverra feugiat purus, eu mollis felis condimentum id. In luctus faucibus felis eget viverra. Vivamus et velit orci. In in tellus mauris, at fermentum diam. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Sed a magna nunc, vel tempor magna. Nam dictum, arcu in pretium varius, libero enim hendrerit nisl, et commodo enim sapien eu augue.',
// A message that might have bbc, but really doesn't
'This message doesn\'t actually have [ bbc',
'Neither does [] this one',
'Nor do[es] this one',
'Not ev[en] this on[/en] has bbc',
'This one is sneaky: [/] [ /] [ /] [ /]',
// Time for smilies
' :) ',
':)',
'Smile :)',
'Smil:)ey face',
'Now for all of the default: :) ;) :D ;D :( :( ::) >:( >:( :o 8) ??? :P :-[ :-X :-\ :-\ :-* :-* :\'( O:-) ',
'and the good old whatzup??? which should not show',
// Time to test BBC
'[b]This statement is bold[/b]',
'[url=http://www.google.com]Google[/url]. Basic unparsed equals',
'[nobbc][b]please do not parse this[/b][/nobbc]',
'[br][hr][br /][hr /]',
"[code][/code]\ne",
// Lists are probably the most complicated part of the parser
'[list][li]short list[/li][/list]',
'[list][li]short list[/li][li]to do[/li][li]growing[/li][/list]',
'[list][li]quick[li]no[li]time[li]for[li]closing[li]tags[/list]',
'[list][li]I[/li][li]feel[list][li]like[/list][li]Santa[/li][/list]',
'[list type=decimal][li]Lorem ipsum dolor sit amet, consectetuer adipiscing elit.[/li][li]Aliquam laoreet pulvinar sem. Aenean at odio.[/li][/list]',
// Tables
'[table][tr][td]remember[/td][td]frontpage?[/td][/tr][/table]',
'[table][tr][td]let me see[/td][td][table][tr][td]if[/td][td]I[/td][/tr][tr][td]can[/td][td]break[/td][/tr][tr][td]the[/td][td]internet[/td][/td][/tr][/table]',
'[table][tr][th][/th][/tr][tr][td][/td][/tr][tr][td][/td][/tr][/table]',
// Images
'[img width=500]http://www.google.com/intl/en_ALL/images/srpr/logo1w.png[/img]',
'[img height=50]http://www.google.com/intl/en_ALL/images/srpr/logo1w.png[/img]',
'[img width=43 alt="google" height=50]http://www.google.com/intl/en_ALL/images/srpr/logo1w.png[/img]',
// Quotes are actually a huge part of the parser
'[quote]If at first you do not succeed; call it version 1.0[/quote]',
'[quote=&quot;Edsger Dijkstra&quot;]If debugging is the process of removing software bugs, then programming must be the process of putting them in[/quote]',
'[quote author=Gates]Measuring programming progress by lines of code is like measuring aircraft building progress by weight.[/quote]',
'[quote]Some[quote]basic[/quote]nesting[/quote]',
'[quote][quote][quote][quote]Some[quote]basic[/quote]nesting[/quote]Still[/quote]not[/quote]deep[/quote]enough',
'[quote author=Mutt & Jeff link=topic=14764.msg87204#msg87204 date=1329175080]Lorem ipsum dolor sit amet, consectetur adipiscing elit.[/quote]',
'[quote link=topic=14764.msg87204#msg87204 author=Mutt & Jeff date=1329175080]I started a band called 999 Megabytes. We don&apos;t have a gig yet.[/quote]',
'[quote=Joe Doe joe@email.com]Here is what Joe said.[/quote]',
// Nested Quotes
'[quote]Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque non sapien a eros tincidunt accumsan. Ut nisl dui, dignissim at posuere quis, facilisis eget lectus. Morbi vitae massa eu metus pharetra rhoncus. Suspendisse potenti. Phasellus laoreet dapibus dapibus. Duis faucibus lacinia diam, nec pharetra est pharetra vitae. Etiam sodales, nulla et ullamcorper mattis, augue nunc sollicitudin risus, nec imperdiet est leo vitae est. Integer ultricies, metus at scelerisque interdum, sapien lorem mollis orci, vel mattis felis augue vitae nunc. Fusce eget sem sed orci interdum commodo sit amet et metus. In ultricies feugiat eleifend. Aliquam erat volutpat.
[quote author=ElkArte]Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque non sapien a eros tincidunt accumsan. Ut nisl dui, dignissim at posuere quis, facilisis eget lectus. Morbi vitae massa eu metus pharetra rhoncus. Suspendisse potenti. Phasellus laoreet dapibus dapibus. Duis faucibus lacinia diam, nec pharetra est pharetra vitae. Etiam sodales, nulla et ullamcorper mattis, augue nunc sollicitudin risus, nec imperdiet est leo vitae est. Integer ultricies, metus at scelerisque interdum, sapien lorem mollis orci, vel mattis felis augue vitae nunc. Fusce eget sem sed orci interdum commodo sit amet et metus. In ultricies feugiat eleifend. Aliquam erat volutpat.[/quote]
[quote link=topic=14764.msg87204#msg87204 date=1329175080 author=Mutt & Jeff]Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque non sapien a eros tincidunt accumsan. Ut nisl dui, dignissim at posuere quis, facilisis eget lectus. Morbi vitae massa eu metus pharetra rhoncus. Suspendisse potenti. Phasellus laoreet dapibus dapibus. Duis faucibus lacinia diam, nec pharetra est pharetra vitae. Etiam sodales, nulla et ullamcorper mattis, augue nunc sollicitudin risus, nec imperdiet est leo vitae est. Integer ultricies, metus at scelerisque interdum, sapien lorem mollis orci, vel mattis felis augue vitae nunc. Fusce eget sem sed orci interdum commodo sit amet et metus. In ultricies feugiat eleifend. Aliquam erat volutpat.
[quote author=Mutt & Jeff link=topic=14764.msg87204#msg87204 date=1329175080]Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque non sapien a eros tincidunt accumsan. Ut nisl dui, dignissim at posuere quis, facilisis eget lectus. Morbi vitae massa eu metus pharetra rhoncus. Suspendisse potenti. Phasellus laoreet dapibus dapibus. Duis faucibus lacinia diam, nec pharetra est pharetra vitae. Etiam sodales, nulla et ullamcorper mattis, augue nunc sollicitudin risus, nec imperdiet est leo vitae est. Integer ultricies, metus at scelerisque interdum, sapien lorem mollis orci, vel mattis felis augue vitae nunc. Fusce eget sem sed orci interdum commodo sit amet et metus. In ultricies feugiat eleifend. Aliquam erat volutpat.[/quote]
[/quote]
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
[/quote]',
// Item codes... suck.
"[*]one dot\n[*]two dots",
"[*]Ahoy!\n[*]Me[@]Matey\n[+]Shiver\n[x]Me\n[#]Timbers\n[!]\n[*]I[*]dunno[*]why",
// Autolinks (specifically avoiding FTP)
'http://www.google.com',
'https://google.com',
'http://google.de',
'www.google.com',
'me@email.com',
'http://www.cool.guy/linked?no&8)',
'http://www.facebook.com/profile.php?id=1439984468#!/group.php?gid=103300379708494&ref=ts',
'www.ñchan.org',
// FTP Autolinks
'[ftp]http://somewhere.com/[/ftp]',
// Autolinks inside of links:
'[url=http://www.google.com/]test www.elkarte.net test[/url]',
'[url=http://www.elkarte.org/community/index.php [^]]ask us for assistance[/url]',
// These shouldn't be autolinked.
'[url=https://www.google.com]http://www.google.com/404[/url]',
'[url=https://www.google.com]www.google.com[/url]',
'[url=https://www.google.com]you@mailed.it[/url]',
// URIs in no autolink areas
'[url=http://www.google.com]www.bing.com[/url]',
'[iurl=http://www.google.com]www.bing.com[/iurl]',
'[email=jack@theripper.com]www.bing.com[/email]',
'[url=http://www.google.com]iam@batman.net[/url]',
// Links inside links:
'[url=http://www.google.com/]this url has [email=someone@someplace.org]an email[/email][/url]',
'[url=http://www.yahoo.com]another URL[/url] in it![/url]',
// Colors
'[color=red]red[/color][color=green]green[/color][color=blue]blue[/color]',
'[color=red]Lorem ipsum dolor sit amet, consectetur adipiscing elit.[/color]',
'[color=blue]Volutpat tellus vulputate dui venenatis quis euismod turpis pellentesque.[/color]',
'[color=#f66]Suspendisse sit amet ipsum eu odio sagittis ultrices at non sapien.[/color]',
'[color=#ff0088]Quisque viverra feugiat purus, in luctus faucibus felis eget viverra.[/color]',
'[color=#cccccc]Suspendisse sit amet ipsum eu odio sagittis ultrices at non sapien.[/color]',
'[color=DarkSlateBlue]this is colored![/color]',
// Fonts
'[size=4]Font Family[/size]',
'[font=Arial]Lorem ipsum dolor sit amet, consectetur adipiscing elit.[/font]',
'[font=Tahoma]Suspendisse sit amet ipsum eu odio sagittis ultrices at non sapien.[/font]',
'[font=Monospace]Quisque viverra feugiat purus, in luctus faucibus felis eget viverra.[/font]',
'[font=Times]Suspendisse sit amet ipsum eu odio sagittis ultrices at non sapien.[/font]',
// Bad BBC
'[i]lets go for italics',
'[u][i]Why do you do this to yourself?[/u][/i]',
'[u][quote]should not get underlined[/quote][/u]',
'[img src=www.here.com/index.php?action=dlattach] this is actually a security issue',
'[quote this=should not=work but=maybe it=will]only a test will tell[/quote]',
// Footnotes
'Footnote[footnote]Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec volutpat tellus vulputate dui venenatis quis euismod turpis pellentesque. Suspendisse sit amet ipsum eu odio sagittis ultrices at non sapien. Quisque viverra feugiat purus, eu mollis felis condimentum id. In luctus faucibus felis eget viverra. Vivamus et velit orci. In in tellus mauris, at fermentum diam. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Sed a magna nunc, vel tempor magna. Nam dictum, arcu in pretium varius, libero enim hendrerit nisl, et commodo enim sapien eu augue. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Suspendisse potenti. Proin tempor porta porttitor. Nullam a malesuada arcu.[/footnote]',
// Spoilers
'[spoiler]Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec volutpat tellus vulputate dui venenatis quis euismod turpis pellentesque. Suspendisse sit amet ipsum eu odio sagittis ultrices at non sapien. Quisque viverra feugiat purus, eu mollis felis condimentum id. In luctus faucibus felis eget viverra. Vivamus et velit orci. In in tellus mauris, at fermentum diam. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Sed a magna nunc, vel tempor magna. Nam dictum, arcu in pretium varius, libero enim hendrerit nisl, et commodo enim sapien eu augue. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Suspendisse potenti. Proin tempor porta porttitor. Nullam a malesuada arcu.[/spoiler]',
// Align
'[center]Center Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aliquam laoreet pulvinar sem. Aenean at odio.[/center]',
'[tt]Teletype Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Donec elit. Fusce eget enim. Nullam tellus felis, sodales nec, sodales ac, commodo eu, ante.[/tt]',
'[right]Right Curabitur tincidunt, lacus eget iaculis tincidunt, elit libero iaculis arcu, eleifend condimentum sem est quis dolor. Curabitur sed tellus. Donec id dolor.[/right]',
'[left]Left Curabitur tincidunt, lacus eget iaculis tincidunt, elit libero iaculis arcu, eleifend condimentum sem est quis dolor. Curabitur sed tellus. Donec id dolor.[/left]',
'[pre]Pre .. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Donec elit. Fusce eget enim. Nullam tellus felis, sodales nec, sodales ac, commodo eu, ante.[/pre]',
// Code
'[code]bee boop bee booo[/code]',
'Everyone\n[code]\ngets a line\n[/code]\nbreak',
'You\n[code=me]\nget 1\n[/code]and [code]\nyou get one[/code]',
'[code]I [b]am[/b] a robot [quote]bee boo bee boop[/quote][/code]',
"[code] this has tabs\n\n\n tab\n tab\n[/code]\neven\tsome\toutside\t THE code",
'[code=php]
<?php
/**
* This controller is the most important and probably most accessed of all.
* It controls topic display, with all related.
*/
class Display_Controller
{
/**
* Default action handler for this controller
*/
public function action_index()
{
// what to do... display things!
$this->action_display();
}?>
[/code]',
'[code][b]Bold[/b]
Italics
Underline
Strike through[/code]',
'[code]email@domain.com
:] :/ >[ :p >_>
:happy: :aw: :cool: :kiss: :meh: :mmf: :heart:
[/code]',
// Just to test unparsed_commas, but you need to add the defintion to the parser
'[glow=red,2,50]glow[/glow]',
);
<?php
/**
* Parse bulletin board code in a string, as well as smileys optionally.
*
* What it does:
* - only parses bbc tags which are not disabled in disabledBBC.
* - handles basic HTML, if enablePostHTML is on.
* - caches the from/to replace regular expressions so as not to reload them every time a string is parsed.
* - only parses smileys if smileys is true.
* - does nothing if the enableBBC setting is off.
* - uses the cache_id as a unique identifier to facilitate any caching it may do.
* - returns the modified message.
*
* @param string|false $message if false return list of enabled bbc codes
* @param bool|string $smileys = true
* @param string $cache_id = ''
* @param string[]|null $parse_tags array of tags to parse, null for all
* @return string
*/
function parse_bbc($message, $smileys = true, $cache_id = '', $parse_tags = array())
{
global $txt, $scripturl, $context, $modSettings, $user_info;
static $bbc_codes = array(), $itemcodes = array(), $no_autolink_tags = array();
static $disabled, $default_disabled, $parse_tag_cache;
// Don't waste cycles
if ($message === '')
return '';
// Clean up any cut/paste issues we may have
$message = sanitizeMSCutPaste($message);
// If the load average is too high, don't parse the BBC.
if (!empty($modSettings['bbc']) && $modSettings['current_load'] >= $modSettings['bbc'])
{
$context['disabled_parse_bbc'] = true;
return $message;
}
if ($smileys !== null && ($smileys == '1' || $smileys == '0'))
$smileys = (bool) $smileys;
if (empty($modSettings['enableBBC']) && $message !== false)
{
if ($smileys === true)
parsesmileys($message);
return $message;
}
// Allow addons access before entering the main parse_bbc loop
call_integration_hook('integrate_pre_parsebbc', array(&$message, &$smileys, &$cache_id, &$parse_tags));
// Sift out the bbc for a performance improvement.
if (empty($bbc_codes) || $message === false)
{
if (!empty($modSettings['disabledBBC']))
{
$temp = explode(',', strtolower($modSettings['disabledBBC']));
foreach ($temp as $tag)
$disabled[trim($tag)] = true;
}
/* The following bbc are formatted as an array, with keys as follows:
tag: the tag's name - should be lowercase!
type: one of...
- (missing): [tag]parsed content[/tag]
- unparsed_equals: [tag=xyz]parsed content[/tag]
- parsed_equals: [tag=parsed data]parsed content[/tag]
- unparsed_content: [tag]unparsed content[/tag]
- closed: [tag], [tag/], [tag /]
- unparsed_commas: [tag=1,2,3]parsed content[/tag]
- unparsed_commas_content: [tag=1,2,3]unparsed content[/tag]
- unparsed_equals_content: [tag=...]unparsed content[/tag]
parameters: an optional array of parameters, for the form
[tag abc=123]content[/tag]. The array is an associative array
where the keys are the parameter names, and the values are an
array which may contain the following:
- match: a regular expression to validate and match the value.
- quoted: true if the value should be quoted.
- validate: callback to evaluate on the data, which is $data.
- value: a string in which to replace $1 with the data.
either it or validate may be used, not both.
- optional: true if the parameter is optional.
test: a regular expression to test immediately after the tag's
'=', ' ' or ']'. Typically, should have a \] at the end.
Optional.
content: only available for unparsed_content, closed,
unparsed_commas_content, and unparsed_equals_content.
$1 is replaced with the content of the tag. Parameters
are replaced in the form {param}. For unparsed_commas_content,
$2, $3, ..., $n are replaced.
before: only when content is not used, to go before any
content. For unparsed_equals, $1 is replaced with the value.
For unparsed_commas, $1, $2, ..., $n are replaced.
after: similar to before in every way, except that it is used
when the tag is closed.
disabled_content: used in place of content when the tag is
disabled. For closed, default is '', otherwise it is '$1' if
block_level is false, '<div>$1</div>' elsewise.
disabled_before: used in place of before when disabled. Defaults
to '<div>' if block_level, '' if not.
disabled_after: used in place of after when disabled. Defaults
to '</div>' if block_level, '' if not.
block_level: set to true the tag is a "block level" tag, similar
to HTML. Block level tags cannot be nested inside tags that are
not block level, and will not be implicitly closed as easily.
One break following a block level tag may also be removed.
trim: if set, and 'inside' whitespace after the begin tag will be
removed. If set to 'outside', whitespace after the end tag will
meet the same fate.
validate: except when type is missing or 'closed', a callback to
validate the data as $data. Depending on the tag's type, $data
may be a string or an array of strings (corresponding to the
replacement.)
quoted: when type is 'unparsed_equals' or 'parsed_equals' only,
may be not set, 'optional', or 'required' corresponding to if
the content may be quoted. This allows the parser to read
[tag="abc]def[esdf]"] properly.
require_parents: an array of tag names, or not set. If set, the
enclosing tag *must* be one of the listed tags, or parsing won't
occur.
require_children: similar to require_parents, if set children
won't be parsed if they are not in the list.
disallow_children: similar to, but very different from,
require_children, if it is set the listed tags will not be
parsed inside the tag.
disallow_parents: similar to, but very different from,
require_parents, if it is set the listed tags will not be
parsed inside the tag.
parsed_tags_allowed: an array restricting what BBC can be in the
parsed_equals parameter, if desired.
*/
$codes = array(
// @todo Just to test unparsed_commas, but remove this autacity !
array(
'tag' => 'glow',
'type' => 'unparsed_commas',
'test' => '[#0-9a-zA-Z\-]{3,12},([012]\d{1,2}|\d{1,2})(,[^]]+)?\]',
'before' => '<table style="border: 0; border-spacing: 0; padding: 0; display: inline; vertical-align: middle; font: inherit;"><tr><td style="filter: Glow(color=$1, strength=$2); font: inherit;">',
'after' => '</td></tr></table> ',
),
array(
'tag' => 'abbr',
'type' => 'unparsed_equals',
'before' => '<abbr title="$1">',
'after' => '</abbr>',
'quoted' => 'optional',
'disabled_after' => ' ($1)',
),
array(
'tag' => 'anchor',
'type' => 'unparsed_equals',
'test' => '[#]?([A-Za-z][A-Za-z0-9_\-]*)\]',
'before' => '<span id="post_$1">',
'after' => '</span>',
),
array(
'tag' => 'b',
'before' => '<strong class="bbc_strong">',
'after' => '</strong>',
),
array(
'tag' => 'br',
'type' => 'closed',
'content' => '<br />',
),
array(
'tag' => 'center',
'before' => '<div class="centertext">',
'after' => '</div>',
'block_level' => true,
),
array(
'tag' => 'code',
'type' => 'unparsed_content',
'content' => '<div class="codeheader">' . $txt['code'] . ': <a href="javascript:void(0);" onclick="return elkSelectText(this);" class="codeoperation">' . $txt['code_select'] . '</a></div><pre class="bbc_code prettyprint">$1</pre>',
'validate' => isset($disabled['code']) ? null : function(&$tag, &$data, $disabled) {
if (!isset($disabled['code']))
$data = str_replace("\t", "<span class=\"tab\">\t</span>", $data);
},
'block_level' => true,
),
array(
'tag' => 'code',
'type' => 'unparsed_equals_content',
'content' => '<div class="codeheader">' . $txt['code'] . ': ($2) <a href="#" onclick="return elkSelectText(this);" class="codeoperation">' . $txt['code_select'] . '</a></div><pre class="bbc_code prettyprint">$1</pre>',
'validate' => isset($disabled['code']) ? null : function(&$tag, &$data, $disabled) {
if (!isset($disabled['code']))
$data[0] = str_replace("\t", "<span class=\"tab\">\t</span>", $data[0]);
},
'block_level' => true,
),
array(
'tag' => 'color',
'type' => 'unparsed_equals',
'test' => '(#[\da-fA-F]{3}|#[\da-fA-F]{6}|[A-Za-z]{1,20}|rgb\((?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\s?,\s?){2}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\))\]',
'before' => '<span style="color: $1;" class="bbc_color">',
'after' => '</span>',
),
array(
'tag' => 'email',
'type' => 'unparsed_content',
'content' => '<a href="mailto:$1" class="bbc_email">$1</a>',
'validate' => function(&$tag, &$data, $disabled) {
$data = strtr($data, array('<br />' => ''));
},
),
array(
'tag' => 'email',
'type' => 'unparsed_equals',
'before' => '<a href="mailto:$1" class="bbc_email">',
'after' => '</a>',
'disallow_children' => array('email', 'ftp', 'url', 'iurl'),
'disabled_after' => ' ($1)',
),
array(
'tag' => 'footnote',
'before' => '<sup class="bbc_footnotes">%fn%',
'after' => '%fn%</sup>',
'disallow_parents' => array('footnote', 'code', 'anchor', 'url', 'iurl'),
'disallow_before' => '',
'disallow_after' => '',
'block_level' => true,
),
array(
'tag' => 'font',
'type' => 'unparsed_equals',
'test' => '[A-Za-z0-9_,\-\s]+?\]',
'before' => '<span style="font-family: $1;" class="bbc_font">',
'after' => '</span>',
),
array(
'tag' => 'ftp',
'type' => 'unparsed_content',
'content' => '<a href="$1" class="bbc_ftp new_win" target="_blank">$1</a>',
'validate' => function(&$tag, &$data, $disabled) {
$data = strtr($data, array('<br />' => ''));
if (strpos($data, 'ftp://') !== 0 && strpos($data, 'ftps://') !== 0)
$data = 'ftp://' . $data;
},
),
array(
'tag' => 'ftp',
'type' => 'unparsed_equals',
'before' => '<a href="$1" class="bbc_ftp new_win" target="_blank">',
'after' => '</a>',
'validate' => function(&$tag, &$data, $disabled) {
if (strpos($data, 'ftp://') !== 0 && strpos($data, 'ftps://') !== 0)
$data = 'ftp://' . $data;
},
'disallow_children' => array('email', 'ftp', 'url', 'iurl'),
'disabled_after' => ' ($1)',
),
array(
'tag' => 'hr',
'type' => 'closed',
'content' => '<hr />',
'block_level' => true,
),
array(
'tag' => 'i',
'before' => '<em>',
'after' => '</em>',
),
array(
'tag' => 'img',
'type' => 'unparsed_content',
'parameters' => array(
'alt' => array('optional' => true),
'width' => array('optional' => true, 'value' => 'width:100%;max-width:$1px;', 'match' => '(\d+)'),
'height' => array('optional' => true, 'value' => 'max-height:$1px;', 'match' => '(\d+)'),
),
'content' => '<img src="$1" alt="{alt}" style="{width}{height}" class="bbc_img resized" />',
'validate' => function(&$tag, &$data, $disabled) {
$data = strtr($data, array('<br />' => ''));
if (strpos($data, 'http://') !== 0 && strpos($data, 'https://') !== 0)
$data = 'http://' . $data;
},
'disabled_content' => '($1)',
),
array(
'tag' => 'img',
'type' => 'unparsed_content',
'content' => '<img src="$1" alt="" class="bbc_img" />',
'validate' => function(&$tag, &$data, $disabled) {
$data = strtr($data, array('<br />' => ''));
if (strpos($data, 'http://') !== 0 && strpos($data, 'https://') !== 0)
$data = 'http://' . $data;
},
'disabled_content' => '($1)',
),
array(
'tag' => 'iurl',
'type' => 'unparsed_content',
'content' => '<a href="$1" class="bbc_link">$1</a>',
'validate' => function(&$tag, &$data, $disabled) {
$data = strtr($data, array('<br />' => ''));
if (strpos($data, 'http://') !== 0 && strpos($data, 'https://') !== 0)
$data = 'http://' . $data;
},
),
array(
'tag' => 'iurl',
'type' => 'unparsed_equals',
'before' => '<a href="$1" class="bbc_link">',
'after' => '</a>',
'validate' => function(&$tag, &$data, $disabled) {
if ($data[0] === '#')
$data = '#post_' . substr($data, 1);
elseif (strpos($data, 'http://') !== 0 && strpos($data, 'https://') !== 0)
$data = 'http://' . $data;
},
'disallow_children' => array('email', 'ftp', 'url', 'iurl'),
'disabled_after' => ' ($1)',
),
array(
'tag' => 'left',
'before' => '<div style="text-align: left;">',
'after' => '</div>',
'block_level' => true,
),
array(
'tag' => 'li',
'before' => '<li>',
'after' => '</li>',
'trim' => 'outside',
'require_parents' => array('list'),
'block_level' => true,
'disabled_before' => '',
'disabled_after' => '<br />',
),
array(
'tag' => 'list',
'before' => '<ul class="bbc_list">',
'after' => '</ul>',
'trim' => 'inside',
'require_children' => array('li', 'list'),
'block_level' => true,
),
array(
'tag' => 'list',
'parameters' => array(
'type' => array('match' => '(none|disc|circle|square|decimal|decimal-leading-zero|lower-roman|upper-roman|lower-alpha|upper-alpha|lower-greek|lower-latin|upper-latin|hebrew|armenian|georgian|cjk-ideographic|hiragana|katakana|hiragana-iroha|katakana-iroha)'),
),
'before' => '<ul class="bbc_list" style="list-style-type: {type};">',
'after' => '</ul>',
'trim' => 'inside',
'require_children' => array('li'),
'block_level' => true,
),
array(
'tag' => 'me',
'type' => 'unparsed_equals',
'before' => '<div class="meaction">&nbsp;$1 ',
'after' => '</div>',
'quoted' => 'optional',
'block_level' => true,
'disabled_before' => '/me ',
'disabled_after' => '<br />',
),
array(
'tag' => 'member',
'type' => 'unparsed_equals',
'test' => '[\d*]',
'before' => '<span class="bbc_mention"><a href="' . $scripturl . '?action=profile;u=$1">@',
'after' => '</a></span>',
'disabled_before' => '@',
'disabled_after' => '',
),
array(
'tag' => 'nobbc',
'type' => 'unparsed_content',
'content' => '$1',
),
array(
'tag' => 'pre',
'before' => '<pre class="bbc_pre">',
'after' => '</pre>',
),
array(
'tag' => 'quote',
'before' => '<div class="quoteheader">' . $txt['quote'] . '</div><blockquote>',
'after' => '</blockquote>',
'block_level' => true,
),
array(
'tag' => 'quote',
'parameters' => array(
'author' => array('match' => '(.{1,192}?\]?)', 'quoted' => true),
),
'before' => '<div class="quoteheader">' . $txt['quote_from'] . ': {author}</div><blockquote>',
'after' => '</blockquote>',
'block_level' => true,
),
array(
'tag' => 'quote',
'type' => 'parsed_equals',
'before' => '<div class="quoteheader">' . $txt['quote_from'] . ': $1</div><blockquote>',
'after' => '</blockquote>',
'quoted' => 'optional',
// Don't allow everything to be embedded with the author name.
'parsed_tags_allowed' => array('url', 'iurl', 'ftp'),
'block_level' => true,
),
array(
'tag' => 'quote',
'parameters' => array(
'author' => array('match' => '([^<>]{1,192}?)'),
'link' => array('match' => '(?:board=\d+;)?((?:topic|threadid)=[\dmsg#\./]{1,40}(?:;start=[\dmsg#\./]{1,40})?|msg=\d{1,40}|action=profile;u=\d+)'),
'date' => array('match' => '(\d+)', 'validate' => 'htmlTime'),
),
'before' => '<div class="quoteheader"><a href="' . $scripturl . '?{link}">' . $txt['quote_from'] . ': {author} ' . ($modSettings['todayMod'] == 3 ? ' - ' : $txt['search_on']) . ' {date}</a></div><blockquote>',
'after' => '</blockquote>',
'block_level' => true,
),
array(
'tag' => 'quote',
'parameters' => array(
'author' => array('match' => '(.{1,192}?)'),
),
'before' => '<div class="quoteheader">' . $txt['quote_from'] . ': {author}</div><blockquote>',
'after' => '</blockquote>',
'block_level' => true,
),
array(
'tag' => 'right',
'before' => '<div style="text-align: right;">',
'after' => '</div>',
'block_level' => true,
),
array(
'tag' => 's',
'before' => '<del>',
'after' => '</del>',
),
array(
'tag' => 'size',
'type' => 'unparsed_equals',
'test' => '([1-9][\d]?p[xt]|small(?:er)?|large[r]?|x[x]?-(?:small|large)|medium|(0\.[1-9]|[1-9](\.[\d][\d]?)?)?em)\]',
'before' => '<span style="font-size: $1;" class="bbc_size">',
'after' => '</span>',
'disallow_parents' => array('size'),
'disallow_before' => '<span>',
'disallow_after' => '</span>',
),
array(
'tag' => 'size',
'type' => 'unparsed_equals',
'test' => '[1-7]\]',
'before' => '<span style="font-size: $1;" class="bbc_size">',
'after' => '</span>',
'validate' => function(&$tag, &$data, $disabled) {
$sizes = array(1 => 0.7, 2 => 1.0, 3 => 1.35, 4 => 1.45, 5 => 2.0, 6 => 2.65, 7 => 3.95);
$data = $sizes[$data] . 'em';
},
'disallow_parents' => array('size'),
'disallow_before' => '<span>',
'disallow_after' => '</span>',
),
array(
'tag' => 'spoiler',
'before' => '<span class="spoilerheader">' . $txt['spoiler'] . '</span><div class="spoiler"><div class="bbc_spoiler" style="display: none;">',
'after' => '</div></div>',
'block_level' => true,
),
array(
'tag' => 'sub',
'before' => '<sub>',
'after' => '</sub>',
),
array(
'tag' => 'sup',
'before' => '<sup>',
'after' => '</sup>',
),
array(
'tag' => 'table',
'before' => '<div class="bbc_table_container"><table class="bbc_table">',
'after' => '</table></div>',
'trim' => 'inside',
'require_children' => array('tr'),
'block_level' => true,
),
array(
'tag' => 'td',
'before' => '<td>',
'after' => '</td>',
'require_parents' => array('tr'),
'trim' => 'outside',
'block_level' => true,
'disabled_before' => '',
'disabled_after' => '',
),
array(
'tag' => 'th',
'before' => '<th>',
'after' => '</th>',
'require_parents' => array('tr'),
'trim' => 'outside',
'block_level' => true,
'disabled_before' => '',
'disabled_after' => '',
),
array(
'tag' => 'tr',
'before' => '<tr>',
'after' => '</tr>',
'require_parents' => array('table'),
'require_children' => array('td', 'th'),
'trim' => 'both',
'block_level' => true,
'disabled_before' => '',
'disabled_after' => '',
),
array(
'tag' => 'tt',
'before' => '<span class="bbc_tt">',
'after' => '</span>',
),
array(
'tag' => 'u',
'before' => '<span class="bbc_u">',
'after' => '</span>',
),
array(
'tag' => 'url',
'type' => 'unparsed_content',
'content' => '<a href="$1" class="bbc_link" target="_blank">$1</a>',
'validate' => function(&$tag, &$data, $disabled) {
$data = strtr($data, array('<br />' => ''));
if (strpos($data, 'http://') !== 0 && strpos($data, 'https://') !== 0)
$data = 'http://' . $data;
},
),
array(
'tag' => 'url',
'type' => 'unparsed_equals',
'before' => '<a href="$1" class="bbc_link" target="_blank">',
'after' => '</a>',
'validate' => function(&$tag, &$data, $disabled) {
if (strpos($data, 'http://') !== 0 && strpos($data, 'https://') !== 0)
$data = 'http://' . $data;
},
'disallow_children' => array('email', 'ftp', 'url', 'iurl'),
'disabled_after' => ' ($1)',
),
);
// Inside these tags autolink is not recommendable.
$no_autolink_tags = array(
'url',
'iurl',
'ftp',
'email',
);
// So the parser won't skip them.
$itemcodes = array(
'*' => 'disc',
'@' => 'disc',
'+' => 'square',
'x' => 'square',
'#' => 'decimal',
'0' => 'decimal',
'o' => 'circle',
'O' => 'circle',
);
// Let addons add new BBC without hassle.
call_integration_hook('integrate_bbc_codes', array(&$codes, &$no_autolink_tags, &$itemcodes));
// This is mainly for the bbc manager, so it's easy to add tags above. Custom BBC should be added above this line.
if ($message === false)
{
if (isset($temp_bbc))
$bbc_codes = $temp_bbc;
return $codes;
}
if (!isset($disabled['li']) && !isset($disabled['list']))
{
foreach ($itemcodes as $c => $dummy)
$bbc_codes[$c] = array();
}
foreach ($codes as $code)
$bbc_codes[substr($code['tag'], 0, 1)][] = $code;
}
// If we are not doing every enabled tag then create a cache for this parsing group.
if ($parse_tags !== array() && is_array($parse_tags))
{
$temp_bbc = $bbc_codes;
$tags_cache_id = implode(',', $parse_tags);
if (!isset($default_disabled))
$default_disabled = isset($disabled) ? $disabled : array();
// Already cached, use it, otherwise create it
if (isset($parse_tag_cache[$tags_cache_id]))
list ($bbc_codes, $disabled) = $parse_tag_cache[$tags_cache_id];
else
{
foreach ($bbc_codes as $key_bbc => $bbc)
{
foreach ($bbc as $key_code => $code)
{
if (!in_array($code['tag'], $parse_tags))
{
$disabled[$code['tag']] = true;
unset($bbc_codes[$key_bbc][$key_code]);
}
}
}
$parse_tag_cache[$tags_cache_id] = array($bbc_codes, $disabled);
}
}
elseif (isset($default_disabled))
$disabled = $default_disabled;
// Shall we take the time to cache this?
if ($cache_id !== '' && !empty($modSettings['cache_enable']) && (($modSettings['cache_enable'] >= 2 && isset($message[1000])) || isset($message[2400])) && empty($parse_tags))
{
// It's likely this will change if the message is modified.
$cache_key = 'parse:' . $cache_id . '-' . md5(md5($message) . '-' . $smileys . (empty($disabled) ? '' : implode(',', array_keys($disabled))) . serialize($context['browser']) . $txt['lang_locale'] . $user_info['time_offset'] . $user_info['time_format']);
if (($temp = cache_get_data($cache_key, 240)) !== null)
return $temp;
$cache_t = microtime(true);
}
if ($smileys === 'print')
{
// Colors can't well be displayed... supposed to be black and white.
$disabled['color'] = true;
$disabled['me'] = true;
// Links are useless on paper... just show the link.
$disabled['url'] = true;
$disabled['iurl'] = true;
$disabled['email'] = true;
// @todo Change maybe?
if (!isset($_GET['images']))
$disabled['img'] = true;
// @todo Interface/setting to add more?
}
$open_tags = array();
$message = strtr($message, array("\n" => '<br />'));
// The non-breaking-space looks a bit different each time.
$non_breaking_space = '\x{A0}';
$pos = -1;
while ($pos !== false)
{
$last_pos = isset($last_pos) ? max($pos, $last_pos) : $pos;
$pos = strpos($message, '[', $pos + 1);
// Failsafe.
if ($pos === false || $last_pos > $pos)
$pos = strlen($message) + 1;
// Can't have a one letter smiley, URL, or email! (sorry.)
if ($last_pos < $pos - 1)
{
// Make sure the $last_pos is not negative.
$last_pos = max($last_pos, 0);
// Pick a block of data to do some raw fixing on.
$data = substr($message, $last_pos, $pos - $last_pos);
// Take care of some HTML!
if (!empty($modSettings['enablePostHTML']) && strpos($data, '&lt;') !== false)
{
$data = preg_replace('~&lt;a\s+href=((?:&quot;)?)((?:https?://|ftps?://|mailto:)\S+?)\\1&gt;~i', '[url=$2]', $data);
$data = preg_replace('~&lt;/a&gt;~i', '[/url]', $data);
// <br /> should be empty.
$empty_tags = array('br', 'hr');
foreach ($empty_tags as $tag)
$data = str_replace(array('&lt;' . $tag . '&gt;', '&lt;' . $tag . '/&gt;', '&lt;' . $tag . ' /&gt;'), '[' . $tag . ' /]', $data);
// b, u, i, s, pre... basic tags.
$closable_tags = array('b', 'u', 'i', 's', 'em', 'ins', 'del', 'pre', 'blockquote');
foreach ($closable_tags as $tag)
{
$diff = substr_count($data, '&lt;' . $tag . '&gt;') - substr_count($data, '&lt;/' . $tag . '&gt;');
$data = strtr($data, array('&lt;' . $tag . '&gt;' => '<' . $tag . '>', '&lt;/' . $tag . '&gt;' => '</' . $tag . '>'));
if ($diff > 0)
$data = substr($data, 0, -1) . str_repeat('</' . $tag . '>', $diff) . substr($data, -1);
}
// Do <img ... /> - with security... action= -> action-.
preg_match_all('~&lt;img\s+src=((?:&quot;)?)((?:https?://|ftps?://)\S+?)\\1(?:\s+alt=(&quot;.*?&quot;|\S*?))?(?:\s?/)?&gt;~i', $data, $matches, PREG_PATTERN_ORDER);
if (!empty($matches[0]))
{
$replaces = array();
foreach ($matches[2] as $match => $imgtag)
{
$alt = empty($matches[3][$match]) ? '' : ' alt=' . preg_replace('~^&quot;|&quot;$~', '', $matches[3][$match]);
// Remove action= from the URL - no funny business, now.
if (preg_match('~action(=|%3d)(?!dlattach)~i', $imgtag) !== 0)
$imgtag = preg_replace('~action(?:=|%3d)(?!dlattach)~i', 'action-', $imgtag);
// Check if the image is larger than allowed.
// @todo - We should seriously look at deprecating some of this in favour of CSS resizing.
if (!empty($modSettings['max_image_width']) && !empty($modSettings['max_image_height']))
{
// For images, we'll want this.
require_once(SUBSDIR . '/Attachments.subs.php');
list ($width, $height) = url_image_size($imgtag);
if (!empty($modSettings['max_image_width']) && $width > $modSettings['max_image_width'])
{
$height = (int) (($modSettings['max_image_width'] * $height) / $width);
$width = $modSettings['max_image_width'];
}
if (!empty($modSettings['max_image_height']) && $height > $modSettings['max_image_height'])
{
$width = (int) (($modSettings['max_image_height'] * $width) / $height);
$height = $modSettings['max_image_height'];
}
// Set the new image tag.
$replaces[$matches[0][$match]] = '[img width=' . $width . ' height=' . $height . $alt . ']' . $imgtag . '[/img]';
}
else
$replaces[$matches[0][$match]] = '[img' . $alt . ']' . $imgtag . '[/img]';
}
$data = strtr($data, $replaces);
}
}
if (!empty($modSettings['autoLinkUrls']))
{
// Are we inside tags that should be auto linked?
$no_autolink_area = false;
if (!empty($open_tags))
{
foreach ($open_tags as $open_tag)
if (in_array($open_tag['tag'], $no_autolink_tags))
$no_autolink_area = true;
}
// Don't go backwards.
// @todo Don't think is the real solution....
$lastAutoPos = isset($lastAutoPos) ? $lastAutoPos : 0;
if ($pos < $lastAutoPos)
$no_autolink_area = true;
$lastAutoPos = $pos;
if (!$no_autolink_area)
{
// Parse any URLs.... have to get rid of the @ problems some things cause... stupid email addresses.
if (!isset($disabled['url']) && (strpos($data, '://') !== false || strpos($data, 'www.') !== false) && strpos($data, '[url') === false)
{
// Switch out quotes really quick because they can cause problems.
$data = strtr($data, array('&#039;' => '\'', '&nbsp;' => "\xC2\xA0", '&quot;' => '>">', '"' => '<"<', '&lt;' => '<lt<'));
// Only do this if the preg survives.
if (is_string($result = preg_replace(array(
'~(?<=[\s>\.(;\'"]|^)((?:http|https)://[\w\-_%@:|]+(?:\.[\w\-_%]+)*(?::\d+)?(?:/[\p{L}\p{N}\-_\~%\.@!,\?&;=#(){}+:\'\\\\]*)*[/\p{L}\p{N}\-_\~%@\?;=#}\\\\])~ui',
'~(?<=[\s>\.(;\'"]|^)((?:ftp|ftps)://[\w\-_%@:|]+(?:\.[\w\-_%]+)*(?::\d+)?(?:/[\w\-_\~%\.@,\?&;=#(){}+:\'\\\\]*)*[/\w\-_\~%@\?;=#}\\\\])~i',
'~(?<=[\s>(\'<]|^)(www(?:\.[\w\-_]+)+(?::\d+)?(?:/[\p{L}\p{N}\-_\~%\.@!,\?&;=#(){}+:\'\\\\]*)*[/\p{L}\p{N}\-_\~%@\?;=#}\\\\])~ui'
), array(
'[url]$1[/url]',
'[ftp]$1[/ftp]',
'[url=http://$1]$1[/url]'
), $data)))
$data = $result;
$data = strtr($data, array('\'' => '&#039;', "\xC2\xA0" => '&nbsp;', '>">' => '&quot;', '<"<' => '"', '<lt<' => '&lt;'));
}
// Next, emails...
if (!isset($disabled['email']) && strpos($data, '@') !== false && strpos($data, '[email') === false)
{
$data = preg_replace('~(?<=[\?\s' . $non_breaking_space . '\[\]()*\\\;>]|^)([\w\-\.]{1,80}@[\w\-]+\.[\w\-\.]+[\w\-])(?=[?,\s' . $non_breaking_space . '\[\]()*\\\]|$|<br />|&nbsp;|&gt;|&lt;|&quot;|&#039;|\.(?:\.|;|&nbsp;|\s|$|<br />))~u', '[email]$1[/email]', $data);
$data = preg_replace('~(?<=<br />)([\w\-\.]{1,80}@[\w\-]+\.[\w\-\.]+[\w\-])(?=[?\.,;\s' . $non_breaking_space . '\[\]()*\\\]|$|<br />|&nbsp;|&gt;|&lt;|&quot;|&#039;)~u', '[email]$1[/email]', $data);
}
}
}
$data = strtr($data, array("\t" => '&nbsp;&nbsp;&nbsp;'));
// If it wasn't changed, no copying or other boring stuff has to happen!
if ($data !== substr($message, $last_pos, $pos - $last_pos))
{
$message = substr($message, 0, $last_pos) . $data . substr($message, $pos);
// Since we changed it, look again in case we added or removed a tag. But we don't want to skip any.
$old_pos = strlen($data) + $last_pos;
$pos = strpos($message, '[', $last_pos);
$pos = $pos === false ? $old_pos : min($pos, $old_pos);
}
}
// Are we there yet? Are we there yet?
if ($pos >= strlen($message) - 1)
break;
$tags = strtolower($message[$pos + 1]);
if ($tags === '/' && !empty($open_tags))
{
$pos2 = strpos($message, ']', $pos + 1);
if ($pos2 === $pos + 2)
continue;
$look_for = strtolower(substr($message, $pos + 2, $pos2 - $pos - 2));
$to_close = array();
$block_level = null;
do
{
$tag = array_pop($open_tags);
if (!$tag)
break;
if (!empty($tag['block_level']))
{
// Only find out if we need to.
if ($block_level === false)
{
array_push($open_tags, $tag);
break;
}
// The idea is, if we are LOOKING for a block level tag, we can close them on the way.
if (strlen($look_for) > 0 && isset($bbc_codes[$look_for[0]]))
{
foreach ($bbc_codes[$look_for[0]] as $temp)
if ($temp['tag'] === $look_for)
{
$block_level = !empty($temp['block_level']);
break;
}
}
if ($block_level !== true)
{
$block_level = false;
array_push($open_tags, $tag);
break;
}
}
$to_close[] = $tag;
}
while ($tag['tag'] !== $look_for);
// Did we just eat through everything and not find it?
if ((empty($open_tags) && (empty($tag) || $tag['tag'] !== $look_for)))
{
$open_tags = $to_close;
continue;
}
elseif (!empty($to_close) && $tag['tag'] !== $look_for)
{
if ($block_level === null && isset($look_for[0], $bbc_codes[$look_for[0]]))
{
foreach ($bbc_codes[$look_for[0]] as $temp)
if ($temp['tag'] === $look_for)
{
$block_level = !empty($temp['block_level']);
break;
}
}
// We're not looking for a block level tag (or maybe even a tag that exists...)
if (!$block_level)
{
foreach ($to_close as $tag)
array_push($open_tags, $tag);
continue;
}
}
foreach ($to_close as $tag)
{
$message = substr_replace($message, "\n" . $tag['after'] . "\n", $pos, $pos2 + 1 - $pos);
$pos += strlen($tag['after']) + 2;
$pos2 = $pos - 1;
// See the comment at the end of the big loop - just eating whitespace ;).
if (!empty($tag['block_level']) && substr($message, $pos, 6) === '<br />')
$message = substr_replace($message, '', $pos, 6);
if (!empty($tag['trim']) && $tag['trim'] !== 'inside' && preg_match('~(<br />|&nbsp;|\s)*~', substr($message, $pos), $matches) !== 0)
$message = substr($message, 0, $pos) . substr($message, $pos + strlen($matches[0]));
}
if (!empty($to_close))
{
$to_close = array();
$pos--;
}
continue;
}
// No tags for this character, so just keep going (fastest possible course.)
if (!isset($bbc_codes[$tags]))
{
continue;
}
$inside = empty($open_tags) ? null : $open_tags[count($open_tags) - 1];
$tag = null;
foreach ($bbc_codes[$tags] as $possible)
{
$pt_strlen = strlen($possible['tag']);
// Not a match?
if (substr_compare($message, $possible['tag'], $pos + 1, $pt_strlen, true) !== 0)
continue;
$next_c = isset($message[$pos + 1 + $pt_strlen]) ? $message[$pos + 1 + $pt_strlen] : '';
// A test validation?
if (isset($possible['test']) && preg_match('~^' . $possible['test'] . '~', substr($message, $pos + 1 + $pt_strlen + 1)) === 0)
continue;
// Do we want parameters?
elseif (!empty($possible['parameters']))
{
if ($next_c !== ' ')
continue;
}
elseif (isset($possible['type']))
{
// Do we need an equal sign?
if ($next_c !== '=' && in_array($possible['type'], array('unparsed_equals', 'unparsed_commas', 'unparsed_commas_content', 'unparsed_equals_content', 'parsed_equals')))
continue;
// Maybe we just want a /...
if ($possible['type'] === 'closed' && $next_c !== ']' && substr($message, $pos + 1 + $pt_strlen, 2) !== '/]' && substr($message, $pos + 1 + $pt_strlen, 3) !== ' /]')
continue;
// An immediate ]?
if ($possible['type'] === 'unparsed_content' && $next_c !== ']')
continue;
}
// No type means 'parsed_content', which demands an immediate ] without parameters!
elseif ($next_c !== ']')
continue;
// Check allowed tree?
if (isset($possible['require_parents']) && ($inside === null || !in_array($inside['tag'], $possible['require_parents'])))
continue;
elseif (isset($inside['require_children']) && !in_array($possible['tag'], $inside['require_children']))
continue;
// If this is in the list of disallowed child tags, don't parse it.
elseif (isset($inside['disallow_children']) && in_array($possible['tag'], $inside['disallow_children']))
continue;
// Not allowed in this parent, replace the tags or show it like regular text
elseif (isset($possible['disallow_parents']) && ($inside !== null && in_array($inside['tag'], $possible['disallow_parents'])))
{
if (!isset($possible['disallow_before'], $possible['disallow_after']))
continue;
$possible['before'] = isset($possible['disallow_before']) ? $tag['disallow_before'] : $possible['before'];
$possible['after'] = isset($possible['disallow_after']) ? $tag['disallow_after'] : $possible['after'];
}
$pos1 = $pos + 1 + $pt_strlen + 1;
// Quotes can have alternate styling, we do this php-side due to all the permutations of quotes.
if ($possible['tag'] === 'quote')
{
// Start with standard
$quote_alt = false;
foreach ($open_tags as $open_quote)
{
// Every parent quote this quote has flips the styling
if ($open_quote['tag'] === 'quote')
$quote_alt = !$quote_alt;
}
// Add a class to the quote to style alternating blockquotes
// @todo - Frankly it makes little sense to allow alternate blockquote
// styling without also catering for alternate quoteheader styling.
// I do remember coding that some time back, but it seems to have gotten
// lost somewhere in the Elk processes.
// Come to think of it, it may be better to append a second class rather
// than alter the standard one.
// - Example: class="bbc_quote" and class="bbc_quote alt_quote".
// This would mean simpler CSS for themes (like default) which do not use the alternate styling,
// but would still allow it for themes that want it.
$possible['before'] = strtr($possible['before'], array('<blockquote>' => '<blockquote class="bbc_' . ($quote_alt ? 'alternate' : 'standard') . '_quote">'));
}
// This is long, but it makes things much easier and cleaner.
if (!empty($possible['parameters']))
{
$preg = array();
foreach ($possible['parameters'] as $p => $info)
$preg[] = '(\s+' . $p . '=' . (empty($info['quoted']) ? '' : '&quot;') . (isset($info['match']) ? $info['match'] : '(.+?)') . (empty($info['quoted']) ? '' : '&quot;') . ')' . (empty($info['optional']) ? '' : '?');
// Okay, this may look ugly and it is, but it's not going to happen much and it is the best way
// of allowing any order of parameters but still parsing them right.
$param_size = count($preg) - 1;
$preg_keys = range(0, $param_size);
$message_stub = substr($message, $pos1 - 1);
// If an addon adds many parameters we can exceed max_execution time, lets prevent that
// 5040 = 7, 40,320 = 8, (N!) etc
$max_iterations = 5040;
// Step, one by one, through all possible permutations of the parameters until we have a match
do {
$match_preg = '~^';
foreach ($preg_keys as $key)
$match_preg .= $preg[$key];
$match_preg .= '\]~i';
// Check if this combination of parameters matches the user input
$match = preg_match($match_preg, $message_stub, $matches) !== 0;
} while (!$match && --$max_iterations && ($preg_keys = pc_next_permutation($preg_keys, $param_size)));
// Didn't match our parameter list, try the next possible.
if (!$match)
continue;
$params = array();
for ($i = 1, $n = count($matches); $i < $n; $i += 2)
{
$key = strtok(ltrim($matches[$i]), '=');
if (isset($possible['parameters'][$key]['value']))
$params['{' . $key . '}'] = strtr($possible['parameters'][$key]['value'], array('$1' => $matches[$i + 1]));
elseif (isset($possible['parameters'][$key]['validate']))
$params['{' . $key . '}'] = $possible['parameters'][$key]['validate']($matches[$i + 1]);
else
$params['{' . $key . '}'] = $matches[$i + 1];
// Just to make sure: replace any $ or { so they can't interpolate wrongly.
$params['{' . $key . '}'] = strtr($params['{' . $key . '}'], array('$' => '&#036;', '{' => '&#123;'));
}
foreach ($possible['parameters'] as $p => $info)
{
if (!isset($params['{' . $p . '}']))
$params['{' . $p . '}'] = '';
}
$tag = $possible;
// Put the parameters into the string.
if (isset($tag['before']))
$tag['before'] = strtr($tag['before'], $params);
if (isset($tag['after']))
$tag['after'] = strtr($tag['after'], $params);
if (isset($tag['content']))
$tag['content'] = strtr($tag['content'], $params);
$pos1 += strlen($matches[0]) - 1;
}
else
$tag = $possible;
break;
}
// Item codes are complicated buggers... they are implicit [li]s and can make [list]s!
if ($smileys !== false && $tag === null && isset($message[$pos + 2]) && isset($itemcodes[$message[$pos + 1]]) && $message[$pos + 2] === ']' && !isset($disabled['list']) && !isset($disabled['li']))
{
if ($message[$pos + 1] == '0' && !in_array($message[$pos - 1], array(';', ' ', "\t", "\n", '>')))
{
continue;
}
$tag = $itemcodes[$message[$pos + 1]];
// First let's set up the tree: it needs to be in a list, or after an li.
if ($inside === null || ($inside['tag'] !== 'list' && $inside['tag'] !== 'li'))
{
$open_tags[] = array(
'tag' => 'list',
'after' => '</ul>',
'block_level' => true,
'require_children' => array('li'),
'disallow_children' => isset($inside['disallow_children']) ? $inside['disallow_children'] : null,
);
$code = '<ul' . ($tag === '' ? '' : ' style="list-style-type: ' . $tag . '"') . ' class="bbc_list">';
}
// We're in a list item already: another itemcode? Close it first.
elseif ($inside['tag'] === 'li')
{
array_pop($open_tags);
$code = '</li>';
}
else
$code = '';
// Now we open a new tag.
$open_tags[] = array(
'tag' => 'li',
'after' => '</li>',
'trim' => 'outside',
'block_level' => true,
'disallow_children' => isset($inside['disallow_children']) ? $inside['disallow_children'] : null,
);
// First, open the tag...
$code .= '<li>';
$message = substr($message, 0, $pos) . "\n" . $code . "\n" . substr($message, $pos + 3);
$pos += strlen($code) - 1 + 2;
// Next, find the next break (if any.) If there's more itemcode after it, keep it going - otherwise close!
$pos2 = strpos($message, '<br />', $pos);
$pos3 = strpos($message, '[/', $pos);
if ($pos2 !== false && ($pos2 <= $pos3 || $pos3 === false))
{
preg_match('~^(<br />|&nbsp;|\s|\[)+~', substr($message, $pos2 + 6), $matches);
$message = substr_replace($message, (!empty($matches[0]) && substr($matches[0], -1) === '[' ? '[/li]' : '[/li][/list]'), $pos2, 0);
$open_tags[count($open_tags) - 2]['after'] = '</ul>';
}
// Tell the [list] that it needs to close specially.
else
{
// Move the li over, because we're not sure what we'll hit.
$open_tags[count($open_tags) - 1]['after'] = '';
$open_tags[count($open_tags) - 2]['after'] = '</li></ul>';
}
continue;
}
// Implicitly close lists and tables if something other than what's required is in them. This is needed for itemcode.
if ($tag === null && $inside !== null && !empty($inside['require_children']))
{
array_pop($open_tags);
$message = substr_replace($message, "\n" . $inside['after'] . "\n", $pos, 0);
$pos += strlen($inside['after']) - 1 + 2;
}
// No tag? Keep looking, then. Silly people using brackets without actual tags.
if ($tag === null)
continue;
// Propagate the list to the child (so wrapping the disallowed tag won't work either.)
if (isset($inside['disallow_children']))
$tag['disallow_children'] = isset($tag['disallow_children']) ? array_unique(array_merge($tag['disallow_children'], $inside['disallow_children'])) : $inside['disallow_children'];
// Is this tag disabled?
if (isset($disabled[$tag['tag']]))
{
if (!isset($tag['disabled_before']) && !isset($tag['disabled_after']) && !isset($tag['disabled_content']))
{
$tag['before'] = !empty($tag['block_level']) ? '<div>' : '';
$tag['after'] = !empty($tag['block_level']) ? '</div>' : '';
$tag['content'] = isset($tag['type']) && $tag['type'] === 'closed' ? '' : (!empty($tag['block_level']) ? '<div>$1</div>' : '$1');
}
elseif (isset($tag['disabled_before']) || isset($tag['disabled_after']))
{
$tag['before'] = isset($tag['disabled_before']) ? $tag['disabled_before'] : (!empty($tag['block_level']) ? '<div>' : '');
$tag['after'] = isset($tag['disabled_after']) ? $tag['disabled_after'] : (!empty($tag['block_level']) ? '</div>' : '');
}
else
$tag['content'] = $tag['disabled_content'];
}
// We use this alot
$tag_strlen = strlen($tag['tag']);
// The only special case is 'html', which doesn't need to close things.
if (!empty($tag['block_level']) && $tag['tag'] !== 'html' && empty($inside['block_level']))
{
$n = count($open_tags) - 1;
while (empty($open_tags[$n]['block_level']) && $n >= 0)
$n--;
// Close all the non block level tags so this tag isn't surrounded by them.
for ($i = count($open_tags) - 1; $i > $n; $i--)
{
$message = substr_replace($message, "\n" . $open_tags[$i]['after'] . "\n", $pos, 0);
$ot_strlen = strlen($open_tags[$i]['after']);
$pos += $ot_strlen + 2;
$pos1 += $ot_strlen + 2;
// Trim or eat trailing stuff... see comment at the end of the big loop.
if (!empty($open_tags[$i]['block_level']) && substr_compare($message, '<br />', $pos, 6) === 0)
$message = substr_replace($message, '', $pos, 6);
if (!empty($open_tags[$i]['trim']) && $tag['trim'] !== 'inside' && preg_match('~(<br />|&nbsp;|\s)*~', substr($message, $pos), $matches) !== 0)
$message = substr($message, 0, $pos) . substr($message, $pos + strlen($matches[0]));
array_pop($open_tags);
}
}
// No type means 'parsed_content'.
if (!isset($tag['type']))
{
// @todo Check for end tag first, so people can say "I like that [i] tag"?
$open_tags[] = $tag;
$message = substr($message, 0, $pos) . "\n" . $tag['before'] . "\n" . substr($message, $pos1);
$pos += strlen($tag['before']) - 1 + 2;
}
// Don't parse the content, just skip it.
elseif ($tag['type'] === 'unparsed_content')
{
$pos2 = stripos($message, '[/' . substr($message, $pos + 1, $tag_strlen) . ']', $pos1);
if ($pos2 === false)
continue;
$data = substr($message, $pos1, $pos2 - $pos1);
if (!empty($tag['block_level']) && substr($data, 0, 6) === '<br />')
{
$data = substr($data, 6);
}
if (isset($tag['validate']))
$tag['validate']($tag, $data, $disabled);
$code = strtr($tag['content'], array('$1' => $data));
$message = substr_replace($message, "\n" . $code . "\n", $pos, $pos2 + 3 + $tag_strlen - $pos);
$pos += strlen($code) - 1 + 2;
$last_pos = $pos + 1;
}
// Don't parse the content, just skip it.
elseif ($tag['type'] === 'unparsed_equals_content')
{
// The value may be quoted for some tags - check.
if (isset($tag['quoted']))
{
$quoted = substr($message, $pos1, 6) === '&quot;';
if ($tag['quoted'] !== 'optional' && !$quoted)
continue;
if ($quoted)
$pos1 += 6;
}
else
$quoted = false;
$pos2 = strpos($message, $quoted === false ? ']' : '&quot;]', $pos1);
if ($pos2 === false)
continue;
$pos3 = stripos($message, '[/' . substr($message, $pos + 1, $tag_strlen) . ']', $pos2);
if ($pos3 === false)
continue;
$data = array(
substr($message, $pos2 + ($quoted === false ? 1 : 7), $pos3 - ($pos2 + ($quoted === false ? 1 : 7))),
substr($message, $pos1, $pos2 - $pos1)
);
if (!empty($tag['block_level']) && substr($data[0], 0, 6) === '<br />')
$data[0] = substr($data[0], 6);
// Validation for my parking, please!
if (isset($tag['validate']))
$tag['validate']($tag, $data, $disabled);
$code = strtr($tag['content'], array('$1' => $data[0], '$2' => $data[1]));
$message = substr_replace($message, "\n" . $code . "\n", $pos, $pos3 + 3 + $tag_strlen - $pos);
$pos += strlen($code) - 1 + 2;
}
// A closed tag, with no content or value.
elseif ($tag['type'] === 'closed')
{
$pos2 = strpos($message, ']', $pos);
$message = substr($message, 0, $pos) . "\n" . $tag['content'] . "\n" . substr($message, $pos2 + 1);
$pos += strlen($tag['content']) - 1 + 2;
}
// This one is sorta ugly... :/
elseif ($tag['type'] === 'unparsed_commas_content')
{
$pos2 = strpos($message, ']', $pos1);
if ($pos2 === false)
continue;
$pos3 = stripos($message, '[/' . substr($message, $pos + 1, $tag_strlen) . ']', $pos2);
if ($pos3 === false)
continue;
// We want $1 to be the content, and the rest to be csv.
$data = explode(',', ',' . substr($message, $pos1, $pos2 - $pos1));
$data[0] = substr($message, $pos2 + 1, $pos3 - $pos2 - 1);
if (isset($tag['validate']))
$tag['validate']($tag, $data, $disabled);
$code = $tag['content'];
foreach ($data as $k => $d)
$code = strtr($code, array('$' . ($k + 1) => trim($d)));
$message = substr($message, 0, $pos) . "\n" . $code . "\n" . substr($message, $pos3 + 3 + $tag_strlen);
$pos += strlen($code) - 1 + 2;
}
// This has parsed content, and a csv value which is unparsed.
elseif ($tag['type'] === 'unparsed_commas')
{
$pos2 = strpos($message, ']', $pos1);
if ($pos2 === false)
continue;
$data = explode(',', substr($message, $pos1, $pos2 - $pos1));
if (isset($tag['validate']))
$tag['validate']($tag, $data, $disabled);
// Fix after, for disabled code mainly.
foreach ($data as $k => $d)
$tag['after'] = strtr($tag['after'], array('$' . ($k + 1) => trim($d)));
$open_tags[] = $tag;
// Replace them out, $1, $2, $3, $4, etc.
$code = $tag['before'];
foreach ($data as $k => $d)
$code = strtr($code, array('$' . ($k + 1) => trim($d)));
$message = substr_replace($message, "\n" . $code . "\n", $pos, $pos2 + 1 - $pos);
$pos += strlen($code) - 1 + 2;
}
// A tag set to a value, parsed or not.
elseif ($tag['type'] === 'unparsed_equals' || $tag['type'] === 'parsed_equals')
{
// The value may be quoted for some tags - check.
if (isset($tag['quoted']))
{
$quoted = substr($message, $pos1, 6) === '&quot;';
if ($tag['quoted'] !== 'optional' && !$quoted)
continue;
if ($quoted)
$pos1 += 6;
}
else
$quoted = false;
$pos2 = strpos($message, $quoted === false ? ']' : '&quot;]', $pos1);
if ($pos2 === false)
continue;
$data = substr($message, $pos1, $pos2 - $pos1);
// Validation for my parking, please!
if (isset($tag['validate']))
$tag['validate']($tag, $data, $disabled);
// For parsed content, we must recurse to avoid security problems.
if ($tag['type'] !== 'unparsed_equals')
{
$data = parse_bbc($data, !empty($tag['parsed_tags_allowed']) ? false : true, '', !empty($tag['parsed_tags_allowed']) ? $tag['parsed_tags_allowed'] : array());
// Unfortunately after we recurse, we must manually reset the static disabled tags to what they were
parse_bbc('dummy');
}
$tag['after'] = strtr($tag['after'], array('$1' => $data));
$open_tags[] = $tag;
$code = strtr($tag['before'], array('$1' => $data));
$message = substr_replace($message, "\n" . $code . "\n", $pos, $pos2 + ($quoted === false ? 1 : 7) - $pos);
$pos += strlen($code) - 1 + 2;
}
// If this is block level, eat any breaks after it.
if (!empty($tag['block_level']) && isset($message[$pos + 1]) && substr_compare($message, '<br />', $pos + 1, 6) === 0)
$message = substr_replace($message, '', $pos + 1, 6);
// Are we trimming outside this tag?
if (!empty($tag['trim']) && $tag['trim'] !== 'outside' && preg_match('~(<br />|&nbsp;|\s)*~', substr($message, $pos + 1), $matches) !== 0)
$message = substr($message, 0, $pos + 1) . substr($message, $pos + 1 + strlen($matches[0]));
}
// Close any remaining tags.
while ($tag = array_pop($open_tags))
$message .= "\n" . $tag['after'] . "\n";
// Parse the smileys within the parts where it can be done safely.
if ($smileys === true)
{
$message_parts = explode("\n", $message);
for ($i = 0, $n = count($message_parts); $i < $n; $i += 2)
parsesmileys($message_parts[$i]);
$message = implode('', $message_parts);
}
// No smileys, just get rid of the markers.
else
$message = strtr($message, array("\n" => ''));
if (isset($message[0]) && $message[0] === ' ')
$message = '&nbsp;' . substr($message, 1);
// Cleanup whitespace.
$message = strtr($message, array(' ' => '&nbsp; ', "\r" => '', "\n" => '<br />', '<br /> ' => '<br />&nbsp;', '&#13;' => "\n"));
// Finish footnotes if we have any.
if (strpos($message, '<sup class="bbc_footnotes">') !== false)
{
global $fn_num, $fn_content, $fn_count;
static $fn_total;
// @todo temporary until we have nesting
$message = str_replace(array('[footnote]', '[/footnote]'), '', $message);
$fn_num = 0;
$fn_content = array();
$fn_count = isset($fn_total) ? $fn_total : 0;
// Replace our footnote text with a [1] link, save the text for use at the end of the message
$message = preg_replace_callback('~(%fn%(.*?)%fn%)~is', 'footnote_callback', $message);
$fn_total += $fn_num;
// If we have footnotes, add them in at the end of the message
if (!empty($fn_num))
$message .= '<div class="bbc_footnotes">' . implode('', $fn_content) . '</div>';
}
// Allow addons access to what parse_bbc created
call_integration_hook('integrate_post_parsebbc', array(&$message, &$smileys, &$cache_id, &$parse_tags));
// Cache the output if it took some time...
if (isset($cache_key, $cache_t) && microtime(true) - $cache_t > 0.05)
cache_put_data($cache_key, $message, 240);
// If this was a force parse revert if needed.
if (!empty($parse_tags))
{
if (empty($temp_bbc))
$bbc_codes = array();
else
{
$bbc_codes = $temp_bbc;
unset($temp_bbc);
}
}
return $message;
}
<?php
// @todo change to \StringParser\BBC
namespace BBC;
use \BBC\Codes;
// Anywhere you see - 1 + 2 it's because you get rid of the ] and add 2 \n
class Parser
{
protected $message;
protected $bbc;
protected $bbc_codes;
protected $item_codes;
protected $tags;
protected $pos;
protected $pos1;
protected $pos2;
protected $pos3;
protected $last_pos;
protected $do_smileys = true;
// This is just the name of the tags that are open, by key
protected $open_tags = array();
// This is the actual tag that's open
// @todo implement as SplStack
protected $open_bbc = array();
protected $do_autolink = true;
protected $inside_tag;
protected $lastAutoPos;
private $original_msg;
public function __construct(Codes $bbc)
{
$this->bbc = $bbc;
$this->bbc_codes = $this->bbc->getForParsing();
$this->item_codes = $this->bbc->getItemCodes();
//$this->tags = $this->bbc->getTags();
}
public function resetParser()
{
//$this->tags = null;
$this->pos = null;
$this->pos1 = null;
$this->pos2 = null;
$this->last_pos = null;
$this->open_tags = array();
//$this->open_bbc = new \SplStack;
$this->do_autolink = true;
$this->inside_tag = null;
$this->lastAutoPos = 0;
}
public function parse($message)
{
$this->message = $message;
// Don't waste cycles
if ($this->message === '')
{
return '';
}
// Clean up any cut/paste issues we may have
$this->message = sanitizeMSCutPaste($this->message);
// Unfortunately, this has to be done here because smileys are parsed as blocks between BBC
// @todo remove from here and make the caller figure it out
if (!$this->parsingEnabled())
{
if ($this->do_smileys)
{
parsesmileys($this->message);
}
return $this->message;
}
$this->resetParser();
// Get the BBC
$bbc_codes = $this->bbc_codes;
// @todo change this to <br> (it will break tests)
$this->message = str_replace("\n", '<br />', $this->message);
//$this->tokenize($this->message);
$this->pos = -1;
while ($this->pos !== false)
{
$this->last_pos = isset($this->last_pos) ? max($this->pos, $this->last_pos) : $this->pos;
$this->pos = strpos($this->message, '[', $this->pos + 1);
// Failsafe.
if ($this->pos === false || $this->last_pos > $this->pos)
{
$this->pos = strlen($this->message) + 1;
}
// Can't have a one letter smiley, URL, or email! (sorry.)
if ($this->last_pos < $this->pos - 1)
{
$this->betweenTags();
}
// Are we there yet? Are we there yet?
if ($this->pos >= strlen($this->message) - 1)
{
break;
}
$tags = strtolower($this->message[$this->pos + 1]);
// Possibly a closer?
if ($tags === '/')
{
if($this->hasOpenTags())
{
// Next closing bracket after the first character
$this->pos2 = strpos($this->message, ']', $this->pos + 1);
// Playing games? string = [/]
if ($this->pos2 === $this->pos + 2)
{
continue;
}
// Get everything between [/ and ]
$look_for = strtolower(substr($this->message, $this->pos + 2, $this->pos2 - $this->pos - 2));
$to_close = array();
$block_level = null;
do
{
// Get the last opened tag
$tag = $this->closeOpenedTag(false);
// No open tags
if (!$tag)
{
break;
}
if ($tag[Codes::ATTR_BLOCK_LEVEL])
{
// Only find out if we need to.
if ($block_level === false)
{
$this->addOpenTag($tag);
break;
}
// The idea is, if we are LOOKING for a block level tag, we can close them on the way.
if (isset($look_for[1]) && isset($bbc_codes[$look_for[0]]))
{
foreach ($bbc_codes[$look_for[0]] as $temp)
{
if ($temp[Codes::ATTR_TAG] === $look_for)
{
$block_level = $temp[Codes::ATTR_BLOCK_LEVEL];
break;
}
}
}
if ($block_level !== true)
{
$block_level = false;
$this->addOpenTag($tag);
break;
}
}
$to_close[] = $tag;
} while ($tag[Codes::ATTR_TAG] !== $look_for);
// Did we just eat through everything and not find it?
if (!$this->hasOpenTags() && (empty($tag) || $tag[Codes::ATTR_TAG] !== $look_for))
{
$this->open_tags = $to_close;
continue;
}
elseif (!empty($to_close) && $tag[Codes::ATTR_TAG] !== $look_for)
{
if ($block_level === null && isset($look_for[0], $bbc_codes[$look_for[0]]))
{
foreach ($bbc_codes[$look_for[0]] as $temp)
{
if ($temp[Codes::ATTR_TAG] === $look_for)
{
$block_level = !empty($temp[Codes::ATTR_BLOCK_LEVEL]);
break;
}
}
}
// We're not looking for a block level tag (or maybe even a tag that exists...)
if (!$block_level)
{
foreach ($to_close as $tag)
{
$this->addOpenTag($tag);
}
continue;
}
}
foreach ($to_close as $tag)
{
//$this->message = substr($this->message, 0, $this->pos) . "\n" . $tag[Codes::ATTR_AFTER] . "\n" . substr($this->message, $this->pos2 + 1);
//$this->message = substr_replace($this->message, "\n" . $tag[Codes::ATTR_AFTER] . "\n", $this->pos, $this->pos2 + 1 - $this->pos);
$tmp = $this->noSmileys($tag[Codes::ATTR_AFTER]);
$this->message = substr_replace($this->message, $tmp, $this->pos, $this->pos2 + 1 - $this->pos);
//$this->pos += strlen($tag[Codes::ATTR_AFTER]) + 2;
$this->pos += strlen($tmp);
$this->pos2 = $this->pos - 1;
// See the comment at the end of the big loop - just eating whitespace ;).
if ($tag[Codes::ATTR_BLOCK_LEVEL] && isset($this->message[$this->pos]) && substr_compare($this->message, '<br />', $this->pos, 6) === 0)
//if (!empty($tag[Codes::ATTR_BLOCK_LEVEL]) && substr($this->message, $this->pos, 6) === '<br />')
{
// $this->message = substr($this->message, 0, $this->pos) . substr($this->message, $this->pos + 6);
$this->message = substr_replace($this->message, '', $this->pos, 6);
}
// Trim inside whitespace
if (!empty($tag[Codes::ATTR_TRIM]) && $tag[Codes::ATTR_TRIM] !== Codes::TRIM_INSIDE)
{
$this->trimWhiteSpace($this->message, $this->pos + 1);
}
}
if (!empty($to_close))
{
$to_close = array();
$this->pos--;
}
}
// We don't allow / to be used for anything but the closing character, so this can't be a tag
continue;
}
// No tags for this character, so just keep going (fastest possible course.)
if (!isset($bbc_codes[$tags]))
{
continue;
}
$this->inside_tag = !$this->hasOpenTags() ? null : $this->getLastOpenedTag();
// @todo figure out if this is an itemcode first
$tag = $this->isItemCode($tags) ? null : $this->findTag($bbc_codes[$tags]);
//if (!empty($tag['itemcode'])
if ($tag === null
// Why does smilies being on/off affect item codes?
// && $this->do_smileys
&& isset($this->message[$this->pos + 2])
&& $this->message[$this->pos + 2] === ']'
&& $this->isItemCode($this->message[$this->pos + 1])
&& !$this->bbc->isDisabled('list')
&& !$this->bbc->isDisabled('li')
)
{
// Itemcodes cannot be 0 and must be preceeded by a semi-colon, space, tab, new line, or greater than sign
if (!($this->message[$this->pos + 1] === '0' && !in_array($this->message[$this->pos - 1], array(';', ' ', "\t", "\n", '>'))))
{
// Item codes are complicated buggers... they are implicit [li]s and can make [list]s!
$this->handleItemCode();
}
continue;
}
// Implicitly close lists and tables if something other than what's required is in them. This is needed for itemcode.
if ($tag === null && $this->inside_tag !== null && !empty($this->inside_tag[Codes::ATTR_REQUIRE_CHILDREN]))
{
$this->closeOpenedTag();
//$this->message = substr_replace($this->message, "\n" . $this->inside_tag[Codes::ATTR_AFTER] . "\n", $this->pos, 0);
$tmp = $this->noSmileys($this->inside_tag[Codes::ATTR_AFTER]);
$this->message = substr_replace($this->message, $tmp, $this->pos, 0);
//$this->pos += strlen($this->inside_tag[Codes::ATTR_AFTER]) - 1 + 2;
$this->pos += strlen($tmp) - 1;
}
// No tag? Keep looking, then. Silly people using brackets without actual tags.
if ($tag === null)
{
continue;
}
// Propagate the list to the child (so wrapping the disallowed tag won't work either.)
if (isset($this->inside_tag[Codes::ATTR_DISALLOW_CHILDREN]))
{
//$tag[Codes::ATTR_DISALLOW_CHILDREN] = isset($tag[Codes::ATTR_DISALLOW_CHILDREN]) ? array_unique(array_merge($tag[Codes::ATTR_DISALLOW_CHILDREN], $this->inside_tag[Codes::ATTR_DISALLOW_CHILDREN])) : $this->inside_tag[Codes::ATTR_DISALLOW_CHILDREN];
$tag[Codes::ATTR_DISALLOW_CHILDREN] = isset($tag[Codes::ATTR_DISALLOW_CHILDREN]) ? $tag[Codes::ATTR_DISALLOW_CHILDREN] + $this->inside_tag[Codes::ATTR_DISALLOW_CHILDREN] : $this->inside_tag[Codes::ATTR_DISALLOW_CHILDREN];
}
// Is this tag disabled?
if ($this->bbc->isDisabled($tag[Codes::ATTR_TAG]))
{
$this->handleDisabled($tag);
}
// The only special case is 'html', which doesn't need to close things.
if ($tag[Codes::ATTR_BLOCK_LEVEL] && $tag[Codes::ATTR_TAG] !== 'html' && !$this->inside_tag[Codes::ATTR_BLOCK_LEVEL])
{
$this->closeNonBlockLevel();
}
// This is the part where we actually handle the tags. I know, crazy how long it took.
if($this->handleTag($tag))
{
continue;
}
// If this is block level, eat any breaks after it.
if ($tag[Codes::ATTR_BLOCK_LEVEL] && isset($this->message[$this->pos + 1]) && substr_compare($this->message, '<br />', $this->pos + 1, 6) === 0)
//if (!empty($tag[Codes::ATTR_BLOCK_LEVEL]) && substr($this->message, $this->pos + 1, 6) === '<br />')
{
$this->message = substr_replace($this->message, '', $this->pos + 1, 6);
//$this->message = substr($this->message, 0, $this->pos + 1) . substr($this->message, $this->pos + 7);
}
// Are we trimming outside this tag?
if (!empty($tag[Codes::ATTR_TRIM]) && $tag[Codes::ATTR_TRIM] !== Codes::TRIM_OUTSIDE)
{
$this->trimWhiteSpace($this->message, $this->pos + 1);
}
}
// Close any remaining tags.
while ($tag = $this->closeOpenedTag())
{
//$this->message .= "\n" . $tag[Codes::ATTR_AFTER] . "\n";
$this->message .= $this->noSmileys($tag[Codes::ATTR_AFTER]);
}
// Parse the smileys within the parts where it can be done safely.
if ($this->do_smileys === true)
{
$message_parts = explode("\n", $this->message);
for ($i = 0, $n = count($message_parts); $i < $n; $i += 2)
{
parsesmileys($message_parts[$i]);
//parsesmileys($this->message);
}
$this->message = implode('', $message_parts);
}
// No smileys, just get rid of the markers.
else
{
$this->message = str_replace("\n", '', $this->message);
}
if (isset($this->message[0]) && $this->message[0] === ' ')
{
$this->message = '&nbsp;' . substr($this->message, 1);
}
// Cleanup whitespace.
// @todo remove \n because it should never happen after the explode/str_replace. Replace with str_replace
$this->message = strtr($this->message, array(' ' => '&nbsp; ', "\r" => '', "\n" => '<br />', '<br /> ' => '<br />&nbsp;', '&#13;' => "\n"));
// Finish footnotes if we have any.
if (strpos($this->message, '<sup class="bbc_footnotes">') !== false)
{
$this->handleFootnotes();
}
// Allow addons access to what the parser created
$message = $this->message;
call_integration_hook('integrate_post_parsebbc', array(&$message));
$this->message = $message;
return $this->message;
}
/**
* Turn smiley parsing on/off
* @param bool $toggle
* @return \BBC\Parser
*/
public function doSmileys($toggle)
{
$this->do_smileys = (bool) $toggle;
return $this;
}
public function parsingEnabled()
{
return !empty($GLOBALS['modSettings']['enableBBC']);
}
protected function parseHTML(&$data)
{
global $modSettings;
//$data = preg_replace('~&lt;a\s+href=((?:&quot;)?)((?:https?://|ftps?://|mailto:)\S+?)\\1&gt;~i', '[url=$2]', $data);
$data = preg_replace('~&lt;a\s+href=((?:&quot;)?)((?:https?://|mailto:)\S+?)\\1&gt;~i', '[url=$2]', $data);
$data = preg_replace('~&lt;/a&gt;~i', '[/url]', $data);
// <br /> should be empty.
$empty_tags = array('br', 'hr');
foreach ($empty_tags as $tag)
{
$data = str_replace(array('&lt;' . $tag . '&gt;', '&lt;' . $tag . '/&gt;', '&lt;' . $tag . ' /&gt;'), '[' . $tag . ' /]', $data);
}
// b, u, i, s, pre... basic tags.
$closable_tags = array('b', 'u', 'i', 's', 'em', 'ins', 'del', 'pre', 'blockquote');
foreach ($closable_tags as $tag)
{
$diff = substr_count($data, '&lt;' . $tag . '&gt;') - substr_count($data, '&lt;/' . $tag . '&gt;');
$data = strtr($data, array('&lt;' . $tag . '&gt;' => '<' . $tag . '>', '&lt;/' . $tag . '&gt;' => '</' . $tag . '>'));
if ($diff > 0)
{
$data = substr($data, 0, -1) . str_repeat('</' . $tag . '>', $diff) . substr($data, -1);
}
}
// Do <img ... /> - with security... action= -> action-.
//preg_match_all('~&lt;img\s+src=((?:&quot;)?)((?:https?://|ftps?://)\S+?)\\1(?:\s+alt=(&quot;.*?&quot;|\S*?))?(?:\s?/)?&gt;~i', $data, $matches, PREG_PATTERN_ORDER);
preg_match_all('~&lt;img\s+src=((?:&quot;)?)((?:https?://)\S+?)\\1(?:\s+alt=(&quot;.*?&quot;|\S*?))?(?:\s?/)?&gt;~i', $data, $matches, PREG_PATTERN_ORDER);
if (!empty($matches[0]))
{
$replaces = array();
foreach ($matches[2] as $match => $imgtag)
{
$alt = empty($matches[3][$match]) ? '' : ' alt=' . preg_replace('~^&quot;|&quot;$~', '', $matches[3][$match]);
// Remove action= from the URL - no funny business, now.
if (preg_match('~action(=|%3d)(?!dlattach)~i', $imgtag) !== 0)
{
$imgtag = preg_replace('~action(?:=|%3d)(?!dlattach)~i', 'action-', $imgtag);
}
// Check if the image is larger than allowed.
// @todo - We should seriously look at deprecating some of $this in favour of CSS resizing.
if (!empty($modSettings['max_image_width']) && !empty($modSettings['max_image_height']))
{
// For images, we'll want $this.
require_once(SUBSDIR . '/Attachments.subs.php');
list ($width, $height) = url_image_size($imgtag);
if (!empty($modSettings['max_image_width']) && $width > $modSettings['max_image_width'])
{
$height = (int) (($modSettings['max_image_width'] * $height) / $width);
$width = $modSettings['max_image_width'];
}
if (!empty($modSettings['max_image_height']) && $height > $modSettings['max_image_height'])
{
$width = (int) (($modSettings['max_image_height'] * $width) / $height);
$height = $modSettings['max_image_height'];
}
// Set the new image tag.
$replaces[$matches[0][$match]] = '[img width=' . $width . ' height=' . $height . $alt . ']' . $imgtag . '[/img]';
}
else
$replaces[$matches[0][$match]] = '[img' . $alt . ']' . $imgtag . '[/img]';
}
$data = strtr($data, $replaces);
}
}
protected function autoLink(&$data)
{
static $search, $replacements;
// Are we inside tags that should be auto linked?
$autolink_area = true;
if ($this->hasOpenTags())
{
foreach ($this->open_tags as $open_tag)
{
if (!$open_tag[Codes::ATTR_AUTOLINK])
{
$autolink_area = false;
}
}
}
// Don't go backwards.
// @todo Don't think is the real solution....
$this->lastAutoPos = isset($this->lastAutoPos) ? $this->lastAutoPos : 0;
if ($this->pos < $this->lastAutoPos)
{
$autolink_area = false;
}
$this->lastAutoPos = $this->pos;
if ($autolink_area)
{
// Parse any URLs.... have to get rid of the @ problems some things cause... stupid email addresses.
if (!$this->bbc->isDisabled('url') && (strpos($data, '://') !== false || strpos($data, 'www.') !== false) && strpos($data, '[url') === false)
{
// Switch out quotes really quick because they can cause problems.
$data = strtr($data, array('&#039;' => '\'', '&nbsp;' => "\xC2\xA0", '&quot;' => '>">', '"' => '<"<', '&lt;' => '<lt<'));
if ($search === null)
{
// @todo get rid of the FTP, nobody uses it
$search = array(
'~(?<=[\s>\.(;\'"]|^)((?:http|https)://[\w\-_%@:|]+(?:\.[\w\-_%]+)*(?::\d+)?(?:/[\p{L}\p{N}\-_\~%\.@!,\?&;=#(){}+:\'\\\\]*)*[/\p{L}\p{N}\-_\~%@\?;=#}\\\\])~ui',
//'~(?<=[\s>\.(;\'"]|^)((?:ftp|ftps)://[\w\-_%@:|]+(?:\.[\w\-_%]+)*(?::\d+)?(?:/[\w\-_\~%\.@,\?&;=#(){}+:\'\\\\]*)*[/\w\-_\~%@\?;=#}\\\\])~i',
'~(?<=[\s>(\'<]|^)(www(?:\.[\w\-_]+)+(?::\d+)?(?:/[\p{L}\p{N}\-_\~%\.@!,\?&;=#(){}+:\'\\\\]*)*[/\p{L}\p{N}\-_\~%@\?;=#}\\\\])~ui'
);
$replacements = array(
'[url]$1[/url]',
//'[ftp]$1[/ftp]',
'[url=http://$1]$1[/url]'
);
call_integration_hook('integrate_autolink', array(&$search, &$replacements, $this->bbc));
}
$result = preg_replace($search, $replacements, $data);
// Only do this if the preg survives.
if (is_string($result))
{
$data = $result;
}
// Switch those quotes back
$data = strtr($data, array('\'' => '&#039;', "\xC2\xA0" => '&nbsp;', '>">' => '&quot;', '<"<' => '"', '<lt<' => '&lt;'));
}
// Next, emails...
if (!$this->bbc->isDisabled('email') && strpos($data, '@') !== false && strpos($data, '[email') === false)
{
$data = preg_replace('~(?<=[\?\s\x{A0}\[\]()*\\\;>]|^)([\w\-\.]{1,80}@[\w\-]+\.[\w\-\.]+[\w\-])(?=[?,\s\x{A0}\[\]()*\\\]|$|<br />|&nbsp;|&gt;|&lt;|&quot;|&#039;|\.(?:\.|;|&nbsp;|\s|$|<br />))~u', '[email]$1[/email]', $data);
$data = preg_replace('~(?<=<br />)([\w\-\.]{1,80}@[\w\-]+\.[\w\-\.]+[\w\-])(?=[?\.,;\s\x{A0}\[\]()*\\\]|$|<br />|&nbsp;|&gt;|&lt;|&quot;|&#039;)~u', '[email]$1[/email]', $data);
}
}
}
protected function findTag(array $possible_codes)
{
$tag = null;
$last_check = null;
foreach ($possible_codes as $possible)
{
// Skip tags that didn't match the next X characters
if ($possible[Codes::ATTR_TAG] === $last_check)
{
continue;
}
// Not a match?
if (substr_compare($this->message, $possible[Codes::ATTR_TAG], $this->pos + 1, $possible[Codes::ATTR_LENGTH], true) !== 0)
{
$last_check = $possible[Codes::ATTR_TAG];
continue;
}
// The character after the possible tag or nothing
// @todo shouldn't this return if empty since there needs to be a ]?
$next_c = isset($this->message[$this->pos + 1 + $possible[Codes::ATTR_LENGTH]]) ? $this->message[$this->pos + 1 + $possible[Codes::ATTR_LENGTH]] : '';
// A test validation?
// @todo figure out if the regex need can use offset
// this creates a copy of the entire message starting from this point!
// @todo where do we know if the next char is ]?
//if (isset($possible[Codes::ATTR_TEST]) && preg_match('~^' . $possible[Codes::ATTR_TEST] . '~', substr($this->message, $this->pos + 1 + $possible[Codes::ATTR_LENGTH] + 1)) === 0)
if (isset($possible[Codes::ATTR_TEST]) && preg_match('~^' . $possible[Codes::ATTR_TEST] . '~', substr($this->message, $this->pos + 2 + $possible[Codes::ATTR_LENGTH], strpos($this->message, ']', $this->pos) - ($this->pos + 2 + $possible[Codes::ATTR_LENGTH]))) === 0)
{
continue;
}
// Do we want parameters?
elseif (!empty($possible[Codes::ATTR_PARAM]))
{
if ($next_c !== ' ')
{
continue;
}
}
elseif ($possible[Codes::ATTR_TYPE] !== Codes::TYPE_PARSED_CONTENT)
{
// Do we need an equal sign?
if ($next_c !== '=' && in_array($possible[Codes::ATTR_TYPE], array(Codes::TYPE_UNPARSED_EQUALS, Codes::TYPE_UNPARSED_COMMAS, Codes::TYPE_UNPARSED_COMMAS_CONTENT, Codes::TYPE_UNPARSED_EQUALS_CONTENT, Codes::TYPE_PARSED_EQUALS)))
{
continue;
}
if ($next_c !== ']')
{
// An immediate ]?
if ($possible[Codes::ATTR_TYPE] === Codes::TYPE_UNPARSED_CONTENT)
{
continue;
}
// Maybe we just want a /...
elseif ($possible[Codes::ATTR_TYPE] === Codes::TYPE_CLOSED && substr_compare($this->message, '/]', $this->pos + 1 + $possible[Codes::ATTR_LENGTH], 2) !== 0 && substr_compare($this->message, ' /]', $this->pos + 1 + $possible[Codes::ATTR_LENGTH], 3) !== 0)
{
continue;
}
}
}
// parsed_content demands an immediate ] without parameters!
elseif ($possible[Codes::ATTR_TYPE] === Codes::TYPE_PARSED_CONTENT)
{
if ($next_c !== ']')
{
continue;
}
}
// Check allowed tree?
if (isset($possible[Codes::ATTR_REQUIRE_PARENTS]) && ($this->inside_tag === null || !in_array($this->inside_tag[Codes::ATTR_TAG], $possible[Codes::ATTR_REQUIRE_PARENTS])))
{
continue;
}
if (isset($this->inside_tag[Codes::ATTR_REQUIRE_CHILDREN]) && !in_array($possible[Codes::ATTR_TAG], $this->inside_tag[Codes::ATTR_REQUIRE_CHILDREN]))
{
continue;
}
// If this is in the list of disallowed child tags, don't parse it.
//elseif (isset($this->inside_tag[Codes::ATTR_DISALLOW_CHILDREN]) && in_array($possible[Codes::ATTR_TAG], $this->inside_tag[Codes::ATTR_DISALLOW_CHILDREN]))
if (isset($this->inside_tag[Codes::ATTR_DISALLOW_CHILDREN]) && isset($this->inside_tag[Codes::ATTR_DISALLOW_CHILDREN][$possible[Codes::ATTR_TAG]]))
{
continue;
}
// Not allowed in this parent, replace the tags or show it like regular text
if (isset($possible[Codes::ATTR_DISALLOW_PARENTS]) && ($this->inside_tag !== null && in_array($this->inside_tag[Codes::ATTR_TAG], $possible[Codes::ATTR_DISALLOW_PARENTS])))
{
if (!isset($possible[Codes::ATTR_DISALLOW_BEFORE], $possible[Codes::ATTR_DISALLOW_AFTER]))
{
continue;
}
$possible[Codes::ATTR_BEFORE] = isset($possible[Codes::ATTR_DISALLOW_BEFORE]) ? $tag[Codes::ATTR_DISALLOW_BEFORE] : $possible[Codes::ATTR_BEFORE];
$possible[Codes::ATTR_AFTER] = isset($possible[Codes::ATTR_DISALLOW_AFTER]) ? $tag[Codes::ATTR_DISALLOW_AFTER] : $possible[Codes::ATTR_AFTER];
}
$this->pos1 = $this->pos + 1 + $possible[Codes::ATTR_LENGTH] + 1;
// This is long, but it makes things much easier and cleaner.
if (!empty($possible[Codes::ATTR_PARAM]))
{
$match = $this->matchParameters($possible, $matches);
// Didn't match our parameter list, try the next possible.
if (!$match)
{
continue;
}
$tag = $this->setupTagParameters($possible, $matches);
}
else
{
$tag = $possible;
}
// Quotes can have alternate styling, we do this php-side due to all the permutations of quotes.
if ($tag[Codes::ATTR_TAG] === 'quote')
{
// Start with standard
$quote_alt = false;
foreach ($this->open_tags as $open_quote)
{
// Every parent quote this quote has flips the styling
if ($open_quote[Codes::ATTR_TAG] === 'quote')
{
$quote_alt = !$quote_alt;
}
}
// Add a class to the quote to style alternating blockquotes
// @todo - Frankly it makes little sense to allow alternate blockquote
// styling without also catering for alternate quoteheader styling.
// I do remember coding that some time back, but it seems to have gotten
// lost somewhere in the Elk processes.
// Come to think of it, it may be better to append a second class rather
// than alter the standard one.
// - Example: class="bbc_quote" and class="bbc_quote alt_quote".
// This would mean simpler CSS for themes (like default) which do not use the alternate styling,
// but would still allow it for themes that want it.
$tag[Codes::ATTR_BEFORE] = str_replace('<blockquote>', '<blockquote class="bbc_' . ($quote_alt ? 'alternate' : 'standard') . '_quote">', $tag[Codes::ATTR_BEFORE]);
}
break;
}
return $tag;
}
protected function handleItemCode()
{
$tag = $this->item_codes[$this->message[$this->pos + 1]];
// First let's set up the tree: it needs to be in a list, or after an li.
if ($this->inside_tag === null || ($this->inside_tag[Codes::ATTR_TAG] !== 'list' && $this->inside_tag[Codes::ATTR_TAG] !== 'li'))
{
$this->addOpenTag(array(
Codes::ATTR_TAG => 'list',
Codes::ATTR_TYPE => Codes::TYPE_PARSED_CONTENT,
Codes::ATTR_AFTER => '</ul>',
Codes::ATTR_BLOCK_LEVEL => true,
Codes::ATTR_REQUIRE_CHILDREN => array('li'),
Codes::ATTR_DISALLOW_CHILDREN => isset($this->inside_tag[Codes::ATTR_DISALLOW_CHILDREN]) ? $this->inside_tag[Codes::ATTR_DISALLOW_CHILDREN] : null,
Codes::ATTR_LENGTH => 4,
Codes::ATTR_AUTOLINK => true,
));
$code = '<ul' . ($tag === '' ? '' : ' style="list-style-type: ' . $tag . '"') . ' class="bbc_list">';
}
// We're in a list item already: another itemcode? Close it first.
elseif ($this->inside_tag[Codes::ATTR_TAG] === 'li')
{
$this->closeOpenedTag();
$code = '</li>';
}
else
{
$code = '';
}
// Now we open a new tag.
$this->addOpenTag(array(
Codes::ATTR_TAG => 'li',
Codes::ATTR_TYPE => Codes::TYPE_PARSED_CONTENT,
Codes::ATTR_AFTER => '</li>',
Codes::ATTR_TRIM => Codes::TRIM_OUTSIDE,
Codes::ATTR_BLOCK_LEVEL => true,
Codes::ATTR_DISALLOW_CHILDREN => isset($this->inside_tag[Codes::ATTR_DISALLOW_CHILDREN]) ? $this->inside_tag[Codes::ATTR_DISALLOW_CHILDREN] : null,
Codes::ATTR_AUTOLINK => true,
Codes::ATTR_LENGTH => 2,
));
// First, open the tag...
$code .= '<li>';
//$this->message = substr($this->message, 0, $this->pos) . "\n" . $code . "\n" . substr($this->message, $this->pos + 3);
//$this->message = substr_replace($this->message, "\n" . $code . "\n", $this->pos, 3);
$tmp = $this->noSmileys($code);
$this->message = substr_replace($this->message, $tmp, $this->pos, 3);
//$this->pos += strlen($code) + 1;
$this->pos += strlen($tmp) - 1;
// Next, find the next break (if any.) If there's more itemcode after it, keep it going - otherwise close!
$this->pos2 = strpos($this->message, '<br />', $this->pos);
$this->pos3 = strpos($this->message, '[/', $this->pos);
$num_open_tags = count($this->open_tags);
if ($this->pos2 !== false && ($this->pos3 === false || $this->pos2 <= $this->pos3))
{
// Can't use offset because of the ^
preg_match('~^(<br />|&nbsp;|\s|\[)+~', substr($this->message, $this->pos2 + 6), $matches);
// Keep the list open if the next character after the break is a [. Otherwise, close it.
$replacement = (!empty($matches[0]) && substr_compare($matches[0], '[', -1, 1) === 0 ? '[/li]' : '[/li][/list]');
//$this->message = substr($this->message, 0, $this->pos2) . $replacement . substr($this->message, $this->pos2);
$this->message = substr_replace($this->message, $replacement, $this->pos2, 0);
$this->open_tags[$num_open_tags - 2][Codes::ATTR_AFTER] = '</ul>';
}
// Tell the [list] that it needs to close specially.
else
{
// Move the li over, because we're not sure what we'll hit.
$this->open_tags[$num_open_tags - 1][Codes::ATTR_AFTER] = '';
$this->open_tags[$num_open_tags - 2][Codes::ATTR_AFTER] = '</li></ul>';
}
}
protected function handleTag($tag)
{
switch ($tag[Codes::ATTR_TYPE])
{
case Codes::TYPE_PARSED_CONTENT:
// @todo Check for end tag first, so people can say "I like that [i] tag"?
$this->addOpenTag($tag);
//$this->message = substr($this->message, 0, $this->pos) . "\n" . $tag[Codes::ATTR_BEFORE] . "\n" . substr($this->message, $this->pos1);
//$this->message = substr_replace($this->message, "\n" . $tag[Codes::ATTR_BEFORE] . "\n", $this->pos, $this->pos1 - $this->pos);
$tmp = $this->noSmileys($tag[Codes::ATTR_BEFORE]);
$this->message = substr_replace($this->message, $tmp, $this->pos, $this->pos1 - $this->pos);
//$this->pos += strlen($tag[Codes::ATTR_BEFORE]) + 1;
$this->pos += strlen($tmp) - 1;
break;
// Don't parse the content, just skip it.
case Codes::TYPE_UNPARSED_CONTENT:
// Find the next closer
$this->pos2 = stripos($this->message, '[/' . $tag[Codes::ATTR_TAG] . ']', $this->pos1);
// No closer
if ($this->pos2 === false)
{
return true;
}
// @todo figure out how to make this move to the validate part
$data = substr($this->message, $this->pos1, $this->pos2 - $this->pos1);
//if (!empty($tag[Codes::ATTR_BLOCK_LEVEL]) && substr_compare($this->message, '<br />', $this->pos, 6) === 0)
//if (!empty($tag[Codes::ATTR_BLOCK_LEVEL]) && substr($data, 0, 6) === '<br />')
if (!empty($tag[Codes::ATTR_BLOCK_LEVEL]) && isset($data[0]) && substr_compare($data, '<br />', 0, 6) === 0)
{
$data = substr($data, 6);
//$this->message = substr_replace($this->message, '', $this->pos, 6);
}
if (isset($tag[Codes::ATTR_VALIDATE]))
{
$tag[Codes::ATTR_VALIDATE]($tag, $data, $this->bbc->getDisabled());
}
$code = strtr($tag[Codes::ATTR_CONTENT], array('$1' => $data));
//$this->message = substr($this->message, 0, $this->pos) . "\n" . $code . "\n" . substr($this->message, $this->pos2 + 3 + $tag[Codes::ATTR_LENGTH]);
//$this->message = substr_replace($this->message, "\n" . $code . "\n", $this->pos, $this->pos2 + 3 + $tag[Codes::ATTR_LENGTH] - $this->pos);
$tmp = $this->noSmileys($code);
$this->message = substr_replace($this->message, $tmp, $this->pos, $this->pos2 + 3 + $tag[Codes::ATTR_LENGTH] - $this->pos);
//$this->pos += strlen($code) - 1 + 2;
$this->pos += strlen($tmp) - 1;
$this->last_pos = $this->pos + 1;
break;
// Don't parse the content, just skip it.
case Codes::TYPE_UNPARSED_EQUALS_CONTENT:
// The value may be quoted for some tags - check.
if (isset($tag[Codes::ATTR_QUOTED]))
{
$quoted = substr_compare($this->message, '&quot;', $this->pos1, 6) === 0;
if ($tag[Codes::ATTR_QUOTED] !== Codes::OPTIONAL && !$quoted)
{
return true;
}
if ($quoted)
{
$this->pos1 += 6;
}
}
else
$quoted = false;
$this->pos2 = strpos($this->message, $quoted === false ? ']' : '&quot;]', $this->pos1);
if ($this->pos2 === false)
{
return true;
}
$this->pos3 = stripos($this->message, '[/' . $tag[Codes::ATTR_TAG] . ']', $this->pos2);
if ($this->pos3 === false)
{
return true;
}
$data = array(
substr($this->message, $this->pos2 + ($quoted === false ? 1 : 7), $this->pos3 - ($this->pos2 + ($quoted === false ? 1 : 7))),
substr($this->message, $this->pos1, $this->pos2 - $this->pos1)
);
if (!empty($tag[Codes::ATTR_BLOCK_LEVEL]) && substr_compare($data[0], '<br />', 0, 6) === 0)
{
$data[0] = substr($data[0], 6);
}
// Validation for my parking, please!
if (isset($tag[Codes::ATTR_VALIDATE]))
{
$tag[Codes::ATTR_VALIDATE]($tag, $data, $this->bbc->getDisabled());
}
$code = strtr($tag[Codes::ATTR_CONTENT], array('$1' => $data[0], '$2' => $data[1]));
//$this->message = substr($this->message, 0, $this->pos) . "\n" . $code . "\n" . substr($this->message, $this->pos3 + 3 + $tag[Codes::ATTR_LENGTH]);
//$this->message = substr_replace($this->message, "\n" . $code . "\n", $this->pos, $this->pos3 + 3 + $tag[Codes::ATTR_LENGTH] - $this->pos);
$tmp = $this->noSmileys($code);
$this->message = substr_replace($this->message, $tmp, $this->pos, $this->pos3 + 3 + $tag[Codes::ATTR_LENGTH] - $this->pos);
//$this->pos += strlen($code) - 1 + 2;
$this->pos += strlen($tmp) - 1;
break;
// A closed tag, with no content or value.
case Codes::TYPE_CLOSED:
$this->pos2 = strpos($this->message, ']', $this->pos);
//$this->message = substr($this->message, 0, $this->pos) . "\n" . $tag[Codes::ATTR_CONTENT] . "\n" . substr($this->message, $this->pos2 + 1);
//$this->message = substr_replace($this->message, "\n" . $tag[Codes::ATTR_CONTENT] . "\n", $this->pos, $this->pos2 + 1 - $this->pos);
$tmp = $this->noSmileys($tag[Codes::ATTR_CONTENT]);
$this->message = substr_replace($this->message, $tmp, $this->pos, $this->pos2 + 1 - $this->pos);
//$this->pos += strlen($tag[Codes::ATTR_CONTENT]) - 1 + 2;
$this->pos += strlen($tmp) - 1;
break;
// This one is sorta ugly... :/
case Codes::TYPE_UNPARSED_COMMAS_CONTENT:
$this->pos2 = strpos($this->message, ']', $this->pos1);
if ($this->pos2 === false)
{
return true;
}
$this->pos3 = stripos($this->message, '[/' . $tag[Codes::ATTR_TAG] . ']', $this->pos2);
if ($this->pos3 === false)
{
return true;
}
// We want $1 to be the content, and the rest to be csv.
$data = explode(',', ',' . substr($this->message, $this->pos1, $this->pos2 - $this->pos1));
$data[0] = substr($this->message, $this->pos2 + 1, $this->pos3 - $this->pos2 - 1);
if (isset($tag[Codes::ATTR_VALIDATE]))
{
$tag[Codes::ATTR_VALIDATE]($tag, $data, $this->bbc->getDisabled());
}
$code = $tag[Codes::ATTR_CONTENT];
foreach ($data as $k => $d)
{
$code = strtr($code, array('$' . ($k + 1) => trim($d)));
}
//$this->message = substr($this->message, 0, $this->pos) . "\n" . $code . "\n" . substr($this->message, $this->pos3 + 3 + $tag[Codes::ATTR_LENGTH]);
//$this->message = substr_replace($this->message, "\n" . $code . "\n", $this->pos, $this->pos3 + 3 + $tag[Codes::ATTR_LENGTH] - $this->pos);
$tmp = $this->noSmileys($code);
$this->message = substr_replace($this->message, $tmp, $this->pos, $this->pos3 + 3 + $tag[Codes::ATTR_LENGTH] - $this->pos);
//$this->pos += strlen($code) - 1 + 2;
$this->pos += strlen($tmp) - 1;
break;
// This has parsed content, and a csv value which is unparsed.
case Codes::TYPE_UNPARSED_COMMAS:
$this->pos2 = strpos($this->message, ']', $this->pos1);
if ($this->pos2 === false)
{
return true;
}
$data = explode(',', substr($this->message, $this->pos1, $this->pos2 - $this->pos1));
if (isset($tag[Codes::ATTR_VALIDATE]))
{
$tag[Codes::ATTR_VALIDATE]($tag, $data, $this->bbc->getDisabled());
}
// Fix after, for disabled code mainly.
foreach ($data as $k => $d)
{
$tag[Codes::ATTR_AFTER] = strtr($tag[Codes::ATTR_AFTER], array('$' . ($k + 1) => trim($d)));
}
$this->addOpenTag($tag);
// Replace them out, $1, $2, $3, $4, etc.
$code = $tag[Codes::ATTR_BEFORE];
foreach ($data as $k => $d)
{
$code = strtr($code, array('$' . ($k + 1) => trim($d)));
}
//$this->message = substr($this->message, 0, $this->pos) . "\n" . $code . "\n" . substr($this->message, $this->pos2 + 1);
//$this->message = substr_replace($this->message, "\n" . $code . "\n", $this->pos, $this->pos2 + 1 - $this->pos);
$tmp = $this->noSmileys($code);
$this->message = substr_replace($this->message, $tmp, $this->pos, $this->pos2 + 1 - $this->pos);
//$this->pos += strlen($code) - 1 + 2;
$this->pos += strlen($tmp) - 1;
break;
// A tag set to a value, parsed or not.
case Codes::TYPE_PARSED_EQUALS:
case Codes::TYPE_UNPARSED_EQUALS:
// The value may be quoted for some tags - check.
if (isset($tag[Codes::ATTR_QUOTED]))
{
//$quoted = substr($this->message, $this->pos1, 6) === '&quot;';
$quoted = substr_compare($this->message, '&quot;', $this->pos1, 6) === 0;
if ($tag[Codes::ATTR_QUOTED] !== Codes::OPTIONAL && !$quoted)
{
return true;
}
if ($quoted)
{
$this->pos1 += 6;
}
}
else
{
$quoted = false;
}
$this->pos2 = strpos($this->message, $quoted === false ? ']' : '&quot;]', $this->pos1);
if ($this->pos2 === false)
{
return true;
}
$data = substr($this->message, $this->pos1, $this->pos2 - $this->pos1);
// Validation for my parking, please!
if (isset($tag[Codes::ATTR_VALIDATE]))
{
$tag[Codes::ATTR_VALIDATE]($tag, $data, $this->bbc->getDisabled());
}
// For parsed content, we must recurse to avoid security problems.
if ($tag[Codes::ATTR_TYPE] !== Codes::TYPE_UNPARSED_EQUALS)
{
//var_dump($this->message, $tag, $data);
$this->recursiveParser($data, $tag);
}
$tag[Codes::ATTR_AFTER] = strtr($tag[Codes::ATTR_AFTER], array('$1' => $data));
$this->addOpenTag($tag);
$code = strtr($tag[Codes::ATTR_BEFORE], array('$1' => $data));
//$this->message = substr($this->message, 0, $this->pos) . "\n" . $code . "\n" . substr($this->message, $this->pos2 + ($quoted === false ? 1 : 7));
//$this->message = substr_replace($this->message, "\n" . $code . "\n", $this->pos, $this->pos2 + ($quoted === false ? 1 : 7) - $this->pos);
$tmp = $this->noSmileys($code);
$this->message = substr_replace($this->message, $tmp, $this->pos, $this->pos2 + ($quoted === false ? 1 : 7) - $this->pos);
//$this->pos += strlen($code) - 1 + 2;
$this->pos += strlen($tmp) - 1;
break;
}
return false;
}
// @todo I don't know what else to call this. It's the area that isn't a tag.
protected function betweenTags()
{
// Make sure the $this->last_pos is not negative.
$this->last_pos = max($this->last_pos, 0);
// Pick a block of data to do some raw fixing on.
$data = substr($this->message, $this->last_pos, $this->pos - $this->last_pos);
// Take care of some HTML!
if (!empty($GLOBALS['modSettings']['enablePostHTML']) && strpos($data, '&lt;') !== false)
{
// @todo new \Parser\BBC\HTML;
$this->parseHTML($data);
}
// @todo is this sending tags like [/b] here?
if (!empty($GLOBALS['modSettings']['autoLinkUrls']))
{
$this->autoLink($data);
}
// @todo can this be moved much earlier?
$data = str_replace("\t", '&nbsp;&nbsp;&nbsp;', $data);
// If it wasn't changed, no copying or other boring stuff has to happen!
//if ($data !== substr($this->message, $this->last_pos, $this->pos - $this->last_pos))
if (substr_compare($this->message, $data, $this->last_pos, $this->pos - $this->last_pos))
{
//$this->message = substr($this->message, 0, $this->last_pos) . $data . substr($this->message, $this->pos);
$this->message = substr_replace($this->message, $data, $this->last_pos, $this->pos - $this->last_pos);
// Since we changed it, look again in case we added or removed a tag. But we don't want to skip any.
$old_pos = strlen($data) + $this->last_pos;
$this->pos = strpos($this->message, '[', $this->last_pos);
$this->pos = $this->pos === false ? $old_pos : min($this->pos, $old_pos);
}
}
protected function handleFootnotes()
{
global $fn_num, $fn_content, $fn_count;
static $fn_total;
// @todo temporary until we have nesting
$this->message = str_replace(array('[footnote]', '[/footnote]'), '', $this->message);
$fn_num = 0;
$fn_content = array();
$fn_count = isset($fn_total) ? $fn_total : 0;
// Replace our footnote text with a [1] link, save the text for use at the end of the message
$this->message = preg_replace_callback('~(%fn%(.*?)%fn%)~is', 'footnote_callback', $this->message);
$fn_total += $fn_num;
// If we have footnotes, add them in at the end of the message
if (!empty($fn_num))
{
$this->message .= '<div class="bbc_footnotes">' . implode('', $fn_content) . '</div>';
}
}
protected function handleDisabled(&$tag)
{
if (!isset($tag[Codes::ATTR_DISABLED_BEFORE]) && !isset($tag[Codes::ATTR_DISABLED_AFTER]) && !isset($tag[Codes::ATTR_DISABLED_CONTENT]))
{
$tag[Codes::ATTR_BEFORE] = !empty($tag[Codes::ATTR_BLOCK_LEVEL]) ? '<div>' : '';
$tag[Codes::ATTR_AFTER] = !empty($tag[Codes::ATTR_BLOCK_LEVEL]) ? '</div>' : '';
$tag[Codes::ATTR_CONTENT] = $tag[Codes::ATTR_TYPE] === Codes::TYPE_CLOSED ? '' : (!empty($tag[Codes::ATTR_BLOCK_LEVEL]) ? '<div>$1</div>' : '$1');
}
elseif (isset($tag[Codes::ATTR_DISABLED_BEFORE]) || isset($tag[Codes::ATTR_DISABLED_AFTER]))
{
$tag[Codes::ATTR_BEFORE] = isset($tag[Codes::ATTR_DISABLED_BEFORE]) ? $tag[Codes::ATTR_DISABLED_BEFORE] : (!empty($tag[Codes::ATTR_BLOCK_LEVEL]) ? '<div>' : '');
$tag[Codes::ATTR_AFTER] = isset($tag[Codes::ATTR_DISABLED_AFTER]) ? $tag[Codes::ATTR_DISABLED_AFTER] : (!empty($tag[Codes::ATTR_BLOCK_LEVEL]) ? '</div>' : '');
}
else
{
$tag[Codes::ATTR_CONTENT] = $tag[Codes::ATTR_DISABLED_CONTENT];
}
}
protected function oldMatchParameters($possible, &$matches)
{
$preg = array();
foreach ($possible[Codes::ATTR_PARAM] as $p => $info)
{
$preg[] = '(\s+' . $p . '=' . (empty($info[Codes::PARAM_ATTR_QUOTED]) ? '' : '&quot;') . (isset($info[Codes::PARAM_ATTR_MATCH]) ? $info[Codes::PARAM_ATTR_MATCH] : '(.+?)') . (empty($info[Codes::PARAM_ATTR_QUOTED]) ? '' : '&quot;') . ')' . (empty($info[Codes::PARAM_ATTR_OPTIONAL]) ? '' : '?');
}
// Okay, this may look ugly and it is, but it's not going to happen much and it is the best way
// of allowing any order of parameters but still parsing them right.
$match = false;
$orders = permute($preg);
$message_stub = substr($this->message, $this->pos1 - 1);
foreach ($orders as $p)
{
if (preg_match('~^' . implode('', $p) . '\]~i', $message_stub, $matches) != 0)
{
$match = true;
break;
}
}
return $match;
}
// @todo change to returning matches. If array() continue
protected function matchParameters(array &$possible, &$matches)
{
if (!isset($possible['preg_cache']))
{
$possible['preg_cache'] = array();
foreach ($possible[Codes::ATTR_PARAM] as $p => $info)
{
$possible['preg_cache'][] = '(\s+' . $p . '=' . (empty($info[Codes::PARAM_ATTR_QUOTED]) ? '' : '&quot;') . (isset($info[Codes::PARAM_ATTR_MATCH]) ? $info[Codes::PARAM_ATTR_MATCH] : '(.+?)') . (empty($info[Codes::PARAM_ATTR_QUOTED]) ? '' : '&quot;') . ')' . (empty($info[Codes::PARAM_ATTR_OPTIONAL]) ? '' : '?');
}
$possible['preg_size'] = count($possible['preg_cache']) - 1;
$possible['preg_keys'] = range(0, $possible['preg_size']);
}
$preg = $possible['preg_cache'];
$param_size = $possible['preg_size'];
$preg_keys = $possible['preg_keys'];
// Okay, this may look ugly and it is, but it's not going to happen much and it is the best way
// of allowing any order of parameters but still parsing them right.
//$param_size = count($possible['preg_cache']) - 1;
//$preg_keys = range(0, $param_size);
$message_stub = substr($this->message, $this->pos1 - 1);
// If an addon adds many parameters we can exceed max_execution time, lets prevent that
// 5040 = 7, 40,320 = 8, (N!) etc
$max_iterations = 5040;
// Step, one by one, through all possible permutations of the parameters until we have a match
do {
$match_preg = '~^';
foreach ($preg_keys as $key)
{
$match_preg .= $possible['preg_cache'][$key];
}
$match_preg .= '\]~i';
// Check if this combination of parameters matches the user input
$match = preg_match($match_preg, $message_stub, $matches) !== 0;
} while (!$match && --$max_iterations && ($preg_keys = pc_next_permutation($preg_keys, $param_size)));
return $match;
}
// This allows to parse BBC in parameters like [quote author="[url]www.quotes.com[/quote]"]Something famous.[/quote]
protected function recursiveParser(&$data, $tag)
{
// @todo if parsed tags allowed is empty, return?
//var_dump('handleParsedEquals', $this->message);
//$data = parse_bbc($data, !empty($tag[Codes::ATTR_PARSED_TAGS_ALLOWED]) ? false : true, '', !empty($tag[Codes::ATTR_PARSED_TAGS_ALLOWED]) ? $tag[Codes::ATTR_PARSED_TAGS_ALLOWED] : array());
//$data = parse_bbc($data);
//parse_bbc('dummy');
//return $data;
//var_dump($tag[Codes::ATTR_PARSED_TAGS_ALLOWED], $this->bbc->getTags());die;
$bbc = clone $this->bbc;
//$old_bbc = $this->bbc->getForParsing();
if (!empty($tag[Codes::ATTR_PARSED_TAGS_ALLOWED]))
{
foreach ($this->bbc->getTags() as $code)
{
if (!in_array($code, $tag[Codes::ATTR_PARSED_TAGS_ALLOWED]))
{
$this->bbc->removeTag($code);
}
}
}
//$this->bbc_codes = $this->bbc->getForParsing();
$parser = new \BBC\Parser($bbc);
$data = $parser->parse($data);
//$data = $this->parse($data);
// set it back
//$this->bbc_codes = $old_bbc;
}
protected function addOpenTag($tag)
{
$this->open_tags[] = $tag;
}
// if false, close the last one
protected function closeOpenedTag($tag = false)
{
if ($tag === false)
{
//$return = end($this->open_tags);
//unset($this->open_tags[key($this->open_tags)]);
//return $return;
return array_pop($this->open_tags);
}
elseif (isset($this->open_tags[$tag]))
{
$return = $this->open_tags[$tag];
unset($this->open_tags[$tag]);
return $return;
}
}
protected function hasOpenTags()
{
return !empty($this->open_tags);
}
protected function getLastOpenedTag()
{
return end($this->open_tags);
}
protected function getOpenedTags($tags_only = false)
{
if (!$tags_only)
{
return $this->open_tags;
}
$tags = array();
foreach ($this->open_tags as $tag)
{
$tags[] = $tag[Codes::ATTR_TAG];
}
return $tags;
}
protected function trimWhiteSpace(&$message, $offset = null)
{
/*
OUTSIDE
if ($tag[Codes::ATTR_TRIM] != 'inside' && preg_match('~(<br />|&nbsp;|\s)*~', substr($message, $pos), $matches) != 0)
$message = substr($message, 0, $pos) . substr($message, $pos + strlen($matches[0]));
if ($tag[Codes::ATTR_TRIM] != 'inside' && preg_match('~(<br />|&nbsp;|\s)*~', substr($message, $pos), $matches) != 0)
$message = substr($message, 0, $pos) . substr($message, $pos + strlen($matches[0]));
INSIDE
if ($tag[Codes::ATTR_TRIM] != 'outsid' && preg_match('~(<br />|&nbsp;|\s)*~', substr($message, $pos), $matches) != 0)
$message = substr($message, 0, $pos) . substr($message, $pos + strlen($matches[0]));
*/
if (preg_match('~(<br />|&nbsp;|\s)*~', $this->message, $matches, null, $offset) !== 0 && isset($matches[0]))
{
//$this->message = substr($this->message, 0, $this->pos) . substr($this->message, $this->pos + strlen($matches[0]));
$this->message = substr_replace($this->message, '', $this->pos, strlen($matches[0]));
}
}
protected function insertAtCursor($string, $offset)
{
$this->message = substr_replace($this->message, $string, $offset, 0);
}
protected function removeChars($offset, $length)
{
$this->message = substr_replace($this->message, '', $offset, $length);
}
protected function setupTagParameters($possible, $matches)
{
$params = array();
for ($i = 1, $n = count($matches); $i < $n; $i += 2)
{
$key = strtok(ltrim($matches[$i]), '=');
if (isset($possible[Codes::ATTR_PARAM][$key][Codes::PARAM_ATTR_VALUE]))
{
$params['{' . $key . '}'] = strtr($possible[Codes::ATTR_PARAM][$key][Codes::PARAM_ATTR_VALUE], array('$1' => $matches[$i + 1]));
}
// @todo it's not validating it. it is filtering it
elseif (isset($possible[Codes::ATTR_PARAM][$key][Codes::ATTR_VALIDATE]))
{
$params['{' . $key . '}'] = $possible[Codes::ATTR_PARAM][$key][Codes::ATTR_VALIDATE]($matches[$i + 1]);
}
else
{
$params['{' . $key . '}'] = $matches[$i + 1];
}
// Just to make sure: replace any $ or { so they can't interpolate wrongly.
$params['{' . $key . '}'] = str_replace(array('$', '{'), array('&#036;', '&#123;'), $params['{' . $key . '}']);
}
foreach ($possible[Codes::ATTR_PARAM] as $p => $info)
{
if (!isset($params['{' . $p . '}']))
{
$params['{' . $p . '}'] = '';
}
}
// We found our tag
$tag = $possible;
// Put the parameters into the string.
if (isset($tag[Codes::ATTR_BEFORE]))
{
$tag[Codes::ATTR_BEFORE] = strtr($tag[Codes::ATTR_BEFORE], $params);
}
if (isset($tag[Codes::ATTR_AFTER]))
{
$tag[Codes::ATTR_AFTER] = strtr($tag[Codes::ATTR_AFTER], $params);
}
if (isset($tag[Codes::ATTR_CONTENT]))
{
$tag[Codes::ATTR_CONTENT] = strtr($tag[Codes::ATTR_CONTENT], $params);
}
$this->pos1 += strlen($matches[0]) - 1;
return $tag;
}
protected function isOpen($tag)
{
foreach ($this->open_tags as $open)
{
if ($open[Codes::ATTR_TAG] === $tag)
{
return true;
}
}
return false;
}
protected function isItemCode($char)
{
return isset($this->item_codes[$char]);
}
protected function closeNonBlockLevel()
{
$n = count($this->open_tags) - 1;
while (empty($this->open_tags[$n][Codes::ATTR_BLOCK_LEVEL]) && $n >= 0)
{
$n--;
}
// Close all the non block level tags so this tag isn't surrounded by them.
for ($i = count($this->open_tags) - 1; $i > $n; $i--)
{
//$this->message = substr_replace($this->message, "\n" . $this->open_tags[$i][Codes::ATTR_AFTER] . "\n", $this->pos, 0);
$tmp = $this->noSmileys($this->open_tags[$i][Codes::ATTR_AFTER]);
$this->message = substr_replace($this->message, $tmp, $this->pos, 0);
//$ot_strlen = strlen($this->open_tags[$i][Codes::ATTR_AFTER]);
$ot_strlen = strlen($tmp);
//$this->pos += $ot_strlen + 2;
$this->pos += $ot_strlen;
//$this->pos1 += $ot_strlen + 2;
$this->pos1 += $ot_strlen;
// Trim or eat trailing stuff... see comment at the end of the big loop.
if (!empty($this->open_tags[$i][Codes::ATTR_BLOCK_LEVEL]) && substr_compare($this->message, '<br />', $this->pos, 6) === 0)
{
$this->message = substr_replace($this->message, '', $this->pos, 6);
}
if (isset($tag[Codes::ATTR_TRIM]) && $tag[Codes::ATTR_TRIM] !== Codes::TRIM_INSIDE)
{
$this->trimWhiteSpace($this->message, $this->pos);
}
$this->closeOpenedTag();
}
}
protected function noSmileys($string)
{
return "\n" . $string . "\n";
}
protected function parseSmileys($string)
{
if ($this->do_smileys === true)
{
$old_string = $string;
parseSmileys($string);
if ($string != $old_string)
var_dump($this->message);
}
//$string = "\n" . $string . "\n";
return $string;
}
protected function tokenize($message)
{
$split_string = $this->getTokenRegex();
$msg_parts = preg_split($split_string, $message, null, PREG_SPLIT_OFFSET_CAPTURE | PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
var_dump(
//$this->bbc_codes,
//array_keys($this->bbc_codes),
//$this->bbc->getTags(),
//$split_chars,
$split_string,
$msg_parts
);
return $msg_parts;
}
protected function getTokenRegex()
{
// @todo itemcodes should be ([\n \t;>][itemcode])
$split_chars = array('(' . preg_quote(']') . ')');
// Get a list of just tags
$tags = $this->bbc->getTags();
// Sort the tags by their length
usort($tags, function ($a, $b) {
// @todo micro-optimization but we could store the lengths of the tags as the val and make the tag the key. Then sort on the key
return strlen($b) - strlen($a);
});
foreach ($tags as $bbc)
{
$split_chars[] = '(' . preg_quote('[' . $bbc) . ')';
// Closing tags are easy. They must have [/.*]
$split_chars[] = '(' . preg_quote('[/' . $bbc) . '])';
}
var_dump($tags);
return '~' . implode('|', $split_chars) . '~';
}
}
<?php
// @todo change to \StringParser\BBC
namespace BBC;
use \BBC\Codes;
/**
* A tag is the name of a code. Like [url]. The tag is "url".
* A code is the instructions, including the name (tag).
* Each tag can have many codes
*/
class PregParser
{
protected $message;
/**
* \BBC\Codes
*/
protected $bbc;
/**
* An array of BBC.
* [$tag]* => [
* [code],
* [code]
* ]*
*/
protected $bbc_codes;
protected $item_codes;
protected $next_closing_bracket = 0;
public function __construct(Codes $bbc)
{
$this->bbc = $bbc;
$this->bbc_codes = $this->bbc->getForParsing();
$this->item_codes = $this->bbc->getItemCodes();
//$this->tags = $this->bbc->getTags();
}
public function resetParser()
{
//$this->tags = null;
$this->pos = null;
$this->pos1 = null;
$this->pos2 = null;
$this->last_pos = null;
$this->open_tags = array();
$this->open_bbc = new \SplStack;
$this->do_autolink = true;
$this->inside_tag = null;
$this->lastAutoPos = 0;
}
public function parse($message)
{
$this->message = &$message;
// Don't waste cycles
if ($this->message === '')
{
return '';
}
// Clean up any cut/paste issues we may have
$this->message = sanitizeMSCutPaste($this->message);
// Unfortunately, this has to be done here because smileys are parsed as blocks between BBC
// @todo remove from here and make the caller figure it out
if (!$this->parsingEnabled())
{
if ($this->do_smileys)
{
parsesmileys($this->message);
}
return $this->message;
}
$this->resetParser();
// Get the BBC
$bbc_codes = $this->bbc_codes;
$bbc = $this->bbc->getCodesGroupedByTag();
// @todo change this to <br> (it will break tests)
$this->message = str_replace("\n", '<br />', $this->message);
$msg_parts = $this->tokenize($message);
// The old way was very complex. Here's how it should be done:
// * split the message in to parts
// * loop through each part
// * check if there is a closing bracket (might just be a smiley or another character, but check anyway)
// * check if the part is a tag
// * figure out which code it belongs to
// *
$break = false;
$next_closing_bracket = 0;
// don't use a foreach so we can jump around the array without worrying about a cursor
for ($num_parts = count($msg_parts), $pos = 0; $pos < $num_parts && !$break; $pos++)
{
list($part, $offset) = $msg_parts[$pos];
// @todo this needs to get rid of the substr. Better to just make the array account for [ and [/
// What if we just searched for [ and then checked if the next element is a code?
$possible_tag = substr($part, 1);
$possible_closer = $possible_tag[0] === '/';
if ($possible_closer)
{
$possible_tag = substr($possible_tag, 1);
}
if ($this->isTag($bbc, $possible_tag))
{
// Don't open the tag yet, we need to look ahead and see if there is a ]
// We might even already know where it's at
$next_closing_bracket = $next_closing_bracket > $pos : $next_closing_bracket : $this->lookAhead($msg_parts, $pos, ']');
if ($next_closing_bracket !== -1)
{
// Starts with a /, has an open tag, that tag matches our possible tag, and the closing bracket is next...
// We have a closer!
if ($possible_closer && !empty($last_open_tag) && $last_open_tag[Codes::ATTR_TAG] === $possible_tag && $next_closing_bracket === $pos + 1)
{
// When we close is actually when we handle all of the parsing like autolink, smilies, before/after/content
// Then we put it in to a message string
// Since we don't need to go backwards, we can pop off the previous array elements to save space
$this->closeCode();
continue;
}
// Okay, open the tag now.
$tag = $this->findTag($bbc[$possible_tag], $msg_parts, $pos, $next_closing_bracket);
// No tag found
if ($tag === null)
{
continue;
}
// Itemcodes are tags too. Just very different
if (!empty($tag[Codes::ATTR_ITEMCODE]))
{
$this->handleItemCode($tag);
}
// If this is block level and the last tag isn't, we need to close it.
// Non-block level tags can't wrap block level tags
if ($tag[Codes::ATTR_BLOCK_LEVEL] && !$last_open_tag[Codes::BLOCK_LEVEL])
{
$this->closeNonBlockLevelTags();
}
// Open the code.
// Also sets the disallowed children
$this->openCode($tag);
}
}
}
// If there is anything remaining, close it
$this->closeRemainingCodes();
}
protected function tokenize($message)
{
$split_string = $this->getTokenRegex();
$msg_parts = preg_split($split_string, $message, null, PREG_SPLIT_OFFSET_CAPTURE | PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
var_dump(
//$this->bbc_codes,
//array_keys($this->bbc_codes),
$this->bbc->getTags(),
$split_chars,
$split_string,
$msg_parts
);
return $msg_parts;
}
protected function getTokenRegex()
{
// @todo itemcodes should be ([\n \t;>][itemcode])
$split_chars = array('(' . preg_quote(']') . ')');
foreach ($this->bbc->getTags() as $bbc)
{
$split_chars[] = '(' . preg_quote('[' . $bbc) . ')';
// Closing tags are easy. They must have [/.*]
$split_chars[] = '(' . preg_quote('[/' . $bbc) . '])';
}
return '~' . implode('|', $split_chars) . '~';
}
protected function isTag($bbc, $possible_tag)
{
return isset($bbc[$possible_tag]);
}
// This is used when opening a code. If there is no closing bracket, it's not actually a code.
protected function getNextClosingBracket()
{
}
protected function lookAhead($array, $pos, $look_for, $count = false)
{
$len = $count === false ? count($array) : $count;
for ($i = $pos; $i < $len; $i++)
{
if ($array[$i] === $look_for)
{
return $i;
}
}
// Ahead means it can never be less than 1
return -1;
}
// This is pretty much the same as the old parser's findTag()
protected function findTag($possible_codes, $msg_parts, $pos, $next_closing_bracket)
{
$next_pos = $pos + 1;
list($next_val, $next_offset) = $msg_parts;
$next_char = $next_val[0];
$is_equals = $next_char === '=';
$is_closing_bracket = $next_char === ']';
$is_forward_slash = $next_char === '/';
$is_space = $next_char === ' ';
foreach ($possible_codes as $possible)
{
// Let's start by checking the types
// Check if it's an itemcode
if ($is_closing_bracket && !empty($possible[Codes::ATTR_ITEMCODE]))
{
return $possible;
}
// Parameters require a space
if ($next_char !== ' ' && !empty($possible[Codes::ATTR_PARAM]))
{
continue;
}
// Any type with COMMAS or EQUALS in it must have an equal sign as the next character
if ($next_char !== '=' && in_array($possible[Codes::ATTR_TYPE], array(Codes::TYPE_UNPARSED_EQUALS, Codes::TYPE_UNPARSED_COMMAS, Codes::TYPE_UNPARSED_COMMAS_CONTENT, Codes::TYPE_UNPARSED_EQUALS_CONTENT, Codes::TYPE_PARSED_EQUALS)))
{
continue;
}
// Closed tag?
// @todo this might actually need substr_compare here
if ($possible[Codes::ATTR_TYPE] === Codes::TYPE_CLOSED && ())
{
continue;
}
/////********** //
// A test validation?
if (isset($possible[Codes::ATTR_TEST]) && preg_match('~^' . $possible[Codes::ATTR_TEST] . '~', substr($this->message, $this->pos + 1 + $possible[Codes::ATTR_LENGTH] + 1)) === 0)
{
continue;
}
// Do we want parameters?
elseif (!empty($possible[Codes::ATTR_PARAM]))
{
if ($next_c !== ' ')
{
continue;
}
}
elseif ($possible[Codes::ATTR_TYPE] !== Codes::TYPE_PARSED_CONTENT)
{
// Do we need an equal sign?
if ($next_c !== '=' && in_array($possible[Codes::ATTR_TYPE], array(Codes::TYPE_UNPARSED_EQUALS, Codes::TYPE_UNPARSED_COMMAS, Codes::TYPE_UNPARSED_COMMAS_CONTENT, Codes::TYPE_UNPARSED_EQUALS_CONTENT, Codes::TYPE_PARSED_EQUALS)))
{
continue;
}
// Maybe we just want a /...
if ($next_c !== ']' && $possible[Codes::ATTR_TYPE] === Codes::TYPE_CLOSED && substr_compare($this->message, '/]', $this->pos + 1 + $possible[Codes::ATTR_LENGTH], 2) !== 0 && substr_compare($this->message, ' /]', $this->pos + 1 + $possible[Codes::ATTR_LENGTH], 3) !== 0)
{
continue;
}
// An immediate ]?
if ($next_c !== ']' && $possible[Codes::ATTR_TYPE] == Codes::TYPE_UNPARSED_CONTENT)
{
continue;
}
}
// parsed_content demands an immediate ] without parameters!
elseif ($possible[Codes::ATTR_TYPE] === Codes::TYPE_PARSED_CONTENT)
{
if ($next_c !== ']')
{
continue;
}
}
// Check allowed tree?
if (isset($possible[Codes::ATTR_REQUIRE_PARENTS]) && ($this->inside_tag === null || !in_array($this->inside_tag[Codes::ATTR_TAG], $possible[Codes::ATTR_REQUIRE_PARENTS])))
{
continue;
}
elseif (isset($this->inside_tag[Codes::ATTR_REQUIRE_CHILDREN]) && !in_array($possible[Codes::ATTR_TAG], $this->inside_tag[Codes::ATTR_REQUIRE_CHILDREN]))
{
continue;
}
// If this is in the list of disallowed child tags, don't parse it.
elseif (isset($this->inside_tag[Codes::ATTR_DISALLOW_CHILDREN]) && in_array($possible[Codes::ATTR_TAG], $this->inside_tag[Codes::ATTR_DISALLOW_CHILDREN]))
{
continue;
}
// Not allowed in this parent, replace the tags or show it like regular text
elseif (isset($possible[Codes::ATTR_DISALLOW_PARENTS]) && ($this->inside_tag !== null && in_array($this->inside_tag[Codes::ATTR_TAG], $possible[Codes::ATTR_DISALLOW_PARENTS])))
{
if (!isset($possible[Codes::ATTR_DISALLOW_BEFORE], $possible[Codes::ATTR_DISALLOW_AFTER]))
{
continue;
}
$possible[Codes::ATTR_BEFORE] = isset($possible[Codes::ATTR_DISALLOW_BEFORE]) ? $tag[Codes::ATTR_DISALLOW_BEFORE] : $possible[Codes::ATTR_BEFORE];
$possible[Codes::ATTR_AFTER] = isset($possible[Codes::ATTR_DISALLOW_AFTER]) ? $tag[Codes::ATTR_DISALLOW_AFTER] : $possible[Codes::ATTR_AFTER];
}
$this->pos1 = $this->pos + 1 + $possible[Codes::ATTR_LENGTH] + 1;
// This is long, but it makes things much easier and cleaner.
if (!empty($possible[Codes::ATTR_PARAM]))
{
$match = $this->oldMatchParameters($possible, $matches);
// Didn't match our parameter list, try the next possible.
if (!$match)
{
continue;
}
$tag = $this->setupTagParameters($possible, $matches);
}
else
{
$tag = $possible;
}
// Quotes can have alternate styling, we do this php-side due to all the permutations of quotes.
if ($tag[Codes::ATTR_TAG] === 'quote')
{
// Start with standard
$quote_alt = false;
foreach ($this->open_tags as $open_quote)
{
// Every parent quote this quote has flips the styling
if ($open_quote[Codes::ATTR_TAG] === 'quote')
{
$quote_alt = !$quote_alt;
}
}
// Add a class to the quote to style alternating blockquotes
// @todo - Frankly it makes little sense to allow alternate blockquote
// styling without also catering for alternate quoteheader styling.
// I do remember coding that some time back, but it seems to have gotten
// lost somewhere in the Elk processes.
// Come to think of it, it may be better to append a second class rather
// than alter the standard one.
// - Example: class="bbc_quote" and class="bbc_quote alt_quote".
// This would mean simpler CSS for themes (like default) which do not use the alternate styling,
// but would still allow it for themes that want it.
$tag[Codes::ATTR_BEFORE] = str_replace('<blockquote>', '<blockquote class="bbc_' . ($quote_alt ? 'alternate' : 'standard') . '_quote">', $tag[Codes::ATTR_BEFORE]);
}
break;
}
return $tag;
}
protected function implodeChunk($array, $glue, $start, $end, $len = false)
{
$string = '';
$len = $len === false ? count($array) : $len;
for ($i = $start; $i < $end && $i < $len; $i++)
{
$string .= $array[$i] . $glue;
}
return $string;
}
}
<?php
/**
* I realize this can and should be done with PHPUnit or similar,
* but this is a simple test script that I am mocking up in Gist editor
* because my computer is in Florida. Once it gets here, this might change.
*
* One thing that won't change is PHPUnit won't measure the RAM and mem taken
* which is one of the main reasons for rewriting.
*/
namespace BBC;
// Some constants
if (!defined('ITERATIONS'))
{
define('ITERATIONS', 100);
}
if (!defined('DEBUG'))
{
define('DEBUG', true);
}
if (!defined('FAILED_TEST_IS_FATAL'))
{
define('FAILED_TEST_IS_FATAL', false);
}
// Neccessary files
require_once 'ParseBBC.php';
require_once 'Codes.php';
require_once 'Parser.php';
require_once 'BBCHelpers.php';
globalSettings();
function globalSettings()
{
global $txt, $scripturl, $context, $modSettings, $user_info, $scripturl;
error_reporting(E_ALL);
ini_set('display_errors', 1);
$scripturl = 'http://localhost';
$txt = array(
'code' => 'code',
'code_select' => 'select',
'quote' => 'quote',
'quote_from' => 'quote from',
'search_on' => 'search on',
'spoiler' => 'spoiler',
// For the smilies
'icon_cheesy' => 'cheesy',
'icon_rolleyes' => 'rolleyes',
'icon_angry' => 'angry',
'icon_laugh' => 'laugh',
'icon_smiley' => 'smile',
'icon_wink' => 'wink',
'icon_grin' => 'grin',
'icon_sad' => 'sad',
'icon_shocked' => 'shocked',
'icon_cool' => 'cool',
'icon_tongue' => 'tongue',
'icon_huh' => 'huh',
'icon_embarrassed' => 'embarrassed',
'icon_lips' => 'lips',
'icon_kiss' => 'kiss',
'icon_cry' => 'cry',
'icon_undecided' => 'undecided',
'icon_angel' => 'angel',
);
$modSettings = array(
'smiley_enable' => false,
'enableBBC' => true,
'todayMod' => '3',
'cache_enable' => false,
'autoLinkUrls' => true,
// These will have to be set to test that block, but that is for later
'max_image_width' => false,
'max_image_height' => false,
'smileys_url' => 'http://www.google.com/smileys',
);
$user_info = array(
'smiley_set' => false,
);
define('SUBSDIR', __DIR__);
}
function tests($input)
{
$bbc = new Codes;
$parser = new Parser($bbc);
setupOldParseBBCGlobals();
parse_bbc(false);
$old_method = getOldMethod();
$new_method = getNewMethod($parser);
$messages = getMessages(isset($input['msg']) ? $input['msg'] : null);
$results = array();
foreach ($messages as $k => $message)
{
$old_result = $old_method($message);
$new_result = $new_method($message);
$pass = $old_result === $new_result;
$results[$k] = array(
'pass' => $pass,
'message' => $message,
// I hate wasting memory like this, but datatables complains about colspan
//'return' => $pass ? array('old' => $old_result) : array('old' => $old_result, 'new' => $new_result),
'return' => array('old' => $old_result, 'new' => $new_result),
);
if (!$pass && $input['fatal'])
{
return $results;
}
}
return $results;
}
// @todo randomize which goes first
function benchmark($input)
{
$messages = getMessages(isset($input['msg']) ? $input['msg'] : null);
$iterations = $input['iterations'];
$results = array(
'num_messages' => count($messages),
'iterations' => $iterations,
//'total_time' => array('old' => 0, 'new' => 0),
);
setupOldParseBBCGlobals();
// This needs to run first to even the playing field.
// Of course old will always win here.
$results['codes'] = array(
'old' => runBenchmark(function (){
parse_bbc(false);
}, $iterations),
'new' => runBenchmark(function (){
new Codes;
}, $iterations),
);
// Setup the BBC for the new method
$parser = new Parser(new Codes);
$methods = array(
'old' => function () use($messages, $iterations) {
return runBenchmark(function () use ($messages) {
foreach($messages as $message)
{
parse_bbc($message);
}
}, $iterations);
},
'new' => function () use ($messages, $parser, $iterations) {
return runBenchmark(function () use ($messages, $parser) {
foreach($messages as $message)
{
$parser->parse($message);
}
}, $iterations);
},
);
shuffle_assoc($methods);
// Now the messages
foreach ($methods as $name => $method)
{
$results['all'][$name] = $method();
}
$methods = array(
'old' => function ($message) use($iterations) {
return runBenchmark(function () use ($message) {
return parse_bbc($message);
}, $iterations, true);
},
'new' => function ($message) use ($parser, $iterations) {
return runBenchmark(function () use (&$message, $parser) {
return $parser->parse($message);
}, $iterations, true);
},
);
// Individual messages to see if there is one that is screwing things up
foreach ($messages as $i => $message)
{
// Every message is a new test
shuffle_assoc($methods);
foreach ($methods as $name => $method)
{
$results[$i][$name] = $method($message);
}
$results[$i]['pass'] = $results[$i]['old']['result'] === $results[$i]['new']['result'];
$results[$i]['message'] = $message;
}
// Setup the diffs
foreach ($results as &$result)
{
if (!is_array($result))
{
continue;
}
// Figure out the order of the test
$order = array();
foreach ($result as $attr => $dummy)
{
if (in_array($attr, array('new', 'old')))
{
$order[] = $attr;
}
}
$result['order'] = implode(',', $order);
$result['time_diff'] = $result['old']['total_time'] - $result['new']['total_time'];
$result['time_winner'] = $result['old']['total_time'] > $result['new']['total_time'] ? 'new' : 'old';
if ($result['old']['total_time'] == 0)
$result['time_diff_percent'] = 0;
else
$result['time_diff_percent'] = round(($result['time_diff'] / $result['old']['total_time']) * 100, 2);
$result['mem_diff'] = max($result['old']['memory_usage'], $result['new']['memory_usage']) - min($result['old']['memory_usage'], $result['new']['memory_usage']);
$result['mem_winner'] = $result['old']['memory_usage'] > $result['new']['memory_usage'] ? 'new' : 'old';
$result['peak_mem_diff'] = max($result['old']['memory_peak_after'], $result['new']['memory_peak_after']) - min($result['old']['memory_peak_after'], $result['new']['memory_peak_after']);
$result['peak_mem_winner'] = $result['old']['memory_peak_after'] > $result['new']['memory_peak_after'] ? 'new' : 'old';
}
return $results;
}
function setupOldParseBBCGlobals()
{
global $bbc_codes, $itemcodes, $no_autolink_tags;
global $disabled, $default_disabled, $parse_tag_cache;
$bbc_codes = array();
$itemcodes = array();
$no_autolink_tags = array();
$disabled = null;
$default_disabled = null;
$parse_tag_cache = null;
}
function resetParseTagCache()
{
$GLOBALS['parse_tag_cache'] = null;
}
function getOldMethod()
{
return function ($message) {
return parse_bbc($message);
};
}
function getNewMethod($parser)
{
return function ($message) use ($parser){
return $parser->parse($message);
};
}
function getMessages($msg_id = null)
{
$messages = require 'Messages.php';
if ($msg_id !== null)
{
// Get a list of messages
if (is_array($msg_id))
{
foreach ($messages as $k => $v)
{
if (!in_array($k, $msg_id))
{
unset($messages[$k]);
}
}
}
// Get a single message
elseif (isset($messages[$msg_id]))
{
$messages = array($msg_id => $messages[$msg_id]);
}
else
{
$messages = array();
}
}
return $messages;
}
function runVSBenchmark($name, callable $old, callable $new)
{
$results = array(
'old' => runBenchmark($old),
'new' => runBenchmark($new),
);
}
function runBenchmark($callback, $iterations = ITERATIONS, $save_result = false)
{
$diagnostics = array(
'iterations' => $iterations,
'memory_before' => memory_get_usage(),
'memory_peak_before' => memory_get_peak_usage(),
'time_before' => microtime(true),
);
for ($i = 0; $i < $iterations; $i++)
{
// This is here because parse_bbc() has a $parse_tag_cache which is normally static
// but we made it global for the purpose of this test. So because it will attempt
// to cache the $parse_tags, it will artificially be faster than the new method.
resetParseTagCache();
if ($save_result)
{
if (!isset($result))
{
$result = $callback();
}
else
{
$callback();
}
}
else
{
$callback();
}
}
$diagnostics['result'] = isset($result) ? $result : null;
$diagnostics['time_after'] = microtime(true);
$diagnostics['memory_after'] = memory_get_usage();
$diagnostics['memory_peak_after'] = memory_get_peak_usage();
// @todo make sure this isn't less
$diagnostics['memory_usage'] = $diagnostics['memory_after'] - $diagnostics['memory_before'];
$diagnostics['total_time'] = round($diagnostics['time_after'] - $diagnostics['time_before'], 6);
return $diagnostics;
}
function debug()
{
if (!DEBUG)
{
return;
}
$args = func_get_args();
foreach ($args as $arg)
{
var_dump($arg);
}
}
// because shuffle doesn't have a shuffle_assoc()
function shuffle_assoc(&$array)
{
$keys = array_keys($array);
shuffle($keys);
foreach($keys as $key)
{
$new[$key] = $array[$key];
}
$array = $new;
return true;
}
<?php
$num_tests = count($results);
$num_pass = 0;
foreach ($results as $result)
{
if ($result['pass'])
{
$num_pass++;
}
}
$num_fail = $num_tests - $num_pass;
?>
Tests: <?= $num_tests ?><br>
Pass: <?= $num_pass ?><br>
Fail: <?= $num_fail ?><br>
<form method="get" action="index.php?type=test">
<input type="hidden" name="type" value="test">
<table class="table table-striped table-bordered table-condensed" data-page-length="1000">
<colgroup>
<col class="col-md-1">
<col class="col-md-3">
<col class="col-md-4">
<col class="col-md-4">
</colgroup>
<thead>
<tr>
<th>#</th>
<th>Message</th>
<th>Old Result</th>
<th>New Result</th>
</tr>
</thead>
<tbody>
<?php
foreach ($results as $test_num => $result)
{
echo '<!-- TEST #', $test_num, ' -->';
echo $result['pass'] ? '<tr>' : '<tr class="danger">';
echo '
<th scope="row" class="form-group"><input type="checkbox" name="msg[]" value="', $test_num, '">&nbsp;<label for="msg">', $test_num, '</label></th>
<td>
<div class="code">', htmlspecialchars($result['message']), '</div>
</td>';
// I really hate outputting both results since they are the same, but datatables complains about colspan
/*if ($result['pass'])
{
echo '
<td colspan="2">
<div class="code">', htmlspecialchars($result['return']['old']), '</div>
</td>';
}
else
{
echo '
<td>
<div class="code">', htmlspecialchars($result['return']['old']), '</div>
</td>
<td>
<div class="code">', htmlspecialchars($result['return']['new']), '</div>
</td>';
*/
echo '
<td>
<div class="code">', htmlspecialchars($result['return']['old']), '</div>
<div class="code">', $result['return']['old'], '</div>
</td>
<td>
<div class="code">', htmlspecialchars($result['return']['new']), '</div>
<div class="code">', $result['return']['new'], '</div>
</td>';
echo '</tr>';
echo '<!-- // END TEST #', $test_num. ' -->';
}
?>
</tbody>
</table>
<button type="submit">Submit</button>
</form>
<?php
$lines = file('top_time_diff_perc.csv');
$tests = array();
foreach ($lines as $line)
{
$result = explode(',', $line);
$result_size = count($result);
foreach ($result as $pos => $test)
{
$val = $result_size - $pos;
$tests[(int) $test] = !isset($tests[(int) $test]) ? $val : $val + $tests[(int) $test];
}
}
arsort($tests);
echo '<pre>
Num tests: ' . count($lines);
foreach ($tests as $num => $test)
{
echo "\nTest num: $num with a value of $test";
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment