Created
August 10, 2011 05:11
-
-
Save BrynM/1136176 to your computer and use it in GitHub Desktop.
inBuckets() - A bucketizer function for PHP
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?php | |
/** | |
* Find which buckets a range of records belongs in. | |
* | |
* This function is for using bucketed queries and/or memcache keys. Let's say | |
* you want to have some sort of paginated list displaying 10 records per page. | |
* Which pages do entries 23-31 live on? | |
* | |
* <code> | |
* $recordMarkers = inBuckets( 23, 31, 10 ); | |
* var_export( $recordMarkers ); | |
* | |
* // Result: | |
* // array ( | |
* // 2 => | |
* // array ( | |
* // 0 => 20, | |
* // 1 => 29, | |
* // ), | |
* // 3 => | |
* // array ( | |
* // 0 => 30, | |
* // 1 => 39, | |
* // ), | |
* // ) | |
* </code> | |
* | |
* Obviously, you can see where we plugged our numbers in as params to | |
* inBuckets(). Let's look at the result. What we got are the starting and ending | |
* points for each record set we need to look up. So what can we do with that? | |
* | |
* The primary thing we can do is construct predictable cache keys. If you've ever | |
* had to chase down odd cache keys that concatinated the parent function's | |
* parameters, you know that having oddball calls to such functions can lead to | |
* stray cache keys that never get purged. | |
* | |
* For example, say a method called `$blog->getRecords($RECA, $RECB)` creates | |
* keys in the format "blog_records_[$RECA]_[$RECB]". The normal | |
* pagination code is calling calling that method in groups of 100, but the AJAX | |
* code wants groups of 10. Using inBuckets() would let you have the same set of | |
* keys control both calls seamlessly. | |
* | |
* Secondly, we can construct predictable SQL queries. This has a couple of | |
* benefits. Since all of the queries are uniform, we will take greater | |
* advantage of any query caching. Secondly, if queries start landing in a slow | |
* query log, we can spot if the data itself is the issue and not the | |
* construction of the table. | |
* | |
* At this point, you've probably realized that the $size parameter given to | |
* inBuckets() should be somewhat standard. To help this along, the function will | |
* look for a declared constant named DEFAULT_RECORD_BUCKET_SIZE. With set to a | |
* reasonable value (depending on your needs), the third parameter of inBuckets() | |
* can be ommitted for the most part. | |
* | |
* @param int $first The first record you want to grab | |
* @param int $last The last record you want to grab | |
* @param int $size [optional] The size of each bucket. See below for default. | |
* @param int $raw [optional] Return the result as the straight output of PHP's | |
* range() function call. This is generally unweildy as you would need to do the | |
* end reocrd bounds math yourself. | |
* The default bucket size can be controlled by declaring the constant | |
* DEFAULT_RECORD_BUCKET_SIZE with an integer value above 0. | |
* @return array By default, this will return an array of record ranges. | |
* Each record range is itself an array of starting record number and | |
* ending record number. | |
*/ | |
function inBuckets( $first, $last, $size = NULL, $raw = false ) { | |
$ret = false; | |
$first = (int)$first; | |
$last = (int)$last; | |
$size = (int)$size; | |
if ( $size < 1 ) { // size not at least 1? probably asking for the defuault | |
// you can control the default size by defining this before declaring inBuckets() | |
// this is so you can tune the default to your convenience | |
if ( !defined( 'DEFAULT_RECORD_BUCKET_SIZE' ) ) { | |
define( 'DEFAULT_RECORD_BUCKET_SIZE', 10 ); | |
} | |
if ( !is_numeric( DEFAULT_RECORD_BUCKET_SIZE ) || ( DEFAULT_RECORD_BUCKET_SIZE < 1 ) ) { | |
trigger_error( 'DEFAULT_RECORD_BUCKET_SIZE is a non-positive integer! Current value is "' . DEFAULT_RECORD_BUCKET_SIZE . '"! inBuckets() will not work!', E_USER_NOTICE ); | |
return $ret; | |
} | |
$size = DEFAULT_RECORD_BUCKET_SIZE; | |
} | |
// now on to some corrections of values... | |
if ( $first < $size ) { | |
// the first record falls into the first bucket, so the starting record is 0 | |
$first = 0; | |
} | |
// time to look at where the last record falls... | |
if ( $first > $last ) { // last before first? correct to just a single record and throw a notice | |
$last = $first + 1; | |
trigger_error( 'inBuckets() First record (' . $first . ') comes after last record (' . $last . ')?!?!', E_USER_NOTICE ); | |
} | |
// find the bucket start for the bucket that the first record belongs to | |
$bS = ( $first > 0 ) ? floor( $first / $size ) * $size : 0; | |
// find the bucket end for the bucket that the | |
$bE = ( $last > $size ) ? ( ceil( $last / $size ) * $size ) - 1: $size; // last record falls within the first bucket? | |
// get the range set using PHP's built in range() as our initial return | |
$ret = range( $bS, $bE, $size ); | |
// set up our output if desired and available | |
if ( ( $raw == false ) && !empty( $ret ) ) { | |
$bucket = 0; | |
// skip ahead to the start bucket | |
while( ( $bucket * $size ) < $bS ) { | |
$bucket++; | |
} | |
$ran = $ret; | |
$ret = array(); | |
foreach ( $ran as $s ) { | |
if ( ( count( $ret ) > 0 ) && ( $s > ( $last - 1 ) ) ) { | |
break; | |
} | |
$ret[$bucket] = array( $s, ( $s + $size - 1 ) ); | |
$bucket++; | |
} | |
} | |
return $ret; | |
} | |
/* | |
// -------------------------------------------- | |
// SAMPLE CODE | |
// -------------------------------------------- | |
$recF = 23; | |
$recL = 31; | |
$bSize = 10; | |
$recordMarkers = inBuckets( $recF, $recL, $bSize ); | |
print '<h1>Bucket Ranges</h1><pre>inBuckets( ' . $recF . ', ' . $recL . ', ' . $bSize . ' ) =' . PHP_EOL . htmlspecialchars( var_export( $recordMarkers, 1 ) ) . '</pre>'; | |
$queries = array(); | |
$cacheKeys = array(); | |
if ( count( $recordMarkers ) > 0 ) { | |
foreach ( $recordMarkers as $buck => $records ) { | |
$queries[$buck] = 'SELECT things FROM foo LIMIT( ' . implode( ', ', $records ) . ' );'; | |
$cacheKeys[$buck] = 'FOO_THINGS_' . implode( '_TO_', $records ); | |
} | |
} | |
print '<h1>SQL Queries</h1><pre>' . htmlspecialchars( var_export( $queries, 1 ) ) . '</pre>'; | |
print '<h1>Cache Keys</h1><pre>' . htmlspecialchars( var_export( $cacheKeys, 1 ) ) . '</pre>'; | |
*/ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment