Skip to content

Instantly share code, notes, and snippets.

@BrynM
Created August 10, 2011 05:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save BrynM/1136176 to your computer and use it in GitHub Desktop.
Save BrynM/1136176 to your computer and use it in GitHub Desktop.
inBuckets() - A bucketizer function for PHP
<?php
/**
* Find which buckets a range of records belongs in.
*
* This function is for using bucketed queries and/or memcache keys. Let's say
* you want to have some sort of paginated list displaying 10 records per page.
* Which pages do entries 23-31 live on?
*
* <code>
* $recordMarkers = inBuckets( 23, 31, 10 );
* var_export( $recordMarkers );
*
* // Result:
* // array (
* // 2 =&gt;
* // array (
* // 0 =&gt; 20,
* // 1 =&gt; 29,
* // ),
* // 3 =&gt;
* // array (
* // 0 =&gt; 30,
* // 1 =&gt; 39,
* // ),
* // )
* </code>
*
* Obviously, you can see where we plugged our numbers in as params to
* inBuckets(). Let's look at the result. What we got are the starting and ending
* points for each record set we need to look up. So what can we do with that?
*
* The primary thing we can do is construct predictable cache keys. If you've ever
* had to chase down odd cache keys that concatinated the parent function's
* parameters, you know that having oddball calls to such functions can lead to
* stray cache keys that never get purged.
*
* For example, say a method called `$blog->getRecords($RECA, $RECB)` creates
* keys in the format &quot;blog_records_[$RECA]_[$RECB]&quot;. The normal
* pagination code is calling calling that method in groups of 100, but the AJAX
* code wants groups of 10. Using inBuckets() would let you have the same set of
* keys control both calls seamlessly.
*
* Secondly, we can construct predictable SQL queries. This has a couple of
* benefits. Since all of the queries are uniform, we will take greater
* advantage of any query caching. Secondly, if queries start landing in a slow
* query log, we can spot if the data itself is the issue and not the
* construction of the table.
*
* At this point, you've probably realized that the $size parameter given to
* inBuckets() should be somewhat standard. To help this along, the function will
* look for a declared constant named DEFAULT_RECORD_BUCKET_SIZE. With set to a
* reasonable value (depending on your needs), the third parameter of inBuckets()
* can be ommitted for the most part.
*
* @param int $first The first record you want to grab
* @param int $last The last record you want to grab
* @param int $size [optional] The size of each bucket. See below for default.
* @param int $raw [optional] Return the result as the straight output of PHP's
* range() function call. This is generally unweildy as you would need to do the
* end reocrd bounds math yourself.
* The default bucket size can be controlled by declaring the constant
* DEFAULT_RECORD_BUCKET_SIZE with an integer value above 0.
* @return array By default, this will return an array of record ranges.
* Each record range is itself an array of starting record number and
* ending record number.
*/
function inBuckets( $first, $last, $size = NULL, $raw = false ) {
$ret = false;
$first = (int)$first;
$last = (int)$last;
$size = (int)$size;
if ( $size < 1 ) { // size not at least 1? probably asking for the defuault
// you can control the default size by defining this before declaring inBuckets()
// this is so you can tune the default to your convenience
if ( !defined( 'DEFAULT_RECORD_BUCKET_SIZE' ) ) {
define( 'DEFAULT_RECORD_BUCKET_SIZE', 10 );
}
if ( !is_numeric( DEFAULT_RECORD_BUCKET_SIZE ) || ( DEFAULT_RECORD_BUCKET_SIZE < 1 ) ) {
trigger_error( 'DEFAULT_RECORD_BUCKET_SIZE is a non-positive integer! Current value is "' . DEFAULT_RECORD_BUCKET_SIZE . '"! inBuckets() will not work!', E_USER_NOTICE );
return $ret;
}
$size = DEFAULT_RECORD_BUCKET_SIZE;
}
// now on to some corrections of values...
if ( $first < $size ) {
// the first record falls into the first bucket, so the starting record is 0
$first = 0;
}
// time to look at where the last record falls...
if ( $first > $last ) { // last before first? correct to just a single record and throw a notice
$last = $first + 1;
trigger_error( 'inBuckets() First record (' . $first . ') comes after last record (' . $last . ')?!?!', E_USER_NOTICE );
}
// find the bucket start for the bucket that the first record belongs to
$bS = ( $first > 0 ) ? floor( $first / $size ) * $size : 0;
// find the bucket end for the bucket that the
$bE = ( $last > $size ) ? ( ceil( $last / $size ) * $size ) - 1: $size; // last record falls within the first bucket?
// get the range set using PHP's built in range() as our initial return
$ret = range( $bS, $bE, $size );
// set up our output if desired and available
if ( ( $raw == false ) && !empty( $ret ) ) {
$bucket = 0;
// skip ahead to the start bucket
while( ( $bucket * $size ) < $bS ) {
$bucket++;
}
$ran = $ret;
$ret = array();
foreach ( $ran as $s ) {
if ( ( count( $ret ) > 0 ) && ( $s > ( $last - 1 ) ) ) {
break;
}
$ret[$bucket] = array( $s, ( $s + $size - 1 ) );
$bucket++;
}
}
return $ret;
}
/*
// --------------------------------------------
// SAMPLE CODE
// --------------------------------------------
$recF = 23;
$recL = 31;
$bSize = 10;
$recordMarkers = inBuckets( $recF, $recL, $bSize );
print '<h1>Bucket Ranges</h1><pre>inBuckets( ' . $recF . ', ' . $recL . ', ' . $bSize . ' ) =' . PHP_EOL . htmlspecialchars( var_export( $recordMarkers, 1 ) ) . '</pre>';
$queries = array();
$cacheKeys = array();
if ( count( $recordMarkers ) > 0 ) {
foreach ( $recordMarkers as $buck => $records ) {
$queries[$buck] = 'SELECT things FROM foo LIMIT( ' . implode( ', ', $records ) . ' );';
$cacheKeys[$buck] = 'FOO_THINGS_' . implode( '_TO_', $records );
}
}
print '<h1>SQL Queries</h1><pre>' . htmlspecialchars( var_export( $queries, 1 ) ) . '</pre>';
print '<h1>Cache Keys</h1><pre>' . htmlspecialchars( var_export( $cacheKeys, 1 ) ) . '</pre>';
*/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment