Skip to content

Instantly share code, notes, and snippets.

@UziTech
Last active May 25, 2023 18:32
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save UziTech/3b65b2543cee57cd6d2ecfcccf846f20 to your computer and use it in GitHub Desktop.
Save UziTech/3b65b2543cee57cd6d2ecfcccf846f20 to your computer and use it in GitHub Desktop.
Recursive glob search in php
/*
* License: DWTFYW
*/
/**
* Search recusively for files in a base directory matching a glob pattern.
* The `GLOB_NOCHECK` flag has no effect.
*
* @param string $base Directory to search
* @param string $pattern Glob pattern to match files
* @param int $flags Glob flags from https://www.php.net/manual/function.glob.php
* @return string[] Array of files matching the pattern
*/
function glob_recursive($base, $pattern, $flags = 0) {
$flags = $flags & ~GLOB_NOCHECK;
if (substr($base, -1) !== DIRECTORY_SEPARATOR) {
$base .= DIRECTORY_SEPARATOR;
}
$files = glob($base.$pattern, $flags);
if (!is_array($files)) {
$files = [];
}
$dirs = glob($base.'*', GLOB_ONLYDIR|GLOB_NOSORT|GLOB_MARK);
if (!is_array($dirs)) {
return $files;
}
foreach ($dirs as $dir) {
$dirFiles = glob_recursive($dir, $pattern, $flags);
$files = array_merge($files, $dirFiles);
}
return $files;
}
@markparnaby
Copy link

I add an extra line after line 10 as glob can return false, and your later array_merge should be fed an array. You catch the possible false return later on line 14, so that's good.

	$files = glob($base.$pattern, $flags);
	$files = $files !== false ? $files : [];

@UziTech
Copy link
Author

UziTech commented Jul 9, 2020

I updated it, thanks! 💯

@WinterSilence
Copy link

WinterSilence commented Jul 28, 2020

/**
 * Recursive `glob()`.
 *
 * @author info@ensostudio.ru
 *
 * @param string $baseDir Base directory to search
 * @param string $pattern Glob pattern
 * @param int $flags Behavior bitmask
 * @return array|string|bool
 */
function glob_recursive(string $baseDir, string $pattern, int $flags = GLOB_NOSORT | GLOB_BRACE)
{
    $paths = glob(rtrim($baseDir, '\/') . DIRECTORY_SEPARATOR . $pattern, $flags);
    if (is_array($paths)) {
        foreach ($paths as $path) {
            if (is_dir($path)) {
                $subPaths = (__FUNCTION__)($path, $pattern, $flags);
                if ($subPaths !== false) {
                    $subPaths = (array) $subPaths;
                    array_push($paths, ...$subPaths);
                }
            }
        }
    }
    return $paths;
}

@haydenk
Copy link

haydenk commented Feb 3, 2021

function rglob(string $patterns, $flags = null): array {
    $result = glob($patterns, $flags);
    foreach ($result as $item) {
        if (is_dir($item)) {
            array_push($result, ...rglob($item . '/*', $flags));
        }
    }

    return $result;
}

My ¢¢

@nimbus2300
Copy link

nimbus2300 commented Mar 25, 2021

function rglob($dir, $flags=null, &$results=[]) {
    $ls = glob($dir, $flags);

    if (is_array($ls)) {
        foreach ($ls as $item) {
            if (is_dir($item)) {
                $this->rglob($item . '/*', $flags, $results);
            }
            if (is_file($item)) {
                $results[] = $item;
            }
        }
    }

    return $results;
}

My ¢¢, this one returns just a simple array of all the files (full path) under the top level directory.

@WinterSilence
Copy link

@nimbus2300 trash

@WinterSilence
Copy link

@haydenk glob() return not only array, you ignore/replace $patterns(and why is multiple?) in array_push($result, ...rglob($item . '/*', $flags));

@WinterSilence
Copy link

@markparnaby @UziTech glob(): array|string|bool, miss case when $flags contain GLOB_NOCHECK

@UziTech
Copy link
Author

UziTech commented Mar 26, 2021

Fixed it for GLOB_NOCHECK

@WinterSilence
Copy link

WinterSilence commented Mar 26, 2021

@UziTech sorry, but it's too wrong - recursive version of glob() must return same results. check my version - it's already fixed.

@UziTech
Copy link
Author

UziTech commented Mar 26, 2021

recursive version of glob() must return same results

Says who? I would rather it be consistent and always return an array.

@WinterSilence
Copy link

@UziTech established BC practice - check native *_recursive() functions

@nimbus2300
Copy link

nimbus2300 commented Mar 26, 2021

@UziTech I agree - the reason I put my (slightly flawed) function in the mix is I just wanted a simple flat array of all the files (recursed) under a parent dir.

@WinterSilence
Copy link

@UziTech flag GLOB_NOCHECK grant return pattern if cant found paths by pattern

@WinterSilence
Copy link

WinterSilence commented Mar 26, 2021

@nimbus2300

  • $this->rglob(
  • if not is_dir($path) then it's file - no need additional check
  • change pattern for search in sub-directories
  • ignore flags

@UziTech
Copy link
Author

UziTech commented Mar 26, 2021

established BC practice - check native *_recursive() functions

I was going for usefulness not consistency with native functions.

Especially as far as GLOB_NOCHECK goes. I don't get why that would be useful. The user is sending the pattern why would they need it back? They can just check if the array is empty and use the pattern they sent to the function.

@WinterSilence
Copy link

WinterSilence commented Mar 26, 2021

@UziTech "useful for me" not same to "useful for anybody". if function name is glob_recursive() then it's should be glob() called recursively, without any other changes.

@UziTech
Copy link
Author

UziTech commented Mar 26, 2021

If that is what people want they can use your implementation.

@UziTech
Copy link
Author

UziTech commented Mar 26, 2021

then it's should be glob() called recursively, without any other changes.

That isn't always the most useful. For instance if glob fails it returns false, so consistency would say that if glob_recursive fails in any of the subfolders it should return false. You implementation returns the current list of files instead of false. (Which I would argue is better anyway.)

Also yours will return the string pattern many times when using GLOB_NOCHECK if the pattern is not found in a subfolder. Which is consistent but not very useful.

// dir1
//   file1
// dir2
//   file2
// dir3
//   file3
glob_recursive("/", "file1", GLOB_NOCHECK);
// Your return:
//  ["dir1/file1", "file1", "file1"]
// Most likely expected return:
//  ["dir1/file1"]

@UziTech
Copy link
Author

UziTech commented Mar 26, 2021

@WinterSilence actually it looks like yours would just return ["/file1"] for the above example because it doesn't looks through directories that don't match the pattern.

@UziTech
Copy link
Author

UziTech commented Mar 26, 2021

Also GLOB_NOCHECK still returns an array with the pattern as a string not just the string pattern.

glob("*.txt", GLOB_NOCHECK)
// returns ["*.txt"] if no files found not "*.txt"

@UziTech
Copy link
Author

UziTech commented Mar 26, 2021

Here is a version that works with GLOB_NOCHECK

function glob_recursive($base, $pattern, $flags = 0) {
	$glob_nocheck = $flags & GLOB_NOCHECK;
	$flags = $flags & ~GLOB_NOCHECK;

	function check_folder($base, $pattern, $flags) {
		if (substr($base, -1) !== DIRECTORY_SEPARATOR) {
			$base .= DIRECTORY_SEPARATOR;
		}

		$files = glob($base.$pattern, $flags);
		if (!is_array($files)) {
			$files = [];
		}

		$dirs = glob($base.'*', GLOB_ONLYDIR|GLOB_NOSORT|GLOB_MARK);
		if (!is_array($dirs)) {
			return $files;
		}

		foreach ($dirs as $dir) {
			$dirFiles = check_folder($dir, $pattern, $flags);
			$files = array_merge($files, $dirFiles);
		}
		
		return $files;
	}
	$files = check_folder($base, $pattern, $flags);

	if ($glob_nocheck && count($files) === 0) {
		return [$pattern];
	}
	
	return $files;
}

@Alys-dev
Copy link

Hi there, I used your marvelous second function glob_recursive, but, I need to say there is an issue when we want using the function glob_recursive twice or more per php run execution.

that one

The issue lead to a FATAL ERROR: Fatal error: Cannot redeclare check_folder() (previously declared in ....

SO... I found where is the issue:
you used a declaration of a sub-function in your second version of glob_recursive(), then unlike class methods, functions (or sub-functions are exported to a global namespace (OR) custom local and of course you certainly know function are not class).
sub-function (I call them like that but it's just a function that will be declared once the main parent function will be called).. cause an issue because once we call ONE time glob_recursive() the function is registered in the PHP execution... (meant : declared!)
finally if we call a second time glob_recursive() it lead to a force re-declare check_folder() function as it is a part of glob_recursive().

The fix is easy, (and I think you just forgot it, as your function is amazing):

Before line:
function check_folder($base, $pattern, $flags) {

Add just:
if (!function_exists('check_folder')) {

And we now must close the newly added "if" statement, so ...

Before line:
$files = check_folder($base, $pattern, $flags);

Add just an another :
}

And now, the glob_recursive() function can be called infinite amount of attempts !

By the way, thanks for the function !!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment