Skip to content

Instantly share code, notes, and snippets.

@UziTech
Last active Mar 26, 2021
Embed
What would you like to do?
Recursive glob search in php
/*
* License: DWTFYW
*/
/**
* Search recusively for files in a base directory matching a glob pattern.
* The `GLOB_NOCHECK` flag has no effect.
*
* @param string $base Directory to search
* @param string $pattern Glob pattern to match files
* @param int $flags Glob flags from https://www.php.net/manual/function.glob.php
* @return string[] Array of files matching the pattern
*/
function glob_recursive($base, $pattern, $flags = 0) {
$flags = $flags & ~GLOB_NOCHECK;
if (substr($base, -1) !== DIRECTORY_SEPARATOR) {
$base .= DIRECTORY_SEPARATOR;
}
$files = glob($base.$pattern, $flags);
if (!is_array($files)) {
$files = [];
}
$dirs = glob($base.'*', GLOB_ONLYDIR|GLOB_NOSORT|GLOB_MARK);
if (!is_array($dirs)) {
return $files;
}
foreach ($dirs as $dir) {
$dirFiles = glob_recursive($dir, $pattern, $flags);
$files = array_merge($files, $dirFiles);
}
return $files;
}
@markparnaby

This comment has been minimized.

Copy link

@markparnaby markparnaby commented Jul 9, 2020

I add an extra line after line 10 as glob can return false, and your later array_merge should be fed an array. You catch the possible false return later on line 14, so that's good.

	$files = glob($base.$pattern, $flags);
	$files = $files !== false ? $files : [];
@UziTech

This comment has been minimized.

Copy link
Owner Author

@UziTech UziTech commented Jul 9, 2020

I updated it, thanks! 💯

@WinterSilence

This comment has been minimized.

Copy link

@WinterSilence WinterSilence commented Jul 28, 2020

/**
 * Recursive `glob()`.
 *
 * @author info@ensostudio.ru
 *
 * @param string $baseDir Base directory to search
 * @param string $pattern Glob pattern
 * @param int $flags Behavior bitmask
 * @return array|string|bool
 */
function glob_recursive(string $baseDir, string $pattern, int $flags = GLOB_NOSORT | GLOB_BRACE)
{
    $paths = glob(rtrim($baseDir, '\/') . DIRECTORY_SEPARATOR . $pattern, $flags);
    if (is_array($paths)) {
        foreach ($paths as $path) {
            if (is_dir($path)) {
                $subPaths = (__FUNCTION__)($path, $pattern, $flags);
                if ($subPaths !== false) {
                    $subPaths = (array) $subPaths;
                    array_push($paths, ...$subPaths);
                }
            }
        }
    }
    return $paths;
}
@haydenk

This comment has been minimized.

Copy link

@haydenk haydenk commented Feb 3, 2021

function rglob(string $patterns, $flags = null): array {
    $result = glob($patterns, $flags);
    foreach ($result as $item) {
        if (is_dir($item)) {
            array_push($result, ...rglob($item . '/*', $flags));
        }
    }

    return $result;
}

My ¢¢

@nimbus2300

This comment has been minimized.

Copy link

@nimbus2300 nimbus2300 commented Mar 25, 2021

function rglob($dir, $flags=null, &$results=[]) {
    $ls = glob($dir, $flags);

    if (is_array($ls)) {
        foreach ($ls as $item) {
            if (is_dir($item)) {
                $this->rglob($item . '/*', $flags, $results);
            }
            if (is_file($item)) {
                $results[] = $item;
            }
        }
    }

    return $results;
}

My ¢¢, this one returns just a simple array of all the files (full path) under the top level directory.

@WinterSilence

This comment has been minimized.

Copy link

@WinterSilence WinterSilence commented Mar 26, 2021

@nimbus2300 trash

@WinterSilence

This comment has been minimized.

Copy link

@WinterSilence WinterSilence commented Mar 26, 2021

@haydenk glob() return not only array, you ignore/replace $patterns(and why is multiple?) in array_push($result, ...rglob($item . '/*', $flags));

@WinterSilence

This comment has been minimized.

Copy link

@WinterSilence WinterSilence commented Mar 26, 2021

@markparnaby @UziTech glob(): array|string|bool, miss case when $flags contain GLOB_NOCHECK

@UziTech

This comment has been minimized.

Copy link
Owner Author

@UziTech UziTech commented Mar 26, 2021

Fixed it for GLOB_NOCHECK

@WinterSilence

This comment has been minimized.

Copy link

@WinterSilence WinterSilence commented Mar 26, 2021

@UziTech sorry, but it's too wrong - recursive version of glob() must return same results. check my version - it's already fixed.

@UziTech

This comment has been minimized.

Copy link
Owner Author

@UziTech UziTech commented Mar 26, 2021

recursive version of glob() must return same results

Says who? I would rather it be consistent and always return an array.

@WinterSilence

This comment has been minimized.

Copy link

@WinterSilence WinterSilence commented Mar 26, 2021

@UziTech established BC practice - check native *_recursive() functions

@nimbus2300

This comment has been minimized.

Copy link

@nimbus2300 nimbus2300 commented Mar 26, 2021

@UziTech I agree - the reason I put my (slightly flawed) function in the mix is I just wanted a simple flat array of all the files (recursed) under a parent dir.

@WinterSilence

This comment has been minimized.

Copy link

@WinterSilence WinterSilence commented Mar 26, 2021

@UziTech flag GLOB_NOCHECK grant return pattern if cant found paths by pattern

@WinterSilence

This comment has been minimized.

Copy link

@WinterSilence WinterSilence commented Mar 26, 2021

@nimbus2300

  • $this->rglob(
  • if not is_dir($path) then it's file - no need additional check
  • change pattern for search in sub-directories
  • ignore flags
@UziTech

This comment has been minimized.

Copy link
Owner Author

@UziTech UziTech commented Mar 26, 2021

established BC practice - check native *_recursive() functions

I was going for usefulness not consistency with native functions.

Especially as far as GLOB_NOCHECK goes. I don't get why that would be useful. The user is sending the pattern why would they need it back? They can just check if the array is empty and use the pattern they sent to the function.

@WinterSilence

This comment has been minimized.

Copy link

@WinterSilence WinterSilence commented Mar 26, 2021

@UziTech "useful for me" not same to "useful for anybody". if function name is glob_recursive() then it's should be glob() called recursively, without any other changes.

@UziTech

This comment has been minimized.

Copy link
Owner Author

@UziTech UziTech commented Mar 26, 2021

If that is what people want they can use your implementation.

@UziTech

This comment has been minimized.

Copy link
Owner Author

@UziTech UziTech commented Mar 26, 2021

then it's should be glob() called recursively, without any other changes.

That isn't always the most useful. For instance if glob fails it returns false, so consistency would say that if glob_recursive fails in any of the subfolders it should return false. You implementation returns the current list of files instead of false. (Which I would argue is better anyway.)

Also yours will return the string pattern many times when using GLOB_NOCHECK if the pattern is not found in a subfolder. Which is consistent but not very useful.

// dir1
//   file1
// dir2
//   file2
// dir3
//   file3
glob_recursive("/", "file1", GLOB_NOCHECK);
// Your return:
//  ["dir1/file1", "file1", "file1"]
// Most likely expected return:
//  ["dir1/file1"]
@UziTech

This comment has been minimized.

Copy link
Owner Author

@UziTech UziTech commented Mar 26, 2021

@WinterSilence actually it looks like yours would just return ["/file1"] for the above example because it doesn't looks through directories that don't match the pattern.

@UziTech

This comment has been minimized.

Copy link
Owner Author

@UziTech UziTech commented Mar 26, 2021

Also GLOB_NOCHECK still returns an array with the pattern as a string not just the string pattern.

glob("*.txt", GLOB_NOCHECK)
// returns ["*.txt"] if no files found not "*.txt"
@UziTech

This comment has been minimized.

Copy link
Owner Author

@UziTech UziTech commented Mar 26, 2021

Here is a version that works with GLOB_NOCHECK

function glob_recursive($base, $pattern, $flags = 0) {
	$glob_nocheck = $flags & GLOB_NOCHECK;
	$flags = $flags & ~GLOB_NOCHECK;

	function check_folder($base, $pattern, $flags) {
		if (substr($base, -1) !== DIRECTORY_SEPARATOR) {
			$base .= DIRECTORY_SEPARATOR;
		}

		$files = glob($base.$pattern, $flags);
		if (!is_array($files)) {
			$files = [];
		}

		$dirs = glob($base.'*', GLOB_ONLYDIR|GLOB_NOSORT|GLOB_MARK);
		if (!is_array($dirs)) {
			return $files;
		}

		foreach ($dirs as $dir) {
			$dirFiles = check_folder($dir, $pattern, $flags);
			$files = array_merge($files, $dirFiles);
		}
		
		return $files;
	}
	$files = check_folder($base, $pattern, $flags);

	if ($glob_nocheck && count($files) === 0) {
		return [$pattern];
	}
	
	return $files;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment