Skip to content

Instantly share code, notes, and snippets.

@BionicClick
Created October 1, 2014 00:12
Show Gist options
  • Save BionicClick/c1613bc6fa3dd528f2cd to your computer and use it in GitHub Desktop.
Save BionicClick/c1613bc6fa3dd528f2cd to your computer and use it in GitHub Desktop.

Locating Files:

Thefindcommand is used to locate files on a Unix or Linux system.findwill search any set of directories you specify for files that match the supplied_search criteria_. You can search for files by name, owner, group, type, permissions, date, and other criteria. The search is recursive in that it will search all subdirectories too. The syntax looks like this:

find where-to-look criteria what-to-do

All arguments tofindare optional, and there are defaults for all parts. (This may depend on which version offindis used. Here we discuss the freely available Gnu version offind, which is the version available onYborStudent.) For example,where-to-look_defaults to.(that is, the current working directory),criteria_defaults to none (that is, select all files), and_what-to-do(known as thefind_action) defaults to‑print(that is, display the names of found files to standard output). Technically, the criteria and actions are all known asfind_primaries_.

For example:

find

will display the pathnames of all files in the current directory and all subdirectories. The commands

find . -print find -print find .

do the exact same thing. Here's an examplefindcommand using a search criterion and the default action:

find / -name foo

This will search the whole system for any files namedfooand display their pathnames. Here we are using the criterion‑namewith the argumentfooto tellfindto perform a name search for the filenamefoo. The output might look like this:

/home/wpollock/foo /home/ua02/foo /tmp/foo

Iffinddoesn't locate any matching files, it produces no output.

The above example said to search the whole system, by specifying the root directory ("/") to search. If you don't run this command as root,findwill display a error message for each directory on which you don't have read permission. This can be a lot of messages, and the matching files that are found may scroll right off your screen. A good way to deal with this problem is to redirect the error messages so you don't have to see them at all:

find / -name foo 2>/dev/null

You can specify as many places to search as you wish:

find /tmp /var/tmp . $HOME -name foo

Advanced Features and Applications:

The "‑print" action lists the names of files separated by a newline. But it is common to_pipe_the output offindintoxargs, which uses a space to separate file names. This can lead to problems if any found files contain spaces in their names, as the output doesn't use any quoting. In such cases, when the output offindcontains a file name such as "foobar" and is piped into another command, that command "sees" two file names, not one file name containing a space. Even without usingxargs, you could have a problem if the file name contains a newline character, as most utilities expect one file name per line.

In such cases, you can specify the action "‑print0" instead. This lists the found files separated not with a newline but with a_null_(or "NUL") character, which is not a legal character in Unix or Linux file names. Of course the command that reads the output offindmust be able to handle such a list of file names. Many commands commonly used withfind(such astarorcpio) have special options to read in file names separated with NULs instead of spaces.

Instead of havingfindlist the files, it can run some command for each file found, using the "‑exec" action. The‑execis followed by some shell command line, ended with a semicolon (";"). (The semicolon must be quoted from the shell, sofindcan see it!) Within that command line, the word "{}" will expand out to the name of the found file. See below for some examples.

You can use shell-style wildcards in the‑namesearch argument:

find . -name foo\*bar

This will search from the current directory down forfoo*bar(that is, any filename that begins withfooand ends withbar). Note that wildcards in the name argument must be quoted so the shell doesn't expand them before passing them tofind. Also, unlike regular shell wildcards, these will match leading periods in filenames. (For example "find‑name\*.txt" would match ".foo.txt".)

You can search for other criteria beside the name. Also you can list multiple search criteria. When you have multiple criteria, any found files must match all listed criteria. That is, there is an implied Boolean_AND_operator between the listed search criteria.findalso allows_OR_and_NOT_Boolean operators, as well as grouping, to combine search criteria in powerful ways (not shown here.)

Here's an example using two search criteria:

find / -type f -mtime -7 | xargs tar -rf weekly_incremental.tar gzip weekly_incremental.tar

will find any regular files (i.e., not directories or other special files) with the criterion "‑typef", and only those modified seven or fewer days ago ("‑mtime‑7"). Note the use ofxargs, a handy utility that coverts a stream of input (in this case the output offind) into command line arguments for the supplied command (in this casetar, used to create a backup archive).

Using thetaroption "‑c" is dangerous here;xargsmay invoketarseveral times if there are many files found, and each "‑c" will causetarto over-write the previous invocation. The "‑r" option_appends_files to an archive. Other options such as those that would permit filenames containing spaces would be useful in a "production quality" backup script.

Another use ofxargsis illustrated below. This command will efficiently remove all files namedcorefrom your system (provided you run the command as root of course):

find / -name core | xargs /bin/rm -f find / -name core -exec /bin/rm -f '{}' ; # same thing find / -name core -delete # same if using Gnu find

The last two forms run thermcommand once per file, and are not as efficient as the first form; but they are safer if file names contain spaces or newlines. The first form can be made safer if rewritten to use "‑print0" instead of (the default) "‑print". "‑exec" can be used more efficiently (see Using‑execEfficientlybelow), but doing so means running the command once with many file names passed as arguments, and so has the same safety issues as withxargs.

One of my favorite of thefindcriteria is used to locate files modified less than 10 minutes ago. I use this right after using some system administration tool, to learn which files got changed by that tool:

find / -mmin -10

(This search is also useful when I've downloaded some file but can't locate it, only in that case "‑cmin" may work better. Keep in mind neither of these criteria is standard; "‑mtime" and "‑ctime" are standard, but use days and not minutes.)

Another common use is to locate all files owned by a given user ("‑user_username_"). This is useful when deleting user accounts.

You can also find files with various permissions set. "‑perm/permissions" means to find files withanyof the specified_permissions_on, "‑perm-permissions" means to find files withallof the specified_permissions_on, and "‑perm_permissions_" means to find files withexactly_permissions_._Permissions_can be specified either symbolically (preferred) or with an octal number. The following will locate files that are writable by "others" (including symlinks, which should be writable by all):

find . -perm -o=w

(Using‑permis more complex than this example shows. You should check both the POSIX documentation forfind(which explains how the symbolic modes work) and the Gnufindman page(which describes the Gnu extensions).

When usingfindto locate files for backups, it often pays to use the "‑depth" option (really a criterion that is always true), which forces the output to be_depth-first_—that is, files first and then the directories containing them. This helps when the directories have restrictive permissions, and restoring the directory first could prevent the files from restoring at all (and would change the time stamp on the directory in any case). Normally,findreturns the directory first, before any of the files in that directory. This default behavior is useful when using the "‑prune" action to preventfindfrom examining any files you want to ignore:

find / -name /dev -prune ...other criteria | xargs tar ...

Using just "find / ‑name /dev ‑prune | xargs tar ..." won't work as most people might expect. This says to only find files named "/dev", and then (if a directory) don't descend into it. So you only get the single directory name "/dev"! A better plan is to use the following:

find / ! -path /dev\* |xargs ...

which says find everything except pathnames that start with "/dev". The "!" means Boolean_NOT_.

When specifying time withfindoptions such as‑mmin(minutes) or‑mtime(24 hour periods, starting from now), you can specify a number "n" to mean exactly_n_, "‑n" to mean less than_n_, and "+n" to mean more than_n_.

Fractional 24-hour periods are truncated! That means that "find‑mtime+1" says to match files modifiedtwo or more days ago.

For example:

find . -mtime 0 # find files modified between now and 1 day ago # (i.e., within the past 24 hours) find . -mtime -1 # find files modified less than 1 day ago # (i.e., within the past 24 hours, as before) find . -mtime 1 # find files modified between 24 and 48 hours ago find . -mtime +1 # find files modified more than 48 hours ago find . -mmin +5 -mmin -10 # find files modified between # 6 and 9 minutes ago

Using the (non-standard) "‑printf" action instead of the default "‑print" is useful to control the output format better than you can with thelsordirutilities. You can usefindwith the‑printfaction to produce output that can easily be parsed by other utilities or imported into spreadsheets or databases. See the Gnufindman page for the dozens of possibilities with the‑printfaction. (In fact,findwith‑printfis more versatile thanls; it is the preferred tool for forensic examiners even on Windows systems, to list file information.) For example the following displays non-hidden (no leading dot) files in the current directory only (no subdirectories), with an custom output format:

find . -maxdepth 1 -name '[!.]*' -printf 'Name: %16f Size: %6s\n'

"‑maxdepth" is a Gnu extension. On a modern, POSIX version offindyou could use this:

find . -path './*' -prune ...

On any version offindyou can use this more complex (but portable) code:

find . ! -name . -prune ...

which says to "prune" (don't descend into) any directories except ".".

Note that "‑maxdepth1" will include "." unless you also specify "‑mindepth1". A portable way to include "." is:

find . ( -name . -o -prune ) ...

The "(" and ")" are just parenthesis used for grouping, and escaped from the shell. The "‑o" means Boolean_OR_.

[This information posted by Stephane Chazelas, on Mar 10 2009, in newsgroup comp.unix.shell.]

As a system administrator, you can usefindto locate suspicious files (e.g., world writable files, files with no valid owner and/or group, SetUIDfiles, files with unusual permissions, sizes, names, or dates). Here's a more complex example (which I saved as a shell script so I can run it often):

find / -noleaf -wholename '/proc' -prune \ -o -wholename '/sys' -prune \ -o -wholename '/dev' -prune \ -o -wholename '/windows-C-Drive' -prune \ -o -perm -2 ! -type l ! -type s \ ! ( -type d -perm -1000 ) -print

This says to seach the whole system, skipping the directories/proc,/sys,/dev, and/windows-C-Drive(presumably a Windows partition on a dual-booted computer). The Gnu‑noleafoption tellsfindnot to assume all remaining mounted filesystems are Unix file systems (you might have a mountedCDfor instance). The "‑o" is the Boolean OR operator, and "!" is the Boolean NOT operator (applies to the following criteria).

So these criteria say to locate files that are world writable ("‑perm‑2", same as "‑o=w") and NOT symlinks ("!‑typel") and NOT_sockets_("!‑types") and NOT directories with the_sticky_(or_text_) bit set ("!(‑typed‑perm‑1000 )"). (Symlinks, sockets, and directories with the sticky bit set, are often world-writable and generally not suspicious.)

A common request is a way to find all the hard links to some file. Using "ls‑li_file_" will tell you how many hard links the file has, and the_inode number_. You can locate all pathnames to this file with:

find mount-point -xdev -inum inode-number

Since hard links are restricted to a single filesystem, you need to search that whole filesystem so you start the search at the filesystem's_mount point_. (This is likely to be either "/home" or "/" for files in your home directory.) The "‑xdev" option tellsfindto not search (descend into) any other filesystems.

(While most Unix and all Linux systems have afindcommand that supports the "‑inum" criterion, this isn't POSIX standard. Older Unix systems provided the "ncheck" utility instead that could be used for this.)

Using‑execEfficiently:

The‑execaction takes a command (along with its options) as an argument. The arguments should contain{}(usually quoted), which is replaced in the command with the name of the currently found file. The command is terminated by a semicolon, which must be quoted ("escaped") so the shell will pass it literally to thefindcommand.

To use a more complex action with‑exec, you can use "sh‑c_complex-command_" as the Unix command. Here's a somewhat contrived example, that for each found file replaces "Mr." with "Mr. or Ms.", and also converts the file to uppercase:

find whatever... -exec sh -c 'sed "s/Mr./Mr. or Ms./g" "{}" \ | tr "[:lower:]" "[:upper:]" >"{}.new"' ;

The‑execaction infindis very useful. But since it runs the command listed for every found file, it isn't very efficient. On a large system this makes a difference! One solution is to combinefindwithxargsas discussed above:

find whatever... | xargs command

However this approach has two limitations. Firstly not all commands accept the list of files at the end of the command. A good example iscp:

find . -name \*.txt | xargs cp /tmp # This won't work!

(Note the Gnu version ofcphas a non-POSIX option "‑t" for this, and Gnuxargshas options to handle this too.)

Secondly, filenames may contain spaces or newlines, which would confuse the command used withxargs. (Again Gnu tools have options for that, "find...‑print0|xargs‑0...".)

There are standard POSIX (but non-obvious) solutions to both problems. An alternate form of‑execends with a plus-sign, not a semi-colon. This form collects the filenames into groups or sets, and runs the command once per set. (This is exactly whatxargsdoes, to prevent argument lists from becoming too long for the system to handle.) In this form, the{}argument expands to the set of filenames. For example:

find / -name core -exec /bin/rm -f '{}' +

This command is equivalent to usingfindwithxargs, only a bit shorter and more efficient. But this form of‑execcan be combined with a shell feature to solve the other problem (names with spaces). The POSIX shell allows us to use:

sh -c 'command-line' [command-name [ args...] ]

(We don't usually care about the_command-name_, so "X", "dummy", "sh", or "'inline cmd'" is often used.) Here's an example of efficiently copying found files to/tmp, in a POSIX-compliant way (Posted on comp.unix.shellnetnews newsgroup on Oct. 28 2007 by Stephane CHAZELAS):

find . -name '*.txt' -type f \ -exec sh -c 'exec cp -f "$@" /tmp' X '{}' +

(Obvious, simple, and readable, isn't it? Perhaps not, but worth knowing since it is safe, portable, and efficient.)

Common "Gotcha":

If the given expression tofinddoes not contain any of the "action" primaries‑exec,‑ok, or‑print, the given expression is effectively replaced by:

find ( expression ) -print

The implied parenthesis can cause unexpected results. For example, consider these two similar commands:

$ find -name tmp -prune -o -name \*.txt ./bin/data/secret.txt ./tmp ./missingEOL.txt ./public_html/graphics/README.txt ./datafile.txt

$ find -name tmp -prune -o -name \*.txt -print ./bin/data/secret.txt ./missingEOL.txt ./public_html/graphics/README.txt ./datafile.txt

The lack of an action in the first command means it is equivalent to:

find . ( -name tmp -prune -o -name \*.txt ) -print

This causestmpto be included in the output. However for the secondfindcommand the normal rules of Boolean operator precedence apply, so the pruned directory does not appear in the output.

A related issue is the precedence of the Boolean operators. OR has lower precedence than AND, and NOT has the highest precedence. When in any doubt, add parenthesis to your expressions.

Thefindcommand can be amazingly useful. See the man page to learn all the criteria and actions you can use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment