I need very often to modify filenames or replace/delete their parts across huge number of files. I was using sed command for these purposes but recently I have found out that all these tasks can be easily done directly in bash (or zsh) because both of them support so called string manipulation operations.
Define some example string:
$ file=some_file_name.txt
$ echo $file
some_file_name.txt
${#string}
- Get string length
$ echo ${#file}
18
${string%.*}
- Strip out (delete) everything from back of the string using %
(shortest match) or %%
(longest match)
$ echo ${file%.*}
some_file_name
$ echo ${file%_*}
some_file
$ echo ${file%%_*}
some
${string#*_}
- Strip out (delete) everything from front of the string using #
(shortest match) or ##
(longest match)
$ echo ${file#*_}
file_name.txt
$ echo ${file##*_}
name.txt
$ echo ${file#*.}
txt
Keep only filename from whole path
$ path_to_file="/home/someuser/sub-01/anat/sub-01_T1w.nii.gz"
$ echo ${filename##*/}
sub-01_T1w.nii.gz
${string%% *}
- Keep the first word from string of words separated by spaces - https://unix.stackexchange.com/a/201744
string="one two three"
$ echo ${string%% *}
one
${string:(POSITION)}
- Extract part of the string from the end (here last 4 characters)
$ echo ${file:(-4)}
.txt
${string:POSITION:LENGTH}
- Extract part of the string from given position with certain length (here extract 5 characters from the 4th character)
$ echo ${file:4:5}
_file
${string/SUBSTRING/REPLACEMENT}
- String replacement (e.g., some_file_name.txt --> some_fname.txt)
$ echo ${file/file_name/fname}
some_fname.txt
Replace all occurrences using //
$ filename=sub-001/anat/sub-001_T2w.nii.gz
$ echo ${filename//001/002}
sub-002/anat/sub-002_T2w.nii.gz
strip out .nii.gz
suffix
FILE="sub-01_T1.nii.gz"
FILE=${FILE/.nii.gz/}
sed 's/\([.*$]\)/_orig\1/g'
- Add _orig
into filename before file suffix (e.g., file.txt --> file_orig.txt)
$ echo $file
some_file_name.txt
$ echo $file | sed 's/\([.*$]\)/_orig\1/g'
some_file_name_orig.txt
Syntax:
\(
and\)
- find a block\1
- replace with the block between between the\(
and the\)
above[.*$]
- match everything else (.*
) to the end of the line ($
)
sed -ie '/<my_line>/ s/^#*/#/g' some_file.txt
- comment out line (add #
) in some_file.txt
sed -ie '/<my_line>/ s/^#*//g' some_file.txt
- uncomment line (remove #
) in some_file.txt
# Insert some line into some_file.txt file using echo
$ echo "BIN_PATH=/usr/local/bin" >> some_file.txt
$ cat some_file.txt
BIN_PATH=/usr/local/bin
# Comment out line starting with BIN_PATH
$ sed -ie '/BIN_PATH/ s/^#*/#/g' some_file.txt
$ cat some_file.txt
#BIN_PATH=/usr/local/bin
# Uncomment line starting with BIN_PATH
$ sed -ie '/BIN_PATH/ s/^#*//g' some_file.txt
$ cat some_file.txt
BIN_PATH=/usr/local/bin
remove part of string from each line
sed -i 's!<string_to_remove>!!' <file>.csv