valosekj/string_manipulation.md

## string_manipulation.md

      
    Raw
  

              string_manipulation.md
            
          
    Useful string manipulation examples for bash and zsh

I need very often to modify filenames or replace/delete their parts across huge number of files. I was using sed command for these purposes but recently I have found out that all these tasks can be easily done directly in bash (or zsh) because both of them support so called string manipulation operations.
String manipulations

Define some example string:
$ file=some_file_name.txt
$ echo $file     
some_file_name.txt

${#string} - Get string length
$ echo ${#file}
18

${string%.*} - Strip out (delete) everything from back of the string using % (shortest match) or %% (longest match)
$ echo ${file%.*}
some_file_name

$ echo ${file%_*}
some_file

$ echo ${file%%_*}
some

${string#*_} - Strip out (delete) everything from front of the string using # (shortest match) or ## (longest match)
$ echo ${file#*_}
file_name.txt

$ echo ${file##*_}
name.txt

$ echo ${file#*.}
txt

Keep only filename from whole path
$ path_to_file="/home/someuser/sub-01/anat/sub-01_T1w.nii.gz"
$ echo ${filename##*/}
sub-01_T1w.nii.gz

${string%% *} - Keep the first word from string of words separated by spaces - https://unix.stackexchange.com/a/201744
string="one two three"
$ echo ${string%% *}
one

${string:(POSITION)} - Extract part of the string from the end (here last 4 characters)
$ echo ${file:(-4)}
.txt

${string:POSITION:LENGTH} - Extract part of the string from given position with certain length (here extract 5 characters from the 4th character)
$ echo ${file:4:5}
_file

${string/SUBSTRING/REPLACEMENT} - String replacement (e.g., some_file_name.txt --> some_fname.txt)
$ echo ${file/file_name/fname}
some_fname.txt

Replace all occurrences using //
$ filename=sub-001/anat/sub-001_T2w.nii.gz
$ echo ${filename//001/002}
sub-002/anat/sub-002_T2w.nii.gz

strip out .nii.gz suffix
FILE="sub-01_T1.nii.gz"
FILE=${FILE/.nii.gz/}

sed

sed 's/\([.*$]\)/_orig\1/g' - Add _orig into filename before file suffix (e.g., file.txt --> file_orig.txt)
$ echo $file
some_file_name.txt
$ echo $file | sed 's/\([.*$]\)/_orig\1/g'
some_file_name_orig.txt

Syntax:

\( and \) - find a block
\1 - replace with the block between between the \( and the \) above
[.*$] - match everything else (.*) to the end of the line ($)

sed -ie '/<my_line>/ s/^#*/#/g' some_file.txt - comment out line (add #) in some_file.txt
sed -ie '/<my_line>/ s/^#*//g' some_file.txt - uncomment line (remove #) in some_file.txt
# Insert some line into some_file.txt file using echo
$ echo "BIN_PATH=/usr/local/bin" >> some_file.txt
$ cat some_file.txt
BIN_PATH=/usr/local/bin

# Comment out line starting with BIN_PATH
$ sed -ie '/BIN_PATH/ s/^#*/#/g' some_file.txt 
$ cat some_file.txt
#BIN_PATH=/usr/local/bin

# Uncomment line starting with BIN_PATH
$ sed -ie '/BIN_PATH/ s/^#*//g' some_file.txt
$ cat some_file.txt
BIN_PATH=/usr/local/bin

remove part of string from each line
sed -i 's!<string_to_remove>!!' <file>.csv