Skip to content

Instantly share code, notes, and snippets.

@sepastian
Created September 18, 2020 08:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sepastian/3eb5e6f2f50f66209add59cc42c94daa to your computer and use it in GitHub Desktop.
Save sepastian/3eb5e6f2f50f66209add59cc42c94daa to your computer and use it in GitHub Desktop.
Unzip with correct encoding in filenames
#!/bin/bash
set -euo pipefail
IFS=$'\n\t'
# Unzip a zipfile, then fix encoding in filenames.
#
# Assume filenames have been encoded with CP1250 (Win);
# convert encoding in filenames to UTF-8.
#
# CP1250 works for German versions of Win;
# you may have to use another codepage, for a full list
# see https://en.wikipedia.org/wiki/Windows_code_page.
if [ $# != 1 ]; then
echo "Usage: $(basename $0) ZIPFILE"
exit 2
fi
zipfile="$1"
# Unzip.
unzip "$1"
# Fix encoding in filenames.
for f in $(zipinfo -1 "$1")
do
convmv -f CP1250 -t UTF-8 --nfc --replace --notest "$f"
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment