Skip to content

Instantly share code, notes, and snippets.

@redstreet
Last active October 9, 2024 03:06
Show Gist options
  • Save redstreet/6f1addb87c667826fb79b509d5d88a51 to your computer and use it in GitHub Desktop.
Save redstreet/6f1addb87c667826fb79b509d5d88a51 to your computer and use it in GitHub Desktop.
├── documents
│   ├── Assets
│   │   ├── Banks
│   │   │   ├── Bank-of-USA <--- *.pdf, *.jpg, etc. (documents)
│   ├── Expenses
├── ingest
| ├── ingest_output
│   ├── filingcabinet <--- bean-file's root directory
│   │   ├── Assets
│   │   │   ├── Banks
│   │   │   │   ├── Bank-of-USA <--- *.ofx, *.csv, *.xlsx, etc.
│   │   │   ├── ...
│   │   ├── Income
│   │   │   └── Employment
│   │   └── Liabilities
│   │   └── Credit-Cards
│   │   └── ...
├── prices <--- prices.bc
├── main.bc <--- include "source/Assets.Banks.xyz.bc", etc.
├── source <--- Assets.Banks.[...].bc, ...
#!/bin/zsh
setopt extendedglob
# exit when any command fails
# set -e
# keep track of the last executed command
trap 'last_command=$current_command; current_command=$BASH_COMMAND' DEBUG
# echo an error message before exiting
trap 'echo "\"${last_command}\" command filed with exit code $?."' EXIT
# https://stackoverflow.com/questions/192249/how-do-i-parse-command-line-arguments-in-bash
while [[ "$#" -gt 0 ]]; do
case $1 in
-x|--unimplemented) target="$2"; shift ;; # just for documentation
-n|--nofile) nofile=1 ;;
*) echo "Unknown parameter passed: $1"; exit 1 ;;
esac
shift
done
INGEST_ROOT=$BEAN_ROOT/ingest
INGEST_OUTPUT=$INGEST_ROOT/ingest_output
ROOT_DIR=~/Downloads/
SCRIPT_DIR=$(dirname -- "$0")
alias bi="bean-identify ${INGEST_ROOT}/my.import "
alias be="bean-extract -o --dir ${INGEST_OUTPUT} ${INGEST_ROOT}/my.import "
alias bf="bean-file -o ${INGEST_ROOT}/filingcabinet ${INGEST_ROOT}/my.import "
files=("${(@f)$(find $ROOT_DIR -maxdepth 1 -iname "*.qfx" -o -iname "*.ofx" -o -iname "*.xml" -o -iname "*.csv" -o -iname "*.xlsx")}")
echo "Ingesting ${#files[@]} files."
rm -frv ${INGEST_OUTPUT}
mkdir -p ${INGEST_OUTPUT}
# So zerosum doesn't run: both for performance and correctness (smart_importer)
if be -f <(echo 'plugin "beancount.plugins.auto_accounts"'; cat ${INGEST_ROOT}/../source/* ) $files ; then
echo "Return value of bean-extract: $?"
if [[ "$nofile"x != "1x" ]]; then
bf $files
fi
fi
echo "-----------------------------------"
echo "${#files[@]} files processed. Creating prices.db."
# Optional: Collect all price entries in a separate prices.bc file
touch ${INGEST_OUTPUT}/noaccount.bc
grep price ${INGEST_OUTPUT}/noaccount.bc >> ${BEAN_ROOT}/prices/prices.bc || echo ''
echo 'price' >> ${INGEST_OUTPUT}/noaccount.bc
echo '' >> ${INGEST_OUTPUT}/noaccount.bc
sed -i '/price/d' ${INGEST_OUTPUT}/noaccount.bc
sed -i '/^$/d' ${INGEST_OUTPUT}/noaccount.bc
[[ ! -s ${INGEST_OUTPUT}/noaccount.bc ]] && rm -fv ${INGEST_OUTPUT}/noaccount.bc
echo "Import done. Editing all imported files."
vim -p ${INGEST_OUTPUT}/**/*.bc
echo "Append to master? (Ctrl+c to exit)"
read
${INGEST_ROOT}/append_to_master.sh
rmdir --ignore-fail-on-non-empty ~/Downloads/skipped
@richban
Copy link

richban commented Jan 9, 2022

Works like a charm! 👍

However, I am kinda confused about these lines:

grep price ${INGEST_OUTPUT}/noaccount.bc >> ${BEAN_ROOT}/prices/prices.bc || echo ''
echo 'price' >> ${INGEST_OUTPUT}/noaccount.bc
echo '' >> ${INGEST_OUTPUT}/noaccount.bc
sed -i '/price/d' ${INGEST_OUTPUT}/noaccount.bc
sed -i '/^$/d' ${INGEST_OUTPUT}/noaccount.bc
[[ ! -s ${INGEST_OUTPUT}/noaccount.bc ]] && rm -fv ${INGEST_OUTPUT}/noaccount.bc

What are you trying to achieve here? What's the purpose of the prices.db ?

@richban
Copy link

richban commented Jan 9, 2022

Based on https://reds-rants.netlify.app/personal-finance/automatically-categorizing-postings/

should this line: https://gist.github.com/redstreet/6f1addb87c667826fb79b509d5d88a51#file-process_all_files-zsh-L35

correspond to:

BEAN_SRC=$(bean-identify my.import $file | grep "^Account:" | sed 's/Account: *//' | sed 's#:#.#g')
BEAN_SRC="${INGEST_ROOT}/../source/${BEAN_SRC}.bc"
bean-extract my.import -f $BEAN_SRC $file

@redstreet
Copy link
Author

Responses to the questions above are here.

@tavlima
Copy link

tavlima commented Sep 17, 2022

Hey, @redstreet, it seems there are some left-overs from your latest updates, like this dangling $processed variable here. Would you mind updating this gist to reflect whatever is your latest version of this script?

Also, a jot of your directories layout would go a long way to help others understand your logic here.

Thanks for the blog posts!

@redstreet
Copy link
Author

redstreet commented Sep 17, 2022

@tavlima good points.

  • script updated
  • added directory layout
    Hope that helps!

Glad to hear the articles are useful!

@kantskernel
Copy link

I've been a bit confused about ingest_output. Is this a directory that would be in directory-layout next to filingcabinet? Is this where the main.beancount would be?

The -o --dir options of bean-extract are erroring out for me so I've been trying something like be -f $files > $(INGEST_OUTPUT)/test.beancount

@redstreet
Copy link
Author

redstreet commented Dec 25, 2022

ingest_output is a sibling of filingcabinet. I've updated the gist to illusrate this, and main.bc

-o and --dir need a patch to be applied. I have the patch in another gist. Search for patch in this article.

Hope that helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment