Skip to content

Instantly share code, notes, and snippets.

@lcpz
Last active August 4, 2023 17:04
Show Gist options
  • Star 10 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save lcpz/be49462beef56a63f5dbaa1c8bd27f16 to your computer and use it in GitHub Desktop.
Save lcpz/be49462beef56a63f5dbaa1c8bd27f16 to your computer and use it in GitHub Desktop.
Giacenza Media Annua for Revolut accounts
#!/bin/bash
# Compute the Giacenza Media Annua (GMA) of a Revolut account, required by the Italian INPS.
# Dependencies: awk, GNU date, getopts, xargs
# Computation method: https://bit.ly/3avDLu3
# Thread on Revolut forum: https://bit.ly/3511e5h
# Assumptions:
# 1. Dates are in YYYY-MM-DD format. The Revolut app uses your locale, so if this is set
# to it-IT, then this assumption is already satisfied. Otherwise, convert the dates in
# your CSV to ISO 8601 before using the script.
# 2. The time interval considered is between January 1 and December 31 of the same year
# (since it is an annual value).
# 3. Transactions are in ascending date order.
# 4. The actual balance of day D is the last registered during D. This is because the GMA
# is a weighted arithmetic mean, and the entries of the CSV are sorted by date. Hence,
# if T is an array of transactions registered during D, with |T| > 1, then all entries
# in T except the last one have null weight, because they have been on the account for
# less than one day. Thus, they do not affect the GMA, and can be safely skipped.
version="2022.05"
completed_date=4
balance=10
separator=","
use_decimal_point=false
print_usage() {
printf "Usage: `basename "$0"` [options]\n%-2sOptions:\n"
printf "%-4s-c \n%-6s'Completed Date' column number\n"
printf "%-6sDefault: $completed_date\n"
printf "%-4s-a \n%-6s'Balance' column number\n%-6sDefault: $balance\n"
printf "%-4s-s \n%-6sSeparator character of the input CSV\n"
printf "%-6sDefault: '$separator'\n"
printf "%-4s-d \n%-6sUse point instead of comma as decimal separator\n"
printf "%-6sDefault: $use_decimal_point\n"
}
if [[ $# -lt 1 ]]
then
print_usage
exit 1
fi
while getopts "cbs:dvh" flag; do
case "${flag}" in
c) completed_date="${OPTARG}" ;;
b) balance="${OPTARG}" ;;
s) separator="${OPTARG}" ;;
d) use_decimal_point=true ;;
v) printf "Version: $version\n"
exit 0 ;;
*) print_usage
exit 0 ;;
esac
done
result=0
prev_date=""
year=""
get_days() {
# input values are in seconds, and there are 60 * 60 * 24 = 86400 seconds in a day
days=$((($1 - $2) / 86400))
days=${days#-}
days=$(($days + 1))
printf $days
}
mycmd="awk -F '$separator' '(!unique[\$$completed_date]++ && NR > 1) {print \$$completed_date,\$$balance}' '$1'"
while read curr_date
do
d1=""
d2s=""
if [[ ! $prev_date ]]
then
year=$(date -d "${curr_date% *}" "+%Y")
d1=$(date -d "$year-01-01" +%s)
d1s="1 Jan $year"
prev_date="$year-01-01"
printf "Calculating the Giacenza Media Annua of the year $year\n\n"
fi
if [[ ! $d1 ]]
then
d1=$(date -d "${prev_date%% *}" +%s)
fi
d2=$(date -d "${curr_date%% *}" +%s)
days="$(get_days $d1 $d2)"
money=`echo ${curr_date##* } | xargs`
# approximated to 2 decimal values for ease of reading, comment to see exact values
money=$(awk "BEGIN {printf(\"%.2f\", $money); exit}")
p=$(awk "BEGIN {print $money*$days; exit}")
result=$(awk "BEGIN {print $result+$p; exit}")
d1s=`echo ${prev_date%% *} | xargs`
if [[ ! $d2s ]]
then
d2s=`echo ${curr_date%% *} | xargs`
fi
printf "$d1s to $d2s: $money x $days = $p\n"
prev_date=$curr_date
done <<< `eval $mycmd`
if [[ ! "$d2s" = "$year-12-31" ]] # last date is not the last day of the year
then
days="$(get_days $(date -d "$year-12-31" +%s) $d2)"
p=$(awk "BEGIN {print $money*$days; exit}")
result=$(awk "BEGIN {print $result+$p; exit}")
printf "$d2s to $year-12-31: $money x $days days = $p\n"
fi
result=`awk "BEGIN {print $result/365; exit}"`
if !$use_decimal_point
then
result=${result/\./\,}
fi
if [[ "$result" -lt 0 ]] # if the GMA is negative, the value to declare must be 0
then
result_ceil=0
else
result_ceil=${result%.*}
result_ceil=$((result_ceil + 1))
fi
printf "\nResult: $result, to be rounded up to $result_ceil\n"
exit 0
@lorcalhost
Copy link

Grazie mille!

@ildede
Copy link

ildede commented Jun 2, 2020

Seriously?
5. If there are duplicate dates, only the first (from top) is picked.

In this way the result is like a random number that seems correct.

@lcpz
Copy link
Author

lcpz commented Jun 2, 2020

@ildede It seems that you are against that assumption. I will elaborate on my rationale.

The entries of the input CSV are sorted by date, in ascending order, from top to bottom.

Given that the GMA is a weighted arithmetic mean on 365 points, the actual balance of day x is the last registered for x.

The first entry from top in a group of duplicates G is exactly that.

Following entries in G have a weight of 0, because they have been on the account for less than one day.

Hence, they do not affect the GMA and can be safely skipped.


If you think that this FLOSS is flawed, I am happy to dedicate time to your polite and constructive criticism.

@mastino21
Copy link

mastino21 commented Dec 21, 2021

Does it run with the new Statement format that Revolut provides or is this working with the old format only?
I have some problems getting this to work with the current CSV format, but maybe I am doing something wrong. Thanks.

@lcpz
Copy link
Author

lcpz commented Dec 21, 2021

@mastino21

Try to reformat the CSV as explained here, before using the script.

If Revolut changed the column order as well, I will give it a look myself ASAP.

@mastino21
Copy link

Yeah I did it, but still...maybe check if the new format is different. This is how it looks now.
Screenshot_20211221_185704

@mastino21
Copy link

mastino21 commented Dec 22, 2021

Not an expert but I think the error generates from the new date format of Revolut which consists of day and time. From the error message I get, I guess the script does not recognize time.

os2@localhost:/media/sf_OSK1_shared> bash revolut.sh Revolut-Statement.csv
date: invalid date ‘TOPUP,Current,2021-04-04 06:45:39,2021-04-04 06:45:44,Apple Pay Top-Up by’
Calculating the Giacenza Media Annua of the year 

date: invalid date ‘TOPUP,Current,2021-04-04 06:45:39,2021-04-04 06:45:44,Apple Pay Top-Up by’
awk: cmd. line:1: BEGIN {print *18261; exit}
awk: cmd. line:1:              ^ syntax error
awk: cmd. line:1: BEGIN {print 0+; exit}
awk: cmd. line:1:                ^ syntax error
From TOPUP,Current,2021-04-04 06:45:39,2021-04-04 06:45:44,Apple Pay Top-Up by to 31 Dec:  x 18261 days = 
date: invalid date ‘TOPUP,Current,2021-04-04 06:45:39,2021-04-04 06:45:44,Apple Pay Top-Up by’
date: invalid date ‘TRANSFER,Current,2021-04-04 06:46:50,2021-04-04 06:46:52,To Mark’
awk: cmd. line:1: BEGIN {print *1; exit}
awk: cmd. line:1:              ^ syntax error
awk: cmd. line:1: BEGIN {print +; exit}
awk: cmd. line:1:               ^ syntax error
```
`

@lcpz
Copy link
Author

lcpz commented Dec 23, 2021

@mastino21 Please try the latest revision. A couple of points:

  • Ensure that all dates are in YYYY-MM-DD format. If needed, you can make the conversion both in Excel and LibreOffice.
  • Specify the numbers of the columns Completed Date and Balance, as well as the separator character, as follows:
    ./revolut-gma.bash Revolut-Statement.csv -c 3 -b 13 -s ','

In case it still does not work, please paste me an example statement.

@mastino21
Copy link

mastino21 commented Dec 23, 2021

Sei un grande. I think you nailed it. It works with the Revolut csv out of the box, meaning you do not need to change dates format. Also it seems specifying number of the columns is not really needed.

@atrebla
Copy link

atrebla commented Feb 15, 2022

Hi, thx for your work. In my case he doesn't seem to notice that there are no more records.

awk: cmd. line:1: BEGIN {printf("%.2f", ); exit}
awk: cmd. line:1:                       ^ syntax error
awk: cmd. line:1: BEGIN {print *415; exit}
awk: cmd. line:1:              ^ syntax error
awk: cmd. line:1: BEGIN {print 9858.27+; exit}
awk: cmd. line:1:                      ^ syntax error
2020-12-28 to :  x 415 days = 
awk: cmd. line:1: BEGIN {print *412; exit}
awk: cmd. line:1:              ^ syntax error
awk: cmd. line:1: BEGIN {print +; exit}
awk: cmd. line:1:               ^ syntax error
 to 2020-12-31:  x 412 days = 
awk: cmd. line:1: BEGIN {print /365; exit}
awk: cmd. line:1:               ^ unterminated regexp
awk: cmd. line:1: BEGIN {print /365; exit}
awk: cmd. line:1:                         ^ unexpected newline or end of string

@lcpz
Copy link
Author

lcpz commented Feb 20, 2022

@atrebla I don't understand what you mean. Can you provide an example input? Just a few lines would suffice.

@Mte90
Copy link

Mte90 commented May 3, 2022

Su linux crasha, questa correzione risolve il problema:

result=${result/\./\,}
if [[ "$result" -lt 0 ]] # if the GMA is negative, the value to declare must be 0

A quanto pare il punto invece della virgola da dei problemi.

@lcpz
Copy link
Author

lcpz commented May 3, 2022

@Mte90 Please check if this patch works.

--- revolut-gma.bash	2022-05-03 13:09:54.134159031 +0200
+++ revolut-gma.bash.new	2022-05-03 13:12:38.554667320 +0200
@@ -20,11 +20,12 @@
 #    in T except the last one have null weight, because they have been on the account for
 #    less than one day. Thus, they do not affect the GMA, and can be safely skipped.
 
-version="2021.12"
+version="2022.05"
 
 completed_date=4
 balance=10
 separator=","
+use_decimal_point=false
 
 print_usage() {
     printf "Usage: `basename "$0"`  [options]\n%-2sOptions:\n"
@@ -33,6 +34,8 @@
     printf "%-4s-a \n%-6s'Balance' column number\n%-6sDefault: $balance\n"
     printf "%-4s-s \n%-6sSeparator character of the input CSV\n"
     printf "%-6sDefault: '$separator'\n"
+    printf "%-4s-d \n%-6sUse point instead of comma as decimal separator\n"
+    printf "%-6sDefault: $use_decimal_point\n"
 }
 
 if [[ $# -lt 1 ]]
@@ -41,11 +44,12 @@
     exit 1
 fi
 
-while getopts "cbs:vh" flag; do
+while getopts "cbs:dvh" flag; do
     case "${flag}" in
         c) completed_date="${OPTARG}" ;;
         b) balance="${OPTARG}" ;;
         s) separator="${OPTARG}" ;;
+        d) use_decimal_point=true ;;
         v) printf "Version: $version\n"
            exit 0 ;;
         *) print_usage
@@ -116,7 +120,12 @@
 
 result=`awk "BEGIN {print $result/365; exit}"`
 
-if [[ result -lt 0 ]] # if the GMA is negative, the value to declare must be 0
+if !$use_decimal_point
+then
+    result=${result/\./\,}
+fi
+
+if [[ "$result" -lt 0 ]] # if the GMA is negative, the value to declare must be 0
 then
     result_ceil=0
 else

@Mte90
Copy link

Mte90 commented May 3, 2022

I think that should work :-)

@lcpz
Copy link
Author

lcpz commented May 3, 2022

Thanks, patch applied.

@ccomploj
Copy link

ccomploj commented Aug 4, 2023

Copying relevant info for users from the author, from
https://community.revolut.com/t/average-daily-balance-giacenza-media-necessary-for-italian-taxes/74290/56:

Hi all, I’ve made a little script in bash for calculating the Italian Giacenza Media Annua. Download it here 226.
It’s a bash script, you can run it on any major operating system (check this page 56 for Windows).
It is based on 2 assumptions:

Dates are in English and have one of the following formats: ISO 8601; RFC 5322; RFC 3339. For instance, valid dates are: 24 Jul 2019, 2020-07-24.
Numbers only have the decimal dot. For instance, “3.120,49” is not valid, while “3120.49” is valid.

These requirements are met by the UK version of the Revolut app. I don’t know if switching to English language produces them. Otherwise, just edit your CSV accordingly.
I hope this helps.
I would suggest Revolut to only use ISO 8601 dates and only one decimal point for numbers with a fractional part.

@ccomploj
Copy link

ccomploj commented Aug 4, 2023

@lcpz I think this file is dangerous as it may make serious errors.

First, note that:
I get the error revolut-gma.bash: line 123: !false: command not found revolut-gma.bash: line 128: [[: 895.307: syntax error: invalid arithmetic operator (error token is ".307")
at the end of the file. Can this be ignored?

Second, I get the error awk: cmd. line:1: BEGIN {printf("%.2f", ); exit} awk: cmd. line:1: ^ syntax error awk: cmd. line:1: BEGIN {print *2; exit} awk: cmd. line:1: ^ syntax error awk: cmd. line:1: BEGIN {print 200708+; exit} awk: cmd. line:1: ^ syntax error
This is because part of the transactions are in Product==Savings instead of Product==Current. In this account, I had some 0s, which led the program to use these 0s for calculation (as it uses the first entry on a day). I do not know how to interpret the error (I do not know bash).

Third, I used 3 methods and get 3 results (if i ignore the above errors anyway). For each csv file, I note what the first two lines look like:

  • using the original EUR statement.csv, I see the dates in excel are not sorted. The Ubuntu reader reads dates as 2022-01-01 to 2022-06-13: 100.00 x 163 = 16300 || 2022-06-13 to 2022-01-02: 7.79 x 162 = 1261.98.
  • after I sort this statement.csv using Excel in ascending order, the file reads dates as 2022-01-01 to 1/2/2022: 7.79 x 2 = 15.58 || 1/2/2022 to 1/3/2022: 107.79 x 2 = 215.58
  • due to this date format change (which appears to happen automatically in Excel when sorting), I change the format again to YYYY-MM-DD, and get 2022-01-01 to 2022-06-13: 100.00 x 163 = 16300 || 2022-06-13 to 2022-01-02: 7.79 x 162 = 1261.98.
    These three methods give amounts 896, 660 and 551, respectively.

Thanks for the effort of writing this file, but I think caution is needed or patches should be applied. I am sorry I cannot help further, as I have no idea about bash or this coding language.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment