- Follow the instructions given in the simple sample
.yml
in the example section. It has self explanatory inline comments - Validate the final
.yml
file in yaml validator - Use online RegEx tool to create and test the regular expressions
- Follow file name convention
vendorname_templatename_NN.yml
- For fileds use
bb_
prefix and other fields should be treated asdummy
(workaround to mitigate the known issues in the underlying library) - During develeopment, Keep the
.yml
file in a designated s3 bucket folder, the pattern is always like[bucketname]\templates\[vendorname]\your.yml
- Once the
.yml
file is tested and certified as working, keep the file in source control. The folder structure should match thes3
bucket
This lambda function
The same is available in the root of thid Git repository
"\"DHDHMI6\\nBangladesh Rural\\nElectrification Board.\\nBREB\\nDate 2020-01-05 16:34:48\\nNo 0000000132028173\\nMetar No.: 040530001029\\nCustomer\\n010513063921627\\nNo.\\nYedot-Co\\nCustomer\\nName:\\nBangladesh Co.\\nDepartment: Salma\\nOperator\\nSalma\\nSequence\\n9\\nLtd.\\n37859.95 TK\\n3675.72kWh\\n250 TK\\nEnergy\\nCost:\\nEnergy\\n(10.3/kWh):\\nMeter Rent-\\n3P\\n(250/month):\\nDemand\\nCharge\\n(30/kW):\\nVAT(5%):\\nRebate{ 1%):\\n\\u0410\\u0442\\u0435\\u0430\\u0433\\nRecovery:\\n360 TK\\n1904.76 TK\\n-374.71 TK\\nOTK\\n40000 TK\\nGross\\nAmount:\\n5313-7505-7686-7027-7399\\nPlease press Enter after each 20-0GILS\\nToken chan continue to another new Token\\n\" [*****->Dynamic content (start)] GrandTotal:319.00 InvoiceDate:20-10-2015 InvoiceNo:#BLR_WFLD20151000982590 [<-*****Dynamic content (end)]"
#This is mandatory field # 1. Hard code the name of the vendor / supplier
issuer: BREB
fields:
#This is mandatory field # 2. Always hard code the below RegEx pattern
amount: GrandTotal:(\d+\.\d+)
#This is mandatory field # 3. Always hard code the below RegEx pattern
date: InvoiceDate:(\d{1,4}\-\d{1,2}\-\d{1,4})
#This is mandatory field # 4. Always hard code the below RegEx pattern
invoice_number: InvoiceNo:(\S+)
# Actual Field extracton Regualr Expression starts from here.
# Tips & key points:
# Prefix with "bp_"
# If there is no match, you will not get this attribute in the result
bp_department: (?s)\\nDepartment:(.*)\\nOperator
bp_token: \d{4}\-\d{4}\-\d{4}\-\d{4}
#"Keyword" is mandatory. Add one or more unique keyword(s) from the bills or documents in question.
keywords:
- BREB
options:
remove_whitespace: false
#---------------------------------------------
# Important: For this template to work, the below dummy one line string should present in the "Text" on which the RegEx going to be applied
#[*****->Dynamic content (start)] GrandTotal:319.00 InvoiceDate:20-10-2015 InvoiceNo:#BLR_WFLD20151000982590 [<-*****Dynamic content (end)]
#---------------------------------------------
Method
POST
End-point
https://h2iwgv44ul.execute-api.us-east-2.amazonaws.com/dev/regex
Request Body
templateFile
can be any validpublic
url orrelative bucket path
as shown in the below example
{
"bucketName": "toji.docs.bills-dev",
"templateFile": "templates/BREB/{templatename.yml}",
"rawData": "{copy the raw text against which the RegEx pattern to be applied}",
"documentCode": "{{Any dummy valid numeric number}}"
}
Response Body
- If everything goes well, you will get response as valid JSON string
- In the response, the
amount
,date
,invoice_number
are dummy values as said above - In the response, the attributes start with
bb_
are valid data attributes.
[
{
"message": "success",
"data": {
"issuer": "BREB",
"amount": 321.0,
"date": "2015-10-20 00:00:00",
"invoice_number": "#BLR_WFLD20151000982590",
"bp_department": " Salma",
"bp_token": "5313-7505-7686-7027",
"currency": "EUR",
"desc": "Invoice from BREB"
},
"documentCode": "23",
"templateFile": "templates/BREB/breb_ebbill_senthil_test.yml",
"execution_time": "0:00:02"
},
200
]
Follow these online blog articles and video tutorials on how to use the invoice2data
the underlying open source code forked to suite our requirements.