Skip to content

Instantly share code, notes, and snippets.

@Alkarex
Last active February 7, 2022 03:58
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Alkarex/687918dd5cb66a3bfc2a661e15ef4237 to your computer and use it in GitHub Desktop.
Save Alkarex/687918dd5cb66a3bfc2a661e15ef4237 to your computer and use it in GitHub Desktop.
Upload large CSV file to SQL database

While default nodes are sufficient to upload small datasets to a database via Node-RED, they do not suffice when handling files of tens of megabytes, let alone gigabytes, which cannot fit in memory.

We take advantage of two special nodes:

For this example, we assume the input file is in CSV format – but it could be something else – and we use the default CSV node to parse the text lines.

For this example, we use a PostgreSQL database (with optional Timescale extension for better handling of time-series) – but it could be another database – thanks to the node-red-contrib-postgresql node.

Node-RED flow

Configuration details

  • The ui-upload node is set to chunks of 256kB in binary mode. Our tests show that this provides good speed and stability.
  • The chunks-to-lines is set to CSV node (for compatibility with the default CSV node), UTF-8 encoding (should be the same than the uploaded file), and outputting 4096 lines at a time. Inserting many lines at a time in database is much faster than one at a time.
  • The CSV node is set to “first row contains column names”, and “output a single message (array)”. Obviously, the separator (comma by default) must be the same than your uploaded file.
  • The postgres node is set to “receive query output”, in order to know when the insertion of a batch of lines in database is done and to trigger the next one.

Usage

  1. Adapt to your database parameters (e.g. database name, column names)
  2. Browse to the Node-RED Dashboard
  3. Click on the Prepare SQL table button to create a blank table in database
  4. In the Upload to SQL section, select a local CSV file on your computer (with columns: time, sensor, data). See sample.csv for an example of mini dataset.
  5. Wait for upload
  6. The number of uploaded lines appears in the Dashboard
  7. You can inspect your database using another tool

Screenshots

Dashboard during upload

Node-RED flow

Dashboard when upload is complete

Node-RED flow

[
{
"id": "e6ec0f0a.388198",
"type": "chunks-to-lines",
"z": "d9a661f4.ef966",
"name": "",
"nbLines": "4096",
"linesFormat": "csv",
"decoder": "UTF-8",
"x": 280,
"y": 940,
"wires": [
[
"436dce51.065f6"
]
]
},
{
"id": "a2a16dc6.fbd32",
"type": "function",
"z": "d9a661f4.ef966",
"name": "Ready for next lines",
"func": "return [\n msg.complete ? msg : null,\n { tick: true },\n];\n",
"outputs": 2,
"noerr": 0,
"initialize": "",
"finalize": "",
"libs": [],
"x": 940,
"y": 880,
"wires": [
[
"87fbc09b.6deeb",
"35615002.196b78"
],
[
"e6ec0f0a.388198"
]
]
},
{
"id": "436dce51.065f6",
"type": "csv",
"z": "d9a661f4.ef966",
"name": "",
"sep": ",",
"hdrin": true,
"hdrout": "none",
"multi": "mult",
"ret": "\\n",
"temp": "",
"skip": "0",
"strings": false,
"include_empty_strings": "",
"include_null_values": "",
"x": 430,
"y": 940,
"wires": [
[
"4127dd44.2d49d4"
]
]
},
{
"id": "4127dd44.2d49d4",
"type": "function",
"z": "d9a661f4.ef966",
"name": "Format SQL",
"func": "if (msg.payload && msg.payload.length > 0) {\n let payload = 'INSERT INTO your_table(time, sensor, data) VALUES ';\n \n for (const line of msg.payload) {\n payload += `('${line.time}', '${line.sensor}', '${line.data}'),`;\n }\n \n msg.payload = payload.slice(0, - 1) + ';';\n}\nreturn msg;\n",
"outputs": 1,
"noerr": 0,
"initialize": "",
"finalize": "",
"libs": [],
"x": 570,
"y": 940,
"wires": [
[
"65c0779b0f924c8b"
]
]
},
{
"id": "d5504079.651a68",
"type": "comment",
"z": "d9a661f4.ef966",
"name": "Upload to SQL",
"info": "",
"x": 120,
"y": 780,
"wires": []
},
{
"id": "715380a8.42ec7",
"type": "debug",
"z": "d9a661f4.ef966",
"name": "",
"active": true,
"tosidebar": true,
"console": false,
"tostatus": false,
"complete": "true",
"targetType": "full",
"statusVal": "",
"statusType": "auto",
"x": 530,
"y": 840,
"wires": []
},
{
"id": "87fbc09b.6deeb",
"type": "debug",
"z": "d9a661f4.ef966",
"name": "Done",
"active": true,
"tosidebar": true,
"console": false,
"tostatus": true,
"complete": "true",
"targetType": "full",
"statusVal": "_parts.lines",
"statusType": "msg",
"x": 1170,
"y": 940,
"wires": []
},
{
"id": "4b31cec2.dad2c8",
"type": "ui_upload",
"z": "d9a661f4.ef966",
"group": "593aaad02ba9e607",
"title": "Generic upload to SQL",
"name": "uploadSql",
"order": 3,
"width": 6,
"height": 6,
"chunk": "256",
"transfer": "binary",
"x": 100,
"y": 940,
"wires": [
[
"e6ec0f0a.388198"
]
]
},
{
"id": "35615002.196b78",
"type": "ui_text",
"z": "d9a661f4.ef966",
"group": "6426525.5bfe1ac",
"order": 3,
"width": 0,
"height": 0,
"name": "",
"label": "Lines uploaded:",
"format": "{{msg._parts.lines}}",
"layout": "row-spread",
"x": 1200,
"y": 880,
"wires": []
},
{
"id": "65d2e9d6.5ca5",
"type": "ui_button",
"z": "d9a661f4.ef966",
"name": "",
"group": "593aaad02ba9e607",
"order": 1,
"width": 0,
"height": 0,
"passthru": false,
"label": "Prepare generic SQL table",
"tooltip": "",
"color": "",
"bgcolor": "",
"icon": "",
"payload": "",
"payloadType": "str",
"topic": "",
"topicType": "str",
"x": 160,
"y": 840,
"wires": [
[
"6b76c340a580b0a4"
]
]
},
{
"id": "6b76c340a580b0a4",
"type": "postgresql",
"z": "d9a661f4.ef966",
"name": "",
"query": "DROP TABLE IF EXISTS your_table;\nCREATE TABLE your_table (\n\tid\t\tSERIAL\tNOT NULL,\n\ttime\tTIMESTAMP WITH TIME ZONE\tNOT NULL,\n\tsensor\tTEXT,\n\tdata\tJSONB\n);\n\nSELECT create_hypertable('your_table', 'time');\nCREATE INDEX IF NOT EXISTS index_sensor ON your_table(\"sensor\");\n",
"postgreSQLConfig": "20ae1e52d1eef983",
"split": false,
"rowsPerMsg": 1,
"outputs": 1,
"x": 370,
"y": 840,
"wires": [
[
"715380a8.42ec7"
]
]
},
{
"id": "65c0779b0f924c8b",
"type": "postgresql",
"z": "d9a661f4.ef966",
"name": "",
"query": "{{{ msg.payload }}}",
"postgreSQLConfig": "20ae1e52d1eef983",
"split": false,
"rowsPerMsg": 1,
"outputs": 1,
"x": 730,
"y": 940,
"wires": [
[
"a2a16dc6.fbd32"
]
]
},
{
"id": "593aaad02ba9e607",
"type": "ui_group",
"name": "Default",
"tab": "5f17e3a3.4c85ac",
"order": 1,
"disp": true,
"width": "6",
"collapse": false
},
{
"id": "6426525.5bfe1ac",
"type": "ui_group",
"name": "Default",
"tab": "cb135bc2.cc3d9",
"order": 1,
"disp": true,
"width": "6",
"collapse": true
},
{
"id": "20ae1e52d1eef983",
"type": "postgreSQLConfig",
"name": "admin@timescale:5432/iot",
"host": "timescale",
"hostFieldType": "str",
"port": "5432",
"portFieldType": "num",
"database": "iot",
"databaseFieldType": "str",
"ssl": "false",
"sslFieldType": "bool",
"max": "10",
"maxFieldType": "num",
"min": "1",
"minFieldType": "num",
"idle": "1000",
"idleFieldType": "num",
"connectionTimeout": "10000",
"connectionTimeoutFieldType": "num",
"user": "admin",
"userFieldType": "str",
"password": "RTffY87xVqGA",
"passwordFieldType": "str"
},
{
"id": "5f17e3a3.4c85ac",
"type": "ui_tab",
"name": "Generic demo",
"icon": "dashboard",
"disabled": false,
"hidden": false
},
{
"id": "cb135bc2.cc3d9",
"type": "ui_tab",
"name": "Anomaly",
"icon": "dashboard",
"disabled": false,
"hidden": false
}
]
time sensor data
2020-08-27 14:43:21.561 sensor1 {"humidity": 60.16676,"temperature": 22.0}
2020-08-27 14:43:22.567 sensor1 {"humidity": 60.11966,"temperature": 22.1}
2020-08-27 14:43:23.575 sensor1 {"humidity": 60.15106,"temperature": 22.2}
2020-08-27 14:43:24.583 sensor1 {"humidity": 60.17304,"temperature": 22.3}
2020-08-27 14:43:25.591 sensor1 {"humidity": 60.16676,"temperature": 22.4}
2020-08-27 14:43:26.603 sensor1 {"humidity": 60.17304,"temperature": 22.5}
2020-08-27 14:43:27.607 sensor1 {"humidity": 60.17304,"temperature": 22.6}
2020-08-27 14:43:28.615 sensor1 {"humidity": 60.19188,"temperature": 22.7}
2020-08-27 14:43:29.623 sensor1 {"humidity": 60.18874,"temperature": 22.8}
2020-08-27 14:43:30.631 sensor1 {"humidity": 60.20129,"temperature": 22.9}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment