Skip to content

Instantly share code, notes, and snippets.

@iperdomo
Last active June 15, 2020 06:06
Show Gist options
  • Save iperdomo/06725216071d995e3b4ad2c63c98571b to your computer and use it in GitHub Desktop.
Save iperdomo/06725216071d995e3b4ad2c63c98571b to your computer and use it in GitHub Desktop.

RQG as JSON blob

The current implementation (behind feature flag rqg=true) imports all the iterations of a group in a single JSON blob.

While this was a good experiment it leads to some complexities, e.g. question type handling.

Right now there is code that understand the value behind a question type.

Consider this 2 imports:

RQG without feature flag

\d ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7
Table “public.ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7”
ColumnTypeCollationNullableDefault
rnumintegernot nullnextval(‘ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7_rnum_seq’::regclass)
identifiertext
instance_idtextnot null
display_nametext
submittertext
submitted_attimestamp with time zone
surveyal_timedouble precision
device_idtext
c583119147text
c594979148double precision
c609479145text
c617319146text
c601879162text
c601889144geometry(Point,4326)
c601899165text
c601879163text
c601899166timestamp with time zone
c599649166text
c615289146geometry(Geometry,4326)
c615289146_0text
c597909145text
c615289147text
c617309149text
c588869155geometry(Point,4326)
Indexes:
“ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7_pkey” PRIMARY KEY, btree (rnum)
“ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7_instance_id_key” UNIQUE CONSTRAINT, btree (instance_id)
“ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7_c588869155_idx” gist (c588869155)
“ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7_c601889144_idx” gist (c601889144)
“ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7_c615289146_idx” gist (c615289146)

And some of the values

select * from ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7 limit 5;
rnumidentifierinstance_iddisplay_namesubmittersubmitted_atsurveyal_timedevice_idc583119147c594979148c609479145c617319146c601879162c601889144c601899165c601879163c601899166c599649166c615289146c615289146_0c597909145c615289147c617309149c588869155
1y8r0-sj5w-fnga579079115GombitovaJana bluestacks2020-06-11 15:15:34+00148jana bluestacksJana301:yesyes20101000020E610000078EE3D5C72DC13405436ACA92C644C40https://akvoflow-uat1.s3.amazonaws.com/images/5e4305e9-57a9-46e8-b045-e0f37b2ff7c5.jpghttps://akvoflow-uat1.s3.amazonaws.com/images/90d5af97-a52a-4363-849f-c45f969be158.mp42020-06-11 15:10:56.09+0012345678900103000020E6100000010000000D000000E646EC6EF9BB2640F3AD0FEB8DFE3F401A23C8F77B6C2340AB4203B16C343C403A32A0281A292D40CF0E5DF525BE374013E3EB21D0AB21409D1CF45B2AD936408F52094FE815FA3F6B0F7BA1802F3A4065DEAAEB504DBFBFDFFAB0DEA8044140AB49A6F9197510C0E148FB7A1915444098D41A947AC0F9BFC29B7F057A134840B350210D5D2F0E4051A1BAB9F8D24A40DF8C9AAF92732240FE7B952133814A4094F029A50C812040227E0FA8486C46406E4617E5E30C18407C702D4954AE4340E646EC6EF9BB2640F3AD0FEB8DFE3F40{“pointCount”:”12”,”length”:”7756726.5”,”area”:”2.7898780828596826E12”}Gombitova0101000020E6100000274EEE7728CA13404EB9C2BB5C244C40
2v1ex-928r-bh9g587119117bla ble blu bliJana bluestacks2020-06-11 15:20:40+00135jana bluestacksbla bla251:yesno10101000020E610000000000000000018400000000000004E40https://akvoflow-uat1.s3.amazonaws.com/images/af5a5426-94f0-4547-8409-9e648c42278a.jpghttps://akvoflow-uat1.s3.amazonaws.com/images/98eb72ad-3084-4062-bc8e-c0dadbd42230.mp42020-06-03 15:15:42.535+00212121212121210103000020E610000001000000080000007080F4A8530122404497811486F64C40BB438A0112E13240C6B656C565D94D405A958E835C353C40E7A15B5430DB4E400A7CEAFDEBDD3C409412279C82954C4054B8D4F7D0A03A4062A1D634EFAB4A40C4DD6AE7455B304011FBA99619714840128F1F85463B27402F185C73476F4A407080F4A8530122404497811486F64C40{“pointCount”:”7”,”length”:”3542964.2”,”area”:”1.0478411372043157E12”}bla ble blu bli0101000020E610000000000000000018400000000000004E40
3vbqk-k700-7gya597079115One Repetitionvaleria_acer2020-06-09 15:10:54+00213valeria_acerPete22:nono20101000020E6100000000000000000F03F000000000000F03Fhttps://akvoflow-uat1.s3.amazonaws.com/images/00a2a5e7-03f4-493c-9711-202b72569be9.jpghttps://akvoflow-uat1.s3.amazonaws.com/images/26691c4c-c7b7-48f0-b8c6-415fa9d351ef.mp42014-06-09 15:09:28.062+0012345670103000020E610000001000000070000009C1AC3F759643F4048F0D0C144FB4840521B8B5C26653F40D594641D8EF64840A4B6C31506713F407C58B96125F84840D305065ACB563F4001B8FE02E6F04840287513C660633F40C9C452C9A5EB48403B0D62B1974F3F4017EAFA4F48EA48409C1AC3F759643F4048F0D0C144FB4840{“pointCount”:”6”,”length”:”44719.12”,”area”:”2.7101229755859375E7”}One Repetition0101000020E6100000000000000000F03F000000000000F03F
4ryvu-0gxm-j69b601909115Poppinsvaleria_acer2020-06-09 14:42:45+00719valeria_acerMary361:yesyesno10101000020E6100000D090A88D3E5105400019A453C6CE4440https://akvoflow-uat1.s3.amazonaws.com/images/825057a7-42c8-490f-906b-8215e7666af1.jpghttps://akvoflow-uat1.s3.amazonaws.com/images/3978e56b-0b91-4d80-bd55-f68a49144214.mp41986-06-09 14:06:48.048+00123456780103000020E6100000010000000600000004C1882249352840AB77B81D1A6D4540CE4B6A0D4A3528402F00E88B186D4540552E54FEB5342840982ABDDB616B454005E33B8CA42C2840B000A60C1C6C454083520F2CA23D2840D4957439CA6D454004C1882249352840AB77B81D1A6D4540{“pointCount”:”5”,”length”:”7469.8857”,”area”:”827588.20703125”}{“app”:{“appVersion”:”1.0”,”language”:”en”},”device”:{“country”:”US”,”language”:”en”,”manufacturer”:”Acer”,”model”:”A1-840FHD”,”os”:”Android - 4.4.2 (19)”,”product”:”a1840fhd_ww_gen1”},”name”:”Water - Chlorine, pH”,”result”:[{“value”:”3.0”,”name”:”pH”,”id”:1},{“value”:”3.0”,”name”:”Chlorine”,”unit”:”mg/l”,”id”:2}],”testDate”:”2020-06-09 16:34”,”type”:”caddisfly”,”user”:{“language”:”en”,”isBackDropDetection”:true},”uuid”:”520ba67c-233f-4dc7-a2ad-17d86047d7c4”}Poppins0101000020E61000004943A6E1CA50054079B8C312C0CE4440

You’ll find that we detect the question type and translate that to a proper column type, e.g. timestamp with time zone, geometry, double precision, etc

select columns from dataset_version where table_name = 'ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7';
columns
[{“key”: false, “sort”: null, “type”: “text”, “title”: “Identifier”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “identifier”, “multipleId”: null, “multipleType”: null}, {“key”: true, “sort”: null, “type”: “text”, “title”: “Instance id”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “instance_id”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Display name”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “display_name”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Submitter”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “submitter”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “date”, “title”: “Submitted at”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “submitted_at”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “number”, “title”: “Surveyal time”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “surveyal_time”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Device Id”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “device_id”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Name”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c583119147”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “number”, “title”: “Age”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c594979148”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Likes Pizza”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c609479145”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Likes pasta”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c617319146”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Region”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c601879162”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “geopoint”, “title”: “Current location”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c601889144”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Photo”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c601899165”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Video”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c601879163”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “date”, “title”: “DOB”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c601899166”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Barcode”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c599649166”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “geoshape”, “title”: “Shape”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c615289146”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “multiple”, “title”: “Shape Features”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c615289146_0”, “multipleId”: “615289146”, “multipleType”: “geo-shape-features”}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Signature”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c597909145”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “multiple”, “title”: “Caddisfly”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c615289147”, “multipleId”: “520ba67c-233f-4dc7-a2ad-17d86047d7c4”, “multipleType”: “caddisfly”}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Family name”, “hidden”: false, “groupId”: “617319144”, “metadata”: null, “direction”: null, “groupName”: “Non repeatable”, “columnName”: “c617309149”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “geopoint”, “title”: “Location”, “hidden”: false, “groupId”: “617319144”, “metadata”: null, “direction”: null, “groupName”: “Non repeatable”, “columnName”: “c588869155”, “multipleId”: null, “multipleType”: null}]
[
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "identifier",
        "groupName": null,
        "direction": null,
        "metadata": null,
        "groupId": null,
        "hidden": false,
        "title": "Identifier",
        "type": "text",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "instance_id",
        "groupName": null,
        "direction": null,
        "metadata": null,
        "groupId": null,
        "hidden": false,
        "title": "Instance id",
        "type": "text",
        "sort": null,
        "key": true
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "display_name",
        "groupName": null,
        "direction": null,
        "metadata": null,
        "groupId": null,
        "hidden": false,
        "title": "Display name",
        "type": "text",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "submitter",
        "groupName": null,
        "direction": null,
        "metadata": null,
        "groupId": null,
        "hidden": false,
        "title": "Submitter",
        "type": "text",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "submitted_at",
        "groupName": null,
        "direction": null,
        "metadata": null,
        "groupId": null,
        "hidden": false,
        "title": "Submitted at",
        "type": "date",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "surveyal_time",
        "groupName": null,
        "direction": null,
        "metadata": null,
        "groupId": null,
        "hidden": false,
        "title": "Surveyal time",
        "type": "number",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "device_id",
        "groupName": null,
        "direction": null,
        "metadata": null,
        "groupId": null,
        "hidden": false,
        "title": "Device Id",
        "type": "text",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "c583119147",
        "groupName": "Repeated",
        "direction": null,
        "metadata": null,
        "groupId": "597899156",
        "hidden": false,
        "title": "Name",
        "type": "text",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "c594979148",
        "groupName": "Repeated",
        "direction": null,
        "metadata": null,
        "groupId": "597899156",
        "hidden": false,
        "title": "Age",
        "type": "number",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "c609479145",
        "groupName": "Repeated",
        "direction": null,
        "metadata": null,
        "groupId": "597899156",
        "hidden": false,
        "title": "Likes Pizza",
        "type": "text",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "c617319146",
        "groupName": "Repeated",
        "direction": null,
        "metadata": null,
        "groupId": "597899156",
        "hidden": false,
        "title": "Likes pasta",
        "type": "text",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "c601879162",
        "groupName": "Repeated",
        "direction": null,
        "metadata": null,
        "groupId": "597899156",
        "hidden": false,
        "title": "Region",
        "type": "text",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "c601889144",
        "groupName": "Repeated",
        "direction": null,
        "metadata": null,
        "groupId": "597899156",
        "hidden": false,
        "title": "Current location",
        "type": "geopoint",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "c601899165",
        "groupName": "Repeated",
        "direction": null,
        "metadata": null,
        "groupId": "597899156",
        "hidden": false,
        "title": "Photo",
        "type": "text",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "c601879163",
        "groupName": "Repeated",
        "direction": null,
        "metadata": null,
        "groupId": "597899156",
        "hidden": false,
        "title": "Video",
        "type": "text",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "c601899166",
        "groupName": "Repeated",
        "direction": null,
        "metadata": null,
        "groupId": "597899156",
        "hidden": false,
        "title": "DOB",
        "type": "date",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "c599649166",
        "groupName": "Repeated",
        "direction": null,
        "metadata": null,
        "groupId": "597899156",
        "hidden": false,
        "title": "Barcode",
        "type": "text",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "c615289146",
        "groupName": "Repeated",
        "direction": null,
        "metadata": null,
        "groupId": "597899156",
        "hidden": false,
        "title": "Shape",
        "type": "geoshape",
        "sort": null,
        "key": false
    },
    {
        "multipleType": "geo-shape-features",
        "multipleId": "615289146",
        "columnName": "c615289146_0",
        "groupName": "Repeated",
        "direction": null,
        "metadata": null,
        "groupId": "597899156",
        "hidden": false,
        "title": "Shape Features",
        "type": "multiple",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "c597909145",
        "groupName": "Repeated",
        "direction": null,
        "metadata": null,
        "groupId": "597899156",
        "hidden": false,
        "title": "Signature",
        "type": "text",
        "sort": null,
        "key": false
    },
    {
        "multipleType": "caddisfly",
        "multipleId": "520ba67c-233f-4dc7-a2ad-17d86047d7c4",
        "columnName": "c615289147",
        "groupName": "Repeated",
        "direction": null,
        "metadata": null,
        "groupId": "597899156",
        "hidden": false,
        "title": "Caddisfly",
        "type": "multiple",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "c617309149",
        "groupName": "Non repeatable",
        "direction": null,
        "metadata": null,
        "groupId": "617319144",
        "hidden": false,
        "title": "Family name",
        "type": "text",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "c588869155",
        "groupName": "Non repeatable",
        "direction": null,
        "metadata": null,
        "groupId": "617319144",
        "hidden": false,
        "title": "Location",
        "type": "geopoint",
        "sort": null,
        "key": false
    }
]

RQG with flag

The same dataset this time with the new import code

\d ds_92712882_698d_4fe6_ab20_98c5b9c253b3
Table “public.ds_92712882_698d_4fe6_ab20_98c5b9c253b3”
ColumnTypeCollationNullableDefault
rnumintegernot nullnextval(‘ds_92712882_698d_4fe6_ab20_98c5b9c253b3_rnum_seq’::regclass)
identifiertext
instance_idtextnot null
display_nametext
submittertext
submitted_attimestamp with time zone
surveyal_timedouble precision
device_idtext
c597899156text
c617309149text
c588869155geometry(Point,4326)
Indexes:
“ds_92712882_698d_4fe6_ab20_98c5b9c253b3_pkey” PRIMARY KEY, btree (rnum)
“ds_92712882_698d_4fe6_ab20_98c5b9c253b3_instance_id_key” UNIQUE CONSTRAINT, btree (instance_id)
“ds_92712882_698d_4fe6_ab20_98c5b9c253b3_c588869155_idx” gist (c588869155)

We have fewer columns because there is one RGQ that was transformed from several columns into a single one.

select columns from dataset_version where table_name = 'ds_92712882_698d_4fe6_ab20_98c5b9c253b3'
columns
[{“key”: false, “sort”: null, “type”: “text”, “title”: “Identifier”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “identifier”, “multipleId”: null, “multipleType”: null}, {“key”: true, “sort”: null, “type”: “text”, “title”: “Instance id”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “instance_id”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Display name”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “display_name”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Submitter”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “submitter”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “date”, “title”: “Submitted at”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “submitted_at”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “number”, “title”: “Surveyal time”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “surveyal_time”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Device Id”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “device_id”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “rqg”, “title”: “Repeated”, “hidden”: false, “groupId”: “597899156”, “metadata”: {“columns”: [{“id”: “c583119147”, “type”: “text”, “title”: “Name”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c594979148”, “type”: “number”, “title”: “Age”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c609479145”, “type”: “text”, “title”: “Likes Pizza”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c617319146”, “type”: “text”, “title”: “Likes pasta”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c601879162”, “type”: “text”, “title”: “Region”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c601889144”, “type”: “geopoint”, “title”: “Current location”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c601899165”, “type”: “text”, “title”: “Photo”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c601879163”, “type”: “text”, “title”: “Video”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c601899166”, “type”: “date”, “title”: “DOB”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c599649166”, “type”: “text”, “title”: “Barcode”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c615289146”, “type”: “geoshape”, “title”: “Shape”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c597909145”, “type”: “text”, “title”: “Signature”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c615289147”, “type”: “multiple”, “title”: “Caddisfly”, “groupId”: null, “metadata”: null, “groupName”: null, “multipleId”: “520ba67c-233f-4dc7-a2ad-17d86047d7c4”, “multipleType”: “caddisfly”}]}, “direction”: null, “groupName”: “Repeated”, “columnName”: “c597899156”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Family name”, “hidden”: false, “groupId”: “617319144”, “metadata”: null, “direction”: null, “groupName”: “Non repeatable”, “columnName”: “c617309149”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “geopoint”, “title”: “Location”, “hidden”: false, “groupId”: “617319144”, “metadata”: null, “direction”: null, “groupName”: “Non repeatable”, “columnName”: “c588869155”, “multipleId”: null, “multipleType”: null}]
[
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "identifier",
        "groupName": null,
        "direction": null,
        "metadata": null,
        "groupId": null,
        "hidden": false,
        "title": "Identifier",
        "type": "text",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "instance_id",
        "groupName": null,
        "direction": null,
        "metadata": null,
        "groupId": null,
        "hidden": false,
        "title": "Instance id",
        "type": "text",
        "sort": null,
        "key": true
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "display_name",
        "groupName": null,
        "direction": null,
        "metadata": null,
        "groupId": null,
        "hidden": false,
        "title": "Display name",
        "type": "text",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "submitter",
        "groupName": null,
        "direction": null,
        "metadata": null,
        "groupId": null,
        "hidden": false,
        "title": "Submitter",
        "type": "text",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "submitted_at",
        "groupName": null,
        "direction": null,
        "metadata": null,
        "groupId": null,
        "hidden": false,
        "title": "Submitted at",
        "type": "date",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "surveyal_time",
        "groupName": null,
        "direction": null,
        "metadata": null,
        "groupId": null,
        "hidden": false,
        "title": "Surveyal time",
        "type": "number",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "device_id",
        "groupName": null,
        "direction": null,
        "metadata": null,
        "groupId": null,
        "hidden": false,
        "title": "Device Id",
        "type": "text",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "c597899156",
        "groupName": "Repeated",
        "direction": null,
        "metadata": {
            "columns": [
                {
                    "groupName": null,
                    "metadata": null,
                    "groupId": null,
                    "title": "Name",
                    "type": "text",
                    "id": "c583119147"
                },
                {
                    "groupName": null,
                    "metadata": null,
                    "groupId": null,
                    "title": "Age",
                    "type": "number",
                    "id": "c594979148"
                },
                {
                    "groupName": null,
                    "metadata": null,
                    "groupId": null,
                    "title": "Likes Pizza",
                    "type": "text",
                    "id": "c609479145"
                },
                {
                    "groupName": null,
                    "metadata": null,
                    "groupId": null,
                    "title": "Likes pasta",
                    "type": "text",
                    "id": "c617319146"
                },
                {
                    "groupName": null,
                    "metadata": null,
                    "groupId": null,
                    "title": "Region",
                    "type": "text",
                    "id": "c601879162"
                },
                {
                    "groupName": null,
                    "metadata": null,
                    "groupId": null,
                    "title": "Current location",
                    "type": "geopoint",
                    "id": "c601889144"
                },
                {
                    "groupName": null,
                    "metadata": null,
                    "groupId": null,
                    "title": "Photo",
                    "type": "text",
                    "id": "c601899165"
                },
                {
                    "groupName": null,
                    "metadata": null,
                    "groupId": null,
                    "title": "Video",
                    "type": "text",
                    "id": "c601879163"
                },
                {
                    "groupName": null,
                    "metadata": null,
                    "groupId": null,
                    "title": "DOB",
                    "type": "date",
                    "id": "c601899166"
                },
                {
                    "groupName": null,
                    "metadata": null,
                    "groupId": null,
                    "title": "Barcode",
                    "type": "text",
                    "id": "c599649166"
                },
                {
                    "groupName": null,
                    "metadata": null,
                    "groupId": null,
                    "title": "Shape",
                    "type": "geoshape",
                    "id": "c615289146"
                },
                {
                    "groupName": null,
                    "metadata": null,
                    "groupId": null,
                    "title": "Signature",
                    "type": "text",
                    "id": "c597909145"
                },
                {
                    "multipleType": "caddisfly",
                    "multipleId": "520ba67c-233f-4dc7-a2ad-17d86047d7c4",
                    "groupName": null,
                    "metadata": null,
                    "groupId": null,
                    "title": "Caddisfly",
                    "type": "multiple",
                    "id": "c615289147"
                }
            ]
        },
        "groupId": "597899156",
        "hidden": false,
        "title": "Repeated",
        "type": "rqg",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "c617309149",
        "groupName": "Non repeatable",
        "direction": null,
        "metadata": null,
        "groupId": "617319144",
        "hidden": false,
        "title": "Family name",
        "type": "text",
        "sort": null,
        "key": false
    },
    {
        "multipleType": null,
        "multipleId": null,
        "columnName": "c588869155",
        "groupName": "Non repeatable",
        "direction": null,
        "metadata": null,
        "groupId": "617319144",
        "hidden": false,
        "title": "Location",
        "type": "geopoint",
        "sort": null,
        "key": false
    }
]

In the case of the RQG the definition of the columns gets embedded in the metadata key of the column holding all the group.

RQG as dataset

We want to reuse the existing infrastructure to deal with question types, transformation, versioning, etc.

It all starts with a data_source that defines the the source of data for a particular dataset.

A job_execution keeps track of long running processes (import, transformation) and links the data_source with the resulting dataset and dataset_version.

https://user-images.githubusercontent.com/178474/84621913-0ea5b380-aedc-11ea-89aa-7c52916a4092.png

At the end the data is stored in a ds_* table with the raw data in a import_* table, e.g.

select dataset_id, version, table_name, imported_table_name from dataset_version limit 1
dataset_idversiontable_nameimported_table_name
5ee6f025-e96b-4530-8d44-7f8edbd89b6c1ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7imported_e51ef10b_22e4_425f_a809_4b15f3234054

The table imported_e51ef10b_22e4_425f_a809_4b15f3234054 is always used as starting point for running transformations, and producing a fresh data table (e.g. ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7)

New data_source (a.k.a Follow current model strictly)

The current hierarchy is

data_source -> job_execution -> dataset -> dataset_version

In this model we have one of each (after import). One way of following the same mental model is to define a new type of data_source and this spec definition we store the required information of parent/child relationship.

Current spec for FLOW source

select spec from data_source limit 1
spec
{“name”: “RQG - 1”, “source”: {“kind”: “AKVO_FLOW”, “email”: “***@akvo.org”, “formId”: “601879159”, “version”: 3, “instance”: “uat1”, “surveyId”: “615289140”}}
{
    "source": {
        "surveyId": "615289140",
        "instance": "uat1",
        "version": 3,
        "formId": "601879159",
        "email": "***@akvo.org",
        "kind": "AKVO_FLOW"
    },
    "name": "RQG - 1"
}

The spec of a dataset for a RQG will use a different kind and some extra properties in source

{
    "source": {
        "surveyId": "615289140",
        "instance": "uat1",
        "version": 3,
        "formId": "601879159",
        "groupId": "597899156",   // <-- RQG id
        "email": "***@akvo.org",
        "kind": "AKVO_FLOW_RQG"   // <-- new kind
    },
    "name": "Repeated"            // <- RQG name
}

With this approach we don’t need to touch too much the current data model. There are some open questions that need some thinking and decision.

Open questions

  • In order to filter datasets for library view, it will be easier to just have a new flag in dataset that help us filter out RQG datasets. e.g. visible=false Let’s remember that RQG as dataset is an implementation detail.
  • An import job_execution produces one dataset and one dataset_version. What happens now? Do we want a different job_execution to produce the dataset for a RQG? Or change the behavior: One job_execution produces 1 + n datasets and dataset_versions (where n is the number of RQG).
    • The same applies for already imported datasets on Update and Transformation processes.

One dataset multiple dataset_version

We don’t introduce a new data_source. When importing we detect the RQG and create one dataset and dataset_version for each RQG group.

The difference between this and the previous is:

  • Don’t introduce a new data_source
  • A dataset can have multiple dataset_version with table_name pointing to the different parts of the dataset (main + RQGs)
select version, table_name, imported_table_name from dataset_version where dataset_id = '5ee6f658-5ca1-4b2a-bc4f-1b13cb095147' order by created desc;
versiontable_nameimported_table_name
2ds_92712882_698d_4fe6_ab20_98c5b9c253b3imported_8597e45a_fbe7_42d7_bf6d_3f003bf97cc7
1imported_8597e45a_fbe7_42d7_bf6d_3f003bf97cc7
  • The relationship parent/child is based on dataset -> dataset_version. We need a way to know that a dataset_version is a RQG so we can fetch the right data when requested.
  • We need to decide if we want one job_execution for all dataset_version or one for each
  • Although this is an implementation detail it feels weird that the distinction between a RQG and main dataset happens at the dataset_version level

Multiple dataset multiple dataset_version

  • This is similar to the previous but we don’t break the current assumption that at one point in time a dataset_version only has one data table.
  • We need to introduce a flag at dataset level to define it as RQG
  • We need to introduce a parent/child relationship between dataset. This is to know to which main dataset a particular RQG belongs to.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment