Skip to content

Instantly share code, notes, and snippets.

@OriHoch
Last active March 4, 2018 14:05
Show Gist options
  • Save OriHoch/b3f1319e910e44f55115df4e3cf62d63 to your computer and use it in GitHub Desktop.
Save OriHoch/b3f1319e910e44f55115df4e3cf62d63 to your computer and use it in GitHub Desktop.
resources one to many.
build_positions:
pipeline:
- run: load_resource
parameters:
url: data/datapackage.json
resource: input_resource
- run: split_resource
- run: dump.to_path
parameters:
out-path: data/splitted_resource
from datapackage_pipelines.wrapper import ingest, spew
from datapackage_pipelines.utilities.resources import PROP_STREAMING
import logging
parameters, datapackage, resources, stats = ingest() + ({},)
output_resource_a = []
output_resource_b = []
output_resource_c = []
for i, row in enumerate(next(resources)):
if row["field"] = "a":
output_resource_a.append(row)
elif row["field"] = "b":
output_resource_b.append(row)
else
output_resource_c.append({"foo":"FOO", "bar": i)
output_resources = [output_resource_a, output_resource_b, output_resource_c]
resource_a_descriptor=dict(datapackage["resources"][0], name="a", path="a.csv")
resource_b_descriptor=dict(datapackage["resources"][0], name="b", path="b.csv")
resource_c_descriptor={PROP_STREAMING: True, "name": "c", "path": "c.csv",
"schema": {"fields":[{"name": "foo", "type": "string"},
{"name": "bar", "type": "integer"}]}}
output_datapackage = dict(datapackge, resources=[resource_a_descriptor,
resource_b_descriptor,
resource_c_descriptor])
spew(output_datapackage, output_resources, stats)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment