Skip to content

Instantly share code, notes, and snippets.

@datajoely
Created June 28, 2021 18:23
Show Gist options
  • Save datajoely/fd688bbac82364f82a7505f94ee691c0 to your computer and use it in GitHub Desktop.
Save datajoely/fd688bbac82364f82a7505f94ee691c0 to your computer and use it in GitHub Desktop.
Kedro layer Comment
raw In this situation 3 data source are described: an Excel file, a multi-part CSV export from a database as well as a single CSV export from a personnel management system.
intermediate The intermediate layer is a typed mirror of the raw layer with a minor transformation applied to the equipment extract since the multi-part data received has been concatenated into a single parquet dataset.
primary Two domain level datasets have been constructed from the intermediate layer which model equipment shutdowns and operator actions.
feature Several features have been constructed form the primary layer which represent variables we think may be predictors of equipment shutdowns such as the maintenance schedule and recent shutdowns.
model_input Two model inputs have been created since we are experimenting with two modeling approaches, one time-series based and another equipment centric without a temporal element.
models The trained models constructed have been serialised as pickle files for safe keeping.
model_output The two modeling approaches output recommendations and scored results in different formats which are consumed downstream outside of our pipeline.
reference A serial number look up table is held in the reference layer since this is actually manually maintained by the maintenance teams since the organsiation is in the midst of migrating to a new inventory system that will eventually expose an API endpoint.
reporting The work performed as part of this ML use-case has made it possible to provide a helicopter view of the maintenance activities not previously possible before the various data sources were connected and has been made available as an Excel spreadsheet which the shift supervisor can review at their convenience.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment