Short description what it is for.
Stakeholders: for external - list departments, for internal - projects or ppl who depend. (To evaluate how critical if it goes down in production once and to inform about problems).
Input Data (if applicable):
- name, hdfs/external path;
- ...
Output Data (if applicable):
- hdfs path;
- someDB
If it is a data pipeline, describe main stages of it, diagram or text like: Drop previously created tables in Impala. Ingest data from Oracle <name_db> to hive metastore as PARQUET, store it as table_name. Associate it with external impala tables with corresponding names. Write a SUCCESS_ file on <flag_path>.
Scheduled to run: daily at 22 o'clock/ once a week/ every 5 days/ etc.
- send logs to kibana <link_to_kibana_index>
- failure alerts are sent to <email_address>
- project owner: name here
- substitute person: name here
- most domains knowleadge people are: names
- java8, python 3.x at $PATH
- sbt/mvn version etc
- system variabels: —list all system variables introduced by the project, if applicable
- something else
- link on requirements file (for python projects)
- conda env, if applicable.
--step by step guide how to prepare for developing
cd your_path/to_projects_dir
git clone <project link>
- instructions to install depended packages, if applicable, or requirements file for python projects
- commands how to build and run unit tests, depends on project build system
sbt clean build test
git checkout -b feature/<new_branch_name>
- develop your thing, with local commits
- when done, create PR to master branch
--if applicable, either no unit tests
--step by step how to run integrational tests, or no integrational tests
--step by step instruction or no functional tests available
--step by step guide to deploy for different environments
- deploy with idea on <server_address> under <user_name>
- ssh to <server_name>
- cd <path_to_project>
- perform in CLI
command_to_perform
or run script.
Flags for successfull run:
- creates new directory <path_to_dir>
- success files on a path <path_to_success_flag_file>
- on <http_link> something will appear
select * from <table_name> limit 1;
will return result- something else
Else:
- review logs <path_to_logs>
- rerun in debug mode:
command --debug
- something else
- create PR, choose reviewers
- after approve merge with master
- follow jenkins <project_name> build progess
Flags for sucessfull run:
- creates new directory <path_to_dir>
- success files on a path <path_to_success_flag_file>
- on <http_link> something will appear
- something else
Else:
- description for alert system (sending leter of failure on that address)
- logs for production are kept
- work scheduler is
- something else
- Error scenarios: —describe all possible and known error scenarios and their resolution
- error or misbehavior been seen (detailed description with errors messages)
- solution (step-by-step)
- Link to page with credentials;
- Path to all logs and monitoring available, what to look for;
- Link to postmortems;
- Links to other useful things;
- Something what is not listed in readme, but is essential to add.