Skip to content

Instantly share code, notes, and snippets.

@nulconaux
Created July 7, 2022 07:12
Show Gist options
  • Save nulconaux/ee94edb0922973d40ff0e8cc27cc548c to your computer and use it in GitHub Desktop.
Save nulconaux/ee94edb0922973d40ff0e8cc27cc548c to your computer and use it in GitHub Desktop.
DevOps in Data Science
import os
# Example of secrets as environmental variables
def access_secrets_env():
secrets = os.environ.get('secret_key', None)
return secrets
# Example of secrets from AWS secrets manager using "default" profile
# In reality, developers typically use specific profiles for specific projects.
# For RBAC, the profile has to be "default"
import boto3
def access_secrets_aws():
session = boto3.session.Session()
client = session.client(service_name='secretsmanager',
region_name=region_name)
secrets = client.get_secret_value(SecretId='secret_key')
# A "typical" stubbed data science workflow.
def extract_data():
"""
Extract data from the source and make it available for downstream steps.
"""
pass
def transform_data():
"""
Scale the columns, impute missing values, encode the data as required.
"""
pass
def feature_engineering():
"""
Transform the raw data into features that the modelling algorithm can use.
"""
pass
def modelling():
"""
Perform the model fitting using the generated features.
"""
pass
def validate():
"""
Perform cross validation measures to measure the accuracy of the model
"""
pass
def main():
extract_data()
transform_data()
feature_engineering()
modelling()
validate()
if __name__ == '__main__':
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment