Skip to content

Instantly share code, notes, and snippets.

@domgiles
Created July 3, 2018 15:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save domgiles/fa7e28b80d15d6b4ee185395eccdf327 to your computer and use it in GitHub Desktop.
Save domgiles/fa7e28b80d15d6b4ee185395eccdf327 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# OCI Object Store Examples\n",
"\n",
"The following are a series of examples showing the loading of data into the Oracle Object Store. For these to work with your own data you'll need to have your own Oracle Cloud account and uploaded a key. You can find details on how to achieve this [here](https://docs.cloud.oracle.com/iaas/Content/API/Concepts/apisigningkey.htm)\n",
"\n",
"I'll be using the Oracle OCI Python SDK which wrappers the REST API. You can find details on the API [here](https://oracle-cloud-infrastructure-python-sdk.readthedocs.io/en/latest/api/landing.html)\n",
"\n",
"Before we do anything we'll need to load the required needed Python modules."
]
},
{
"cell_type": "code",
"execution_count": 127,
"metadata": {},
"outputs": [],
"source": [
"import oci\n",
"import keyring\n",
"import ast\n",
"import os"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Configuration needed to connect \n",
"\n",
"I'm using the \"keyring\" Python module to hold the config for my connection to OCI (to avoid needlessly exposing sensitive information). It's of the form\n",
"```json\n",
"{\n",
" \"user\": \"your user ocid\",\n",
" \"key_file\": \"the path to your private key file\",\n",
" \"fingerprint\": \"the fingerprint of your public key\",\n",
" \"tenancy\": \"your tenancy ocid\",\n",
" \"region\": \"the region you are working with\"\n",
"}\n",
"```\n",
"After retrieving it from my keyring store I then need to convert it into a dictionary before using it. You can also validate the config you are using as well. Handy if this is the first time you've configured it."
]
},
{
"cell_type": "code",
"execution_count": 128,
"metadata": {},
"outputs": [],
"source": [
"my_config = ast.literal_eval(keyring.get_password('oci_opj','doms'))\n",
"oci.config.validate_config(my_config)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Create object storage client\n",
"\n",
"Then I just need to retireve a Object Storage client to start working with data"
]
},
{
"cell_type": "code",
"execution_count": 129,
"metadata": {},
"outputs": [],
"source": [
"object_storage_client = oci.object_storage.ObjectStorageClient(my_config)"
]
},
{
"cell_type": "code",
"execution_count": 130,
"metadata": {},
"outputs": [],
"source": [
"namespace = object_storage_client.get_namespace().data\n",
"bucket_name = \"doms_object_store\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Upload the contents of user directory to a bucket\n",
"\n",
"I'll create a bucket and then select all of the files from a user defined directory and upload them to the newly created bucket"
]
},
{
"cell_type": "code",
"execution_count": 131,
"metadata": {},
"outputs": [],
"source": [
"import os, io\n",
"\n",
"directory = '/Users/dgiles/datagenerator/bin/generateddata'\n",
"files_to_process = [file for file in os.listdir(directory) if file.endswith('csv')]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create a bucket named \"Sales_Data\" and give it the tenancy ocid from your config."
]
},
{
"cell_type": "code",
"execution_count": 132,
"metadata": {},
"outputs": [],
"source": [
"try:\n",
" create_bucket_response = object_storage_client.create_bucket(\n",
" namespace,\n",
" oci.object_storage.models.CreateBucketDetails(\n",
" name='Sales_Data',\n",
" compartment_id=my_config['tenancy']\n",
" )\n",
" )\n",
"except Exception as e:\n",
" print(e.message)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then we just need to loop through the list of files in the directory specified and upload them to the newly created bucket"
]
},
{
"cell_type": "code",
"execution_count": 133,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Uploading file CUSTOMERS.csv\n",
"Uploading file PRODUCTS.csv\n",
"Uploading file COUNTRIES.csv\n",
"Uploading file PROMOTIONS.csv\n",
"Uploading file CHANNELS.csv\n",
"Uploading file SUPPLEMENTARY_DEMOGRAPHICS.csv\n",
"Uploading file SALES.csv\n"
]
}
],
"source": [
"bucket_name = 'Sales_Data'\n",
"for upload_file in files_to_process:\n",
" print('Uploading file {}'.format(upload_file))\n",
" object_storage_client.put_object(namespace, bucket_name, upload_file, io.open(os.path.join(directory,upload_file),'r'))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Retrieve a list of objects in a bucket\n",
"\n",
"The folowing retrieves a bucket and gets a list of objects in the bucket"
]
},
{
"cell_type": "code",
"execution_count": 134,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CHANNELS.csv\n",
"COUNTRIES.csv\n",
"CUSTOMERS.csv\n",
"PRODUCTS.csv\n",
"PROMOTIONS.csv\n",
"SALES.csv\n",
"SUPPLEMENTARY_DEMOGRAPHICS.csv\n"
]
}
],
"source": [
"bucket = object_storage_client.get_bucket(namespace, bucket_name)\n",
"object_list = object_storage_client.list_objects(namespace, bucket_name)\n",
"\n",
"for o in object_list.data.objects:\n",
" print(o.name)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Download the contents of an object\n",
"\n",
"The following downloads a file from a named bucket in chunks and writes it to user defined directory on the client"
]
},
{
"cell_type": "code",
"execution_count": 135,
"metadata": {},
"outputs": [],
"source": [
"# Attempt to download a file\n",
"\n",
"object_name = \"CUSTOMERS.csv\"\n",
"destination_dir = '/Users/dgiles/Downloads'.format(object_name) \n",
"get_obj = object_storage_client.get_object(namespace, bucket_name, object_name)\n",
"with open(os.path.join(destination_dir,object_name), 'wb') as f:\n",
" for chunk in get_obj.data.raw.stream(1024 * 1024, decode_content=False):\n",
" f.write(chunk)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Delete a bucket\n",
"We can just as simply delete the bucket we've just created but first we'll need to delete all of the objects inside of it."
]
},
{
"cell_type": "code",
"execution_count": 136,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Deleting object CHANNELS.csv\n",
"Deleting object COUNTRIES.csv\n",
"Deleting object CUSTOMERS.csv\n",
"Deleting object PRODUCTS.csv\n",
"Deleting object PROMOTIONS.csv\n",
"Deleting object SALES.csv\n",
"Deleting object SUPPLEMENTARY_DEMOGRAPHICS.csv\n",
"Deleting bucket\n"
]
}
],
"source": [
"object_list = object_storage_client.list_objects(namespace, bucket_name)\n",
"\n",
"for o in object_list.data.objects:\n",
" print('Deleting object {}'.format(o.name))\n",
" object_storage_client.delete_object(namespace, bucket_name, o.name)\n",
"\n",
"print('Deleting bucket') \n",
"response = object_storage_client.delete_bucket(namespace, bucket_name)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment