Skip to content

Instantly share code, notes, and snippets.

@Bakhtiyar-Garashov
Last active March 9, 2022 21:04
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Bakhtiyar-Garashov/ad97ed0de06077ea91fbf4e9219eb077 to your computer and use it in GitHub Desktop.
Save Bakhtiyar-Garashov/ad97ed0de06077ea91fbf4e9219eb077 to your computer and use it in GitHub Desktop.
How to unit test AWS S3 integration with Python

Hi, reader. I hope that you will find this short practical article helpful (and don't lie please, we all hate writing unit tests I know).

Today while developing a service in my daily job I have faced an issue when trying to write simple unit tests for the functions. First let me introduce what is our actual code that I am willing to cover with tests.

Imagine we have 3 functions as shown below. While writing the code for functions we need to consider that we are going to write unit tests for them. It is really important to write testable code (I'd suggest to read this book). If you have a function which violets SOLID's S principle (google it) and does more than one thing (e.g., creating a session and downloading file from the bucket) it will be cumbersome to write the test for this function. Taking into account this, I am going to parameterize important things so that in the unit test we can mock them and pass to function easily.

Also, plese note that I am not going to share all the details of each function such as handling different kind of possible exceptions. You can easily figure them out.

  1. A function which generates a connection session with the S3 bucket in the AWS environment:
from typing import Union
import boto3
import botocore
from mypy_boto3_s3.service_resource import Bucket
from logger import log

def bucket_session_generator(region: str, bucket_name: str) -> Union[Bucket, None]:
    """Generates a bucket connection"""

    try:
        session = boto3.Session(region_name=region)
        s3 = session.resource("s3")
        bucket = s3.Bucket(bucket_name)
        return bucket
    except Exception as e:
        log(LogLevel.ERROR, f"Error: {e}")
        
    return None
  1. Second function which takes the file name and S3 connection object (returned from the function number 1) as parameter and downloads and saves the file into local filesystem:
def download_file(file_name: str, bucket: Bucket) -> None:
    """Downloads a single file from the bucket"""

    try:
        bucket.download_file(f"{file_name}.csv", f"/tmp/{file_name}.csv")
    except botocore.exceptions.ClientError as e:
        if e.response["Error"]["Code"] == "404":
            log(LogLevel.ERROR, f"The object {file_name} does not exist.")   
  1. Last function which again takes the file name and bucket connection object as parameter and uploads the file into S3 bucket:
def upload_file(file_name: str, bucket: Bucket) -> None:
    """Uploads a file to the s3 bucket"""

    try:
        bucket_object = bucket.Object(file_name)
        with open(f"/tmp/{file_name}", "rb") as file:
            bucket_object.upload_fileobj(file)
    except Exception as e:
        log(LogLevel.ERROR, f"Something went wrong when trying to upload file. Error: {e}")

Okay. As we have all the actual functions in place, it is time to move to the important step: writing tests for the function. For the mocking S3 bucket we are going to use a library called moto This is not limited with only S3, but more.

  1. As a preliminary step we need to create fixtures first. Create a file with the name conftest.py and add the code below:
import boto3

@pytest.fixture()
def mock_s3_client():
    client = boto3.client('s3', region_name='us-east-1', aws_access_key_id='test', aws_secret_access_key='test')
    return client


@pytest.fixture()
def mock_s3_resource():
    s3 = boto3.resource('s3', region_name='us-east-1', aws_access_key_id='test', aws_secret_access_key='test')
    return s3

As it can be seen from the each function's name, they are intended to mock a S3 client and resource objects which can be used within the tests. It is important to note that if you don't use us-east-1 as region while mocking you can have some odd issues.

  1. Let's have a look at the bucket_session_generator function test. This function's test is pretty simple and nothing really hard to understand.
from typing import Final

S3_BUCKET: Final[str] = "mock-bucket"
FILENAME: Final[str] = "test"

 def test_bucket_session_generator(self, mock_s3_resource):
        bucket = bucket_session_generator("eu-west-1", S3_BUCKET)
        assert bucket.name == S3_BUCKET
        assert bucket
  1. Another function's test which is upload_file:
  def test_upload_file(self, mock_s3_resource, mock_s3_client):
        with open(f"/tmp/{FILENAME}.csv", "w") as f:
            f.write("test content")

        mock_s3_client.create_bucket(Bucket=S3_BUCKET)
        upload_file(f"{FILENAME}.csv", mock_s3_resource.Bucket(S3_BUCKET))
        assert (
            mock_s3_resource.Object(S3_BUCKET, f"{FILENAME}.csv").get()["Body"].read().decode("utf-8")
            == "test content"
        )

It can be seen that after creating a csv file, adding the text "test content" and uploading to the bucket we can verify that the file in the bucket has the expected content inside and eventually we get the test passed

  1. Test for the download_file function
 def test_download_file(self, mock_s3_resource, mock_s3_client):
        mock_s3_client.create_bucket(Bucket=S3_BUCKET)
        upload_file(f"{FILENAME}.csv", mock_s3_resource.Bucket(S3_BUCKET))
        download_file(f"{FILENAME}", mock_s3_resource.Bucket(S3_BUCKET))
        assert os.path.exists(f"/tmp/{FILENAME}.csv")
        with open(f"/tmp/{FILENAME}.csv", "r") as f:
            assert f.read() == "test content"

Here we have used the previously tested function (which uploads file) just before the actual function which is intended to be tested so we will have a file in the bucket and in this way we can verify the behaviour of download_file function easily.

  1. This is the final step: if we want to run the functions successfully we need to add them in a class (well, as you might have seen functions' first parameters are self) and add important decorator so that bucket will be mocked. Long story short, the full test class for the function:
from moto import mock_s3
import os
from utils.s3_file_handling import upload_file, bucket_session_generator, download_file
from typing import Final

S3_BUCKET: Final[str] = "mock-bucket"
FILENAME: Final[str] = "test"


@mock_s3
class TestS3Storage:

    def test_bucket_session_generator(self, mock_s3_resource):
        bucket = bucket_session_generator("eu-west-1", S3_BUCKET)
        assert bucket.name == S3_BUCKET
        assert bucket

    def test_upload_file(self, mock_s3_resource, mock_s3_client):
        with open(f"/tmp/{FILENAME}.csv", "w") as f:
            f.write("test content")

        mock_s3_client.create_bucket(Bucket=S3_BUCKET)
        upload_file(f"{FILENAME}.csv", mock_s3_resource.Bucket(S3_BUCKET))
        assert (
            mock_s3_resource.Object(S3_BUCKET, f"{FILENAME}.csv").get()["Body"].read().decode("utf-8")
            == "test content"
        )

    def test_download_file(self, mock_s3_resource, mock_s3_client):
        mock_s3_client.create_bucket(Bucket=S3_BUCKET)
        upload_file(f"{FILENAME}.csv", mock_s3_resource.Bucket(S3_BUCKET))
        download_file(f"{FILENAME}", mock_s3_resource.Bucket(S3_BUCKET))
        assert os.path.exists(f"/tmp/{FILENAME}.csv")
        with open(f"/tmp/{FILENAME}.csv", "r") as f:
            assert f.read() == "test content"

P.S: I am relatively new in writing technical articles and I usually mess things up while writing. So, if you have any feedback and/or opinion don't hesitate and drop a comment below.

Thanks for reading!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment