zeryx/tutorial_sofar.md

## tutorial_sofar.md

      
    Raw
  

              tutorial_sofar.md
            
          
    Algorithm Parallelism and Threading Tutorial

This tutorial is primarily targetted at an advanced audience looking for solutions to improve runtime performance of already existing algorithms. If you'd like to find the introductory tutorials please check this page first.
You might even be making requests to multiple downstream algorithms from within your current algorithm, reusing building blocks to make the most out of the platform. For example:
import Algorithmia
import numpy as np
from PIL import Image


def get_image_as_numpy(url, client):
    """Uses the Smart Image Downloader algorithm to format and download images from the web or other places."""
    input = {"image": url}
    result = client.algo("util/SmartImageDownloader/0.2.x").pipe(input).result
    local_path = client.file(result['savePath'][0]).getFile()
    with Image.open(local_path) as f:
        image = np.asarray(f)
    return image


def do_work(image_array):
    """Does some computer vision work and needs a numpy array to function"""
    ...

def apply(input):
    client = Algorithmia.client()
    image_data = get_image_as_numpy(input, client)
    results = do_work(image_data)
    return results


This is works perfectly for situations where you just want to get image support to an algorithm that needs to do serial processing and then do some monolithic processing using a gpu or other resources to get a final result, one image at a time.
However, if you're dealing with a production system, you're most likely going to be using more than 1 downstream algorithm; and you'll also want to improve performance as much as possible, especially in batch. In that case, you're in luck! This tutorial will go over the different mechanisms you can use both inside of your algorithm, and outside to get the results you're looking for.
Algorithm solutions

Async and Futures

Your algorithm above works great, but now you need to improve performance and enable batch processing. Your algorithm executes code very quickly (Which is great!) but the http reqeuests to actually download images take a variable amount of time, and are eating up a huge chunk of the total compute time.
This is when using an async function may be of value (and Futures), this structure allows you to make a series of requests in parallel, and just wait for them all to finish; regardless of which image downloaded first or second.
import Algorithmia
import numpy as np
from PIL import Image
import asyncio

# Same example, but this time with async and Futures

def get_image_as_numpy(url, client):
    """Uses the Smart Image Downloader algorithm to format and download images from the web or other places."""
    input = {"image": url}
    result = client.algo("util/SmartImageDownloader/0.2.x").pipe(input).result
    local_path = client.file(result['savePath'][0]).getFile()
    with Image.open(local_path) as f:
        image = np.asarray(f)
    return image


def do_work(image_array):
    """Does some computer vision work and needs a numpy array to function"""
    ...

# We've added a processor function that gets and processes an image, but is prefixed with an  'async'
# We did this, as when dealing with batch for image processing algorithms, it's common that bottleneck is http and getting
# the images from a remote resource into your system.


# You can read more about 'asyncio' here: https://docs.python.org/3/library/asyncio.html
# Bare in mind that if you're using a version of python < 3.5, you'll need to import it as a pypi package.

async def process_url(url, client):
    image_data = get_image_as_numpy(url, client)
    result = do_work(image_data)
    return result


def apply(input):
    client = Algorithmia.client()
    # We have a list of inputs that we're going to want to loop over
    if isinstance(input, list):
        future_images = []
        for url in input:
            future_image = asyncio.ensure_future(process_url(url, client))
            future_images.append(future_image)
        # Now we have a list of futures, let's gather them and wait for them to complete.
        results = asyncio.run(asyncio.gather(future_images))
        return  results
    elif isinstance(input, str):
        # And if we are only processing one image at a time, lets keep the old functionality as well
        image_data = get_image_as_numpy(input, client)
        result = do_work(image_data)
        return result
    else:
        raise Exception("Invalid input, expecting a list of Urls or a single URL string.")


This structure can be used in many languages, including Scala, Java and Rust.
But what if the algorithm has a different story; the algorithm takes time to compute a result, sometimes upwards of 10-30 seconds. This isn't terrible as a demo, but you're tasked with turning this into a high performance pipeline that could process many requests per second, such as a streaming pipeline, how would you do that?
Multi-Request solutions

These next examples will help you grasp this and give you the tools you need to build the right algorithm.
Algorithm Orchestration