gearbox/Concurrent simultaneous jobs.md

## Concurrent simultaneous jobs.md

      
    Raw
  

              Concurrent simultaneous jobs.md
            
          
    How to download multiple files simultaneously and trigger specific actions for each one done?

Typical approach is to put the IO heavy part like fetching data over the internet and data processing into the same function:
import random
import threading
import time
from concurrent.futures import ThreadPoolExecutor

import requests


def fetch_and_process_file(url):
    thread_name = threading.currentThread().name

    print(thread_name, "fetch", url)
    data = requests.get(url).text

    # "process" result
    time.sleep(random.random() / 4)  # simulate work
    print(thread_name, "process data from", url)

    result = len(data) ** 2
    return result


threads = 2
urls = ["https://google.com", "https://python.org", "https://pypi.org"]

executor = ThreadPoolExecutor(max_workers=threads)
with executor:
    results = executor.map(fetch_and_process_file, urls)

print()
print("results:", list(results))
outputs:
ThreadPoolExecutor-0_0 fetch https://google.com
ThreadPoolExecutor-0_1 fetch https://python.org
ThreadPoolExecutor-0_0 process data from https://google.com
ThreadPoolExecutor-0_0 fetch https://pypi.org
ThreadPoolExecutor-0_0 process data from https://pypi.org
ThreadPoolExecutor-0_1 process data from https://python.org