To measure a function's execution time start ipython, import code without
running the "main" part, and execute your code. You can also do it directly by
importing and calling the timeit.timeit
function.
$ ipython
In [1]: %run -n file.py
In [2]: %timeit some_function_in_file(arg1, arg2)
<output>
If you want to profile a file as a whole, use:
$ python -m cProfile some_file.py
If you want to target specific parts, use:
# some_file.py
imprt cProfile
def some_function_in_file(arg1, arg2):
<code>
# Print to stdout
c.Profile.run("some_function_in_file(arg1, arg2)")
# Save to file for exploration
c.Profile.run("some_function_in_file(arg1, arg2)", filename="prof.out")
and execute:
$ python some_file.py
If you used an outpout file, read it with:
$ python -m pstats prof.out
prof.out% stats 10
<output>
prof.out% sort cumtime
prof.out% stats 10
<output>
First, install line_profiler
with pip
. Then apply the @profile
decorator
to any function you want to profile line-by-line. Then use:
$ kernprof -l some_file.py
$ python -m line_profiler prof.py.lprof
<output>
First, install memory_profiler
with pip
. Then apply the @profile
decorator
to any function you want to profile for memory usage. Then use:
$ python -m memory_profiler some_file.py
<output>
With memory_profiler
, we also get mprof
to track memory through time. To use
it, remove the @profile
decorators in your code, and run:
$ mprof some_file.py
<output to an mprofile_*.dat file>
$ mprof plot <output_file>.dat
<output as graph>
-
Execution time is affected in order by:
- Overall system architecture
- Algorithms and data structures
- Algorithmic implementations
-
Techniques to increase speed
- Caching
- LRU Cache (memory cache)
- Joblib (disk cache)
- Pre-calculating
- Approximations
- Parallelization
- Caching
- Concurrency: manage many things at once (can be single threaded)
- Parallelization: do many things at once (requires multiple threads)
- Amdahl's Law: how much you can improve (bounded by sequential time)
- IO-Bound: most time is spent on reading/writing inputs/outputs
- CPU-Bound: most time is spend on calculations
Technique | Good | Bad | Uses |
---|---|---|---|
Threads | IO-Bound | CPU-Bound | Shared memory, can corrupt objects |
Processes | CPU-Bound | IO-Bound | No shared memory, difficult proc comms |
AsyncIO | Many connections | Async drivers all throughout |
Name | Abbreviation | Multiple to seconds |
---|---|---|
Millisecond | ms | 1,000 |
Microsecond | mus | 1,000,000 |
Nanosecond | ns | 1,000,000 |