-
-
Save netj/526585 to your computer and use it in GitHub Desktop.
#!/usr/bin/env bash | |
# memusg -- Measure memory usage of processes | |
# Usage: memusg COMMAND [ARGS]... | |
# | |
# Author: Jaeho Shin <netj@sparcs.org> | |
# Created: 2010-08-16 | |
############################################################################ | |
# Copyright 2010 Jaeho Shin. # | |
# # | |
# Licensed under the Apache License, Version 2.0 (the "License"); # | |
# you may not use this file except in compliance with the License. # | |
# You may obtain a copy of the License at # | |
# # | |
# http://www.apache.org/licenses/LICENSE-2.0 # | |
# # | |
# Unless required by applicable law or agreed to in writing, software # | |
# distributed under the License is distributed on an "AS IS" BASIS, # | |
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # | |
# See the License for the specific language governing permissions and # | |
# limitations under the License. # | |
############################################################################ | |
set -um | |
# check input | |
[[ $# -gt 0 ]] || { sed -n '2,/^#$/ s/^# //p' <"$0"; exit 1; } | |
# TODO support more options: peak, footprint, sampling rate, etc. | |
pgid=$(ps -o pgid= $$) | |
# make sure we're in a separate process group | |
if [[ "$pgid" == "$(ps -o pgid= $(ps -o ppid= $$))" ]]; then | |
cmd= | |
set -- "$0" "$@" | |
for a; do cmd+="'${a//"'"/"'\\''"}' "; done | |
exec bash -i -c "$cmd" | |
fi | |
# detect operating system and prepare measurement | |
case $(uname) in | |
Darwin|*BSD) sizes() { /bin/ps -o rss= -g $1; } ;; | |
Linux) sizes() { /bin/ps -o rss= -$1; } ;; | |
*) echo "$(uname): unsupported operating system" >&2; exit 2 ;; | |
esac | |
# monitor the memory usage in the background. | |
( | |
peak=0 | |
while sizes=$(sizes $pgid) | |
do | |
set -- $sizes | |
sample=$((${@/#/+})) | |
let peak="sample > peak ? sample : peak" | |
sleep 0.1 | |
done | |
echo "memusg: peak=$peak" >&2 | |
) & | |
monpid=$! | |
# run the given command | |
exec "$@" |
+1 Just used this for some quick profiling. Thanks.
Nice work! I used this script to do some quick profiling of some PHP scripts.
Just a heads up...
User Jonathan Clark has a reviewed and improved version of this memusg script here:
https://github.com/jhclark/memusg
It seems to have fixed all the problems with child processes, threads and nohup.
The only disadvantage is that it measures the Virtual Size instead.
But you only need to change the expression 'vsize=' to 'rssize=' to get the Peak (High Water Mark) RSS Memory Usage.
Just FYI, the version linked by @XICO2KX does not work on OSX/BSD (due to differences in the ps binary).
has anyone an idea, what I can do if I have to log a software that only works like this:
executable parameter parameter < input
I could adapt the script, so it only works with that tool and pass the input as a normal parameter, and instead of "just" calling $@ split the arguments and construct the correct call. But I like these kind of monitoring tools to work more generic.
for example if I type
/usr/bin/time -v sort < unsorted_file
I get the sorted output as expected. If I do the same with your script, it does not work
I found the answer:
exec "$@" <&0
should pass the stdin to the command
Hi, @netj! I'm a fan of your script and have been using it for a long time. I want to include it in my repository, https://github.com/tclssg/benchmark-data, to be run from a benchmarking script. Could you state what license it is distributed under?
Hi, i'm trying to use this script to monitor peak RAM usage of scripts I'm running withing an SGE environment on a shared server (using qsub). I've found your script useful and easy to use , so i decided to use it here.
However, the SGE-jobs always fail when I try to run "memusg" with the following error message:
bash: no job control in this shell '': unknown terminal type.
memusg only works if i run it in a local shell (NOT submiting it via qsub) which sadly is not an option for the jobs i want to run.
Is there a way to fix this?
@dbohdan @holtgrewe: Thanks for asking! Didn't know this was being used by so many people. I just added the license (Apache 2.0) to the script. You are free to use/include/modify it. Looks like I should try to iron out the issues raised here and there.
Fantastic tool! Use it quite frequently to profile programs for parallelization. I noticed that sending it to background with a terminating "&" seems to cause problems - in particular, it always closes my shell session. Would it be too difficult to make memusg compatible with background execution? :P
Can I suggest this is moved to an actual repository instead of just a gist? Would make it possible to send pull requests and create issues properly.
+1 to @EmilStenstrom
FWIW, though it's nearly six years since @brassel asked his first question, just yesterday I wrestled with an issue whereby memusg
was hanging in a driver script. I have an extensive comment in my script explaining the issue with bash -i -c "$cmd"
causing my script to hang, and how job control (set -m
) solves the problem. An answer to Why can't I use job control in a bash script? helped me zero in on the set -m
solution.
A reproduction, per my comment:
First create the following script; I'll call it foo.sh
:
memusg ls >"foo0.out" 2>&1
memusg ls >"foo1.out" 2>&1
Then:
- Run it as
bash foo.sh
and it will hang; runfg
to continue. - Run it as
bash -m foo.sh
and it will complete.
Also try editing the output redirections per the following and running it with bash foo.sh
:
- Remove the first
2>&1
redirection and it will hang. - Remove the second
2>&1
redirection and it will complete. - Remove both
2>&1
redirection and it will complete. - Remove the first
>"foo0.out"
redirection and it will hang. - Remove the second
>"foo0.out"
redirection and it will complete.
I also validated the fact that set -m
obviated the need for a new process group by adding the following to memusg
before the exec bash -i -c "$cmd"
call, then watched when it appeared (bash
) and when it didn't (bash -m
or set -m
):
echo "EXECING FOR NEW PROCESS GROUP" >&2
What is the unit of memusg output, bit or byte? Thanks.
really cool script, thanks!
Thanks for the cool tool! :-)
@bbsunchen it appears to be using the value of the RSS column from ps
which, according to the man page, is measured "in 1024 byte units". But that can't be right as I'm getting usage values like 138487884
and I don't, sadly, have 138 GB or RAM.
The man page linked above can no longer be found. The unit of RSS from ps is now kb. https://man7.org/linux/man-pages/man1/ps.1.html
I tried out this script, but it didn't produce the info I wanted to capture on my M2 Max MBP. Since Apple Silicon uses unified memory and treats RAM and VRAM as the same thing, the output of ps does not match what is shown in Activity Monitor for graphically intensive apps -- for me, this is Stable Diffusion running in Python. After some experimenting, I came up with the following one-liner that works well enough for me:
footprint --noCategories --sample 0.1 -f bytes -p [PID] | awk '/[process name]/ {print $5; fflush()}' >> /[path]/memout.txt
The footprint command more accurately accounts for the full memory use of a process. --noCategories prevents the output of a breakdown of various categories of memory use that isn't of interest to me. --sample 0.1 polls the memory use every 0.1s, which is overkill for most uses and often not attainable -- footprint will dial back the polling rate automatically if it finds it can't keep up. -f bytes outputs the exact memory usage to the byte, which is important because the default formatting often only outputs two significant figures for large memory usage. And of course you need to supply the PID to monitor with -p.
The awk command further refines the output of footprint, plucking out just the line (though you need to provide the process name that footprint outputs in order for this to work; in my case it was "Python") and then the field containing the memory use number. The fflush() statement is important so that awk will output continuously rather than buffering the output until footprint quits, since if you terminate the monitoring with Ctrl-C, awk will terminate too before it can write from its buffer.
I then redirect the filtered output to a file so I can take data for as long as I want without filling up the Terminal output. Note that the output is a full history of memory use rather than just a maximum, so it works a bit differently compared to netj's script. However, you can easily import the data into Numbers or Excel for graphing and picking out the maximum yourself. You could probably easily modify the command into a short bash script that could keep track of the maximum for you (just define a variable to hold the last seen maximum value and compare its contents with each new output from footprint/awk), but I like being able to see when during process execution certain levels of memory use occurred.
Hopefully this will be helpful to someone.
@dbolser It's kb, perhaps. See also comments on my answer to the stackoverflow question.