Skip to content

Instantly share code, notes, and snippets.

@netj
Last active November 5, 2024 15:18
Show Gist options
  • Save netj/526585 to your computer and use it in GitHub Desktop.
Save netj/526585 to your computer and use it in GitHub Desktop.
memusg -- Measure memory usage of processes
#!/usr/bin/env bash
# memusg -- Measure memory usage of processes
# Usage: memusg COMMAND [ARGS]...
#
# Author: Jaeho Shin <netj@sparcs.org>
# Created: 2010-08-16
############################################################################
# Copyright 2010 Jaeho Shin. #
# #
# Licensed under the Apache License, Version 2.0 (the "License"); #
# you may not use this file except in compliance with the License. #
# You may obtain a copy of the License at #
# #
# http://www.apache.org/licenses/LICENSE-2.0 #
# #
# Unless required by applicable law or agreed to in writing, software #
# distributed under the License is distributed on an "AS IS" BASIS, #
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. #
# See the License for the specific language governing permissions and #
# limitations under the License. #
############################################################################
set -um
# check input
[[ $# -gt 0 ]] || { sed -n '2,/^#$/ s/^# //p' <"$0"; exit 1; }
# TODO support more options: peak, footprint, sampling rate, etc.
pgid=$(ps -o pgid= $$)
# make sure we're in a separate process group
if [[ "$pgid" == "$(ps -o pgid= $(ps -o ppid= $$))" ]]; then
cmd=
set -- "$0" "$@"
for a; do cmd+="'${a//"'"/"'\\''"}' "; done
exec bash -i -c "$cmd"
fi
# detect operating system and prepare measurement
case $(uname) in
Darwin|*BSD) sizes() { /bin/ps -o rss= -g $1; } ;;
Linux) sizes() { /bin/ps -o rss= -$1; } ;;
*) echo "$(uname): unsupported operating system" >&2; exit 2 ;;
esac
# monitor the memory usage in the background.
(
peak=0
while sizes=$(sizes $pgid)
do
set -- $sizes
sample=$((${@/#/+}))
let peak="sample > peak ? sample : peak"
sleep 0.1
done
echo "memusg: peak=$peak" >&2
) &
monpid=$!
# run the given command
exec "$@"
@XICO2KX
Copy link

XICO2KX commented Mar 13, 2011

Hello!
I have a problem using your tool with a multi-threaded program...

This last program when started launches itself a couple of times.
(Different processes with different PIDs but with the same name appear on "top".)

I am using a shell script to launch "memusg". Something like:
[script.sh] memusg.sh command

And I run it through "nohup":
nohup script.sh &

But by watching "top", the command itself isn't executed!
(Or it shows up and quickly disappears constantly!)

And the output file shows multiple times:
[nohup.out] bash: no job control in this shell

The only processes that I can see running are:
/bin/bash script.sh
bash memusg.sh command

But if I kill the process (the one initiated by "nohup"):
/bin/bash script.sh
the command gets executed!

Do you know what is the problem and how can I fix it?
Thank you very much!

@dbolser
Copy link

dbolser commented Oct 28, 2011

Sorry for being dumb... what is the output? kb? Cheers,
Dan.

@netj
Copy link
Author

netj commented Oct 31, 2011

@netj
Copy link
Author

netj commented Oct 31, 2011

@brassel and @XICO2KX I know it's very late, but I'll look into this problem. It seems there're quite a few folks who want to use this in various ways.

@gbluma
Copy link

gbluma commented Feb 28, 2013

+1 Just used this for some quick profiling. Thanks.

@nderevj
Copy link

nderevj commented Mar 21, 2013

Nice work! I used this script to do some quick profiling of some PHP scripts.

@XICO2KX
Copy link

XICO2KX commented Jul 12, 2013

Just a heads up...
User Jonathan Clark has a reviewed and improved version of this memusg script here:
https://github.com/jhclark/memusg
It seems to have fixed all the problems with child processes, threads and nohup.
The only disadvantage is that it measures the Virtual Size instead.
But you only need to change the expression 'vsize=' to 'rssize=' to get the Peak (High Water Mark) RSS Memory Usage.

@kilburn
Copy link

kilburn commented Jan 16, 2014

Just FYI, the version linked by @XICO2KX does not work on OSX/BSD (due to differences in the ps binary).

@Garonenur
Copy link

has anyone an idea, what I can do if I have to log a software that only works like this:

executable parameter parameter < input

I could adapt the script, so it only works with that tool and pass the input as a normal parameter, and instead of "just" calling $@ split the arguments and construct the correct call. But I like these kind of monitoring tools to work more generic.

for example if I type

/usr/bin/time -v sort < unsorted_file

I get the sorted output as expected. If I do the same with your script, it does not work

@Garonenur
Copy link

I found the answer:

exec "$@" <&0

should pass the stdin to the command

@dbohdan
Copy link

dbohdan commented Feb 13, 2015

Hi, @netj! I'm a fan of your script and have been using it for a long time. I want to include it in my repository, https://github.com/tclssg/benchmark-data, to be run from a benchmarking script. Could you state what license it is distributed under?

@holtgrewe
Copy link

I second the question of @dbohdan. @netj, what is the license here? MIT? Public domain?

@jvollme
Copy link

jvollme commented Aug 6, 2015

Hi, i'm trying to use this script to monitor peak RAM usage of scripts I'm running withing an SGE environment on a shared server (using qsub). I've found your script useful and easy to use , so i decided to use it here.
However, the SGE-jobs always fail when I try to run "memusg" with the following error message:
bash: no job control in this shell '': unknown terminal type.
memusg only works if i run it in a local shell (NOT submiting it via qsub) which sadly is not an option for the jobs i want to run.
Is there a way to fix this?

@netj
Copy link
Author

netj commented Nov 10, 2015

@dbohdan @holtgrewe: Thanks for asking! Didn't know this was being used by so many people. I just added the license (Apache 2.0) to the script. You are free to use/include/modify it. Looks like I should try to iron out the issues raised here and there.

@tantrev
Copy link

tantrev commented Mar 29, 2016

Fantastic tool! Use it quite frequently to profile programs for parallelization. I noticed that sending it to background with a terminating "&" seems to cause problems - in particular, it always closes my shell session. Would it be too difficult to make memusg compatible with background execution? :P

@EmilStenstrom
Copy link

EmilStenstrom commented May 15, 2016

Can I suggest this is moved to an actual repository instead of just a gist? Would make it possible to send pull requests and create issues properly.

@supermanue
Copy link

@mbland
Copy link

mbland commented Aug 20, 2016

FWIW, though it's nearly six years since @brassel asked his first question, just yesterday I wrestled with an issue whereby memusg was hanging in a driver script. I have an extensive comment in my script explaining the issue with bash -i -c "$cmd" causing my script to hang, and how job control (set -m) solves the problem. An answer to Why can't I use job control in a bash script? helped me zero in on the set -m solution.

A reproduction, per my comment:

First create the following script; I'll call it foo.sh:

memusg ls >"foo0.out" 2>&1
memusg ls >"foo1.out" 2>&1

Then:

  • Run it as bash foo.sh and it will hang; run fg to continue.
  • Run it as bash -m foo.sh and it will complete.

Also try editing the output redirections per the following and running it with bash foo.sh:

  • Remove the first 2>&1 redirection and it will hang.
  • Remove the second 2>&1 redirection and it will complete.
  • Remove both 2>&1 redirection and it will complete.
  • Remove the first >"foo0.out" redirection and it will hang.
  • Remove the second >"foo0.out" redirection and it will complete.

I also validated the fact that set -m obviated the need for a new process group by adding the following to memusg before the exec bash -i -c "$cmd" call, then watched when it appeared (bash) and when it didn't (bash -m or set -m):

echo "EXECING FOR NEW PROCESS GROUP" >&2

@bbsunchen
Copy link

What is the unit of memusg output, bit or byte? Thanks.

@cristiprg
Copy link

really cool script, thanks!

@gogothegreen
Copy link

Thanks for the cool tool! :-)

@tamlyn
Copy link

tamlyn commented Sep 14, 2017

@bbsunchen it appears to be using the value of the RSS column from ps which, according to the man page, is measured "in 1024 byte units". But that can't be right as I'm getting usage values like 138487884 and I don't, sadly, have 138 GB or RAM.

@lolrenceH
Copy link

The man page linked above can no longer be found. The unit of RSS from ps is now kb. https://man7.org/linux/man-pages/man1/ps.1.html

@Adreitz
Copy link

Adreitz commented Apr 19, 2023

I tried out this script, but it didn't produce the info I wanted to capture on my M2 Max MBP. Since Apple Silicon uses unified memory and treats RAM and VRAM as the same thing, the output of ps does not match what is shown in Activity Monitor for graphically intensive apps -- for me, this is Stable Diffusion running in Python. After some experimenting, I came up with the following one-liner that works well enough for me:

footprint --noCategories --sample 0.1 -f bytes -p [PID] | awk '/[process name]/ {print $5; fflush()}' >> /[path]/memout.txt

The footprint command more accurately accounts for the full memory use of a process. --noCategories prevents the output of a breakdown of various categories of memory use that isn't of interest to me. --sample 0.1 polls the memory use every 0.1s, which is overkill for most uses and often not attainable -- footprint will dial back the polling rate automatically if it finds it can't keep up. -f bytes outputs the exact memory usage to the byte, which is important because the default formatting often only outputs two significant figures for large memory usage. And of course you need to supply the PID to monitor with -p.

The awk command further refines the output of footprint, plucking out just the line (though you need to provide the process name that footprint outputs in order for this to work; in my case it was "Python") and then the field containing the memory use number. The fflush() statement is important so that awk will output continuously rather than buffering the output until footprint quits, since if you terminate the monitoring with Ctrl-C, awk will terminate too before it can write from its buffer.

I then redirect the filtered output to a file so I can take data for as long as I want without filling up the Terminal output. Note that the output is a full history of memory use rather than just a maximum, so it works a bit differently compared to netj's script. However, you can easily import the data into Numbers or Excel for graphing and picking out the maximum yourself. You could probably easily modify the command into a short bash script that could keep track of the maximum for you (just define a variable to hold the last seen maximum value and compare its contents with each new output from footprint/awk), but I like being able to see when during process execution certain levels of memory use occurred.

Hopefully this will be helpful to someone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment