-
-
Save varjmes/dd2d3f4ba7207242251c to your computer and use it in GitHub Desktop.
SUFFIXES = {1000: ['KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'], | |
1024: ['KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']} | |
def approximate_size(size, a_kilobyte_is_1024_bytes=True): | |
"""Convert a file size to human-readable form. | |
Keyword arguments: | |
size -- file size in bytes | |
a_kilobyte_is_1024_bytes -- if True (default), use multiples of 1024, | |
if False, use multiples of 1000 | |
Returns: string | |
""" | |
if size < 0: | |
raise ValueError("Number must be non-negative") | |
multiple = 1024 if a_kilobyte_is_1024_bytes else 1000 | |
for suffix in SUFFIXES[multiple]: | |
size /= multiple | |
if size < multiple: | |
return "{0:.1f} {1}".format(size, suffix) | |
""" | |
Can someone tell me how the above for loop selects the correct suffix? It does it as if by magic. | |
I understand how everything else works, but don't understand how the right suffix within the list | |
is selected. | |
""" | |
raise ValueError("Number too large") | |
if __name__ == "__main__": | |
print(approximate_size(1000000000000, False)) | |
print(approximate_size(1000000000000)) |
You can think of size /= multiple
as size = size / multiple
, for every iteration of the loop the size is divided by the multiple and then compared with it - if the size is larger (and so can be divided again) the loop moves onto the next suffix, continuing until it reaches the correct suffix.
you are basically looking to see "if you are there yet"
so you try each unit in turn
divide size by KB and see if you have less than a 1000 (or 1024) KB left - if so its something.something KB
otherwise you're at least in the next bracket for units
divide size by MB and see if you have less than a 1000 (or 1024) MB left - if so its something.something MB
etc
if it helps (sometimes seeing the same thing done a different way is useful ) my version of the same function is (you'll have to work out the indentation because comments don't seem to like it but it's fairly obvious)
units = [ "bytes" , "Kb" , "Mb" , "Gb" , "Tb" , "Pb" ]
def _nicesize( v , uidx ):
k = float(v)/1024
if k > 1:
return _nicesize( k , uidx + 1 )
else:
return ( v, uidx )
def nicesize( v ):
( v , uidx ) = _nicesize( v ,0 )
return "%#.2f %s" % ( v , units[uidx] )
interestingly my version and yours both have the same bug - they don't test to see if they've 'run off the end'
I guess I've never used mine with petabytes & yours at least has a pretty high top end
The loop...loops, until the size is divided enough in that it is less than the multiple (1024 or 1000). Depending on the number of loops made up until that point, it takes the number of loops (e.g. 4) and picks that number within the suffixes list (eg. the fourth element in the 1024 Suffix list is 'PiB'). It adds it on to the now divided size. Voila.