-
-
Save varjmes/dd2d3f4ba7207242251c to your computer and use it in GitHub Desktop.
SUFFIXES = {1000: ['KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'], | |
1024: ['KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']} | |
def approximate_size(size, a_kilobyte_is_1024_bytes=True): | |
"""Convert a file size to human-readable form. | |
Keyword arguments: | |
size -- file size in bytes | |
a_kilobyte_is_1024_bytes -- if True (default), use multiples of 1024, | |
if False, use multiples of 1000 | |
Returns: string | |
""" | |
if size < 0: | |
raise ValueError("Number must be non-negative") | |
multiple = 1024 if a_kilobyte_is_1024_bytes else 1000 | |
for suffix in SUFFIXES[multiple]: | |
size /= multiple | |
if size < multiple: | |
return "{0:.1f} {1}".format(size, suffix) | |
""" | |
Can someone tell me how the above for loop selects the correct suffix? It does it as if by magic. | |
I understand how everything else works, but don't understand how the right suffix within the list | |
is selected. | |
""" | |
raise ValueError("Number too large") | |
if __name__ == "__main__": | |
print(approximate_size(1000000000000, False)) | |
print(approximate_size(1000000000000)) |
you are basically looking to see "if you are there yet"
so you try each unit in turn
divide size by KB and see if you have less than a 1000 (or 1024) KB left - if so its something.something KB
otherwise you're at least in the next bracket for units
divide size by MB and see if you have less than a 1000 (or 1024) MB left - if so its something.something MB
etc
if it helps (sometimes seeing the same thing done a different way is useful ) my version of the same function is (you'll have to work out the indentation because comments don't seem to like it but it's fairly obvious)
units = [ "bytes" , "Kb" , "Mb" , "Gb" , "Tb" , "Pb" ]
def _nicesize( v , uidx ):
k = float(v)/1024
if k > 1:
return _nicesize( k , uidx + 1 )
else:
return ( v, uidx )
def nicesize( v ):
( v , uidx ) = _nicesize( v ,0 )
return "%#.2f %s" % ( v , units[uidx] )
interestingly my version and yours both have the same bug - they don't test to see if they've 'run off the end'
I guess I've never used mine with petabytes & yours at least has a pretty high top end
You can think of
size /= multiple
assize = size / multiple
, for every iteration of the loop the size is divided by the multiple and then compared with it - if the size is larger (and so can be divided again) the loop moves onto the next suffix, continuing until it reaches the correct suffix.