Skip to content

Instantly share code, notes, and snippets.

@stgleb
Created November 7, 2016 12:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save stgleb/099e048cc8052e90fac439203bd5ffe6 to your computer and use it in GitHub Desktop.
Save stgleb/099e048cc8052e90fac439203bd5ffe6 to your computer and use it in GitHub Desktop.
SEEK_BEG = 0
SEEK_SET = 1
SEEK_END = 2
BLOCK_SIZE = 10
FILE_NAME = "console.log"
def get_console_output(length, offset, origin=SEEK_END):
if origin == SEEK_END:
offset *= -1
with open(FILE_NAME, "rb") as f:
offset -= length
f.seek(offset, origin)
bytes_read = f.read(length)
pos = f.tell()
# Return bytes and position of last unread byte from the begin.
# Subtract length to pos since we will read in opposite direction
return bytes_read, pos - length
if __name__ == "__main__":
data = []
bytes_array, pos = get_console_output(BLOCK_SIZE, 20, SEEK_END)
data.append(bytes_array)
bytes_array, pos = get_console_output(BLOCK_SIZE, pos, origin=SEEK_BEG)
data.append(bytes_array)
bytes_array, pos = get_console_output(BLOCK_SIZE, pos, origin=SEEK_BEG)
data.append(bytes_array)
print("".join(data[::-1]))
@markuszoeller
Copy link

My assumption is, that the content in the file FILENAME is constantly under change as soon as the log rotation (with virtlogd) is in place. This means, the problem we face here is the time frame between line 28 and 29. In this specific example this time frame is so small that it is probably not an issue. In a real-life setup this time frame can be big enough, that this logic "misses" some lines/bytes.

@markuszoeller
Copy link

So, finally my concerns in code:

if __name__ == "__main__":
    data = []

    # markus_z: create some test data -->
    with open(FILE_NAME, "wb"):
        # empty the content from a possible previous run
        pass
    with open(FILE_NAME, "ab") as f:
        for i in range(1, 101):
            f.write(str(i) + "\n")
    # <--- created some test data

    bytes_array, pos = get_console_output(BLOCK_SIZE, 20, SEEK_END)
    data.append(bytes_array)
    # markus_z: in the mean time, the file content changes -->
    content_added = ""
    with open(FILE_NAME, "ab") as f:
        for i in range(101, 111):
            content_added += str(i) + "\n"
        f.write(content_added)
    import sys # yeah, that's ugly...
    bytes_to_truncate = sys.getsizeof(content_added)
    with open(FILE_NAME, "rb") as f_in:
        updated_content = f_in.read()
    with open(FILE_NAME, "wb") as f_out:
        rotated_content = updated_content[bytes_to_truncate:-1]
        f_out.write(rotated_content)
    # <--- file content changed
    bytes_array, pos = get_console_output(BLOCK_SIZE, pos, origin=SEEK_BEG)
    data.append(bytes_array)
    bytes_array, pos = get_console_output(BLOCK_SIZE, pos, origin=SEEK_BEG)
    data.append(bytes_array)
    print("".join(data[::-1]))

@markuszoeller
Copy link

The code above returns:

106
107
1101
92
93
94

Which is not the expected data.

@markuszoeller
Copy link

The issue is the content of the file which will be under constant change when log rotation is enabled:

1st      2nd      3rd
read     read     read
------------------------
  1        2        3        
  2        3        4
  3        4        5
  4        5        6
  5        6        7           
  6        7        8
  7        8        9
  8        9       10
  9       10       11
 10       11       12
------------------------

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment