Skip to content

Instantly share code, notes, and snippets.

@bryanchriswhite
Last active May 22, 2023 06:04
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bryanchriswhite/5ae41396b86658d020fbb2c03ac8bbda to your computer and use it in GitHub Desktop.
Save bryanchriswhite/5ae41396b86658d020fbb2c03ac8bbda to your computer and use it in GitHub Desktop.
chatGPT assisted log analysis

Background

While debugging, after refactoring some tests and test helpers, I found myself struggling to make use of the debug log output that I was adding as it can easily get lost in the structured logging and myriad other debug log lines which print in the context of integration tests.

Summary

The threshold for writing a script to solve a one-off problem like this might typically be measured in a large number of hours or perhaps even day(s) for me. Using LLMs to generate and test such a script can greatly reduce the time and directed attention required to get to making use of your scirpt to solve your actual problem (rather than debugging it). So much so, that I would argue that it's worth reconsidering that threshold to prefer giving LLMs a shot much earlier than one otherwise may.

Prompting for the solution directly

Asking chatGPT to directly accomplish the task resulted in failure for both GPT3.5 and GPT4 using the following prompt:

split the following log lines into 3 groups containing consecutive lines containing a peerID, then compare the peerIDs contained within each group to those in every other group and explain any discrepancies you see. Let's work step by step:

... (contents of log_lines.txt)

GPT3.5

Quickly ran out of tokens, resulting in having to prompt it to continue where it left off, introducing errors when it finally gets to the comparison step.

GPT4

Took a very long time to just produce the groups. It also eventually ran out of tokens and had to be prompted to continue.

Prompting for solution code (& tests)

Continuing from GPT3.5's incorrect discrepency summary, I had the thought to prompt for code which would produce the solution, rather than the solution itself:

write a python script which can produce a similar summary of the discrepancies given a text file which contains the log lines

The results are the contents of analyze_log.py. To ensure correctness and speed up my understanding of the code, I asked it to write a test for the script which it produced; the contents of which are in analyze_logs_test.py.

import os
def extract_peer_ids(log_lines):
peer_ids = []
for line in log_lines:
if "peerID" in line:
peer_id = line.split('"peerID":"')[1].split('"')[0]
peer_ids.append(peer_id)
return peer_ids
def find_discrepancies(groups):
discrepancies = []
for i in range(len(groups)):
for j in range(i + 1, len(groups)):
group1 = groups[i]
group2 = groups[j]
common_ids = set(group1) & set(group2)
if len(common_ids) < len(group1) or len(common_ids) < len(group2):
discrepancy = {
'Group 1': group1,
'Group 2': group2,
'Common IDs': list(common_ids)
}
discrepancies.append(discrepancy)
return discrepancies
def summarize_discrepancies(discrepancies):
for i, discrepancy in enumerate(discrepancies, start=1):
print(f"Discrepancy {i}:")
print("Group 1:", discrepancy['Group 1'])
print("Group 2:", discrepancy['Group 2'])
print("Common IDs:", discrepancy['Common IDs'])
print()
# Read log lines from file
current_dir = os.path.dirname(os.path.abspath(__file__))
filename = os.path.join(current_dir, 'log_lines.txt') # Update with your file name
with open(filename, 'r') as file:
log_lines = file.readlines()
# Split log lines into groups containing consecutive lines with the same peerID
groups = []
current_group = []
for line in log_lines:
if "peerID" in line:
current_group.append(line)
elif current_group:
groups.append(extract_peer_ids(current_group))
current_group = []
if current_group:
groups.append(extract_peer_ids(current_group))
# Find and summarize the discrepancies in peerIDs
discrepancies = find_discrepancies(groups)
summarize_discrepancies(discrepancies)
import unittest
from io import StringIO
from unittest.mock import patch
from compare_peerstores_from_logs import extract_peer_ids, find_discrepancies, summarize_discrepancies
class TestLogAnalysis(unittest.TestCase):
def test_extract_peer_ids(self):
log_lines = [
'{"level":"debug","peerID":"peer1","message":"peerstore"}\n',
'{"level":"debug","peerID":"peer2","message":"peerstore"}\n',
'{"level":"debug","peerID":"peer1","message":"peerstore"}\n',
'{"level":"debug","peerID":"peer3","message":"peerstore"}\n'
]
expected_peer_ids = ['peer1', 'peer2', 'peer1', 'peer3']
peer_ids = extract_peer_ids(log_lines)
self.assertEqual(peer_ids, expected_peer_ids)
def test_find_discrepancies(self):
groups = [['peer1', 'peer2', 'peer3'], ['peer2', 'peer3'], ['peer1', 'peer3']]
expected_discrepancies = [
{'Group 1': ['peer1', 'peer2', 'peer3'], 'Group 2': ['peer2', 'peer3'], 'Common IDs': ['peer2', 'peer3']},
{'Group 1': ['peer1', 'peer2', 'peer3'], 'Group 2': ['peer1', 'peer3'], 'Common IDs': ['peer1', 'peer3']},
{'Group 1': ['peer2', 'peer3'], 'Group 2': ['peer1', 'peer3'], 'Common IDs': ['peer3']}
]
discrepancies = find_discrepancies(groups)
self.assertEqual(discrepancies, expected_discrepancies)
@patch('sys.stdout', new_callable=StringIO)
def test_summarize_discrepancies(self, mock_stdout):
discrepancies = [
{'Group 1': ['peer1', 'peer2', 'peer3'], 'Group 2': ['peer2', 'peer3'], 'Common IDs': ['peer2', 'peer3']},
{'Group 1': ['peer1', 'peer2', 'peer3'], 'Group 2': ['peer1', 'peer3'], 'Common IDs': ['peer1', 'peer3']}
]
expected_output = "Discrepancy 1:\nGroup 1: ['peer1', 'peer2', 'peer3']\nGroup 2: ['peer2', 'peer3']\nCommon IDs: ['peer2', 'peer3']\n\nDiscrepancy 2:\nGroup 1: ['peer1', 'peer2', 'peer3']\nGroup 2: ['peer1', 'peer3']\nCommon IDs: ['peer1', 'peer3']\n\n"
summarize_discrepancies(discrepancies)
self.assertEqual(mock_stdout.getvalue(), expected_output)
if __name__ == '__main__':
unittest.main()
{"level":"debug","peerID":"12D3KooWN6kuiA2kjCNtqp3RCMRP7jaR8m7PtxcGQBu4CNCpcFUY","addr":"/ip4/10.0.0.1/tcp/0","time":"2023-05-12T10:36:01+02:00","message":"peerstore"}
{"level":"debug","peerID":"12D3KooWCPq8audTqV5k7W76JuNNSdpvU3fsMs42PkJY5hz3mu5T","addr":"/ip4/10.0.0.1/tcp/42069","time":"2023-05-12T10:36:01+02:00","message":"peerstore"}
{"level":"debug","peerID":"12D3KooWDibYY1NF7KN3ubYWe3nr7y3iYDmjD1HCJXwJwEHBLETF","addr":"/ip4/10.0.0.1/tcp/42069","time":"2023-05-12T10:36:01+02:00","message":"peerstore"}
{"level":"debug","peerID":"12D3KooWR8Ve6aQQRRnvfP9XzAYBL1fCybKc2eMmbiKY4eY9Bhzf","addr":"/ip4/10.0.0.1/tcp/42069","time":"2023-05-12T10:36:01+02:00","message":"peerstore"}
{"level":"debug","peerID":"12D3KooWR2KSRQWyanR1dPvnZkXt296xgf3FFn8135szya3zYYwY","addr":"/ip4/10.0.0.1/tcp/42069","time":"2023-05-12T10:36:01+02:00","message":"peerstore"}
{"level":"error","module":"router","error":"opening stream: 12D3KooWN6kuiA2kjCNtqp3RCMRP7jaR8m7PtxcGQBu4CNCpcFUY cannot connect to 12D3KooWDibYY1NF7KN3ubYWe3nr7y3iYDmjD1HCJXwJwEHBLETF","time":"2023-05-12T10:36:01+02:00","message":"sending to peer during broadcast"}
{"level":"debug","module":"router","local_hostname":"node1.consensus","remote_hostname":"node3.consensus","time":"2023-05-12T10:36:01+02:00","message":"OUTGOING MSG"}
{"level":"debug","InboundConns":0,"InboundStreams":0,"OutboundConns":0,"OutboundStreams":0,"time":"2023-05-12T10:36:01+02:00","message":"host transient resource scope"}
{"level":"debug","peerID":"12D3KooWR8Ve6aQQRRnvfP9XzAYBL1fCybKc2eMmbiKY4eY9Bhzf","addr":"/ip4/10.0.0.1/tcp/42069","time":"2023-05-12T10:36:01+02:00","message":"peerstore"}
{"level":"debug","peerID":"12D3KooWR2KSRQWyanR1dPvnZkXt296xgf3FFn8135szya3zYYwY","addr":"/ip4/10.0.0.1/tcp/42069","time":"2023-05-12T10:36:01+02:00","message":"peerstore"}
{"level":"debug","peerID":"12D3KooWN6kuiA2kjCNtqp3RCMRP7jaR8m7PtxcGQBu4CNCpcFUY","addr":"/ip4/10.0.0.1/tcp/0","time":"2023-05-12T10:36:01+02:00","message":"peerstore"}
{"level":"debug","peerID":"12D3KooWCPq8audTqV5k7W76JuNNSdpvU3fsMs42PkJY5hz3mu5T","addr":"/ip4/10.0.0.1/tcp/42069","time":"2023-05-12T10:36:01+02:00","message":"peerstore"}
{"level":"debug","peerID":"12D3KooWDibYY1NF7KN3ubYWe3nr7y3iYDmjD1HCJXwJwEHBLETF","addr":"/ip4/10.0.0.1/tcp/42069","time":"2023-05-12T10:36:01+02:00","message":"peerstore"}
{"level":"error","module":"router","error":"opening stream: 12D3KooWN6kuiA2kjCNtqp3RCMRP7jaR8m7PtxcGQBu4CNCpcFUY cannot connect to 12D3KooWR8Ve6aQQRRnvfP9XzAYBL1fCybKc2eMmbiKY4eY9Bhzf","time":"2023-05-12T10:36:01+02:00","message":"sending to peer during broadcast"}
{"level":"debug","module":"router","firstTarget":"node1.consensus:42069","height":1,"level":1,"pstoreSize":2,"secondTarget":"node2.consensus:42069","time":"2023-05-12T10:36:01+02:00","message":"Targets at height"}
{"level":"debug","module":"router","local_hostname":"node1.consensus","remote_hostname":"node2.consensus","time":"2023-05-12T10:36:01+02:00","message":"OUTGOING MSG"}
{"level":"debug","InboundConns":0,"InboundStreams":0,"OutboundConns":0,"OutboundStreams":0,"time":"2023-05-12T10:36:01+02:00","message":"host transient resource scope"}
{"level":"debug","peerID":"12D3KooWCPq8audTqV5k7W76JuNNSdpvU3fsMs42PkJY5hz3mu5T","addr":"/ip4/10.0.0.1/tcp/42069","time":"2023-05-12T10:36:01+02:00","message":"peerstore"}
{"level":"debug","peerID":"12D3KooWDibYY1NF7KN3ubYWe3nr7y3iYDmjD1HCJXwJwEHBLETF","addr":"/ip4/10.0.0.1/tcp/42069","time":"2023-05-12T10:36:01+02:00","message":"peerstore"}
{"level":"debug","peerID":"12D3KooWR8Ve6aQQRRnvfP9XzAYBL1fCybKc2eMmbiKY4eY9Bhzf","addr":"/ip4/10.0.0.1/tcp/42069","time":"2023-05-12T10:36:01+02:00","message":"peerstore"}
{"level":"debug","peerID":"12D3KooWR2KSRQWyanR1dPvnZkXt296xgf3FFn8135szya3zYYwY","addr":"/ip4/10.0.0.1/tcp/42069","time":"2023-05-12T10:36:01+02:00","message":"peerstore"}
{"level":"debug","peerID":"12D3KooWN6kuiA2kjCNtqp3RCMRP7jaR8m7PtxcGQBu4CNCpcFUY","addr":"/ip4/10.0.0.1/tcp/0","time":"2023-05-12T10:36:01+02:00","message":"peerstore"}
@Olshansk
Copy link

I had the thought to prompt for code which would produce the solution, rather than the solution itself:

Love it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment