Skip to content

Instantly share code, notes, and snippets.

@juffaz
Created February 23, 2024 14:17
Show Gist options
  • Save juffaz/87d538b5b7b684fb5d7852e542b947bf to your computer and use it in GitHub Desktop.
Save juffaz/87d538b5b7b684fb5d7852e542b947bf to your computer and use it in GitHub Desktop.
python-pandas-parse-group-logs.py
import pandas as pd
import re
# Добавление новых данных в DataFrame
data = {
'key': [
"/loans-clone/11111111-124221101/accounts",
"/loans-clone/11111111-124221101/schedule",
"/loans-clone/22222222-1162201111/accounts",
"/loans-clone/123456-102228111/accounts",
"/loans-clone/1234567-116700111/schedule",
"/loans-clone/by-cif/1844111/active",
"/loans-clone/collaterals/1899111"
],
'doc_count': [2, 2, 2, 2, 2, 2, 3], # Замените 0 на фактические значения doc_count
'api': [None, None, None, None, None, None, None] # Замените None на фактические значения api
}
df = pd.DataFrame(data)
# Извлечение endpoint без цифр
df['endpoint'] = df['key'].apply(lambda x: re.sub(r'/\d+[-]?\d*', '', x))
# Группировка данных по endpoint и суммирование doc_count
grouped_df = df.groupby('endpoint')['doc_count'].sum().reset_index()
# Вывод результирующего DataFrame
print(grouped_df)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment