Skip to content

Instantly share code, notes, and snippets.

@hiczlf
Last active July 28, 2020 02:22
Show Gist options
  • Save hiczlf/4ec9d60d768eae2cc87f6b99076339a2 to your computer and use it in GitHub Desktop.
Save hiczlf/4ec9d60d768eae2cc87f6b99076339a2 to your computer and use it in GitHub Desktop.
import pprint
# 需要替换的文本
s = """
2020/7/28 9:03:01,姓名:科创,班级:Y1系_1班,年龄:15
2020/7/28 9:03:01,姓名:新三,班级:Y2系_2班,年龄:16
2020/7/28 9:03:01,姓名:创业,班级:Y3系_3班,年龄:17
"""
# 命令后解析
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('file', help='要解析的txt文件路径')
args = parser.parse_args()
# 读取文件
s = open(args.file, 'r').read()
# 替换不标准字符为英文字符, 处理起来比较简单
s = s.replace(",", ",")
s = s.replace(":", ":")
result = []
for line in s.splitlines():
row = line.split(',')
item = {}
if not row or len(row) == 1:
continue
# 下面几行的缩进被吃掉了
item['姓名'] = row[1].split(':')[1]
item['院系'] = row[2].split(':')[1].split('_')[0]
item['班级'] = row[2].split(':')[1].split('_')[1]
item['年龄'] = row[3].split(':')[1]
result.append(item)
pprint.pprint(result)
@hiczlf
Copy link
Author

hiczlf commented Jul 28, 2020

运行

 (learn_python) ➜  learn_python python parse.py f.txt
[{'姓名': '科创', '年龄': '15', '班级': '1班', '院系': 'Y1系'},
 {'姓名': '新三', '年龄': '16', '班级': '2班', '院系': 'Y2系'},
 {'姓名': '创业', '年龄': '17', '班级': '3班', '院系': 'Y3系'}]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment