Last active
September 23, 2021 21:43
-
-
Save jimrollenhagen/4453083 to your computer and use it in GitHub Desktop.
Cool little utility to build Django model definitions from a CSV file. Outputs CharField definitions, with blank=True for columns that have blank cells, and max_length corresponding to the next power of 2 greater than that column's actual max length.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sample output: | |
processed 65534 lines | |
[(False, 10), (False, 12), (True, 908)] | |
field_one = models.CharField(max_length=16) | |
field_two = models.CharField(max_length=16) | |
field_three = models.CharField(max_length=1024, blank=True) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import csv | |
with open('my.csv', 'rb') as f: | |
reader = csv.reader(f) | |
header = reader.next() # burn off the header | |
props = [(False, 0)] * 3 # 3 is number of columns | |
for n, line in enumerate(reader): | |
for i, item in enumerate(line): | |
item = item.strip() | |
if not item: | |
props[i] = (True, props[i][1]) | |
if len(item) > props[i][1]: | |
props[i] = (props[i][0], len(item)) | |
print 'processed %s lines' % n | |
print props | |
for i, p in enumerate(props): | |
power = 0 | |
while 2 ** power < p[1]: | |
power += 1 | |
max_length = 2 ** power | |
if p[0]: | |
print '%s = models.CharField(max_length=%s, blank=True)' % (header[i], max_length) | |
else: | |
print '%s = models.CharField(max_length=%s)' % (header[i], max_length) |
Here's the script that's compatible with Python >=3.6. It also has a dynamic number of headers:
import csv
with open('data.csv') as f:
reader = csv.reader(f)
header = next(reader) # burn off the header
props = [(False, 0)] * len(header) #() # 3 is number of columns
for n, line in enumerate(reader):
for i, item in enumerate(line):
item = item.strip()
if not item:
props[i] = (True, props[i][1])
if len(item) > props[i][1]:
props[i] = (props[i][0], len(item))
print(f'processed {n} lines')
print(props)
for i, p in enumerate(props):
power = 0
while 2 ** power < p[1]:
power += 1
max_length = 2 ** power
model_name = header[i]
if p[0]:
print(f'{model_name} = models.CharField(max_length={max_length})')
else:
print(f'{model_name} = models.CharField(max_length={max_length})')
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
lol hero. found this because i'm bouta have to make model based on a csv. thanks.