Skip to content

Instantly share code, notes, and snippets.

@jimrollenhagen
Last active September 23, 2021 21:43
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jimrollenhagen/4453083 to your computer and use it in GitHub Desktop.
Save jimrollenhagen/4453083 to your computer and use it in GitHub Desktop.
Cool little utility to build Django model definitions from a CSV file. Outputs CharField definitions, with blank=True for columns that have blank cells, and max_length corresponding to the next power of 2 greater than that column's actual max length.
Sample output:
processed 65534 lines
[(False, 10), (False, 12), (True, 908)]
field_one = models.CharField(max_length=16)
field_two = models.CharField(max_length=16)
field_three = models.CharField(max_length=1024, blank=True)
import csv
with open('my.csv', 'rb') as f:
reader = csv.reader(f)
header = reader.next() # burn off the header
props = [(False, 0)] * 3 # 3 is number of columns
for n, line in enumerate(reader):
for i, item in enumerate(line):
item = item.strip()
if not item:
props[i] = (True, props[i][1])
if len(item) > props[i][1]:
props[i] = (props[i][0], len(item))
print 'processed %s lines' % n
print props
for i, p in enumerate(props):
power = 0
while 2 ** power < p[1]:
power += 1
max_length = 2 ** power
if p[0]:
print '%s = models.CharField(max_length=%s, blank=True)' % (header[i], max_length)
else:
print '%s = models.CharField(max_length=%s)' % (header[i], max_length)
@smcalilly
Copy link

lol hero. found this because i'm bouta have to make model based on a csv. thanks.

@smcalilly
Copy link

smcalilly commented Sep 23, 2021

Here's the script that's compatible with Python >=3.6. It also has a dynamic number of headers:

import csv

with open('data.csv') as f:
    reader = csv.reader(f)
    header = next(reader) # burn off the header
    props = [(False, 0)] * len(header) #() # 3 is number of columns
    for n, line in enumerate(reader):
        for i, item in enumerate(line):
            item = item.strip()
            if not item:
                props[i] = (True, props[i][1])
            if len(item) > props[i][1]:
                props[i] = (props[i][0], len(item))
    print(f'processed {n} lines')
    print(props)

    for i, p in enumerate(props):
        power = 0
        while 2 ** power < p[1]:
            power += 1
        max_length = 2 ** power
        model_name = header[i]
        if p[0]:
            print(f'{model_name} = models.CharField(max_length={max_length})')
        else:
            print(f'{model_name} = models.CharField(max_length={max_length})')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment