Skip to content

Instantly share code, notes, and snippets.

@manichabba
Created August 23, 2016 20:45
Show Gist options
  • Save manichabba/4564251b5ba18b2de0c6bec72c715238 to your computer and use it in GitHub Desktop.
Save manichabba/4564251b5ba18b2de0c6bec72c715238 to your computer and use it in GitHub Desktop.
Counting Organizations This application will read the mailbox data (mbox.txt) count up the number email messages per organization (i.e. domain name of the email address) using a database with the following schema to maintain the counts.
import re
import sqlite3
conn = sqlite3.connect('emaildb.sqlite')
cur = conn.cursor()
cur.execute('''
DROP TABLE IF EXISTS Counts''')
cur.execute('''
CREATE TABLE Counts (org TEXT, count INTEGER)''')
fname = raw_input('Enter file name: ')
if ( len(fname) < 1 ) : fname = 'mbox.txt'
fh = open(fname)
for line in fh:
if not line.startswith('From: ') : continue
pieces = line.split()[1]
org = pieces.split('@')[1]
print org
cur.execute('SELECT count FROM Counts WHERE org = ? ', (org, ))
row = cur.fetchone()
if row is None:
cur.execute('''INSERT INTO Counts (org, count)
VALUES ( ?, 1 )''', ( org, ) )
else :
cur.execute('UPDATE Counts SET count=count+1 WHERE org = ?',
(org, ))
conn.commit()
sqlstr = 'SELECT org, count FROM Counts ORDER BY count DESC LIMIT 10'
print
print "Counts:"
for row in cur.execute(sqlstr) :
print str(row[0]), row[1]
cur.close()
@Shankar1901
Copy link

Shankar1901 commented Jul 16, 2021

Everything is fine, except that org should be like this:
org = pieces[1].split("@")[1] on line 21

Because split() is performed only on strings and pieces is a list. You will get an attribute error

@giriprasathd
Copy link

@rish codes
https://github.com/rish-codes/py4e/blob/master/Course-4/emaildb.py
open the above link
copy the code and execute it
or copy the below code also
import sqlite3

conn = sqlite3.connect('emaildb.sqlite')
cur = conn.cursor()

cur.execute('DROP TABLE IF EXISTS Counts')

cur.execute('CREATE TABLE Counts(org TEXT, count INTEGER)')

fname = 'mbox.txt'
fh = open(fname)
for line in fh:
if not line.startswith('From: '):
continue
pieces = line.split()
email = pieces[1]
org = email.split('@')[1]
cur.execute('SELECT count FROM Counts WHERE org = ? ', (org,))
row = cur.fetchone()
if row is None:
cur.execute('''INSERT INTO Counts (org, count)
VALUES (?, 1)''', (org,))
else:
cur.execute('UPDATE Counts SET count = count + 1 WHERE org = ?',
(org,))
conn.commit()

sqlstr = 'SELECT org, count FROM Counts ORDER BY count DESC LIMIT 10'

for row in cur.execute(sqlstr):
print(str(row[0]), row[1])

cur.close()

make sure you downloaded the mbox text
once you execute the python code the emaildb.SQLite is generated .you need to apply the uploaded that code
image
image

image

@MhamedBendenia
Copy link

Here is the solution that works for me :

import re

conn = sqlite3.connect('orgdb.sqlite')
cur = conn.cursor()

cur.execute('DROP TABLE IF EXISTS Counts')

cur.execute('''
CREATE TABLE Counts (org TEXT, count INTEGER)''')

fname = 'mbox.txt'
fh = open(fname)

for line in fh:

    if not line.startswith('From: '): continue
    org = re.findall("@(.*)", line)
    cur.execute('SELECT count FROM Counts WHERE org = ? ', (org[0],))
    row = cur.fetchone()
    if row is None:
        cur.execute('''INSERT INTO Counts (org, count)
                VALUES (?, 1)''', (org[0],))
    else:
        cur.execute('UPDATE Counts SET count = count + 1 WHERE org = ?',
                    (org[0],))

conn.commit()

# https://www.sqlite.org/lang_select.html
sqlstr = 'SELECT org, count FROM Counts ORDER BY count DESC LIMIT 10'

for row in cur.execute(sqlstr):
    print(str(row[0]), row[1])

cur.close()

@niparkinson
Copy link

Help why does it keep saying my data is incorrect even though it is correct!
Screen Shot 2022-03-09 at 8 43 20 PM
Screen Shot 2022-03-09 at 8 43 42 PM

@cnydo
Copy link

cnydo commented Apr 1, 2022

Help why does it keep saying my data is incorrect even though it is correct! Screen Shot 2022-03-09 at 8 43 20 PM Screen Shot 2022-03-09 at 8 43 42 PM

I think the result should be like this

iupui.edu 536
umich.edu 491
indiana.edu 178
caret.cam.ac.uk 157
vt.edu 110
uct.ac.za 96
media.berkeley.edu 56
ufp.pt 28
gmail.com 25
et.gatech.edu 17

@niparkinson
Copy link

Thank you

Help why does it keep saying my data is incorrect even though it is correct! Screen Shot 2022-03-09 at 8 43 20 PM Screen Shot 2022-03-09 at 8 43 42 PM

I think the result should be like this

iupui.edu 536
umich.edu 491
indiana.edu 178
caret.cam.ac.uk 157
vt.edu 110
uct.ac.za 96
media.berkeley.edu 56
ufp.pt 28
gmail.com 25
et.gatech.edu 17

@Aszxsert
Copy link

2
How do I run this code? I look at YouTube videos and I can get the answer based on this code, why not mine? Beg for help!

@Iryna1103
Copy link

Non of the codes for me is actually counting orgs, maximum i get is 1? what am i doing wrong , i already tried all the codes here ans mine which were similar and trying to correct where i think mistake can be , and yes i do have mbox.txt saved in same directory and i do have created db.sqlite files but they either empty or counting 1, please help

@sshaya1
Copy link

sshaya1 commented Mar 12, 2024

I'm getting the right result, but I also get this "Data is incorrect: emaildb.sqlite"
Uploading Screen Shot 2024-03-12 at 11.06.31 AM.png…

@Bssmmn
Copy link

Bssmmn commented Jun 17, 2024

me too and i can't pass this task did u pass it?

@kartik-nv
Copy link

kartik-nv commented Jun 17, 2024 via email

@Iryna1103
Copy link

Iryna1103 commented Jun 17, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment