Skip to content

Instantly share code, notes, and snippets.

@hoasung01
Forked from squallcs12/text.md
Last active August 29, 2015 14:19
Show Gist options
  • Save hoasung01/822c33aee84e5e674c96 to your computer and use it in GitHub Desktop.
Save hoasung01/822c33aee84e5e674c96 to your computer and use it in GitHub Desktop.

This command is used to load a large data set into table.

The basic usage COPY country FROM '/usr1/proj/bray/sql/country_data'; is work only for existing file on postgres server, which might caught problem with heroku web app.

To do that with django project, you can use the method cursor.copy_from https://pythonhosted.org/psycopg2/cursor.html#cursor.copy_from

With RoR, this may help http://stackoverflow.com/questions/14526489/using-copy-from-in-a-rails-app-on-heroku-with-the-postgresql-backend

====

The copy process will be faster if table has no relation. On local machince, copy 1 million rows into simple table take just 8 seconds, compare to 35 seconds for complex table with 2 foreign key.

I did not test on heroku yet, but I think the different is not that much because heroku postgres and web app run on strong machine. The copy process took only 5 seconds on postgres.

At first I try to "insert a million rows into postgres within 10 seconds" and I failed. Many same questions have been asked on internet and all the answers I see are "impossible", there is a limited number of rows can be inserted into database per seconds.

Then I think "what if it's not insert?", why we can write a 100 MB csv file in 3 seconds but postgres take so much time to write just a 24MB data. Then I try another thing, maybe it's not "inserting", maybe it's just "loading". So I searched with the keyword "load data into postgres from file" and I got the answer right at the first result in google.

I was lucky, so there is not much thing to share with you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment