Tools used
- https://npmjs.com/package/pg-json-import
- MySQL Workbench
- pgAdmin III
Since projects take up about 750Mib of disk space, loading them all into a node process is not possible. Therefore, I broke the projects up into 50000 project chunks. You're probably also going to want to manuall increase the Heap size available to V8 using node --max_old_space_size=2000000
Query and save user ids, emails and usernames, save them to json using MySQL Workbench:
SELECT
id, username, email
FROM
wmlogin.Users
WHERE
deletedAt IS NULL;
Query and save thimble project data as json using MySQL Workbench:
SELECT
userid AS email,
rawData AS html,
title,
createdAt,
updatedAt
FROM
thimble.ThimbleProjects
WHERE
rawData != ''
LIMIT 50000
OFFSET 0;
The html data might (read: probably will) have control characters embedded in it. These can be remove like so:
$ tr -cd '\11\12\15\40-\176' < thimble_projects.json > thimble_projects_fixed.json
Create the database and tables you need (see sql files in this gist):\
$ createdb migrate_thimble
$ psql -d migrate_thimble -f create_tables.sql
import data into a local pg database using the pg-json-import tool:
$ pji --file login_data.json --connection postgres://localhost:5432/migrate_thimble --table users
$ pji --file thimble_projects_fixed.json --connection postgres://localhost:5432/migrate_thimble --table projects
run import-thimble-users.js
node --max_old_space_size=2000000 import-thimble-projects.js <project_migration_db_connection_string> <publish_db_connection_string>
run import-thimble-projects.js
node --max_old_space_size=2000000 import-thimble-projects.js <project_migration_db_connection_string> <publish_db_connection_string> 50000 0