Skip to content

Instantly share code, notes, and snippets.

@ocharles
Created November 30, 2010 18:06
Show Gist options
  • Save ocharles/722090 to your computer and use it in GitHub Desktop.
Save ocharles/722090 to your computer and use it in GitHub Desktop.
use strictures;
=pod
WHAT ON EARTH IS THIS?
Well if you calm down, I'll explain.
This is vision for a new edit schema, and is very very roughly mocked up in a
working implementation here. First off all, lets take a look at the schema.
Currently, we have a single table for edits, and a table for votes and
edit_notes. This does not scale well, and is a massive head ache to work with -
we all know this. The new schema will look a little bit like this:
edit_group
edit
edit_revision
data_edit_*
edit_note
vote
Edits are grouped together. For a lot of edits, there might only be 1 edit in
the group, but for more complex edits, we might make a lot of changes. Next, an
edit can have multiple revisions. Again, a lot of edits will hopefully only have
1 revision, but being able to support multiple revisions is helpful. Multiple
revisions are a solution to the "this is great, but you got {some field} wrong."
In this case the user can click "new revision" (new version, whatever) and move
back into an editor (edit artist/edit release, etc) and submit a new version of
the edit.
For the edit themselves, each edit is represented by 2 rows - a revision, and a
set of data assossciated with the row. The revision row contains the date, the
editor who made the revision, etc - metadata, effectively. The data row is a
little bit more interesting. In here we store the data that is being edited, but
also a reference to the data at the point of the revision. This is interesting,
so lets look into this in a bit more detail.
Say I'm editing an artist. I need to store all the information about my edit
(the new name, the new sort name, etc). But to view the edit in history, I need
to know how the artist looked at that point in time. We use KiokuDB to flatten
the object graph into a set of JSON entries, and store these in a larger table.
For edit artist, this handles the case of artists being deleted, but is
significantly more useful for tasks such as relationships.
When a relationship is added, we want to see how it looked at the time of
addition. There is a lot of data required to accurately reconstruct a
relationship though - the current edits don't even manage to do so. However,
with KiokuDB we throw it the relationship object, and it does the work for us.
We never miss data.
Building on the object graph, each entity now consumes a content ID role - this
means that the ID we use to store it for edits is generated based on a hash of
interesting data. This way, we never store data twice (for example, dates) we
keeps the size of the table down.
=cut
{
package ContentID;
use Moose::Role;
use Digest::SHA1 qw( sha1_hex );
with 'KiokuDB::Role::ID::Content';
requires 'content';
sub kiokudb_object_id {
my $self = shift;
return sha1_hex(join(', ', $self->content));
}
};
{
package Artist;
use Moose;
with 'ContentID';
has $_ => ( is => 'ro' )
for qw( name sort_name );
sub content {
my $self = shift;
return (
$self->name,
$self->sort_name
);
}
}
{
package Edit;
use Moose::Role;
has 'id' => ( is => 'ro' );
has 'revision' => ( is => 'ro', default => 1 );
sub new_revision {
my $self = shift;
}
}
{
package Edit::Artist;
use Moose;
with 'Edit';
has $_ => ( is => 'ro' )
for qw( name sort_name );
has 'artist' => (
is => 'ro',
required => 1
);
sub to_row {
my ($self, $c) = @_;
return 'data_edit_artist' => {
name => $self->name,
sort_name => $self->sort_name,
artist => $c->kiokudb->store($self->artist)
}
}
}
{
package Data::Edit;
use Moose;
use DBI;
use KiokuDB;
has 'dbh' => (
is => 'ro',
lazy => 1,
default => sub {
my $dbh = DBI->connect(
'dbi:Pg:dbname=musicbrainz_db_2', 'musicbrainz_user', ''
);
$dbh->do('
DROP TABLE IF EXISTS edit;
DROP TABLE IF EXISTS edit_revision;
DROP TABLE IF EXISTS edit_group;
DROP TABLE IF EXISTS data_edit_artist;
CREATE TABLE edit_group ( id SERIAL );
CREATE TABLE edit (
id SERIAL,
edit_group INTEGER NOT NULL
);
CREATE TABLE edit_revision (
edit INTEGER NOT NULL,
revision INTEGER
);
CREATE TABLE data_edit_artist (
edit INTEGER NOT NULL,
revision INTEGER NOT NULL,
name TEXT, sort_name TEXT, artist TEXT
);
');
return $dbh;
}
);
has 'kiokudb' => (
is => 'ro',
lazy => 1,
default => sub {
KiokuDB->connect(
'dbi:Pg:dbname=musicbrainz_db_2',
user => 'musicbrainz_user',
password => '',
create => 1
);
},
);
sub store {
my ($self, $edit) = @_;
my $sth;
my $group = do {
my $sth = $self->dbh
->prepare('INSERT INTO edit_group DEFAULT VALUES RETURNING id');
$sth->execute;
(@{ $sth->fetchrow_arrayref })[0]
};
my $edit_id = do {
my $sth = $self->dbh
->prepare('INSERT INTO edit (edit_group) VALUES (?) RETURNING id');
$sth->execute($group);
(@{ $sth->fetchrow_arrayref })[0]
};
{
my $sth = $self->dbh
->prepare('INSERT INTO edit_revision (edit, revision) VALUES (?, ?)');
$sth->execute($edit_id, $edit->revision);
};
my $scope = $self->kiokudb->new_scope;
my ($table, $row) = $edit->to_row($self);
$row->{revision} = $edit->revision;
$row->{edit} = $edit_id;
my $query = "INSERT INTO $table ";
my @cols = keys %$row;
$query .= '(' . join(', ', @cols) . ') VALUES (' .
join(', ', ('?') x @cols) . ')';
$sth = $self->dbh->prepare($query);
$sth->execute(map { $row->{$_} } @cols);
}
}
# ----
my $artist = Artist->new(
name => 'June MilLer',
sort_name => 'Miller, June',
);
my $edit = Edit::Artist->new(
artist => $artist,
name => 'June Miller',
sort_name => 'june miller'
);
my $final_edit = $edit->new_revision(
sort_name => 'June Miller'
);
my $data = Data::Edit->new;
$data->store($edit);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment