Skip to content

Instantly share code, notes, and snippets.

@simong
Created October 30, 2012 13:52
Show Gist options
  • Save simong/3980289 to your computer and use it in GitHub Desktop.
Save simong/3980289 to your computer and use it in GitHub Desktop.
Revision history

The content module for Sakai OAE

Cassandra model

Column families

1. Content

Holds the actual metadata for a piece of content

2. LibraryByPrincipal

Three rows per principal that each holds IDs of the content items that are visible in this library.

  • principalId:PUBLIC
  • principalId:LOGGEDIN
  • principalId

3. Revisions

A content item that has a file body (or a page in a sakaidoc?) will have 1+ revisions associated to it. Each revision object consists out of:

  • created - time in millis since epoch
  • createdBy - user id.
  • storage - A URI that allows the storage backend to decide where the file is stored.

I don't think there is value in having ids for our revisions other than the created timestamps. The UI doesn't seem to expose any versioning numbers? (Neither does Google Docs for example.)

There will be one row of revisions per content item. Assuming local storage:

contentId      |  1351604415674                              | 1351604565231
c:cam:F1a3cd   |  u:cam:nico#local:/2012/10/30/13/41/shortid | u:cam:mrvisser#local:/2012/10/30/13/43/shortid

Assuming Amazon S3 storage

contentId      |  1351604415674                                   | 1351604565231
c:cam:CD1f7c   |  u:cam:simong#amazons3:/2012/10/30/13/41/shortid | u:cam:bert#amazons3:/2012/10/30/13/43/shortid

Storage backends

The idea behind the different storage backends is that institutions have the choice where they want to store their files. Potential backends:

  • Local disk / NFS
  • Amazon S3
  • Google Cloud Storage
  • ..

Each storage backend implements the same simple 'interface' that allows for backend-agnostic REST/Service apis.

Potential Storage Backend interface

/*!
     * Copyright 2012 Sakai Foundation (SF) Licensed under the
     * Educational Community License, Version 2.0 (the "License"); you may
     * not use this file except in compliance with the License. You may
     * obtain a copy of the License at
     *
     *     http://www.osedu.org/licenses/ECL-2.0
     *
     * Unless required by applicable law or agreed to in writing,
     * software distributed under the License is distributed on an "AS IS"
     * BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
     * or implied. See the License for the specific language governing
     * permissions and limitations under the License.
     */

    /**
     * The interface that each Storage Backend should implement.
     */

    var Revision = require('oae-content/lib/model').Revision;

    /**
     * Apply any REST endpoints that are required for this storage backend.
     * @param  {Server} server The express server object.
     */
    module.exports.applyRestEndpoints = function(server) {
        
    };

    /**
     * Stores a file body on the storage backend.
     * It's assumed that on executing this method it's the first time
     * you're storing a file body against this piece of content.
     * This method will create the initial revision in the DB and will return that to the caller.
     *
     * @param {Context}     ctx                 The current execution context.
     * @param {Content}     contentItem         The content item that this file body is associated with.
     * @param {Buffer}      body                The file body.
     * @param {Function}    [callback]          An optional callback method
     * @param {Object}      callback.err        An error object (if any)
     * @param {Revision}    callback.revision   A revision object.
     */
    var store = module.exports.store = function(ctx, contentItem, body, callback) {};

    /**
     * Updates the file body on a piece of content by creating a new revision.
     * The old file body will **not** be overwritten.
     *
     * @param {Context}     ctx                 The current execution context.
     * @param {Content}     contentItem         The content item that this file body is associated with.
     * @param {Buffer}      body                The file body.
     * @param {Function}    [callback]          An optional callback method
     * @param {Object}      callback.err        An error object (if any)
     * @param {Revision}    callback.revision   A revision object.
     */
    var update = module.exports.update = function(ctx, contentItem, body, callback) {};

    /**
     * Delete a file body.
     *
     * @param {Context}     ctx                 The current execution context.
     * @param {Content}     contentItem         The content item that this file body is associated with.
     * @param {Revision}    [revision]          An optional revision. If the revision is omitted, the most current revision will be removed.
     * @param {Function}    [callback]          An optional callback method
     * @param {Object}      callback.err        An error object (if any)
     
     */
    var remove = module.exports.remove = function(ctx, contentItem, revision, calback) {};

    /**
     * In case the storage backend is not hosted on this app server,
     * the user will be redirected to a URL that gets generated by this method.
     * If the file bodies do exist on this app server, this method will return null
     * and the caller should use the `getBody` method.
     *
     * @param {Context}     ctx                 The current execution context.
     * @param {Content}     contentItem         The content item that this file body is associated with.
     * @param {Revision}    [revision]          An optional revision. If the revision is omitted, the url for the most current revision will be retrieved.
     * @param {Function}    callback            A callback method
     * @param {Object}      callback.err        An error object (if any)
     * @param {Revision}    callback.url        A URL that points to the external file, or null if the file sits locally.
     */
    var getDownloadLink = module.exports.getDownloadLink = function(ctx, contentItem, revision, callback) {};

    /**
     * Gets the content of a file body.
     * Warning: It usually doesn't make sense to retrieve the filebody
     * on the app server. Handle with care.
     *
     * @param {Context}     ctx                 The current execution context.
     * @param {Content}     contentItem         The content item that this file body is associated with.
     * @param {Revision}    [revision]          An optional revision. If the revision is omitted, the most current revision will be removed.
     * @param {Function}    callback            A callback method
     * @param {Object}      callback.err        An error object (if any)
     * @param {Buffer}      callback.body       The file body.
     */
    var getBody = module.exports.getBody = function(ctx, contentItem, revision, calback) {};
@mrvisser
Copy link

The API seems to suggest that the body will be passed around as byte arrays, we should avoid this. The parameters and method names should be indicative that we're passing around streams, not "bodies". And those streams themselves would be event-driven references that have events such as 'data', 'end', 'error', 'close' etc...

This is the (currently unstable) streaming API it appears that Node.js specs out: http://nodejs.org/api/stream.html; If we don't use it directly, we should probably at least follow the stream events as close as possible.

@mrvisser
Copy link

For the URI references: u:cam:mrvisser#local:/2012/10/30/13/43/shortid, rather than generate a shortid, we can probably just append the timestamp to the content item's ID. That way navigating the filesystem can actually give some identification to what the content body is.

@mrvisser
Copy link

Couple little nitpicks: getDownloadLink might be better off named getDownloadUrl, and contentItem might be better off referred to as contentId.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment