Skip to content

Instantly share code, notes, and snippets.

@alcaeus
Last active April 24, 2018 18:33
Show Gist options
  • Save alcaeus/f7f5cd06e2fcbf475e846c04717f0770 to your computer and use it in GitHub Desktop.
Save alcaeus/f7f5cd06e2fcbf475e846c04717f0770 to your computer and use it in GitHub Desktop.
GridFS in ODM 2.0

GridFS in ODM 2.0

Problems in the implementation in 1.0

  • A GridFS document is a regular document, with a single property mapped as file. This suggests that mapping additional fields in addition to the default GridFS fields is allowed (which it isn't according to the GridFS spec)
  • The file (normally) starts out as a file path to the file on the local disk, but after hydration it will be an instance of MongoGridFSFile - this type inconsistency is impossible to deal with in strictly typed systems
  • A file was mutable, which it (currently) shouldn't be. Changes to files should be pushed to GridFS as new documents with a newer uploadDate.

Changes necessitated by the new driver

  • The new MongoDB driver forbids storing additional fields outside the metadata embedded document
  • The MongoGridFSFile class was dropped without replacement: the driver handles file streams

Implementation requirements

  • Users will want to map their metadata documents
  • Users will want to map references to a file, requiring mapping a file to an object
  • The API should be consistent: if a user inputs a file path to upload a document, they should be able to receive a file path where to read the file contents after fetching the document from the database

File mapping examples

Instead of mapping a file as document and marking it as a file due to presence of a property mapped with file, the document is now mapped as file directly. This prohibits mapping any fields other than the GridFS fields. These are (according to GridFS specification):

  • length
  • chunkSize
  • uploadDate
  • filename
  • metadata

The deprecated md5, aliases and contentType fields can not be mapped in the files document directly. The mapping also specifies an additional @FilePath annotation that is used by ODM to determine from where to read the file when the document is persisted. A document that is loaded from the database will expose a file path from which its content can be read via this property. Changes to this file will not be persisted to the database.

The only field required to be mapped is the @FilePath field (as it's needed to read the file contents). The other fields are optional.

Metadata

The @Metadata annotation is equivalent to an @EmbedOne annotation and requires a document class.

Discriminators

The File document itself may not be discriminated, as there's no place to store the discriminator value in the document itself. The metadata document can be discriminated like any other embed relationship. Users are encouraged to use separate buckets or only discriminate the metadata property.

Code samples

The File.php and FileMetadata.php show a sample implementation for documents that map to files. At this time, this implementation is impossible because the MongoDB PHP Library does not expose its stream wrapper outside of the Bucket class.

"Classic" object mapping similar to 1.0

Creating and storing a new file would look like this:

$file = new File('/tmp/path/to/file.txt', $owner);
$documentManager->persist($file);
$documentManager->flush();

Reading from the database:

$file = $documentManager->find(File::class, $fileId);
file_get_contents($file->getFilePath());

The advantages of this API:

  • A file is treated like any other document, with the only special property being the one mapped with @FilePath
  • There is no resource overhead between hydration and the file being read - no cursor is being kept open, no resource created to help read the file

This option is not easily implementable, since the MongoDB library does not expose its stream wrapper and only exposes resources for files.

Not exposing file paths but working with resources

With resources, the above code would be similar, except that the @FilePath mapping would have to be renamed. Inserting and reading data would be similar:

$file = new File(fopen('/tmp/path/to/file.txt', 'r'), $owner);
$documentManager->persist($file);
$documentManager->flush();

Reading from the database:

$file = $documentManager->find(File::class, $fileId);
stream_get_contents($file->getResource());

When creating a new file, the resource would be opened when the document is created and must remain open until the document has been written to the database. Similarly, when reading data, the resource is opened on hydration and has to remain open until the document is detached from the document manager. When reading from a stream, users have to make sure to rewind the stream (which is the dafult behavior for stream_get_contents) or be aware that other code may have moved the current position in the stream.

Not exposing the file at all

This approach is radically different from the current approach. Users would pass a stream to a special repository method, along with an optional metadata document:

$file = $documentManager->getRepository(File::class)->uploadFromStream(fopen('tmp/path/to/file.txt'), new FileMetadata($owner));

To read the file, the repository would be able to return a download stream:

stream_get_contents($documentManager->getRepository(File::class)->getDownloadStream($file));

This approach requires the repository on multiple occasions. For read operations, this could be avoided by implementing proxy logic to return a fresh download stream via a special method.

<?php
declare(strict_types = 1);
namespace Documents;
use Doctrine\ODM\MongoDB\Mapping\Annotations as ODM;
/** @ODM\File */
class File
{
/** @ODM\Id */
private $id;
/** @ODM\File\FilePath */
private $filePath;
/** @ODM\File\Filename */
private $filename;
/** @ODM\File\Length */
private $length;
/** @ODM\File\UploadDate */
private $uploadDate;
/** @ODM\File\Metadata(targetDocument=FileMetadata::class) */
private $metadata;
public function __construct($path, User $owner)
{
$this->filePath = $path;
$this->metadata = new FileMetadata($owner);
}
public function getId(): ?string
{
return $this->id;
}
public function getFilePath(): string
{
return $this->filePath;
}
public function setFilename(string $filename): void
{
$this->filename = $filename;
}
public function getFilename(): ?string
{
return $this->filename;
}
public function getLength(): int
{
return $this->length;
}
public function getUploadDate(): \DateTimeInterface
{
return $this->uploadDate;
}
public function getMetadata(): FileMetadata
{
return $this->metadata;
}
}
<?php
declare(strict_types = 1);
namespace Documents;
use Doctrine\ODM\MongoDB\Mapping\Annotations as ODM;
/** @ODM\EmbeddedDocument */
final class FileMetadata
{
/**
* @ODM\ReferenceOne(targetDocument=User::class)
* @var User
*/
private $owner;
public function __construct(User $owner)
{
$this->setOwner($owner);
}
public function getOwner(): User
{
return $this->owner;
}
public function setOwner(User $owner): void
{
$this->owner = $owner;
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment