One of the objectives of gitenberg is to provide a github-flavored pathway for the improvement of the metadata for Project Gutenberg ebooks. This runs in two directions: . Improving the accessibility an usability of PG metadata . Improving the quality and completeness of PG metadata
The first step in this effort is to figure out what metadata already exists in Project Gutenberg.
Project Gutenberg provides periodic dumps of its metadata in the form of RDF. These are the metadata used to make the "bibrec" pages and also to make ebook files (an epub package, for example, stores this metadata in its "OPF" file). The dump consists of a zipped tarfile containing one rdf file per pg text. In the second tranche of repos moved to github, (roughly those above 10,000) Seth added the rdf file for each text to the corresponding archive. This saves you from having to deal with opening the archive and letting your operating system deal with 50,000 directori