Skip to content

Instantly share code, notes, and snippets.

@Paxa
Created August 26, 2011 06:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Paxa/1172829 to your computer and use it in GitHub Desktop.
Save Paxa/1172829 to your computer and use it in GitHub Desktop.
% bundle exec mida http://lawrencewoodman.github.com/mida/news/
Parsing: http://lawrencewoodman.github.com/mida/news/
---
:type: http://schema.org/Blog
% bundle exec mida -v http://lawrencewoodman.github.com/mida/news/
Parsing: http://lawrencewoodman.github.com/mida/news/
---
:type: http://schema.org/Blog
:properties:
blogPosts:
- :type: http://schema.org/BlogPosting
:properties:
name:
- Mida v0.3.3 Released
author:
- :type: http://schema.org/Person
:properties:
url:
- http://techtinkering.com
name:
- Lawrence Woodman
datePublished:
- '2011-08-01'
articleBody:
- ! "This version now contains all the schema.org datatypes, enumerations and
vocabularies. This was achieved by scraping the site using schema.org_scraper
to create a JSON and YAML representation of the types. These are hosted on
github so that the changes can easily be tracked.\n\nThe main problem that
I did encounter was that the Enumeration type conflicts with the explanation
in the Getting started Guide. In the type list, it says that Enumeration is
a vocabulary derived from Thing, whereas in the explanation it says to uses
the <link> tag to specify the value of the enumeration. The latter makes a
lot more sense as otherwise it is unclear which property should be used to
specify the value. I have therefore created an Enumeration DataType, which
can be subclassed to create an Enumeration with specified valid values, as
in the following:\n# The publication format of the book.\nclass BookFormatType
< Mida::DataType::Enumeration\n VALID_VALUES = [\n [::Mida::DataType::URL,
%r{http://schema.org/EBook}i],\n [::Mida::DataType::URL, %r{http://schema.org/Hardcover}i],\n
\ [::Mida::DataType::URL, %r{http://schema.org/Paperback}i]\n ]\nend\n\n\nTo
see the complete list of changes, please have a look at the CHANGELOG."
- :type: http://schema.org/BlogPosting
:properties:
name:
- Mida v0.3.2 Released
author:
- :type: http://schema.org/Person
:properties:
url:
- http://techtinkering.com
name:
- Lawrence Woodman
datePublished:
- '2011-07-09'
articleBody:
- ! "After looking at the best way to implement the recently released schema.org
Vocabularies, I decided to allow vocabularies to be included into other vocabularies.
This would not only reduce repetition when specifying the same properties
repeatedly, but would also allow vocabularies that include another vocabulary
to be used in place of that vocabulary. It is used as follows:\nclass Thing
< Mida::Vocabulary\n itemtype %r{http://example.com/vocab/thing}i\n has_one
'name', 'description'\nend\n\nclass Book < Mida::Vocabulary\n itemtype %r{http://example.com/vocab/book}i\n
\ include_vocabulary Thing\n has_one 'title', 'author'\nend\n\nclass Collection
< Mida::Vocabulary\n itemtype %r{http://example.com/vocab/collection}i\n
\ has_many 'item' do\n extract Thing\n end\nend\n\n\nIn the above if you
gave a Book as an item of Collection this would be accepted because it includes
the Thing vocabulary. When examining the item you would find #vocabulary set
to Book and you would have access to all the properties of Thing and all the
properties of Book.\n\nThis release also has a small bug fix so that mida
no longer defaults to searches for %r{} and will only make searches if a type
regexp is given."
- :type: http://schema.org/BlogPosting
:properties:
name:
- Mida v0.3.1 Released
author:
- :type: http://schema.org/Person
:properties:
url:
- http://techtinkering.com
name:
- Lawrence Woodman
datePublished:
- '2011-07-05'
articleBody:
- ! "The big addition for this release is a command-line tool, mida.\n\nTo use
the tool, supply it with the urls or filenames that you would like to be parsed
(by default each item is output as yaml):\n mida http://lawrencewoodman.github.com/mida/news/\n\n\nIf
you want to search for specific types you can use the -t switch followed by
a Regular Expression:\n mida -t /person/i http://lawrencewoodman.github.com/mida/news/\n\n\nFor
more information look at mida’s help:\n mida -h\n\n\nTo see the complete
list of changes, please have a look at the CHANGELOG."
- :type: http://schema.org/BlogPosting
:properties:
name:
- Mida v0.3.0 Released
author:
- :type: http://schema.org/Person
:properties:
url:
- http://techtinkering.com
name:
- Lawrence Woodman
datePublished:
- '2011-06-29'
articleBody:
- ! 'The new release includes quite a few refinements such as auto registering
the vocabularies via the inherited hook and adding various DataTypes such
as Boolean, Float, ISO8601Date, etc. To improve output when using pp or y
on Item#to_h the vocabulary has been removed.
There are a couple of changes to watch out for:
When describing a vocabulary types has been deprecated in favour of extract.
This was done in case greater flexibility was needed at a later date by providing
extract with a block. If this happens extract would make more sense as a name.
String can no longer be used as a type when describing a Vocabulary, instead
use DataType::Text
Properties marked as has_one now output a single value instead of an Array
Document#search now only uses a Regexp to search with as this greatly simplified
the code
The project would benefit from greater collaboration, to aid this Bundler
support has been added.
To see the complete list of changes, please have a look at the CHANGELOG.'
- :type: http://schema.org/BlogPosting
:properties:
name:
- Mida v0.2.0 Released
author:
- :type: http://schema.org/Person
:properties:
url:
- http://techtinkering.com
name:
- Lawrence Woodman
datePublished:
- '2011-05-03'
articleBody:
- ! "The main change for this release is that you can now describe vocabularies
to conform to. These are set out by subclassing VocabularyDesc as in the following:\nclass
Rating < Mida::VocabularyDesc\n itemtype %r{http://example\\.com.*?rating$}i\n
\ has_one 'best', 'value'\nend\n\nclass Comment < Mida::VocabularyDesc\n itemtype
%r{http://example\\.com.*?comment$}i\n has_one 'commentor', 'comment'\nend\n\nclass
Review < Mida::VocabularyDesc\n itemtype %r{http://example\\.com.*?review$}i\n
\ has_one 'itemreviewed'\n has_one 'rating' do\n types Rating, String\n
\ end\n has_many 'comments' do\n types Comment\n end\nend\n\n\nThere
were also a few implementation changes. To see the complete list of changes,
please have a look at the CHANGELOG."
- :type: http://schema.org/BlogPosting
:properties:
name:
- Mida v0.1.3 Released
author:
- :type: http://schema.org/Person
:properties:
url:
- http://techtinkering.com
name:
- Lawrence Woodman
datePublished:
- '2011-04-18'
articleBody:
- This release fixes a few bugs where itemprops were not being parsed properly
if they contained non-microdata elements. It also will now recognize itemprops
nested within other itemprops.
- :type: http://schema.org/BlogPosting
:properties:
name:
- Mida v0.1.0 Released
author:
- :type: http://schema.org/Person
:properties:
url:
- http://techtinkering.com
name:
- Lawrence Woodman
datePublished:
- '2011-04-12'
articleBody:
- This is the first release of Mida. It was written after hearing rumours of
Google making increasing use of Microdata; if this turns out to be true then
more sites will in-turn make use of it and a Microdata parser will become
very useful.
------------
ruby-1.9.2-p136 :001 > Nokogiri::LIBXML_VERSION
=> "2.7.6"
ruby-1.9.2-p136 :002 > Nokogiri::LIBXML_PARSER_VERSION
=> "20706"
ruby-1.9.2-p136 :003 >
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment