Created
August 26, 2011 06:26
-
-
Save Paxa/1172829 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
% bundle exec mida http://lawrencewoodman.github.com/mida/news/ | |
Parsing: http://lawrencewoodman.github.com/mida/news/ | |
--- | |
:type: http://schema.org/Blog | |
% bundle exec mida -v http://lawrencewoodman.github.com/mida/news/ | |
Parsing: http://lawrencewoodman.github.com/mida/news/ | |
--- | |
:type: http://schema.org/Blog | |
:properties: | |
blogPosts: | |
- :type: http://schema.org/BlogPosting | |
:properties: | |
name: | |
- Mida v0.3.3 Released | |
author: | |
- :type: http://schema.org/Person | |
:properties: | |
url: | |
- http://techtinkering.com | |
name: | |
- Lawrence Woodman | |
datePublished: | |
- '2011-08-01' | |
articleBody: | |
- ! "This version now contains all the schema.org datatypes, enumerations and | |
vocabularies. This was achieved by scraping the site using schema.org_scraper | |
to create a JSON and YAML representation of the types. These are hosted on | |
github so that the changes can easily be tracked.\n\nThe main problem that | |
I did encounter was that the Enumeration type conflicts with the explanation | |
in the Getting started Guide. In the type list, it says that Enumeration is | |
a vocabulary derived from Thing, whereas in the explanation it says to uses | |
the <link> tag to specify the value of the enumeration. The latter makes a | |
lot more sense as otherwise it is unclear which property should be used to | |
specify the value. I have therefore created an Enumeration DataType, which | |
can be subclassed to create an Enumeration with specified valid values, as | |
in the following:\n# The publication format of the book.\nclass BookFormatType | |
< Mida::DataType::Enumeration\n VALID_VALUES = [\n [::Mida::DataType::URL, | |
%r{http://schema.org/EBook}i],\n [::Mida::DataType::URL, %r{http://schema.org/Hardcover}i],\n | |
\ [::Mida::DataType::URL, %r{http://schema.org/Paperback}i]\n ]\nend\n\n\nTo | |
see the complete list of changes, please have a look at the CHANGELOG." | |
- :type: http://schema.org/BlogPosting | |
:properties: | |
name: | |
- Mida v0.3.2 Released | |
author: | |
- :type: http://schema.org/Person | |
:properties: | |
url: | |
- http://techtinkering.com | |
name: | |
- Lawrence Woodman | |
datePublished: | |
- '2011-07-09' | |
articleBody: | |
- ! "After looking at the best way to implement the recently released schema.org | |
Vocabularies, I decided to allow vocabularies to be included into other vocabularies. | |
This would not only reduce repetition when specifying the same properties | |
repeatedly, but would also allow vocabularies that include another vocabulary | |
to be used in place of that vocabulary. It is used as follows:\nclass Thing | |
< Mida::Vocabulary\n itemtype %r{http://example.com/vocab/thing}i\n has_one | |
'name', 'description'\nend\n\nclass Book < Mida::Vocabulary\n itemtype %r{http://example.com/vocab/book}i\n | |
\ include_vocabulary Thing\n has_one 'title', 'author'\nend\n\nclass Collection | |
< Mida::Vocabulary\n itemtype %r{http://example.com/vocab/collection}i\n | |
\ has_many 'item' do\n extract Thing\n end\nend\n\n\nIn the above if you | |
gave a Book as an item of Collection this would be accepted because it includes | |
the Thing vocabulary. When examining the item you would find #vocabulary set | |
to Book and you would have access to all the properties of Thing and all the | |
properties of Book.\n\nThis release also has a small bug fix so that mida | |
no longer defaults to searches for %r{} and will only make searches if a type | |
regexp is given." | |
- :type: http://schema.org/BlogPosting | |
:properties: | |
name: | |
- Mida v0.3.1 Released | |
author: | |
- :type: http://schema.org/Person | |
:properties: | |
url: | |
- http://techtinkering.com | |
name: | |
- Lawrence Woodman | |
datePublished: | |
- '2011-07-05' | |
articleBody: | |
- ! "The big addition for this release is a command-line tool, mida.\n\nTo use | |
the tool, supply it with the urls or filenames that you would like to be parsed | |
(by default each item is output as yaml):\n mida http://lawrencewoodman.github.com/mida/news/\n\n\nIf | |
you want to search for specific types you can use the -t switch followed by | |
a Regular Expression:\n mida -t /person/i http://lawrencewoodman.github.com/mida/news/\n\n\nFor | |
more information look at mida’s help:\n mida -h\n\n\nTo see the complete | |
list of changes, please have a look at the CHANGELOG." | |
- :type: http://schema.org/BlogPosting | |
:properties: | |
name: | |
- Mida v0.3.0 Released | |
author: | |
- :type: http://schema.org/Person | |
:properties: | |
url: | |
- http://techtinkering.com | |
name: | |
- Lawrence Woodman | |
datePublished: | |
- '2011-06-29' | |
articleBody: | |
- ! 'The new release includes quite a few refinements such as auto registering | |
the vocabularies via the inherited hook and adding various DataTypes such | |
as Boolean, Float, ISO8601Date, etc. To improve output when using pp or y | |
on Item#to_h the vocabulary has been removed. | |
There are a couple of changes to watch out for: | |
When describing a vocabulary types has been deprecated in favour of extract. | |
This was done in case greater flexibility was needed at a later date by providing | |
extract with a block. If this happens extract would make more sense as a name. | |
String can no longer be used as a type when describing a Vocabulary, instead | |
use DataType::Text | |
Properties marked as has_one now output a single value instead of an Array | |
Document#search now only uses a Regexp to search with as this greatly simplified | |
the code | |
The project would benefit from greater collaboration, to aid this Bundler | |
support has been added. | |
To see the complete list of changes, please have a look at the CHANGELOG.' | |
- :type: http://schema.org/BlogPosting | |
:properties: | |
name: | |
- Mida v0.2.0 Released | |
author: | |
- :type: http://schema.org/Person | |
:properties: | |
url: | |
- http://techtinkering.com | |
name: | |
- Lawrence Woodman | |
datePublished: | |
- '2011-05-03' | |
articleBody: | |
- ! "The main change for this release is that you can now describe vocabularies | |
to conform to. These are set out by subclassing VocabularyDesc as in the following:\nclass | |
Rating < Mida::VocabularyDesc\n itemtype %r{http://example\\.com.*?rating$}i\n | |
\ has_one 'best', 'value'\nend\n\nclass Comment < Mida::VocabularyDesc\n itemtype | |
%r{http://example\\.com.*?comment$}i\n has_one 'commentor', 'comment'\nend\n\nclass | |
Review < Mida::VocabularyDesc\n itemtype %r{http://example\\.com.*?review$}i\n | |
\ has_one 'itemreviewed'\n has_one 'rating' do\n types Rating, String\n | |
\ end\n has_many 'comments' do\n types Comment\n end\nend\n\n\nThere | |
were also a few implementation changes. To see the complete list of changes, | |
please have a look at the CHANGELOG." | |
- :type: http://schema.org/BlogPosting | |
:properties: | |
name: | |
- Mida v0.1.3 Released | |
author: | |
- :type: http://schema.org/Person | |
:properties: | |
url: | |
- http://techtinkering.com | |
name: | |
- Lawrence Woodman | |
datePublished: | |
- '2011-04-18' | |
articleBody: | |
- This release fixes a few bugs where itemprops were not being parsed properly | |
if they contained non-microdata elements. It also will now recognize itemprops | |
nested within other itemprops. | |
- :type: http://schema.org/BlogPosting | |
:properties: | |
name: | |
- Mida v0.1.0 Released | |
author: | |
- :type: http://schema.org/Person | |
:properties: | |
url: | |
- http://techtinkering.com | |
name: | |
- Lawrence Woodman | |
datePublished: | |
- '2011-04-12' | |
articleBody: | |
- This is the first release of Mida. It was written after hearing rumours of | |
Google making increasing use of Microdata; if this turns out to be true then | |
more sites will in-turn make use of it and a Microdata parser will become | |
very useful. | |
------------ | |
ruby-1.9.2-p136 :001 > Nokogiri::LIBXML_VERSION | |
=> "2.7.6" | |
ruby-1.9.2-p136 :002 > Nokogiri::LIBXML_PARSER_VERSION | |
=> "20706" | |
ruby-1.9.2-p136 :003 > |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment