Skip to content

Instantly share code, notes, and snippets.

@SemanticBeeng
Last active October 22, 2020 07:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save SemanticBeeng/582d6f23a2f5d22fc94df528b96646c4 to your computer and use it in GitHub Desktop.
Save SemanticBeeng/582d6f23a2f5d22fc94df528b96646c4 to your computer and use it in GitHub Desktop.
I tried to write my thoughts down in an "one page proposal" style.
Have succeeded only moderately.
Please advise if this make sense.
In my work to design big data management and analytics products I often make the case that "knowledge science" has to come before "data science".
Unless the meaning of the data is under governance the numbers produced by the data/ML analyses will not be as useful.
Instead, semantic data governance enables:
* better use of the raw data from both business and engineering POV
* the data produced by feature engineering & ML models are more reusble and accessible to all data citizens
* the gap between the model predicitons and business decisions is smaller
In order to action these insights one needs the actual domain knowledge to produce the domain specific semantic models.
And this is often, sorely missing in the regular enterprises!
There are a number of trends and technologies for combining "AI" and "blockchain".
Two of the most proeminent are @origin_trail ("Google for supply chains") and @oceanprotocol (data marketplaces and AI on blockchain).
The shared insights are that sharing/trading data would enable a "data economy".
The vision they evolve around the "data economy" is quite plausible and rich (considers even activities such as data curation) where different kinds of parties can collaborate around proprietary "data", "create value" and be rewarded based on well defined electronic "smart contracts".
All such developments are very promising because they envision emerging new markets for data and knowledge, beyond the restrictive boundaries of enterprises and research groups.
But they also severely underestimate the importance of "knowledge science" to publish, search, find, curate the data, while focusing more on protecting the interests of the parties involved.
Some large organizations have declared their challenges for managing the complex "tribal knowledge" around their data.
And some have created tools to address the challenges: https://twitter.com/semanticbeeng/status/1236929013618270208
Still, at this time, the "state of art" data management practice, especially around data marketplaces is sorely missing the semantic data governance.
On the basis of this realization my market research has lead me to the amazing work you and others do under CGIAR to extract and curate domain knowledge into semantic models.
Have studied a few public materials and harvested a lot more to study in depth.
Just studying and using these models is of great value to me - thank you for making them available.
And, in the context of the above market context, am wondering if you would be open to explore ways to apply your work commercially.
Some ideaa
1. Explore data marketplaces (as per above) for private datasets your group (or others) might have available and willing to offer under license/fee
2. Help other organizations that want to use your semantic models but do not know how or do not have the resources to do it
3. Combine ontologies with word embedding methods to add statistical relevance to concepts in the ontology
4. Create domain specific search engines for parties like farmers or food suppliers that could benefit from reusing the semantic models you created
Hope this is interesting enough to warrant some exploration?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment