Vladimir Alexiev VladimirAlexiev

## README.md

      
              3 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                VladimirAlexiev
                / README.md
            
            
              Last active
              April 26, 2024 12:51
            
          
    This is a fixed representation of sec 7 Real ECLASS Content Example from ECLASS Serialization as RDF, Part 1, ECLASS Technical Specification 110, 22 April 2024.
I collected all text, added comments into puml:label, and had to make some fixes (marked with FIXED):

Fixed 8 prefixes from eclass: (used for ECLASS "metadata" terms) to eclass13-0: (used for ECLASS content terms)
Elided (commented out) 4 statements because they don't add clarity
Changed or added 10 URLs to make the whole graph connected

Then I used the rdfpuml tool to make a diagram:


## CHIN-restructure.md

      
              6 files
            
          
              0 forks
            
          
              0 comments
            
          
              1 star
            
          
                VladimirAlexiev
                / CHIN-restructure.md
            
            
              Last active
              March 17, 2024 19:04
            
              
                Examples of complex SPARQL queries that I've written
              
          
    Prefixes

At the beginning the query defines all prefixes it uses, including for individuals like nom:, nomBib:, nomLang:.
The ontology nomo.ttl defines an even wider set of prefixes: when loaded to GraphDB,
these become repository namespaces, so they are used in result display and export,
which is very useful for the end-user.
Output


## Makefile
all: sparql-anything-test-xml.ttl sparql-anything-test-html.ttl

sparql-anything-test-xml.ttl: sparql-anything.sparql test.xml
	sparql-anything.bat -q sparql-anything.sparql -v type=application/xml -v file=test.xml > sparql-anything-test-xml.ttl

sparql-anything-test-html.ttl: sparql-anything.sparql test.xml
	sparql-anything.bat -q sparql-anything.sparql -v type=text/html       -v file=test.xml > sparql-anything-test-html.ttl


## DIGIN10-30-LV1_EQ-fixed.jsonld
{
  "@context": {
    "rdf"                                    : "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "cim"                                    : "http://ucaiug.org/ns/CIM#",
    "eu"                                     : "http://iec.ch/TC57/CIM100-European#",
    "dct"                                    : "http://purl.org/dc/terms/",
    "dcat"                                   : "http://www.w3.org/ns/dcat#",
    "prov"                                   : "http://www.w3.org/ns/prov#",
    "xsd"                                    : "http://www.w3.org/2001/XMLSchema#",
    "cim:Bay.VoltageLevel"                   : {"@type" : "@id"},

## README.md

      
              3 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                VladimirAlexiev
                / README.md
            
            
              Last active
              November 22, 2023 15:23
            
              
                Migrating J. Paul Getty Museum Agent ID from P2432 to P12040
              
          
    Migrating J. Paul Getty Museum Agent ID from P2432 to P12040

https://www.wikidata.org/wiki/Property:P12040

Renamed P2432 to "J. Paul Getty Museum agent DOR ID (old)"
New prop P12040 "J. Paul Getty Museum agent ID"

The old IDs https://www.getty.edu/art/collection/artists/377 redirect to new IDs https://www.getty.edu/art/collection/person/103JV9 .
These pages include human-readable info and "APIs & other identifiers" on the bottom that lists:

Permalink: the new prop
DOR ID (internal digital object repository): the old prop. WD has 1054 values (about 9% of total)


## Berkshire-describe.ttl
@prefix dc:            <http://purl.org/dc/elements/1.1/> .
@prefix dct:           <http://purl.org/dc/terms/> .
@prefix fn:            <http://www.w3.org/2005/xpath-functions#> .
@prefix foaf:          <http://xmlns.com/foaf/0.1/> .
@prefix gleif-L1:      <https://www.gleif.org/ontology/L1/> .
@prefix gleif-L2:      <https://www.gleif.org/ontology/L2/> .
@prefix gleif-base:    <https://www.gleif.org/ontology/Base/> .
@prefix gleif-data-L1: <https://linked.opendata.gleif.org/L1/> .
@prefix gleif-data-L2: <https://linked.opendata.gleif.org/L2/> .
@prefix gleif-data-ra: <https://linked.opendata.gleif.org/RegistrationAuthority/> .

## README.md

      
              1 file
            
          
              2 forks
            
          
              2 comments
            
          
              2 stars
            
          
                VladimirAlexiev
                / README.md
            
            
              Created
              March 30, 2017 06:54
            
              
                How to use Google Sheets to Manage Wikidata Coreferencing
              
          
    How to use Google Sheets to Manage Wikidata Coreferencing

A previous post How to Add Museum IDs to Wikidata explained how to use SPARQL to find missing data on Wikidata (Getty Museum IDs), how to create such values (from museum webpage URLs) and how to format them properly for QuickStatements.
Here I explain how to use Google sheets to manage a more advanced task. The sheet AAT-Wikidata matches about 3k AAT concepts to Wikipedia, WordNet30 and BabelNet (it restored an old mapping to Wordnet, retrieved it from BabelNet, mapped to Wikipedia).

For each row, it uses the following Google sheet formula (column C) to query the Wikipedia API and get the corresponding Wikidata ID (wikibase_item); split on two lines for readability:

=ImportXml(concat("https://en.wikipedia.o


## ePO_owl_core-outline.txt
epo:AccessTerm
epo:AcquiringCentralPurchasingBody
epo:AgentInRole
epo:AwardCriterion
epo:AwardDecision
epo:AwardEvaluationTerm
epo:Awarder
epo:AwardingCentralPurchasingBody
epo:BudgetProvider
epo:Business

## README.org

      
              2 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                VladimirAlexiev
                / README.org
            
            
              Created
              September 23, 2022 11:11
            
              
                CrunchBase permalinks including uppercase
              
          
    Most CB permalinks are uppercase, but a tiny percentage include uppercase letters:
grep "[A-Z]" permalink.txt|sort>permalink-uppercase.txt
wc -l permalink.txt permalink-uppercase.txt
 2050775 permalink.txt
     272 permalink-uppercase.txt

I attach the file so it can be added as exceptions to Wikidata.

  
## README.md

      
              20 files
            
          
              1 fork
            
          
              0 comments
            
          
              0 stars
            
          
                VladimirAlexiev
                / README.md
            
            
              Last active
              August 18, 2022 09:55
            
              
                Crunchbase Semantic Model and Challenge: https://github.com/kg-construct/best-practices/issues/7
              
          
    Crunchbase Challenge

Here's a challenge to the KG Construction CG:

Take Crunchbase: 10.5M rows, across 18 tables, served as CSV, updated daily.
The data of some nodes comes from multiple tables (eg Organization from organizations, org_parents, org_descriptions)
RDFize and store the total dataset, in under 1-2 hours time

Using the approach described here, GraphDB 9.11 with OntoRefine takes 76-119 minutes (1.3-2 hours) depending on hardware to produce and load 138M triples (19-30k triples per second)


Update the data daily, replacing the data of recently updated rows.

Using the approach described here, it takes about 15 minutes to update all of Crunchbase


Do it with your favorite RDFization toolkit, and preferably do it declaratively
	all: sparql-anything-test-xml.ttl sparql-anything-test-html.ttl

	sparql-anything-test-xml.ttl: sparql-anything.sparql test.xml
	sparql-anything.bat -q sparql-anything.sparql -v type=application/xml -v file=test.xml > sparql-anything-test-xml.ttl

	sparql-anything-test-html.ttl: sparql-anything.sparql test.xml
	sparql-anything.bat -q sparql-anything.sparql -v type=text/html -v file=test.xml > sparql-anything-test-html.ttl
	{
	"@context": {
	"rdf" : "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
	"cim" : "http://ucaiug.org/ns/CIM#",
	"eu" : "http://iec.ch/TC57/CIM100-European#",
	"dct" : "http://purl.org/dc/terms/",
	"dcat" : "http://www.w3.org/ns/dcat#",
	"prov" : "http://www.w3.org/ns/prov#",
	"xsd" : "http://www.w3.org/2001/XMLSchema#",
	"cim:Bay.VoltageLevel" : {"@type" : "@id"},
	@prefix dc: <http://purl.org/dc/elements/1.1/> .
	@prefix dct: <http://purl.org/dc/terms/> .
	@prefix fn: <http://www.w3.org/2005/xpath-functions#> .
	@prefix foaf: <http://xmlns.com/foaf/0.1/> .
	@prefix gleif-L1: <https://www.gleif.org/ontology/L1/> .
	@prefix gleif-L2: <https://www.gleif.org/ontology/L2/> .
	@prefix gleif-base: <https://www.gleif.org/ontology/Base/> .
	@prefix gleif-data-L1: <https://linked.opendata.gleif.org/L1/> .
	@prefix gleif-data-L2: <https://linked.opendata.gleif.org/L2/> .
	@prefix gleif-data-ra: <https://linked.opendata.gleif.org/RegistrationAuthority/> .
	epo:AccessTerm
	epo:AcquiringCentralPurchasingBody
	epo:AgentInRole
	epo:AwardCriterion
	epo:AwardDecision
	epo:AwardEvaluationTerm
	epo:Awarder
	epo:AwardingCentralPurchasingBody
	epo:BudgetProvider
	epo:Business