Skip to content

Instantly share code, notes, and snippets.

@practicalparticipation
Created May 31, 2013 11:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save practicalparticipation/5684302 to your computer and use it in GitHub Desktop.
Save practicalparticipation/5684302 to your computer and use it in GitHub Desktop.
Sparql queries for working with IATI Linked Data.
The query below run against the store at http://eculture.cs.vu.nl:1987/iati/user/query which uses the data model from http://iati2lod.appspot.com/model/activities retrieves URLs for all the transactions which include GB-1 as the providing organisation (currently 26,000 or so).
```sparql
PREFIX iati:<http://purl.org/collections/iati/>
SELECT ?transaction WHERE {
?transaction a iati:transaction.
?transaction iati:provider-org <http://purl.org/collections/iati/codelist/OrganisationIdentifier/GB-1> .
}
```
Provenance information on when this data was last updated is available in the triple store (details in the data model information).
@tetrahedra
Copy link

The equivalent XPath is:

//iati-activity/transaction/provider-org[contains(@provider-activity-id,"GB-1-")]

So that should return all transactions where the provider-activity-id is set to a DFID project.

@practicalparticipation
Copy link
Author

Ok - there are two ways I think we might see the provider noted:

The query below identifies IF transactions where they are related to DFID via provider-org:

PREFIX iati:<http://purl.org/collections/iati/>

SELECT * WHERE {
  ?transaction iati:provider-org <http://purl.org/collections/iati/codelist/OrganisationIdentifier/GB-1> .
  ?transaction iati:transaction-type <http://purl.org/collections/iati/codelist/TransactionType/IF>.
  ?activity iati:activity-transaction ?transaction.
}  

but this is different from provider-activity-id which I'll take a look at now...

@practicalparticipation
Copy link
Author

I think what you want is:

PREFIX iati:<http://purl.org/collections/iati/>

SELECT * WHERE {
  ?transaction iati:provider-org-activity-id ?providerId.
  ?activity iati:activity-transaction ?transaction.
  FILTER(regex(str(?providerId),"GB-1"))
}  

(Or add ?transaction iati:transaction-type http://purl.org/collections/iati/codelist/TransactionType/IF. in as well to get just those marked as Incoming Finance).

Results are here.

@tetrahedra
Copy link

How up-to-date is the repository? I'm getting interestingly different results from our XPath query results (data cut 17 Apr).

Effectively I'm trying to identify the data publishers who link back to funding DFID projects via provider-activity-id, so that we can link them from Aid Info Platform.

Can I add the publisher name to the query results?

@practicalparticipation
Copy link
Author

Yes - if you run

PREFIX iati:http://purl.org/collections/iati/

SELECT * WHERE {
GRAPH ?graph {

?transaction iati:provider-org-activity-id ?providerId.
?activity iati:activity-transaction ?transaction.

FILTER(regex(str(?providerId),"GB-1"))
} ?g iati:extras-data-updated ?updated.
}

You will get details of the IATI-Registry package the data was drawn from, and the updated-date given by the registry for that package.

In terms of the different results you are getting - are you getting lots more or less results (or just different ones?)

This repository is still in development, but would have been refreshed in the last month at the longest.

@tetrahedra
Copy link

I'm getting a lot fewer hits than I got with XPath on the 17-Apr data cut. I'll run wget and get the data into BaseX to compare.

This has been very useful, thanks.

@KasperBrandt
Copy link

Hi all,

The IATI data currently in the data store was last crawled the 22nd of May. So it should be quite recent.

However, I can't be sure that all activities have a proper provider-org-activity-id specified. So might be causing the fewer hits. edit: only noticed now that this shouldn't matter, since the query is the same as the XPath. It would be interesting to find out which activities are missing, so I can see why they're missing.

Is it possible to query on the activity-id alone, do they all contain GB-1? In that case we could use a simple query like this:

SELECT * WHERE {
GRAPH ?graph {

?activity a iati:activity .
?activity iati:activity-id ?id .

FILTER(regex(str(?id),"GB-1"))
} ?g iati:extras-data-updated ?updated.
}

I'm currently at work, so I'm sure not if this works, but I'll be able to take a more extensive look later this weekend.

@KasperBrandt
Copy link

Also, we got quite a few errors when uploading triples to the triple store. We only did the first test on uploading data last monday, so it's all quite fresh ;).

I'll be fixing some errors this weekend and we'll try to have a fresh dataset in the triple store somewhere next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment