Semantic web (or linked data) enable publishing structure data so that it can be interlinked. A structured document is a set of triples: subject, predicate, object. The subject and object can be itselves identified by a URL describing another document. RDF is a data model that describes the web resources. Several common serialization formats are in use, including xml, n3, nt, json-ld etc.
We can request freely some data endpoint with Sparql. This language express requests. This is the subject of this blog post. I wanted to see what can be requested and how in Bnf (it is the National Library in France) and Wikidata in the aim of enhance/enrich a biblio at Opac for example.
You will find below a very small part of what we can do. This is not a lesson to explain how it works. There are examples to test and enhance. In this way, you’ll discover and see what you can do with that.
If you want to go further, it exists a lot of web ressources well documented.
I am really interested to learn what you would like to do in your softwares. For example, could we use some of information to enhance Opac or SIGB and bring a better user experience to the borrowers or librarians. I published a small patch to Koha community (experimental work).
To have a Sparql request, you have to put a SELECT and WHERE. You can declare PREFIX to write a smaller request.
You can test the Bnf request on the Sparql endpoint: http://data.bnf.fr/sparql/
Let’s go!
For an ark, if I want to know the available predicates and related objects I can write (you can remove “?o in SELECT” to have only predicates):
SELECT DISTINCT ?p ?o
WHERE {
<http://data.bnf.fr/ark:/12148/cb13173759h#about> ?p ?o .
}
Try it
“http://data.bnf.fr/ark:/12148/cb13173759h#about” describe a Work. The work creator is available with “http://purl.org/dc/terms/creator”.
So,if I want to find the Work creator, I have to write:
PREFIX dcterms: <http://purl.org/dc/terms/>
SELECT DISTINCT ?creator
WHERE {
<http://data.bnf.fr/ark:/12148/cb13173759h#about> dcterms:creator ?creator .
}
Try it
But if I want to find the Expression author, I must write:
SELECT DISTINCT ?p
WHERE {
<http://data.bnf.fr/ark:/12148/cb34336868b#Expression> ?p ?o .
}
Try it
PREFIX bnfroles: <http://data.bnf.fr/vocabulary/roles/>
SELECT DISTINCT ?author
WHERE {
<http://data.bnf.fr/ark:/12148/cb34336868b#Expression> bnfroles:r70 ?author .
}
Try it
You will find more information about Bnf model on the “opendata” page.
The entity type is recorded in “http://www.w3.org/1999/02/22-rdf-syntax-ns#type” and have possible values:
- http://rdvocab.info/uri/schema/FRBRentitiesRDA/Work
- http://rdvocab.info/uri/schema/FRBRentitiesRDA/Expression
- http://rdvocab.info/uri/schema/FRBRentitiesRDA/Manifestation etc.
We could after find information about author:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX bio: <http://vocab.org/bio/0.1/>
SELECT ?name ?birth ?death ?page ?bio ?country ?image
WHERE {
<http://data.bnf.fr/ark:/12148/cb11888092r#about> foaf:name ?name .
<http://data.bnf.fr/ark:/12148/cb11888092r#about> bio:birth ?birth .
<http://data.bnf.fr/ark:/12148/cb11888092r#about> bio:death ?death .
<http://data.bnf.fr/ark:/12148/cb11888092r#about> foaf:page ?page .
<http://data.bnf.fr/ark:/12148/cb11888092r#about> foaf:depiction ?image .
<http://data.bnf.fr/ark:/12148/cb11888092r#about> <http://rdvocab.info/ElementsGr2/biographicalInformation> ?bio .
<http://data.bnf.fr/ark:/12148/cb11888092r#about> <http://rdvocab.info/ElementsGr2/countryAssociatedWithThePerson> ?country .
}
Try it
Have a look to the howto published by Thomas Francart (written in french). He shows other Sparql query examples.
I played with Wikidata endpoint. “Wikidata is a free and open knowledge base that can be read and edited by both humans and machines. Wikidata acts as central storage for the structured data of its Wikimedia sister projects including Wikipedia, Wikivoyage, Wikisource, and others. “
The “predicated” are called “properties” in Wikidata. You can find a list or find its with this browser or this one. Every item (subject or object) is identified with Q*.
The query user interface is very usable and give an efficient assitance: http://query.wikidata.org/
I go back to the same example, I begin to search the Wikidata identifier matching an ark part:
SELECT ?wdwork
WHERE {
?wdwork wdt:P268 ?idbnf
FILTER CONTAINS(?idbnf, "13173759h") .
}
Try it
I can then find all the predicates and object for this entity:
SELECT DISTINCT ?p ?o
WHERE {
wd:Q25169 ?p ?o .
}
Try it
or find other information about the author:
SELECT ?wdauthor ?birthdate ?deathdate ?image ?workperiod ?name
WHERE {
wd:Q25169 wdt:P50 ?wdauthor.
OPTIONAL { ?wdauthor wdt:P18 ?image. }
OPTIONAL { ?wdauthor wdt:P2031 ?workperiod. }
OPTIONAL { ?wdauthor wdt:P570 ?deathdate} .
OPTIONAL { ?wdauthor wdt:P1477 ?name. }
OPTIONAL { ?wdauthor wdt:P569 ?birthdate. }
}
Try it
and his occupations:
SELECT ?occupation ?occupationLabel
WHERE {
wd:Q43361 wdt:P50 ?auteur.
OPTIONAL { ?auteur wdt:P106 ?occupation. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it
or information about Work:
SELECT ?wdwork ?wdworkDescription ?wdauthor ?wdauthorLabel ?countryoriginLabel ?serieLabel ?pubdate ?genreLabel
WHERE
{
?wdwork wdt:P268 ?idbnf
FILTER CONTAINS(?idbnf, "13173759h").
?wdwork wdt:P50 ?wdauthor .
OPTIONAL { ?wdwork wdt:P495 ?countryorigin .}
OPTIONAL { ?wdwork wdt:P577 ?pubdate .}
OPTIONAL { ?wdwork wdt:P179 ?serie .}
OPTIONAL { ?wdwork wdt:P136 ?genre .}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],fr". }
}
Try it
To go further:
- Project Books on Wikidata
- Others librarians requests inside a Wikimedia talk
- There is a lot of pages on wikidata to learn more. You can find general requests“, and user pages like this one from Hélène Sarrazin.
The data sets don’t miss, I could cite Bnf, Wikidata, Geonames, Europeana etc. Have a look to Lod Cloud when it will come back.
Don’t hesitate to share experiences, wishes, problems to build a query, dreams of softwares in the comment, I would be glad to think to that 🙂
Enjoy!