At IWMW 2010, last week, a lot of discussion centred around how, in an increasingly austere atmosphere, we can make more use of free stuff. One category of free stuff is linked data. In particular, I was intrigued by Thom Bunting (UKOLN)‘s presentation about extracting information from Wikipedia. It has inspired me to start experimenting with data about UK universities.
Let’s get some terminology out of the way. Dbpedia is a service that extracts machine-readable data from Wikipedia articles. You can look at, for example, everything Dbpedia knows about the University of Bristol. SPARQL is an SQL-like language for querying triples: effectively, all the data is in a single table with three columns. SNORQL is a front-end to Dbpedia that allows you to enter SPARQL queries directly. It’s possible to ask SNORQL for “All soccer players, who played as goalkeeper for a club that has a stadium with more than 40.000 seats and who are born in a country with more than 10 million inhabitants” and get results in a variety of machine-readable formats.
Sadly, when you look for ways to use Dbpedia data, some of the links are broken, which was initially off-putting. SNORQL is great fun though. SPARQL is a something I’m only just learning, but to anyone familiar with SQL and the basics of RDF it’s straightforward.
List the members of the 1994 Group of universities
SELECT ?uni
Read the rest of this entry »
WHERE {
?uni rdf:type <http://dbpedia.org/ontology/University> .
?uni skos:subject <http://dbpedia.org/resource/Category:1994_Group>
}
ORDER by ?uni