Académique Documents
Professionnel Documents
Culture Documents
1 of 4
http://wifo5-03.informatik.uni-mannheim.de/pubby/
Pubby
A Linked Data Frontend for SPARQL Endpoints
Richard Cyganiak
Chris Bizer
Much Semantic Web data lives inside triple stores and can be accessed only by sending SPARQL queries to a SPARQL endpoint. It is hard to
connect information in these stores with other external data sources.
Linked Data is a style of publishing data on the Semantic Web that makes it easy to interlink, discover and consume data on the Semantic
Web. It allows a wide variety of existing RDF browsers (e.g. Disco, Tabulator, OpenLink Browser), RDF crawlers (e.g. SWSE, Swoogle), and
query agents (e.g. SemWeb Client Library, SWIC) to access the data.
Pubby makes it easy to turn a SPARQL endpoint into a Linked Data server. It is implemented as a Java web application.
News
2011-01-26: Pubby 0.3.3 released. This version switches Pubby from using N3 syntax to the (almost identical) Turtle syntax. Configuration files now
use the .ttl extension instead of .n3. The source code was also moved to a Github repository.
2011-01-25: Pubby 0.3.2 released. This version fixes a bug in the metadata extension. This bug caused problems generating RDF/XML output.
2011-01-20: Alternaitve tool released. Epimorphics has released Elda, a Linked Data publishing tool which can be used as an alternative to Pubby.
2010-07-27: Pubby 0.3.1 released. The default metadata template in this version is updated to release v0.5 of the Provenance Vocabulary.
2009-09-26: Pubby 0.3 released. This version adds a metadata extension that -by default- provides provenance information.
2007-10-22: Pubby 0.2 released. This version adds multi-dataset support, has improved content negotiation, adds the conf:datasetURIPattern
Features
Provides a Linked Data interface to local or remote SPARQL protocol servers
Provides dereferenceable URIs by rewriting URIs found in the SPARQL-exposed dataset into the
Pubby server's namespace
Provides a simple HTML interface showing the data available about each resource
Takes care of handling 303 redirects and content negotiation
Compatible with Tomcat and Jetty servlet containers
Includes a metadata extension to add metadata to the provided data
How It Works
Many triple stores and other SPARQL endpoints can be accessed only by SPARQL client applications that use the SPARQL protocol. It
cannot be accessed by the growing variety of Linked Data clients. Pubby is designed to provide a Linked Data interface to those RDF data
sources.
6/23/2016 5:39 PM
2 of 4
http://wifo5-03.informatik.uni-mannheim.de/pubby/
In RDF, resources are identified by URIs. The URIs used in most SPARQL dataset are not dereferenceable, meaning they cannot be accessed
tag:dbpedia.org,2007:Berlin .
When setting up a Pubby server for a SPARQL endpoint, you will configure a mapping that translates those URIs to dereferenceable URIs
handled by Pubby. If your server is running at http://myserver.org:8080/pubby/, then the Berlin URI above might be mapped to
http://myserver.org:8080/pubby/Berlin .
Pubby will handle requests to the mapped URIs by connecting to the SPARQL endpoint, asking it for information about the original URI, and
passing back the results to the client. It also handles various details of the HTTP interaction, such as the 303 redirect required by Web
Architecture, and content negotiation between HTML, RDF/XML and Turtle descriptions of the same resource.
webapp
directory into the servlet container's webapps folder. If Pubby is the only web application you want
to run in the container, then rename the webapp directory to root. Otherwise, rename it to something like
mydataset.
to http://myserver/mydataset/.
4. Modify the configuration file to suit your needs. It is located within Pubby's webapp directory, at /WEB-INF/config.ttl. See the next section for a list
of supported configuration directives.
Configuration
The Pubby configuration file uses Turtle syntax. It typically starts with some boilerplate prefix declarations, followed by a server configuration
section, and one or more dataset configuration sections:
<> a conf:Configuration;
conf:option1 value1;
conf:option2 value2;
(...)
conf:dataset [
conf:option1 value1;
conf:option2 value2;
];
.
6/23/2016 5:39 PM
3 of 4
http://wifo5-03.informatik.uni-mannheim.de/pubby/
conf:projectHomepage <project_homepage_url.html>;
A project homepage or similar URL, for linking in page titles.
conf:webBase <server_base_uri>;
Required. The root URL where the Pubby web application is installed, e.g. http://myserver/mydataset/ .
dc:title, foaf:name .
dc:description .
conf:usePrefixesFrom <file.rdf>;
Links to an RDF document whose prefix declarations will be used in output. Defaults to the empty URL, which means the prefixes from the configuration file
will be used.
conf:defaultLanguage "en";
If labels and comments in multiple languages are present (using different language tags on RDF literals), then this language will be preferred. Defaults to "en" .
conf:indexResource <dataset_uri>;
The URI of a resource whose description will be displayed as the hom e page of the Pubby installation. Note that you have to specify a dataset URI , not a
mapped web URI.
conf:dataset [ ... ];
Required. Introduces a dataset configuration section. There can be one or more dataset sections.
conf:sparqlDefaultGraph <sparql_default_graph_name>;
If the data of interest is not located in the SPARQL dataset's default graph, but within a named graph, then its name must be specified here.
conf:datasetBase <dataset_uri_prefix>;
Required. The common URI prefix of the resource identifiers in the SPARQL dataset; only resources with this prefix will be mapped and made available by
Pubby.
conf:datasetBase <http://example.org/>;
conf:datasetURIPattern "(users|documents)/.*";
This example configuration will publish the dataset URI http://example.org/users/alice , but not
invoices/5395842
http://example.org/invoices/5395842
conf:addSameAsStatements "true"/"false";
If set to "true" , an owl:sameAs statement of the form
6/23/2016 5:39 PM
4 of 4
http://wifo5-03.informatik.uni-mannheim.de/pubby/
conf:rdfDocumentMetadata
block will be added as document metadata to the RDF documents published for this dataset. This feature
can be used for instance to add licensing information to your published documents.
conf:rdfDocumentMetadata [
dc:publisher <http://richard.cyganiak.de/foaf.rdf#cygri>;
];
conf:metadataTemplate "metadata.ttl";
Refers to a metadata template that is used by the metadata extension. This file is expected in directory ./WEB-INF/templates/ .
conf:webResourcePrefix "uri_prefix/";
If present, this string will be prefixed to the mapped web URIs. This is useful if you have to avoid potential name clashes with URIs already used by the server
itself. For example, if the dataset includes a URI
http://mydataset/page ,
http://mydataset/,
mapping because Pubby reserves the mapped URI http://myserver/mydataset/page for its own use. In this case, you may specify a prefix like
"resource/" ,
conf:fixUnescapedCharacters "abc";
(Only needed if you have problems with funny characters in the URIs when running Pubby behind an Apache proxy)
conf:redirectRDFRequestsToEndpoint "true"/"false";
Instead of serving RDF documents, Pubby will redirect requests for RDF to
DESCRIBE
serving HTML descriptions of resources. All features that affect the RDF output will have no effect, e.g. URI rewriting and adding of owl:same statements won't
work. This is useful to improve performance in cases where the SPARQL dataset has been designed with Pubby publication in mind.
Limitations
Only works for SPARQL endpoint that can answer DESCRIBE queries
Multiple dataset support may not work as expected: If a requested URI is matched by the conf:datasetURIPattern of more than one dataset (or one
doesn't have a
conf:datasetURIPattern),
then only one of the possible endpoints will be queried at a time. Pubby will never try to query multiple
endpoints in order to create a single response. In most cases, it is recommended to simply set up a separate Pubby instance for each dataset.
Hash URIs on the web side are not supported.
Acknowledgements
This project has received contributions from Olaf Hartig and Boris Villazn-Terrazas.
6/23/2016 5:39 PM