Introduction
The SWAD-E portal is a prototype implementation of a semantic web portal
which supports one demonstration portal service - the Semantic Web
Environmental Directory. The design is intended to be quite flexible
and easy to customize. This document outlines the different ways of approaching
this customization. Note that this is prototype software built as a demonstration
so there are likely to be weaknesses and limitations to the current implementation
that will be discovered during use. A similar warning applies to this
documentation - it is a guide to the state of the software at the time
of writing, not a detailed developer's manual.
Level of customization
There are many different levels of customization possible. We'll first outline
the different customization points and then look at some in more detail.
- DataSource definition
- The simplest customization is done though a single RDF configuration
file which defines one or more DataSources. A single portal
web application can provide an interface onto multiple DataSources Each
DataSource can in turn be built from multiple data files and ontologies
and can harvest information multiple external sources of data. The DataSource
properties in this configuration file defines what ontologies and instance
data to use, whether and how to use a data base for storage, what search
facets to use and what display templates to use for different displayable
objects.
- Styles
-
The existing templates use style sheet information to define the look and feel
of the web pages. The DataSource configuration specifies what style sheet
the templates should link to. By configuring a new style sheet and/or editing
the existing one you change the basic graphical look and feel of the
existing templates.
- Templates
- All of the portal display pages are generated from templates using
the Jakarta Velocity
template engine. All of the browsing, navigation and viewing capabilities
of the portal can be modified by changing these templates. These templates
in turn make use of macros and included subtemplates to make them more
modular and easier to adapt.
- Extensions
- Any of the Java code that implements the portal functionality can
be modified. A few extension points have been built in to make some
such extensions easier. In particular it should be possible to define
new sorts of Facet, to provide alternative store implementations and
to add additional Java functions to help with rendering.
- Web.xml
-
Like most Java web application the portal includes a
web.xml
configuration file. The primary configuration that might be necessary here
is defining the security constraint that controls access to the administration
servlets. It is also possible to change where the log4j package picks up
its initialization file by changing the log4j-init-file
parameter value on the initServlet
- Logging
- Portal actions are logged using the Jakarta log4j
package. The configuration file which defines how log4j operates is
in the webapp at path
WEB-INF/config/log4j.properties though
this can be changed (see above). By modifying the log4j.properties
file you can increase or decrease the level of log messages for different
parts of the portal and can change where log messages are sent. Note
that there is one service provided by the initServlet which is relevant
here. It is convenient to be able to specify log files relative to the
current webapp. The initServlet helps with this by modifying every log
file appender whose name starts with WEBINF to prepend the directory
path {context}/WEB-INF/logs to the specified log file name.
- A note on URIs
-
In several places in configuring the portal (especially in the RDF
configuration file) we need to be able to refer
to files which are part of the portal installation. We achieve this by
using a pseudo URI schema called
portal:, thus the URI
portal://data/swed/swedClosure.rules refers to the rules file
in the data/swed directory of the enclosing web application.
This will work irrespective of what the web application is called or where
the enclosing servlet container is located.
In the case of template files there is one more pseudo URI scheme supported.
A uri of the form template:someInlineText specifies a template
inline (in much the same way that a javascript: URI specifies
an inline javascript call).
Top
DataSource configuration
The main configuration file which defines the DataSources to be viewed by the portal is
WEB-INF/config/sources.n3 which is an RDF file in
N3 syntax.
This filename is hardwired into the system, if it needs to be changed then modify
the CONFIG_FILE constant defined in GlobalSources.java.
The properties that may be used to describe a DataSource in this file are
defined using an RDFS vocabulary portal-config-vocab.n3.
We shall use the prefix pcv to refer to this vocabulary for convenience.
Basic properties
Each DataSource should be defined as a resource of type pcv:DataSource
with a number of defining properties, some of which in turn might be structured values
described using bNodes.
Each datasource needs an encoding string which will be used to identify it in
http requests to the portal, it also needs an integer "order number" which is used to
sort the DataSources into order. The lowest numbered DataSource is treated as the
default. It is also convenient to give a DataSource a descriptive label so athe start
of a configuration file might look something like:
[] rdf:type pcv:DataSource ;
rdfs:label "Who's who in the environment" ;
pcv:encoding "wwite";
pcv:order "10"^^xsd:integer ;
dc:description "Environment example based on: Who's who in the environment" ;
In addition a useful DataSource also needs some data. The data is broken
into two groups - ontologies and instance data. These are not necessarily
clearly separated but the files treated as ontologies will be loaded into
memory and will be consulted when following the hierarchical relationships
between concepts in facets. The instance data can be loaded into memory
or found in a database. All searches and queries will check the union of
the ontology and the instance data. Thus a simple DataSource with one ontology
and two data files would also have config entries such as:
pcv:sourceURL <portal://data/swed/wwite_tidy.n3> ;
pcv:sourceURL <portal://data/swed/wwite_project.n3> ;
pcv:ontologySourceURL <portal://data/swed/organisation_v1.2.owl> ;
If the instance data should be stored in a database then the database can be specified
using the properties:
pcv:databaseURI "jdbc:mysql://localhost/swed?autoReconnect=true" ;
pcv:databaseModelName "SwedMain" ;
pcv:databaseUser "user" ;
pcv:databasePassword "password" ;
pcv:databaseType "mysql";
pcv:databaseDriver "com.mysql.jdbc.Driver" ;
If there is a database specified then the pcv:sourceURL entries
will be ignored by the portal. They are, however, still useful. The portal code includes
a java application tools.DBInit which will preload the database specified
in the config file with the data from the files specified by the sourceURL entries.
There is an entry in the ant script to run this tool for initializing a database.
The final basic (and required) property pcv:styleSheet
is used to define the CSS style sheet
that should used by the viewing templates.
Templates can be written with fixed style sheet
references but the built in templates use the scripting support
to link to the style sheet specified here. The value of the styleSheet
property can either be an absolute URL or a string literal (in which case
it is taken to be the name of a file in the webapp's styles
directory). A typical declaration is:
pcv:styleSheet "swed_style.css" ;
Defining facets
Now we have the portal data specified the second major area of configuration
is to define what facets to use to navigate the data. The default portal
templates provide a faceted browse interface where the facets are specified
as part of the DataSource configuration. Facet's are specified using the
pcv:facet property to point to a resource specifying an
instance of the class pcv:Facet. There are currently
two concrete subclasses of pcv:Facet supported -
pcv:HierarchicalFacet (indexed resources are classified
by concepts which are arranged in some hierarchical structure) and
pcv:AlphaRangeFacet (resources are grouped by the first letter
of a literal-valued property). A facet declared as just a pcv:Facet
will be treated as a flat facet with a simple set of allowed
literal or resource values. Other facet types could be defined by
extending the appropriate Java classes, in which case new
subclasses of pcv:Facet should be added to the portal configuration
vocabulary.
Several attributes are needed to define a facet:
pcv:linkProp
-
This specifies the RDF property which links an indexed resource to
a classification value. The facet organizes the resources according
to the value of this property. If the classification values are subject
to some form of inheritance then the closure rules (see below) should
be used to add the inherited values.
pcv:linkUpdateProp
-
This specifies the RDF property which links an indexed resource to
a classification value, without any inheritance processing. This is
the property that will be updated if a change is made to an object's
classification. It need not be different from the value of
pcv:linkProp but in the case of hierarchical facets it is
often useful to have two different properties - one for the direct
classification and one for the closure of the classification.
pcv:facetBase, pcv:narrowP and pcv:widenP
- For hierarchical facets then these properties define the concept hierarchy.
The
pcv:facetBase defines the top level concept which is
not normally explicitly displayed. The other two properties specify
how to move from a concept to the concepts which are more specific or
more general. Only one of needs to be defined, either will do. This
specifies the RDF property which links an indexed resource to a classification
value. The facet organizes the resources according to the value of this
property.
pcv:valueClass, pcv:valueList
- For flat facets (the default if no more specific type is specified) then
the list of possible facet values is, by default, taken from the set of values
currently used in the data set. For a big dataset this may be expensive and might miss
out values which are not currently used but that you'd like the user to be aware of.
To cater for this it is possible to specify a fixed set of legal values for the facet.
This can either be done by providing an rdf list of values using
pcv:valueList
or by providing a class, using pcv:valueClass, all of whose instances
are legal values for the facet.
pcv:order
-
An integer ordering for the facet. Facets will be sorted into a fixed
sequence for display in ascending value of the order property.
rdfs:label
-
Gives the name the facet will be known by in the interface.
rdfs:comment
-
Gives a description of the nature of the facet which will be included
at the top of the "what's in this facet" page.
For example two facets from the SWED application are shown below:
pcv:facet [
a pcv:AlphaRangeFacet;
rdfs:label "Name" ;
rdfs:comment "This facet groups entries under the first
letter of their names.
There is no tree of categories to display." ;
pcv:linkProp swed:has_primary_prorg_name ;
pcv:linkUpdateProp swed:has_primary_prorg_name ;
pcv:order "9"^^xsd:integer;
];
pcv:facet [
a pcv:HierarchicalFacet;
rdfs:label "Topic of interest" ;
rdfs:comment "This facet classifies entries according to
environmental topic that the organisation are interested in." ;
pcv:linkProp swed:has_topic_cl ;
pcv:linkUpdateProp swed:has_topic ;
pcv:widenP skos:broader ;
pcv:facetBase swed_toi:topics_of_interest ;
pcv:order "1"^^xsd:integer;
];
The first (which comes last in the displayed ordering) defines a facet which
indexes organizations by the first letter of their primary name. The second
indexes organizations by the topics they are interested in. The topics are
specified by a concept thesaurus defined using the skos
vocabulary.
Linking to templates
As already mentioned (several times) we use the Velocity template engine
to build the web views from the RDF data. The templates which are used
for this are mostly defined from this configuration file to make it possible
for each DataSources to use different templates for their actions.
There are two primary top level templates, the "view" template which is used
to display a set of filtered search results (including the top level page
which corresponds to a search using an empty filter) and the "pageview" template
which is used to display a single resource which has been selected from
the match list. These are specified using pcv:viewTemplate and
pcv:pageviewTemplate respectively.
Calls made to the portal controller servlet (context/servlet/Entry?action=foo)
whose action parameter is not known are translated into calls
to the template {templateBase}/fooAction.vm. The value of
{templateBase} can be set using the configuration property
pcv:templateBase but defaults to the templates
directory of the webapp.
The default view and pageview templates show the RDF descriptions of
the matching resources using the embedded template mechanism discussed
in the structure documentation. Recall
that the embedded template to use is determined by a "context" parameter
(an arbitrary string) and the type of the object to be displayed. The
default view template displays the short form of matching results using
a template context "default". The default pageview template displays the
full form of an individual resource using template context "page". It
will switch to the alternative context "pageRaw" if the view raw
menu option is selected.
The mapping from [template context] plus [type of the displayed resource] to
the [template to be used] is described in the configuration file using the
pcv:template property to point to instances of the
pcv:Template class. The template definition requires a context
string, a path to the template an optional rdfs:Class for which the template
should be used and an optional weight. The weight is only relevant when
a resource has multiple classes to which different templates might apply.
For example the following:
pcv:template [a pcv:Template;
pcv:templateContext "page" ;
pcv:templateClass swed:prorg ;
pcv:templatePath <portal://templates/prorgPageview.vm> ;
];
pcv:template [a pcv:Template;
pcv:templateContext "page" ;
pcv:templatePath <portal://templates/pageDefault.vm> ;
];
specifies that the progPageview.vm template should be used
to display resources of type swed:prorg when in full page mode.
All other types of resource will be displayed using the pageDefault.vm
template (which is a vanilla table view).
Harvester controls
The operation of the harvester, which regularly scans a set of known
sites for updated data, can also be adjusted here. The relevant
properties are:
pcv:harvesterInterval
-
The time in seconds between success harvester scans.
pcv:harvesterPoolSize
-
The number of concurrent threads the harvester should use
while polling sites for updates.
pcv:harvesterBootstrap
-
An RDF file which should be preloaded into the harvester
database to define the location and status of sites which
should be scanned. This can include references to
files which are marked as "trusted" and in turn contain
rdfs:seeAlso links to other sources.
pcv:harvesterModelName
-
This is the name of the database model which the harvester
should use to keep its data in. This assumes that the DataSource
has been configured to use a database. The harvester can use
a distinct persistent Jena Model from the data model but
it currently always uses the same physical database.
Inference controls
When data files are loaded they can be processed using a set of inference rules.
These rules can be used to implement relevant parts of RDFS and OWL processing,
to compute the transitive closure of thesaurus properties and to normalize data
in application-specific ways. The inference rules are specified using the configuration
property pcv:closureRulesURL which should point to a text file
containing rules in Jena's
rule syntax.
These rules will be used to process each set of instance data as it is
added to the portal. They will be run in a context where the ontology data
is also visible to the rule set. For a memory DataSource this means the rules
will process each source file as it is loaded. For a database DataSource the
rules will have been applied at the time the dbinit operation was
done and the forward deductions will have been stored in the database. No additional
application of the closure rules is done when the database is opened. In either case
when new data is incrementally loaded by the harvester the closure rules will be
run on the new data.
Note there is a problem with this scheme. It is not possible
to use backward chaining (or hybrid) rules on data in the database because the
database initialization only preserves the forward deductive closure. There are some
hooks in the form of a separate pcv:transformRulesURL designed to
overcome this but they are not yet complete and this property should be avoided
for now.
In all cases a default "smushing" operation is also performed which
will map together nodes which are known to refer to the
same entity by virtue of any properties of type owl:inverseFunctionalProperty
declared in the ontology files. No explicit entries in the rule is needed to
enable this functionality, it is built into the portal.
As an example here is the part of the SWED rule file which calculates
the closure of the "topics of interest" of the organizations to implement
the hierarchical facet defined in the facet section above:
(?P swed:has_topic ?T) -> (?P swed:has_topic_CL ?T) .
(?P swed:has_topic_CL ?T) (?T skos:broader ?B)
-> (?P swed:has_topic_CL ?B) .
Subsource definitions
In some applications it can be useful to change the interface (look and feel,
page layout, search facets) according to the current set of facet selections.
This is supported through the notion of a sub source.
A sub source is defined
like a data source but is linked to the real datasource through a pcv:subSourceSpec
which also defines a set of conditions under which the sub source should be used.
The sub source can define alternative style sheets, templates and template
base settings which override or add to the top level style and template settings.
The sub source can also define facets which should be used and these replace the
top level settings (to allow facets to be dropped) so that all facets in the sub
source must be declared even if they are the same as the top source.
All of the facets which might be used in subsources should be known to the
top level DataSource - to make this possible while still allowing subsources
to introduce new facets into the UI it is possible for the top level DataSource
to define hidden facets using the property pcv:hiddenFacet.
All other data source details such as the data and harvester are inherited from
the parent data source.
The SubSourceSpec also defines a set of facet/value pairs which form the
conditions under which the sub source will be used. Only if all of the facets
given in the condition have exactly the given filter values will the sub source
be triggered. The facets are referred to as RDF resources so whereas the earlier examples
used bNodes for facet definitions then when defining sub sources it becomes
necessary to give internal URIs for the facet defitions.
As an example suppose that we define a set of facets for some portal:
eg:sexFacet a pcv:Facet;
rdfs:label "Sex" ;
pcv:linkProp eg:sex;
pcv:order "3"^^xsd:integer .
eg:nameFacet a pcv:AlphaRangeFacet; ... .
eg:speciesFacet a pcv:HierarchicalFacet; ... .
eg:statusFacet a pcv:Facet; ... .
eg:colourFacet a pcv:Facet; ... .
Then suppose that the main DataSource defines data files, templates and so forth as normal
but includes the following facet declarations:
[] rdf:type pcv:DataSource ;
...
pcv:facet eg:sexFacet ;
pcv:facet eg:nameFacet ;
pcv:facet eg:speciesFacet ;
pcv:facet eg:statusFacet ;
pcv:hiddenFacet eg:colourFacet ;
In that case the top level interface will only show the first four facets
but predeclares the colourFacet so that it can be used in a sub source. We
can then declare a sub source by:
[] rdf:type pcv:DataSource ;
...
pcv:subSourceSpec [
pcv:subSource [
pcv:facet eg:sexFacet ;
pcv:facet eg:nameFacet ;
pcv:facet eg:speciesFacet ;
pcv:facet eg:statusFacet ;
pcv:facet eg:colourFacet ;
pcv:styleSheet "site-pet.css" ;
pcv:viewTemplate ;
] ;
pcv:condition [
pcv:conditionFacet eg:statusFacet;
pcv:conditionValue eg:pet;
]
] .
This declares that if the statusFacet is selected to have the value eg:pet
then the sub source configuration will be used which changes the style sheet,
the format of the normal viewing template and adds in the previously hidden
colour facet. If several conditions are specified in the subSourceSpec then
all of the conditions must be true for the sub source configuration to be triggered.
Miscellaneous properties
There are a few other configuration properties which don't fall into
any of the above groups.
pcv:filterOnType
- If this property is defined it gives the rdfs:Class of resources
that should be shown by the portal views. If resources are found in
a search which don't match one of these filter types they will not appear.
For example, in the SWED demonstration a text search might find a vcard:Address
bNode as well as the organisation which has this address. We avoid showing
the bNode in the match list by setting this filter property appropriately.
pcv:baseRelationProperty
- The portal has a notion of "relational links" between displayable
resources. Of course, any RDF property between two RDF resources is
a form of relation. However, it is useful to be able to distinguish
a special subset of such properties. For example, in the SWED demonstrator
we define a group of relationships between projects and organisations
such as "department_of". This notion is used in the display templates
to show relationships in a defined part of the display page and in the
graph visualization support to define the links which should be shown
on the graph. The
pcv:baseRelationProperty indicates a
property which forms the root of a subproperty tree whose members should
all be considered as "relational" properties in this sense.
pcv:primaryProperty
-
When a resource is asked for its "primarySource" then
the answer is found by looking for the value of a "primary property"
and tracking back the provenance of that value. The RDF
property which should be treated as "primary" for this purpose
is the one indicated by this configuration entry.
pcv:preferredLanguage
-
Defines the preferred language to use for labels. The value should be a string in
RFC 3066 format (e.g. "en" or "en-US"). If the facet names, values or object names
have multiple labels then the one with the language tag closest to the value defined
here will be used. It is possible to write templates which change their display
depending on the language setting by the
getPropertyValueLang method in place of
getPropertyValue when getting text values of literals.
Top
Styles
The default templates link to the style sheet specified by the
pcv:styleSheet to provide all of the CSS-based styling
of the content.
The typical pattern used in the default templates is to use an enclosing
div tag with an appropriate ID or CLASS to define a region of the
display and use normal tags A/P/etc within the div. The style just selects
on the sequence of tags to style each tag in context.
The easiest way to generate a new style sheet is to start from the
SWED demonstration sheet (swed_style.css) and copy and edit that.
As we gain more experience with use of the portal code base
the viewing templates and associated style sheets may become more streamlined
at which point some more detailed documentation might be added here.
Top
Data export
The portal includes a Joseki servlet
to enable RDF clients to remotely access both the stored data and the harvester metadata.
The web.xml file defines the binding between the Joseki servlet and a web address.
In the SWED instance it is bound to <host>/swed/rdf/*. It also
defines where the Joseki configuration file may be found; in the case of SWED this defaults
to WEB-INF/config/joseki.n3.
The joseki configuration can be set up as normal. The only additional feature is a
special source controller that can be used to directly export the live data or harvester
metadata from a given portal data source. To use this you must first define
a binding for this controller in the joseki.n3 file:
joseki:portalSource
a joseki:SourceController ;
module:interface joseki:SourceController ;
module:implementation
[ module:className
"com.hp.hpl.swade.portal.joseki.SourceControllerPortal" ] ;
rdfs:comment "A portal-specific controller that just recognizes two model names data and harvester" .
then this source can be used to define a Joseki data source, for example:
<http://server/harvester>
a joseki:AttachedModel ;
joseki:sourceController joseki:portalSource ;
joseki:attachedModel <portal:harvester> ;
joseki:hasQueryOperation joseki:BindingGET ;
joseki:hasQueryOperation joseki:BindingFetchClosure ;
joseki:hasQueryOperation joseki:BindingRDQL ;
joseki:hasOperation joseki:BindingQueryModel ;
joseki:isImmutable "true" ;
rdfs:comment "Harvester metadata" ;
.
The SourceControllerPortal treats the URI for the attachedModel in a special way.
The attachedModel should be of the form portal:source/model where
source is the encoding name for data source to be accessed
(or can be omitted, along with the /, to indicate the default source); and
model is one of data (the data stored in the portal),
fulldata (data plus all ontologies plus configuration data) or harvester (the
harvester meta data).
Templates
Virtually all of the portal look and feel and behaviour can be
adjusted by writing appropriate viewing templates.
In order to write new viewing templates you need to understand the
Velocity scripting language (see http://jakarta.apache.org/velocity/
for documentation) and the methods available via the context parameter data objects (introduced
in the structure documentation and
described in more detail in the appendix
and the javadoc).
The templates are structured so that applications can change the look
and feel of the interface without necessarily having to rewrite large
parts of the template from scratch. You need to write new templates from
scratch if you are building a custom view of some object type or if you
are adding some new actions to the portal. We'll outline the structure
of the default interface templates and the customization points. Beyond
that the easiest thing to do is to look at the demonstration templates
and try to modify them. For example add a link to a new "action", create
a template to implement that action and start experimenting.
Note: when developing new templates and macros it is convenient to
have the velocity engine dynamically reload any template changes.
This can be done by modifying the file templates/velocity.properties,
see templates/velocity.properties.dev for an example. The ant build
script will copy the velocity.properties.dist in place of velocity.properties
when building a deployable war file.
View template
The top level portal browse page and the pages which show lists of matches to a
set of facet selections are all generated by a single view template which is invoked
by action=v requests to the controller. The default view template is
templates/view.vm but this can be changed in the DataSource configuration file.
The default view template is broken into a set of separate subtemplates to
make modifications easier. These are:
templates/header.vm
-
to generate the header bar across the top of the page, just change this
to modify the header;
templates/search.vm
-
to generate the free text search box, use the style sheet to change the
look and feel of this form;
templates/footerStart.vm and templates/footerEnd.vm
-
to generate the logo and footer bar at the bottom of the page, these are in
two parts to make it easy to insert footer text such as provenance messages;
templates/viewFacets.vm
- to generate the block of top level views of the available facets,
use the style sheet to change the look and feel of this area; this script
is pulled out into a separate template to make it easier to reuse in
your own view.vm template;
templates/viewMatches.vm
-
to generate the list of all portal resources which match the current filter,
each resource is displayed using view context "default";
Additionally a few macros are used to generate the list of recent visits
(#recentVisits), the current filter selection (#trail) and
the set of remaining facet options (#facetStates). These macros are
defined in the file templates/stdmacros.vm. The choice of when to
use macros and when to use included templates is a little arbitrary.
Pageview template
When a link to an individual portal resource is selected from a view it is displayed in
its own page using the pageview template. By default this is templates/pageview.vm
but again that can be changed in the DataSource configuration file.
The default pageview template uses the same subtemplates to display the header, search form and footer
areas of the page as the view template. So changing these once will change them across the portal.
The resource to be displayed is shown using a recursive view using context "page" so that
whatever embedded templates are registered in the DataSource configuration for the
resource type will be located and used.
The only nontrivial operation of the pageview template itself is that
it displays page provenance information. This uses the configured "primaryProperty"
to obtain the primary provenance for the page and uses the harvester_vocab:banner
and rdfs:label values for that source to create the provenance
sidebar and footer, respectively. Applications that don't want to display
provenance information using these conventions can easily remove or modify
that part of the pageview script.
Top
Extensions
If you want to add functionality to the portal which can't be done
by scripting the templates then you will need to extend the Java code.
At that point you are largely on your own - read the javadoc and get started.
However, there are a couple of hooks for extensions that are worth pointing out.
The most common extension needed is probably to add some new processing
similar in kind to that used in the current views. This would typically be done by
adding methods to the VMRenderManager object ($rm in the scripts).
The support for graph visualization is an example of such an extension we
added late in the development of the portal code base.
A second extension point to bear in mind is that new DataStores, which
load and process data in different ways, can be implemented
by extending AbstractDataStoreImpl.
Thirdly, note that new facet types can be defined by implementing the
Facet and FacetState interfaces. The implementation
of text search is an example of doing this in non trivial ways
(see QueryFacetImpl and QueryFacetStateImpl).
Top
Appendix: Scripting objects
In this appendix we provide more details of the specific "model" objects typically
used from the velocity scripts. We cross-reference to the javadoc which is the definitive
source for all such details.
When the controller invokes the velocity processing engine it places some
data objects into the "context" of the engine so that they are available
as variables within the velocity scripts. These variables are:
request
-
The HTTPServletRequest which prompted the action which lead to this
view being called. This can be used to access additional parameters
and session state.
datasource
- The DataSource object through which all the configuration information
and data can be accessed. Note that in the template this variable is
called
datasource whereas in the http request the parameter
which specifies this is Ds, this is purely for historical
reasons there's no strong reason they should have the same name but
no good reason they have ended up being distinct.
filter
-
A FilterState object which defines the current search filter. If there
is no search in progress then this will be the root filter state.
rm
-
A VMRenderManager object which provides services to assist with
rendering views. In particular it has some methods to simplify
generating URL's to access various portal services and support
for recursively calling the velocity engine in order to render
embedded object values.
resource
-
A NodeWrapper object which wraps up the RDF resource, if any, which is being
processed.
a. Model - filters and facets
The core idea of search in the portal is to divide the search space into
a set of dimensions, called facets. Each facet specifies some
property of the objects being searched over. This might be a simple keyword
value or a hierarchical classification. The Facet Java interface is
used to represent the definition of a single facet, the FacetState interface
defines a search constraint for a single facet and a FilterState is a
collection of FacetStates (one for each Facet configured for the portal).
The default viewing templates use these interfaces to generate a faceted
browsing user interface for navigating the portal data.
Used to represent an individual browsable facet for navigating the portal.
This is an abstract interface and new Facet types and implementations can be added.
The Facet definitions are found by iterating over the $datasource.facets.
Important methods:
getDisplayName() velocity short form displayName
- The display label to use to describe the facet.
getParmameterName() velocity short form parameterName
- The parameter name to use to reference this facet
when constructing a portal access URL.
Represents the state of a portal search. It comprises a set
of FacetStates which describe the current state of each available
search facet. It is available via the context variable $filter
Important methods:
isRoot()
- Returns true if this is the root filter state in which no search
has yet been performed.
getFacetStates() velocity short form facetStates
- Return a list of the FacetStates in this filter.
getMatchList() velocity short form matchList
- Return an ordered list of wrapped resources which match the filter.
The list will be ordered on the rdfs:label property of the resources.
getMatchList(start, length)
- Return the next
length elements from the ordered list of
elements which match this filter, starting at the element numbered start.
getSize() velocity short form size
- The number of resources which match this filter, i.e. the length of the matchList.
getEncoding() velocity short form encoding
- Return a URL parameter string which can be appended to a portal
request URL to pass it this facet state.
getRefinements(Facet)
- Return a list of the ways in which this FilterState can be futher
refined. This will be a list of narrower FacetStates for the given facet.
getRefinementCount(FacetState)
- Return the number of resources that would match a refined filter, based
on the current filter state but refined by substituting the given FacetSate (e.g. as
generated by
getRefinements) as the value for the corresponding facet.
refine(FacetState)
- Return a new FilterState which is based on this one but with the
refined FacetState in place of the current state for the corresponding Facet.
removeFacet(Facet)
- Return a new FilterState which is based on this one but with the nominated
search Facet cleared (reset to its root state).
Used to represent a single dimension of a search corresponding to a
defined Facet.
Important methods:
getDisplayName() velocity short form displayName
- The display label to use to describe the facet state.
getFacet() velocity short form Facet
- Return the Facet which this is a state of.
getParent() velocity short form parent
- Return the parent facet state, or null if this is already the root state.
Used to walk back up the refinement tree for a hierarchical facet.
isRoot()
- Return true if this is the root state for this Facet, i.e. no filter
constraints have been specified for this Facet.
b. Model - datasources and stores
This object encapsulates all the configuration information for a given
data source and provides access to all of the information associate with
it. It is provided in the context variable $datasource
Important methods include:
getName() velocity short form name
- The name of the data source, used in display titles.
getEncoding() velocity short form encoding
- The String that can be used to encode this datasource in a parameter
in a URL request string.
getFacets() velocity short form facets
- Return a list of Facets defined for this data source. This includes
the textSearch pseudo facet. A list of real, user defined, facets can
be found using
getRealFacets.
getDataStore() velocity short form dataStore
- Return the store abstraction through which the data and associated
ontologies can be accessed.
getSourceURLDescription(source)
- Part of the provenance support. Takes a resource which specifies a
place from which data has been harvested and returns the descriptive text
associated with that source, displayed in the page footnote by default.
getSourceURLBanner(source)
- Part of the provenance support. Takes a resource which specifies a
place from which data has been harvested and returns the banner text
associated with that source, display in the page provenance side-bar by default.
The provides an abstraction of the different possible ways in which
the data associated with a DataSource may be held. There are two implementations
provided - one where all data and ontologies are loaded from files into
memory and one where all the data is held in a Jena database. Almost all
of the operations on DataStores are performed via other interfaces such
as FilterState and DataStores are not often directly accessed from viewing
templates scripts. However a couple of operations which are sometimes
used are:
listCompatibleRelations(source, target)
- Return a list of all the properties in the DataSource ontologies which
can be used to link the given two resources. The entries in the list
will be wrapped resources.
getDataModel() velocity short form dataModel
- Return the Jena Model which holds all of the instance data
accessible to this portal. Used to access raw, unwrapped, RDF data.
getOntologyModel() velocity short form ontologyModel
- Return the Jena OntModel which holds all of the ontology information
accessible to this portal. Used to access raw, unwrapped, RDF data.
c. Model - RDF wrappers
The SWAD-E portal is built on top of the Jena semantic web toolkit, which
in turn provides a very rich interface for manipulating RDF and OWL data.
There is nothing to prevent one access the Jena Model's which represent
the data using the DataStore access functions. However, it is generally
more convenient to access the RDF data via a set of wrapper objects which
are slightly easier to script. In particular we support access to RDF property
values just using a qname string rather than having to construct a Property object.
A wrapped up version of an RDFNode, i.e. any RDF value including
a Resource or a Literal value.
Important methods:
getName() velocity short form name
- Return a displayable name for the node. If it is a literal
then this will be the lexical form. If it is a resource then
this will be its rdfs:label if it has one, otherwise it will be
its localname.
getEncoding() velocity short form encoding
- Return the String that can be used to encode this node in a URL
parameter string.
render(context, request)
- Return a embeddable piece of HTML that can be used to display and
link to this node. For a literal this will just be the same as
name.
For a resource then a suitable display template will be used. The context
is an arbitrary string that template designers can use group display templates
which might be used in different situations. The only fixed context name is
leaf, a resource rendered in a leaf context
will be embedded as an html link to a pageview onto that resource. Any
other context will be dereferenced, using the template settings in
the portal configuration file, to a template name. That template will be used
to create an embedded rendering of the resource and the resulting HTML will
be returned.
This is a subclass of NodeWrapper for the case where
the wrapped up RDFNode is a resource.
In addition to the NodeWrapper methods this supports a number of methods
to access the properties of the resource. These properties are generally
specified using strings which can use qname format (using any namespace
prefix known to the datasource). Important methods are:
getPropertyValue(propString)
- Return a single value of for the given property of this resource as a simple
string. If there is more than one value an arbitrary one will be chosen.
getPropertyValueLang(propString)
- Return a single value of for the given property of this resource as a simple
string. If there is more than one then the value with the language tag closest
to the datasource's preferredLanguage setting will be used.
getProperty(propString)
- Return all the values for the given property of this resource.
The result will be a MultiStatementWrapper which can be used to
iterate over the values..
getTruncatedTextPropertyValue(propString, n)
- Return a string containing the first
n words
of the given property (whose value is assumed to be a literal).
getProperties() velocity short form properties
- Return a list of all the values for all properties of this resource.
The result will be a collection of MultiStatementWrappers, one for
each property.
hasProperty(propString)
- Return true if this resource has at least one value for the
given property.
hasType(typeString)
- Return true if one of the rdf:types of this resource is the
given resource URI.
getUri() velocity short form uri
- Return the URI for this resource as a string.
getPrimarySource() velocity short form primarySource
- Return the primary source for the properties of this "object".
The datasource configuration defines a "primaryProperty". We check whether
this resource has a value for that property and if so return the source
of that value as the primary source for this resource.
getLinkAddress(request)
- Return an href value that can be used to link to a display of
this resource by the portal.
findProperties(propString)
- Locates a subset of the properties of a resource as a collection suitable
for traversal and rendering. In this case the properties are identified
as any subProperty of the named property. For example, a portal might
define the notion of a relational link between different displayable
objects and provide a number of particular relations arranged in a subProperty
hierarchy. This method can be used to list all the relations of resource
by passing it the root relation property.
findPropertiesByClass(markerTypeURIString)
- Locates a subset of the properties of a resource as a collection
suitable for traversal and rendering. In this case the properties
are identified as any property which is also marked as having rdf:type
of the given class.
Used to represent a set of property values for a resource.
In Jena terms its really a collection of Jena Statements which
are all expected (though not actually required) to have the
same subject and predicate values. The values are accessible as ordered collections
rather than Jena iterator style to simplify scripting.
Important methods:
getValues() velocity short form values
- Return the collection of all the values for this property, sorted
(by rdfs:label). The entries will be NodeWrappers.
getValue() velocity short form value
- Return an arbitrary selection from values.
getPropName() velocity short form propName
- Return the displayable name for the property in this collection (arbitrary
if there is more than one property represented).
getStatementSource(object)
- Return, as a ResourceWrapper, the source URI from which the given
value of this property was obtained (without the object argument it
will pick an arbitrary value to check for sources).
d. Model - rendering support
This provides a miscellaneous collection of utilities to
help with generating view pages from the data. Most such work
is done via the specific Model objects above and the render manager
(context variable $rm) is just a convenient place
to put anything which doesn't fit elsewhere.
Important methods:
getDataSources() velocity short form dataSources()
- Return a collection of all available data sources. Useful to provide
a top level index page into all sources supported by a portal instance.
getRelations() velocity short form relations
- Return a list of all the (wrapped) properties which are to be
treated as relations between displayable portal objects. By convention
there is a base Property used for such relations (defined in the config file)
and any subproperties of that base will be returned by this call.
getStyleSheet() velocity short form styleSheet
- Return a reference to the style sheet defined for use by this data source.
This allows the same templates to be used for different data sources but
change their look-and-feel by modifying the style sheet.
recordVisit(name, link)
- Record that a visit was made to a given location. The name is a
display name to use to describe the location and the link is an
href which can used to jump back to that location. The visit record
is stored in the session state for the servlet connection.
getVisits()
- Return a list of objects describing the recently visited places.
Each object in the list has a
name and a link field.
renderFacetBrowse(facet, annotation)
- Return a javascript fragment which initializes a tree-structured view onto
a hierarchical facet. The facet is specified as a ResourceWrapper (not
a Facet object). The second argument defines what data should
be associated with each node in the tree, it can be one of
"enc" (an encoding of the node resource),
"comment" (a scope note associated with the facet node) or
"both".
- Viewing relation graphs
- The VMRenderManager also has support for rendering a server-side
image map depicting the resource and all relational links connecting
it to other displayable resources.
createRelationGraph(resource, depthLimit) creates the graph
renderRelationGraph(graph) takes a created graph, creates a temporary
server side image depicting the graph and returns a string that can be used as an href
in an IMG tag to show the image.
getVertices(graph) returns the collection of resources indexed
in the created graph, each of these can then be processed using getMapRectFor(resource, graph)
to obtain a coordinate string that can be used to create the client side
portion of an image map which allows the graph to be used to jump to the
depicted resources.
The set of relations shown in the graph will be a subset of those returned by
getRelations(), the subset is stored as session attributes using
the relation's display name as the attribute name and true or false
to denote whether the relation should be displayed (no value implies the relation
should be displayed). The relations to be displayed given the current session
attributes can be found using getVisibleRelations() and each
relation can be tested for display using shouldShowRelation(relation).
top
|