Downes.ca ~ Stephen's Web ~ Atlantic Workshop on Semantics and Services

Atlantic Workshop on Semantics and Services - Day 1

Jun 14, 2010
By Stephen Downes

Originally posted on Half an Hour, June 14, 2010.

Introduction to Semantic Web Technologies
Harold Boley, NRC

Presented sample of RDF being used in BBC music site. Wikipedia is using external, public data sets. In short, it is using the web of data as a CMS. More and more data is on the web, and more and more applications rely on that data.

We ned to break out of database silos. We need the infrastructre for a web of data, so data can be interlinked and integrated on the web, so we can start making queries on the whole, not just specific databases.

Example: book. There may be a dataset with ID, author, tite, ec. and separate records for the author and home page, and publisher and home page.

We could export this data as a set of relations. Some of these resources might not have globally visible URLs, though, like the author and publisher information. Still, the relations form a graph. One can export the data generating relations on the fly, and can export part of the data.

We could take this data and merge it with other data, provided we have the same resource id. (Example of english & french metadata being merged). Eg. Combine the 'person' metadata from the book with other sources, such as Wikipedia data (dbpedia.org).

This can be done automatically. This is where ontologies extra rules, etc., come in. These can be relatively simple and small, or huge, or anything in between. And we can ask even more powerful queries of the web of data.

Semantic Web Services
Omair Sgafiq, University of Calgary

Conceptual Model

Next Generation of the web moves from static to dymamic (WWW to WS) and from syntactic to semantic (WWW to semantic web). The final step is combined: Semantic Web Services

Web services are programs accessible over the web. The service oriented architecture (SOA) uses web services as basic building blocks. The current technology includes WDSL, SOAP, UDDI

The web service usage process involves a discovery process where people find services they need. But consumers can't use the descriptions to find services - the description is syntactic, which means nothibg to the user. Also, there's no semantically marked-up content, no support for the semantic web.

To fix this, people have introduced semantic web services. It uses ontologies as the data model, and invokes web services as an integral part. This gives us 'Semantically Enables Service Oriented Architecture' (SESA). We semantically describe client requests, employ mediation techniques, and semantically enhance the complete SOA lifecycle.

Goal-Driven Web Service Usage

There is currently no layer that describes user objectives formally. This requires that service providers lift up semantic descriptions of their services into the web services discover layer, and the objective, described semntically, can be matched to that.

This leads to a definition of WSMO. It combines:
- Goals: objectives that a client wants to achieve
- Ontologies: formally specified terminology used by all other components
- Web Services - seantic descriptions of capabilities and interface
- Mediators - connectors between components with mediation facilities

Ontologies are the data model used throughout. Ontology elements include concepts, attributes, relations, functions, instances, and axioms. These specify the knowledge domain of the service,

The description of the service includes Dublin Core, versioning, quality of sercvice, and financial information, plus functionality, such as choreography and orchestration. (This is *very* high level and - frankly - looks like vaporware - SD).

The capability specification includes pre-conditions, assumptions, post-conditions and effects. Pre-conditions might require, for example, a departure and destination point. The post-condition might be the issuing of a ticket. Choreography tells the service in what order it should perform steps. Orchestration allows service providers to invoke other service providers.

In the objectives section, the client can formally specify their requests. The prupose is to facilitate problem-oriented WS usage, dynamic WS usage, and to provide a central client element. The goal description has three parts: requested capability, requested choreography, and requested orchestration.

The idea is to accept a goal, and decompose it into services. The client has different goal templates, or can write a complete description. First, selection is carried out, then, in operation, the service is actually invoked.

The mediators ensure heterogenty between the data and the processes. This is needed because, eg., clients will write goals in isolation from sevrice providers, leading to the possibility of mismatch. Mediators use a declarative approach, providing a semantic description of resources. Mediation cannot be fully automated.

(There are different approaches to creating semantically-enabled web services).

Semantic Matching
Virendra C. Bhavsar, UNB

Syntactic matching - is basically exact string matching. The result is 'yes' or 'no' (1 or 0). This also includes matching permutations of strings. For example, you can get a measure of similarity between two strings, by dividing number of words in common, with total number of words in string. But this sort of similarity does not have much meaning. We would match, eg. 'electruc chair' and 'committee chair'.

Semantic matching would be useful in a variety of application, for example, e-business, e-learning, matchmaking portals, web services, etc.

Semantic similarity - is an identity relation, between 0 and 1.
Semantic distance, is a matching relationship, between 1 and infinity. It consists of matching of words, short texts, documents, schemas, taxonomy, etc.

Concept similarity in a taxonomy: we can find the similarity of two nodes in a taxonomy tree by tinking about how 'far' they are, as defined (say) by how many nodes we have to traverse. Or the amount of the path length they have in common. Or you might compare the contents of the nodes. Etc. There are different ways to do this.

The motivation for this is to, eg., match online buyers and sellers in a marketplace. The buyer may define a specific set of requests. You need more than flat representations of resources. You need to be able to identity 'part-of', etc. You also need to be able to match properties of buyers and sellers.

(SD - it seems ridiculous to be talking about the 'similarity' of concepts absent any sort of context - similarity depends entirely on salience - because Bhavsar, by using taxonomy trees to measure similarity, is using a 'distance' measure of similarity he is unable to account for salience - see Amos Tversky, Features of Similarity).

Description of partonomy tree similarity engine (based on eduSource work!). See the teclantic protal http://teclantic,ca which uses this type of similarity matching. (This portal is not active now, however.)

Current work includes weighted semantic tree similarity engines, semantic searching, weighted graph similarity engines, multi-core and cluster implementations, and matchmaking portals.

Resource Description Markup Language
Stephen Downes and Luc Belliveau, NRC

My presentation page is http://www.downes.ca/presentation/256

Using Semantic Web Technologies To Unwind Humanities Citations
Mobertson, MAU.

Discussing how semantic web tecnologies are used to manage citations from texts, eg., Latin texts. How do we cite this work? When you're dealing with Latin literature, there are international standards.

Early citations were non-actionable - they would point to text, but in a different work or volume. Eg. John 3:16 - the advantage, though, is that they will be the same everywhere. John 3:16 will be the same no matter what language (The 'John' but will change, but 3:16 won't.

In the Humanities, we use various abbreviations (eg., 'Ar' = Aristophenes in English, Apristotle in French). These were used just to save print. So we want to expand this, so that it's beautiful and intelligble.

We have, for example, Canonical Text Service URNs. Eg. urb:cts:greekLour:tlg0532:tlg001:1:10

In the browser, we will:
- reject the abbreviation wholly
- exploit dbpedia, etc., to provide internationalized (or partially internationalized) versions of the references
- make references dereferencable - ie., you have the ability to click on the reference and see the text immediately.

To integrate CTS URNs and dbpedia, we are using Swiftowlm. We generate the record. Then we generate HTML 5 data divs to display the information, so we can extract it from the database with SPARQL queries.

It also supports a taxonomy of citation (evidence, refutation, etc). This actually allows the citation inline inside the div.

This will also enable the system to replace old-style scanning systems (JSTOR) with the live cittation system. And we will be able to geo- and temporally reference secondary sources by reference to the primary sources they cite. Etc.

(Described how to extract ontologies from text. turns out the method uses pre-supplied lists of terms).

OWL-based Execution of Clinical Practice Guidelines
Borna Jafarpour, Dalhousie

Clinical practice guidelines (CPGs) have a great deal of information. They support decisions, and place constraints on ordering, among other things. (Sample diagram, with flow of tasks).

The motivation behind OWL-based execution is to provide inference-power of OWL to execute CDGs, to integrate everything together. Thus we can develop a workflow ontology in OWL. We can plug in any reasoner that we want.

Problems in OWL

Handling preconditions: 'any' and 'all' easily handled in OWL. 'any k' preconditions (where k is a number) is harder - you have to use a combination of all and some.

If t2 is a precondition, and in turn has precondition t1, OWL can't handle an uncle problem, This is where the precondition t11 is a subtask of t1, it needs to know t1 satisfies the precondition

Also, another problem is restructions of a subset of data values in a range.

Also, there is the 'open world assumption' - from the absence of a statement, a reasoner should not assume that the statement is false.

Thus there needs to be preprocessing.

(more and more stuff to make it work)

Question: if you have to do all this in OWL, is it worth it?

Platform for Ocean Knowledge Management
Ashraf M. Abusharekh, Dalhousie

We need to mix information from oceanography and marine biology on, say, leatherneck turtles. The leatherneck tuyrtle community of interest has members from both these groups.

We developed an SOA+knowlede management approach that supports multiple knpowledge domains, and multiple disparate and distributed users. This is a framework for publishing new scientific capabilities.

(Yes, the presentations really are this vague)

(Big picture with boxes and layers)

POKM relies mostly on the ocean tracking network (OTN). The POKM ontology is much-needed because to the different and separate nature of the users.

( More boxes and layers) defining a federated system, composed of communities of interest from NCEs (network of centers of excellence). Each COI publishes a service description that is available within the NCE.

The problem in POKM is that it is a single SOAP but multiple different users. So, eg. the animal tracking COI consists of a migration service and a detection service. The detection system meanwhile is based on two databases, POST and OBIS (one for Pacific, one for Atlantic). The idea here is that the detection service would query the repository to determine which database to use.

(Diagrams of scripts for asdding new services - experts need to define mappings)

(Neat diagram of a Yahoo-pipes like composition of multiple services.) The tool being used mostly is Glassfish.

Question: how many man-years did it take to develop something like this. Answer: more than 10.

Interface Between Modular Ontology and Applications
Faezeh Ensan, UNB

OWL-DL has challenges for representing complex ontologies. For example, it has difficulty with efficient reasoning. Also, evolution management is a challenge. Finally, there is an issue with the integration of different developed parts.

Two approaches to address this:
- decompose large ontology into smaller parts
- develop modular formalisms

The purpose of modular ontology formalisms is to make integration and reasoning using the ontology feasible.

The knowledge is 'encapsulated' the knowledge in the ontology is represented through interfaces. So ontology modules can be efficiently interchanged. You can get different perspectives to a module. And it supports polynorphism.

Example: tourism ontology. It could use either 'Canada destinations' or 'North America destinations'. Or, a 'sightseeing' subsystem could link to an ontology related to 'scientific attractions', or to an ontology related to 'natural attractions'.

The knowledge base of the 'utilizer' modules are augmented by sets of nominals from 'realizer' modules. After augmentation, the knowledge engines are not expected to take into account the realizer modules any more. (Diagram makes it look like data from the realizer module is integrated into the utilizer module).

(Screenshop of application)

We applied the modular ontology in Service New Brunswick. We needed to define some sort of common meaning of the terms related to 'research' for SNB, to precisely define research. The modular ontology has modules related to finding, governance, working with existing proceses, and reporting activities.

The main module was 'research project'; other modules were 'reserach funding', 'research performers', 'corporation partners', etc.

(Diagram of 'requirements' module). A requirement is a construct that comes from a concept called a 'requirement', that can be a 'gap', say, or a 'new idea', or 'opportunity', of 'issues', etc. There will be a 'priority', different 'approaches', such as 'development', 'research', each of which will have 'risk' and 'cost'. Etc.

Elements of this ontology can come from other ontologies. For example, the 'requirements' module may have a 'project' or 'SNB Unit'. These are defined by other modules.

(Diagram of Research Project module). A reserach project needs a requirement. This requirement is defined by the requirements module. It needs a funding source, defined by another module. Etc.

(SD - I like the concept of ontology modules.)

(Boxes and layers diagram - query execution.) Kind of a neat example where a 'query dispatcher' breaks a query into two separate queries, one for each module. Then an aggregator unit joins the results from these queries and displays the results.

Question, about whether it would be better to use an meta-ontology. Answer: we can do this without the need for a meta-ontology.

NICHE Research Group
Sajjad Hussain, Dalhousie
(Not on agenda)

Stands for kNowledge Intensive Computing for Healthcare Enterprises. Includes about 20 researchers. Mostly we deal with health care technology. Mostly proof-of-concepts.

(Big boxes and layers diagram) - everything from personal health records to health team collaboration to evidence organization to clinical practice guidelines.

(Even bigger boxes and layers diagram - more boxes than they have people). Everything from policy to risk assessemnt to social networks, etc.

(List of 9 projects).

Sample project: CarePlan - a "rich temporal peocess-specific clinical pathway that manages the evolving dynamics of a patient to meet the patient's needs, institutional workflows and medical knowledge". (Not making this up - SD)

The idea is to use a 'logic-based proof engine' to generate a 'personalized care plan'.

Another project - "knowledge morphing for clinical decision support". 'Knowledge orphing' is defined as the intelligent and autonomous fusion/integtation of contextually, conceptually and functionally related knowledge objects'.

(Nonsense. It's all nonsense. -SD)

(Diagram of 'knowledge morphing' by combining ontology fragments, to eg. generate contextualized subontologies, merge contextualized subontologies, and detect contradictory knowledge in merged knowledge components.)

Another project - structural-semantic matching, done at the proof level. Interesting if you like formal proofs. The proofs are based on 'similarities' (but again these similarities are based on 'distances').

Detecting and Resolving Inconsistencies in Ontologies Using Contradiction Derivations
Sajjad Hussain

Ontologies are the foundational representation formalism for web based information sharing. Ontology engineers develop domain ontologies by either making them from scratch or adapting existing ontologies. During this process, inconsistencies can occur.

Inconsistencies can occur either during original ontology production, or as an effect of an ontology reconciliation project, There is potential harm to the ontology as a result.

Formal definitions (I won't repeat the squiggles here) of 'ontologies (as a description logic 'DL) with axioms, and sub-ontologies that talk about some concept sof the main ontologies. Then we define 'ontology triples', which are created from a logic program defined by a set of rules.

(ick, I have always hated formalism - SD)

Next we define 'constraint rules' and 'contradictory triple sets'. This conjunction of triples leads to a contradiction. This allows us to define an inconsistent ontology (as one containing a contradiction, duh) and maximally consistent sub-ontology (being one such that any addition will produce a contradiction).

Example 'whale is a big fish' contraduiction detection.

(I can't bear to take more notes - he's standing there facing his slides and reading them, and his slides are sets of equations and formulae. "Every one of them.... must have.... triples... each of them.... the triple of Mi.... must have.... at least.... one of them.")