About

The goal of Service-Detective is thus to overcome shortcomings of current Web Service search engines by:

The Service-Detective project will deliver a search engine that enables users to find up-to- date information on available Web Services. It will employ automated crawling, information retrieval methods and analysis techniques and shall be able to scale with the increasing number of services, as this approach does not rely on a central editorial team that would necessarily become a bottleneck once the number of deployed services reaches Web scale. Consequently, the approaches developed by Service-Detective can adapt quickly to changes in terms of the available services. The search engine will leverage available information exposed by current technologies and extend this information with semantic annotations to allow for more accurate retrieval. It will use the service information to enable efficient clustering and matchmaking of services, in view of the goal to provide an efficient discovery and clustered search possibility for Web Services.

The concept

Service- Detective approaches the discovery problem in two orthogonal dimensions:

We illustrate the high-level architecture of Service-Detective in Figure 1. By crawling the Web, we obtain the available services and related information: We initially start from a set of services crawled from the Web. Given the information in the interface descriptions we can identify invokable endpoints. Around those endpoints we gather information. For instance, given the underlying transport protocol4, we can assess the liveliness of endpoints, as well as their geographic location. Aggregating this with other information sources, we can obtain information about the housing condition (shared, dedicated, dynamic DNS, etc) and infer properties about service reliability.

The sources of this information are divers. In particular sources of information include the provider’s service definition, as well as documents pointing to that definition and vice versa. Also other resources related via multiple Web links or through keyword similarity might be considered. In order to provide a general infrastructure, it is necessary to design methods for identification and retrieval of such heterogeneous information and to investigate how to exploit it. This first task is performed by our smart Service Crawler that is able to gather both services and information related to services.

Having the data gathered from the crawler, e.g. information of the service terms (Web Service API) and documentation, we are able to create semantic descriptions of service functionalities. Therefore we analyze the data and enrich it, based on the results of the analysis.

figure 1:Service-Detective structure
Figure 1: Service-Detective structure

Semantic descriptions of Web Services can be added in two different ways: using generic Web Service ontologies, as e.g. the Web Service Modeling Ontology (WSMO)5, or using domain-specific ontologies. The innovative approach in Service-Detective is to facilitate service discovery by automatically adding semantic service descriptions for different forms of Web Services, using both information from the service provider and from data sources independent of the service provider. Different forms of Web Service descriptions include for example descriptions in WSDL (Web Service Description Language), which is a W3C- recommended XML-based language that provides a model to describe Web Services, or REST (Representational State Transfer), which is an architecture style that has no standardized interface descriptions, as compared to WSDL, what makes REST service descriptions harder to detect on the Web than WSDL descriptions. REST input parameter are usually submitted in the query part of a URL and the response data is often described by a mixture of XML Schema and textual information.

It will then be integrated in one coherent semantic model (conceptual index) allowing effective retrieval.

The information from these various sources will then be used to build up an initial version of one coherent semantic model (conceptual index) that allows for effective retrieval. The semantic model will then be aggregated into a clustering and matchmaking component that uses term similarity approaches to build clusters out of the results.

In summary, the objectives of Service-Detective are to: