Search Anything Related to Library and Information Science

Federated Search


1. Introduction: Multiple platforms that provide searchable resources to a target user base participated in a federation. A user makes a single query request in the federation, which is distributed to the all the platform participating in the federation in real time with the appropriate syntax for that resource. The federated search then aggregates the results that are received from the platform for presentation to the user with minimal duplication. One application of federated searching is the Meta search engine; however, Meta search engine unable to index many documents called deep Web, or invisible Web.

2. Definition: Peter Jasco defines federated search as, “Transforming a query and broadcasting it to a group of disparate databases with the appropriate syntax, merging the results collected from the databases, presenting them in a succinct and unified format with minimal duplication, and allowing the library patron to sort the merged result set by various criteria”. The definition of federated search generally includes the following aspects.
a) Search Scope: The federated search inspects multiple resources simultaneously for relevant resources.
b) Software: Federated search engine is the software that use a protocol standard such as Z39.50.
c) Presentation: The results are presented in a uniform interface that of the federated search engine. Depending on the particular program‘s capabilities, the results can be ranked and de-duped.
In simple, federated search is an information retrieval technology that allows the simultaneous search of multiple searchable resources in real time and presents the resultant list in a unified way. Federated searches are inherently as current as the individual information sources, as they are searched in real time.

3. Approaches to Federated Search: Federated search work by the following ways-
a) Search-Time Merging: A query federator intercepts the query, and passes it to multiple platform or databases. The federator then waits to hear replies from the platforms, and when received, merges the results into a results list. This model relies on data repositories to provide a search function.
            The primary advantage of this approach is ease of implementation, because no additional indexing of content is necessary. The query federation system simply taps into existing systems and extracts results, which are then merged.
            The merging of search results into a sensible hit list is difficult if based on relevancy, as each search engine called will score relevancy in a different way. Again, performance issues can occur if the federator waits for the slowest remote search engine to respond.
b) Index-Time Merging: This approach requires content to be acquired into a central index, and it is typical of traditional enterprise search systems.
Through acquiring all data into a central index, sophisticated query enhancement and relevancy algorithms the user can be provided with excellent search results.
The effort needed to acquire the content from the various repositories can be substantial. The indexing process must read each item, and re-read it every time a change occurs in the databases.  In some cases, for example where private content behind paywall is involved, this is not possible.
c) Hybrid Federated Search: In hybrid approach the content is indexed centrally. Repositories for which are not cost effective (or simply not possible) are federated to at query time.

4. Federated Search Interfaces: A few federated search interfaces provided by commercial vendors to the libraries are mentioned below-
a) SEARCHit: SEARCHit (https://www.auto-graphics.com/researchit-a-robust-federated-search-application/) is a federated search tool from Auto-Graphics, Inc. that saves time by enabling users to simultaneously search across multiple content resources and view a combined results set.
b) Encore Discovery Solution: Encore Discovery Solution (https://www.iii.com/resources/product-overview-encore-discovery-solution) is a product from Innovative Interfaces Inc that fully integrates the discovery process and gives users the types of self-service capabilities they have come to expect through the web, by placing library and its unique offerings at the forefront.
c) Primo®: Primo (https://www.exlibrisgroup.com/products/primo-discovery-service), a product from Ex Libris is a one-stop solution for the discovery and delivery of local and remote resources, such as books, journal articles, and digital objects.
d) MasterKey Connect (MKC): MasterKey Connect (https://software.indexdata.com/mkc/mkc-profile.html), a product from Index Data is a service which provides a simple network Application Programming Interface (API) to thousands of online databases, journals, library catalogues, and other resources. The service allows you to use a simple, XML-based API to access practically any searchable site, whether open access or subscription-based. It consists of open source software based on international standards and communication protocols such as Z39.50, but also supports non-Z39.50 searching.

5. Examples of Federated Search Engines: In the following, a few examples of Meta search engines are given-
a) WorldWideScience (http://www.worldwidescience.org): WorldWideScience is hosted by the U.S. Department of Energy’s Office of Scientific and Technical Information. WorldWideScience.org is a global federated science search engine designed to accelerate scientific discovery and progress by accelerating the sharing of scientific knowledge. Through a multilateral partnership, WorldWideScience.org enables anyone with internet access to launch a single-query search of national scientific databases and portals in more than 70 countries, covering all of the world’s inhabited continents and over three-quarters of the world’s population. From a user’s perspective, WorldWideScience.org makes the databases act as if they were a unified whole.
b) Science.gov (http://www.science.gov): Science.gov is a federated search engine that serves as a gateway to information sources representing most of the R&D output of the United States government scientific and technical information and research. Science.gov searches over 60 databases and over 2200 selected websites from 15 federal agencies, offering 200 million pages of authoritative U.S. government science information including research and development results.

6. Usefulness of Federated Search: The basic idea of federated search is to improve the accuracy and relevance of individual searches as well as reduce the amount of time required to search for resources. The following are the usefulness of federated search-
a) Time Saving:  Federated search allows a user to search multiple databases at once in real time, arrange the results from the various databases into a useful form and then present the results to the user. So, the user is reluctant to go to all the databases individually.
b) Single Searching Platform for Multiple Resources: Federated search helps user to put his/her query into a single platform and search multiple disparate content sources.
c) Familiar Interface: Searching for information using electronic databases can be tedious and time-consuming as all databases use different interfaces and query language to which user are not familiar with. Federated search is the options here that provide only a single interface to the user.

7. Problems with Federated Search Engines: A challenge faced in the implementation of federated search engines is scalability, in other words, the performance of the site decreases as the number of information sources comprising the federated search engine increases.
a) Problem in Indexing of Subscription Databases: Not all federated search engines can search all databases, although most can search Z39.50 and free databases. Most vendors that claim to offer federated search engines cannot currently search all licensed databases due to the problem with authentication for subscription databases. Before buying a federated platform for your library, ask vendors to demonstrate that they can search all of your library’s databases using your library’s own authentication, both locally and remotely.
b) Duplication cannot be Avoided: To completely de-dupe search results, it’s necessary to download all results from all databases, but practically it is impossible. So, for federated search engines, true de-duplication is virtually impossible. Vendors that claim to do true de-duping usually are just de-duping the first results set returned by the search.
c) Relevancy is Questionable:  The abstract and full-text data, as well as the indexing that content providers use to relevancy-rank their content, are unavailable to federated search engines. The content providers have the full article and indexing to work with, but not the federated search engines. They have only the citation to search on, so the relevancy is a questionable thing in federated search engine.
d) Native is Better than Federated: You can’t get better results with a federated search engine than you can with the native database search. In case of federated search platform, the same content is being searched which are already taken care of by the native search engine at the same time the federated engine does not enhance the native database's search interface. All federated search does is translate a search into something the native database’s engine can understand. But it’s restricted to the capabilities of the native database’s search function. Federated searching cannot improve on the native databases search capabilities. It can only use them.
e) Time Taking: In a federated search interface user needs to wait sometime to get all the result. Again, after getting the result the user need to click on the resources relevant to them in the federated retrieval list that will lead them to the individual databases. So, ultimately the user needs to deal with both the federated search platform as well as the native interface which again is a time consuming job.  In case of general purpose search interfaces, the results are generally arrived instantly.

8. Conclusion: Federated searching and information retrieval service collect descriptive metadata from multiple, diverse target resources, including but not limited to commercial or licensed electronic resources, databases, Web pages, and library catalogues and present a single interface to the user. The users can search all the databases from that single interface itself and thus reluctant in remembering all the web resource the library has access to. It should be noted; however, that the user may have to deal with the native interface (e.g., Sage, Emerald, Elsevier) once he or she clicks an item in the federated search results list, so using the search results may ultimately require dealing with multiple interfaces.



How to Cite this Article?
APA Citation, 7th Ed.:  Barman, B. (2020). A comprehensive book on Library and Information Science. New Publications.
Chicago 16th Ed.:  Barman, Badan. A Comprehensive Book on Library and Information Science. Guwahati: New Publications, 2020.
MLA Citation 8th Ed:  Barman, Badan. A Comprehensive Book on Library and Information Science. New Publications, 2020.

Badan BarmanBadan Barman at present working as an Assistant Professor in the Department of Library and Information Science, Gauhati University, Guwahati-781014, Assam, India. He is the creator of the LIS Links (http://www.lislinks.com) - India’s most popular social networking website for Library and Information Science professionals. He also created the UGC NET Guide (http://www.netugc.com) and LIS Study (http://www.lisstudy.com) website.

No comments:

Post a Comment

Website Pageviews