Querying IBM Watson Discovery with Salesforce Federated Search

As explained in this blog post, federated search enables you to search in third-party data repositories from the global search box of the standard Salesforce user interface. You can take this functionality one step further by creating your own custom search provider that connects to a REST API like IBM Watson Discovery. This blog post shows you how.

Benefits of federated search

Using the capabilities of the Federated Search API, compared to querying via Apex HTTP callouts, has several benefits:

  • A search can be executed by a user from within the global search box.
  • The search results are represented through External Objects. This means you have customization options like page layouts, object security, or SOSL queries automatically available.

See it here in action.

IBM Watson Discovery capabilities

The Watson Discovery Service is an IBM offering that enables companies to build cognitive apps that extract value from large amounts of structured and unstructured data. This diagram gives an overview of the service’s capabilities.

Typical use cases are, for example, to ingest and normalize large amounts of unstructured proprietary data or to exploit third-party pre-enriched news. Check out the Watson Discovery Service documentation to learn more about the service and how to set up your first environment and data collection.

DIY custom search provider

Federated search is based on the OpenSearch specification. As the Watson Discovery Service offers a REST API we need to add a basic middle-tier, in this case a node.js app, that we use as a proxy. The node.js app takes the federated search request, queries data from Watson Discovery, and transforms its JSON response to XML.

When you set up federated search, it parses the OpenSearch definition file in the node.js app and automatically creates corresponding External Objects.

By using Salesforce-specific extensions you have full control over the created object and field names. This lets you create dedicated External Objects for different search patterns if needed.

<Url type="application/atom+xml" sfdc:maxTotalResults="500"
    <sfdc:RecordType name="Watson Discovery">
        <sfdc:Field name="Author" type="string" sortable="true"/>
        <sfdc:Field name="Source" type="string" sortable="true"/>
        <sfdc:Field name="Date" type="date" sortable="true"/>

XML explanation:

  • Line 2: The template attribute defines which parameters federated search should send to the node.js app in a query request.
  • Line 4: A sfdc:RecordType tag specifies the name of the to be created External Object.
  • Line 5–7: The sfdc:Field tags define the field names and types for the External Object.

Creating OpenSearch output for the Watson Discovery API

IBM delivers SDKs for various programming languages. For our scenario we’re using their node.js SDK to connect from the proxy app to the Watson Discovery Service.

var DiscoveryV1 = require("watson-developer-cloud/discovery/v1");

var discovery = new DiscoveryV1({
  username: process.env.WATSON_USERNAME,
  password: process.env.WATSON_PASSWORD,
  version_date: DiscoveryV1.VERSION_DATE_2017_04_27

exports.runQuery = (queryText, callback) => {
      environment_id: process.env.ENVIRONMENT_ID,
      collection_id: process.env.COLLECTION_ID,
      query: queryText
    function(err, response) {
      if (err) {
      } else {

Code explanation:

  • Line 1: The Watson Discovery module from the node.js SDK gets imported as a variable.
  • Line 3: The variable gets set up with the security credentials of the service as well as the used API version.
  • Line 9: A basic method is exported that takes a query parameter. This is used later from the federated search query.
  • Line 10: The query method of the node.js SDK queries the Watson service with the given environment, collection, and query parameter.
  • Line 20: The Watson response gets returned without any modification as callback response for the runQuery method.

By using Mustache as a templating engine, it’s only a matter of minutes to create an OpenSearch compliant XML output for the REST API without being an XML hero. If you’re not familiar with Mustache (or alternatively Handlebars), you can see in this image how the mapping works.

Since we now have a method that queries the Watson Discovery API, we can add another method that responds to the federated search query request.

var request = require("request");
var watson = require("./watsonDiscovery");
var Mustache = require("mustache");
var fs = require("fs");

let mustacheTemplate = "./resources/response.mustache";

exports.processGet = (req, res) => {
  watson.runQuery(req.query.q, function(response) {
    fs.readFile(mustacheTemplate, function(err, data) {
      if (err) throw err;
      var output = Mustache.render(data.toString(), response);
        .header("content-type", "application/xml")

Code explanation:

  • Line 6: The mustache template is stored as a file in the node.js app.
  • Line 9: We’re passing the value of the queries’ q parameter to our custom Watson query method.
  • Line 10: If the query is successful the mustache file gets read from the local file system.
  • Line 12: Once the file is read, mustache is used to map the JSON data from the Watson call to the XML template.
  • Line 13: The XML gets returned as a response to the federated search request.

Querying outside of the global search box

I mentioned before that federated search allows you to query its data via SOSL. This is because External Objects are used as representation for the search results. This comes in handy when you want to use the search capabilities not only from the global search, but also from within Apex. An example can be to display Account-related information within a custom Lightning component based on the Account name.

public static List<Watson_Discovery__x> getSearchResults(String accountName) {
    List<List<SObject>> allResults = [FIND :accountName IN Name Fields RETURNING Watson_Discovery__x(Title__c) LIMIT 25];
    List<Watson_Discovery__x> results = allResults[0];
    return results;

Code explanation:

  • Line 3: A SOSL query for the Watson_Discovery__x object is executed with the given accountName parameter as search value.
  • Line 4: As the SOSL always returns a list of object lists, we’re fetching the first one (which makes sense as we don’t query across multiple objects here).

Security considerations

The example node.js app doesn’t include any security mechanisms. For a real-world implementation, always secure it with an OAuth2 server add-on like node-oauth2-server.

Get hands on!

The Watson Discovery Service is a powerful service to ingest, enrich, and query large amounts of unstructured data. Sign up for an IBM Bluemix account and explore it with IBM’s provided test data. From there on, get the source code of the node.js app from GitHub and create your own middle-tier on Heroku. Follow the instructions in the GitHub repo to set up your first custom federated search connector with Watson Discovery.

About the author

René Winkelmeyer works as a Senior Developer Evangelist at Salesforce. He focuses on enterprise integrations, mobile, and security with the Salesforce Platform. You can follow him on Twitter on his handle @muenzpraeger.

September 26, 2017

Leave your comments...

Querying IBM Watson Discovery with Salesforce Federated Search