Sunday, December 04, 2016

Composing and reusing request handlers: the "Invisible Queries Request Handler"

Here is an extract of an old article [1] on Lucidworks.com by Grant Ingersoll:

"It is often necessary in many applications to execute more than one query for any given user query.  For instance, in applications that require very high precision (only good results, forgoing marginal results), the app. may have several fields, one for exact matches, one for case-insensitve matches and yet another with stemming.  Given a user query, the app may try the query against the exact match field first and if there is a result, return only that set.  If there are no results, then the app would proceed to search the next field, and so on."


The sentence above assumes the reader has the capability of changing the (client) application behaviour by issuing to Solr several and subsequent requests on top of a user query.

What about you don't have such control? Imagine you're the relevance engineer of an e-commerce portal that has been built using Magento, which, in this scenario, acts as the Solr client; someone installed and configured the Solr connector and everything is working: when the user submits a search, the connector forwards the request to Solr, which in turns executes a (single) query according with the configuration.

What if that query returns no results? The interaction is gone, and the user will probably see something like "Sorry, no results for your search". Although this sounds perfectly reasonable, in this post I'd like to focus on a different approach / alternative (that could still end itself with a no results message), based on the "invisible queries" thing you can read in the extract above.

The main point here is a precondition: I cannot change the client code; that because, for example:
  • I don't want to introduce custom code in Magento
  • I don't know PHP
  • I'm strictly responsible for the Solr infrastructure and the frontend developer doesn't want / is not able to implement this feature in a configurable way
  • I want to move as much as possible the search logic in Solr    
What I'd like to do is to provide a single entry point (i.e. one single request handler) to my clients and behind the scenes, being able to execute a workflow like this:


In few words: I want to execute several request handlers until one of them produces a positive result.

Before entering in the implementation, which is very simple and can (will) be improved with a lot of cool things, here, in my github account, you can find / see / use a working version of such component:


Other than source code with comments, you can find a brief documentation, unit and integration test and a maven repository.

The underlying idea is to provide a Facade which is able to chain several handlers; something like this, in solrconfig.xml:


<requestHandler name="/search" class="...InvisibleQueriesRequestHandler">
    <str name="chain">/rh1,/rh2,/rh3</str>
</requestHandler>

where /rh1, /rh2 and /rh3 are standard SearchHandler instances you've already declared that you want to chain in the workflow described in the diagram above. 

The InvisibleQueriesRequestHandler implementation is very simple: as you can see the handleRequestBody method sequentially executes the configured handler references, and stops when a query returns positive results (i.e. numFound > 0):

chain.stream()
// Get the request handler associated with a given name
.map(refName -> { return requestHandler(request, refName); })
// Only SearchHandler instances are allowed in the chain
.filter(SearchHandler.class::isInstance) 
// executes the handler logic 
.map(handler -> { return executeQuery(request, response, params, handler); })
// Don't consider negative (i.e. no results) executions
.filter(qresponse -> howManyFound(qresponse) > 0)
// Stop at first positive execution
.findFirst()
// or, if we don't have any positive executions, just returns an empty response.
.orElse(emptyResponse(request, response)));

I tried to use a composed method approach, so the remaining part of the class is composed by several small and (hopefully) cohesive methods. I think in this way the code is more readable.

As I told you before, although the handler is working it can be improved a lot. A useful thing could be, for example, to put default parameters / values in the InvisibleQueriesRequestHandler and have each subsequent handler just override them (instead of declaring everything from scratch).

If you want to give a try without entering in the implementation details, there's a Maven repository with the last stable version of the library; see the README in the github repository for detailed instructions.

Feel free to try and, as usual, any feedback is warmly welcome!