streamsx.solr

Integration with Apache Solr

This project is maintained by Alex-Cook4

Welcome to the Solr Toolkit (incubating)

We are excited to offer Solr integration with Streams. Bringing the power of one of the most powerful search platfrorms to your streams processing. Find out more about why you should be excited about Solr here.

Operators

For toolkit documentation, read the SPLDOC.

SolrDocumentSink

This operator is used for writing tuples as Solr documents to a Solr collection. It takes in a set of attributes and a map<ustring,ustring> on its import port. Those attributes are committed to a Solr collection on a configurable interval (time or number of tuples). If specified, the map (attribute: atomicUpdateMap) must specify the type of update for each attribute: set, add, remove, removeregex, or inc. The map should NOT include the uniqueIdentifier attribute, as this is provided by a parameter. See the SolrDocSinkSample for map syntax.If no map is provided, all attributes will be committed as if the map were on "set". No ordering of the tuples within a buffer being committed is guaranteed.

SolrQuery

This operator is used for querying a Solr server. One of the incoming attributes must be a solr query (default: solr_query). The query field should look like the portion after "/select/?" in an html query. Special characters must correctly be converted by the user--queries should look like queries made from a browser. For example, spaces should be encoded as '%20'. '=' and '&' characters should remain as is. Example:

solr_query = "q=*:*&sort=id%20asc&fq=cat:electronics&fl=id,cat,name" ;

See sample and tests for more query examples. A Solr collection must either be specified as an operator parameter, or it can also be sent in as an attribute with each tuple. The response of the Solr query is placed as an output value (default solr_response). The resonseFormat, numberOfRows, and omitHeader parameters are for convenience and can be instead provided as part of the solr_query. The parameter values will be overridden by query specifications (meaning rows=5 in the query overrides numberOfRows=15 as a parameter). Unrelated incoming attributes with corresponding output attributes will be forwarded along with the solr_response. There is an optional error port where failed queries and their error messages can be sent.

SolrStemmer

This operator is used for stemming words. For example, apples -> apple, walked -> walk, talked -> talk. This operator is the least developed and has been minimally tested.

Authors and Contributors

This streamsx.solr project was proposed by @jjbosox and initial code was contributed by @Alex-Cook4.