Operator HBASEGet

IBMStreams com.ibm.streamsx.hbase Toolkit > com.ibm.streamsx.hbase 3.9.3 > com.ibm.streamsx.hbase > HBASEGet

The HBASEGet operator gets tuples from an HBase table. It is similar to the ODBCEnrich operator in the Database Toolkit. It puts the result in the attribute of the output port that is specified in the outAttrName parameter. The operator accepts four types of queries. In the simplest case, you specify a row, columnFamily, and columnQualifier, and the output value is the single value in that entry.
stream<rstring who, rstring infoType, rstring requestedDetail, rstring value, 
       int32 numResults> queryResults = HBASEGet(queries) {
          param
             tableName : "streamsSample_lotr";
             rowAttrName : "who" ;
             columnFamilyAttrName : "infoType" ;
             columnQualifierAttrName : "requestedDetail" ;
             outAttrName : "value" ;
             outputCountAttr : "numResults" ;
            }
If the type of the attributed given by outAttrName is a tuple, it interprets the the attribute names of the output attribute as the column qualifiers, thus getting multiple values at once. Suppose that you represent a book in HBase as a row, with a single column family, and entries with different column qualifiers to represent the title, the author_fname, the author_lname, and the year. We could do separate queries for each column family using the the approach above, but as a short cut, we let you populate the whole tuple at once Let GetBookType represent a the type of a book.
type GetBookType = rstring title,rstring author_fname, rstring author_lname, rstring year, rstring fiction;
Then we query the table as follows:
stream<rstring key,GetBookType value> enriched = HBASEGet(keyStream) {
  param
    rowAttrName: "key";
    tableName: "streamsSample_books";
    staticColumnFamily: "all";
    outAttrName: "value";
}

Additionally, you can get all the entries for a given row-columnFamily pair by supplying an output attribute that is of type map. The map will be populated with columnqualifiers mapping to their corresponding values. If you wan all the entries for a given row, you can supply an output attribute that is of type map to map. The map will take column families to a map of columnqualfiers to values. See the GetSample in samples for details.

If you wish to get multiple versions of a given entry, you can do that by providing using a list type instead of a primitive type. In all cases, if an attribute with the name outputCountAttr exists on the output port, it is populated with the number of values found. This behavior can help you distinguish between the case where the value returned is zero and the case where no such entry existed in HBase.

Behavior in a consistent region

The HBASEGet operator can be in a consistent region. It is treated as a stateless operator, which means that if the underlying HBASE table changes between the first time a tuple is sent and when it is replayed, the HBASEGet operator gives a different answer. As a result, if you use this operator in a consistent region in conjuction with an operator that changes the state of a tuple you could get unexpected behavior. This might happen, for example, if the HBASEGet operator feeds a functor that increments a value, and then puts the tuple back into HBase by using the HBASEPut operator. HBASEGet is not supported as the source of a consistent region.

Summary

Ports
This operator has 1 input port and 2 output ports.
Windowing
This operator does not accept any windowing configurations.
Parameters
This operator supports 14 parameters.

Required: rowAttrName

Optional: authKeytab, authPrincipal, columnFamilyAttrName, columnQualifierAttrName, hbaseSite, maxVersions, minTimestamp, outAttrName, outputCountAttr, staticColumnFamily, staticColumnQualifier, tableName, tableNameAttribute

Metrics
This operator does not report any metrics.

Properties

Implementation
Java

Input Ports

Ports (0)

Description of which tuples to get

Properties

Output Ports

Assignments
Java operators do not support output assignments.
Ports (0)

Output tuple with value or values from HBASE

Properties

Ports (1)

Optional port for error information. This port submits an error message and a tuple, when an error occurs while HBase actions.

Properties

Parameters

This operator supports 14 parameters.

Required: rowAttrName

Optional: authKeytab, authPrincipal, columnFamilyAttrName, columnQualifierAttrName, hbaseSite, maxVersions, minTimestamp, outAttrName, outputCountAttr, staticColumnFamily, staticColumnQualifier, tableName, tableNameAttribute

authKeytab

The authKeytab parameter specifies the kerberos keytab file that is created for the principal.

Properties
authPrincipal

The authPrincipal parameter specifies the Kerberos principal, which is typically the principal that is created for HBase server

Properties
columnFamilyAttrName

Name of the attribute on the input tuple containing the columnFamily. Cannot be used with staticColumnFmily.

Properties
columnQualifierAttrName

Name of the attribute on the input tuple containing the columnQualifier. Cannot be used with staticColumnQualifier.

Properties
hbaseSite

The hbaseSite parameter specifies the path of hbase-site.xml file. This is the recommended way to specify the HBASE configuration. If not specified, then HBASE_HOME must be set when the operator runs, and it will use $HBASE_SITE/conf/hbase-site.xml

Properties
maxVersions

This parameter specifies the maximum number of versions that the operator returns. It defaults to a value of one. A value of 0 indicates that the operator gets all versions.

Properties
minTimestamp

This parameter specifies the minimum timestamp that is used for queries. The operator does not return any entries with a timestamp older than this value. Unless you specify the maxVersions parameter, the opertor returns only one entry in this time range.

Properties
outAttrName

This parameter specifies the name of the attribute of the output port in which the operator puts the retrieval results. The data type for the attribute depends on whether you specified a columnFamily or columnQualifier.

Properties
outputCountAttr

This parameter specifies the name of attribute of the output port where the operator puts a count of the values it populated.

Properties
rowAttrName

Name of the attribute on the input tuple containing the row. It is required.

Properties
staticColumnFamily

If this parameter is specified, it will be used as the columnFamily for all operations. (Compare to columnFamilyAttrName.) For HBASEScan, it can have cardinality greater than one.

Properties
staticColumnQualifier

If this parameter is specified, it will be used as the columnQualifier for all tuples. HBASEScan allows it to be specified multiple times.

Properties
tableName

Name of the HBASE table. It is an optional parameter but one of these parameters must be set in opeartor: 'tableName' or 'tableNameAttribute'. Cannot be used with 'tableNameAttribute'. If the table does not exist, the operator will throw an exception

Properties
tableNameAttribute

Name of the attribute on the input tuple containing the tableName. Use this parameter to pass the table name to the operator via input port. Cannot be used with parameter 'tableName'. This is suitable for tables with the same schema.

Properties

Libraries

Operator class library
Library Path: ../../impl/lib/com.ibm.streamsx.hbase.jar, ../../opt/downloaded/*