Processing SPL tuples in Python

Toolkits > com.ibm.streamsx.topology 2.1.0 > com.ibm.streamsx.topology.python > Creating SPL Operators from Python code > Processing SPL tuples in Python

SPL tuples are converted to Python objects and passed to a decorated callable.

Overview

For each SPL tuple arriving at an input port a Python callable is invoked with the SPL tuple converted to Python values suitable for the function call. How the tuple is passed is defined by the tuple passing style.

Tuple Passing Styles

An input tuple can be passed to Python function using a number of different styles:
  • dictionary
  • tuple
  • attributes by name not yet implemented
  • attributes by position

Dictionary

Passing the SPL tuple as a Python dictionary is flexible and makes the operator independent of any schema. A disadvantage is the reduction in code readability for Python function by not having formal parameters, though getters such as tuple['id'] mitigate that to some extent. If the function is general purpose and can derive meaning from the keys that are the attribute names then **kwargs can be useful.

When the only function parameter is **kwargs, e.g. def myfunc(**tuple):, then the passing style is dictionary.

All of the attributes are passed in the dictionary using the attribute name as the key.

Tuple

Passing the SPL tuple as a Python tuple is flexible and makes the operator independent of any schema but is brittle to changes in the SPL schema. Another disadvantage is the reduction in code readability for Python function by not having formal parameters. However if the function is general purpose and independent of the tuple contents *args can be useful.

When the only function parameter is *args (e.g. def myfunc(*tuple):) then the passing style is tuple.

All of the attributes are passed as a Python tuple with the order of values matching the order of the SPL schema.

Attributes by name

(not yet implemented)

Passing attributes by name can be robust against changes in the SPL scheme, e.g. additional attributes being added in the middle of the schema, but does require that the SPL schema has matching attribute names.

When attributes by name is used then SPL tuple attributes are passed to the function by name for formal parameters. Order of the attributes and parameters need not match. This is supported for function parameters of kind POSITIONAL_OR_KEYWORD and KEYWORD_ONLY.

If the function signature also contains a parameter of the form **kwargs (VAR_KEYWORD) then any attributes not bound to formal parameters are passed in its dictionary using the attribute name as the key.

If the function signature also contains an arbitrary argument list *args then any attributes not bound to formal parameters or to **kwargs are passed in order of the SPL schema.

If there are only formal parameters any non-bound attributes are not passed into the function.

Attributes by position

Passing attributes by position allows the SPL operator to be independent of the SPL schema but is brittle to changes in the SPL schema. For example a function expecting an identifier and a sensor reading as the first two attributes would break if an attribute representing region was added as the first SPL attribute.

When attributes by position is used then SPL tuple attributes are passed to the function by position for formal parameters. The first SPL attribute in the tuple is passed as the first parameter. This is supported for function parameters of kind POSITIONAL_OR_KEYWORD.

If the function signature also contains an arbitrary argument list *args (VAR_POSITIONAL) then any attributes not bound to formal parameters are passed in order of the SPL schema.

The function signature must not contain a parameter of the form **kwargs (VAR_KEYWORD).

If there are only formal parameters any non-bound attributes are not passed into the function.

The SPL schema must have at least the number of positional arguments the function requires.

Selecting the style

For signatures only containing a parameter of the form *args` or **kwargs the style is implicitly defined:

  • def f(**tuple) - dictionary - tuple will contain a dictionary of all of the SPL tuple attribute's values with the keys being the attribute names.
  • def f(*tuple) - tuple - tuple will contain all of the SPL tuple attribute's values in order of the SPL schema definition.
Otherwise the style is set by the style parameter to the decorator, defaulting to attributes by name. The style value can be set to:
  • 'name' - attributes by name
  • 'position' - attributes by position

Note: For backwards compatibility @spl.pipe and @spl.sink always use attributes by position and do not support **kwargs. They do not support the style parameter.

Examples

These examples how a SPL tuple with the schema and value:

tuple<rstring id, float64 temp, boolean increase>
{id='battery', temp=23.7, increase=true}

is passed into a variety of functions by showing the effective Python call and the resulting values of the function's parameters.

Dictionary consuming all attributes by **kwargs:
@spl.map()
def f(**tuple)
    pass
# f({'id':'battery', 'temp':23.7, 'increase': True})
#     tuple={'id':'battery', 'temp':23.7, 'increase':True}
Tuple consuming all attributes by *args:
@spl.map()
def f(*tuple)
    pass
# f('battery', 23.7, True)
#     tuple=('battery',23.7, True)
Attributes by name consuming all attributes:
@spl.map()
def f(id, temp, increase)
    pass
# f(id='battery', temp=23.7, increase=True)
#     id='battery'
#     temp=23.7
#     increase=True
Attributes by name consuming a subset of attributes:
@spl.map()
def f(id, temp)
    pass
# f(id='battery', temp=23.7)
#    id='battery'
#    temp=23.7
Attributes by name consuming a subset of attributes in a different order:
@spl.map()
def f(increase, temp)
    pass
# f(temp=23.7, increase=True)
#    increase=True
#    temp=23.7
Attributes by name consuming id by name and remaining attributes by **kwargs:
@spl.map()
def f(id, **tuple)
    pass
# f(id='battery', {'temp':23.7, 'increase':True})
#    id='battery'
#    tuple={'temp':23.7, 'increase':True}
Attributes by name consuming id by name and remaining attributes by *args:
@spl.map()
def f(id, *tuple)
    pass
# f(id='battery', 23.7, True)
#    id='battery'
#    tuple=(23.7, True)
Attributes by position consuming all attributes:
@spl.map(style='position')
def f(key, value, up)
     pass
# f('battery', 23.7, True)
#    key='battery'
#    value=23.7
#    up=True
Attributes by position consuming a subset of attributes:
@spl.map(style='position')
def f(a, b)
   pass
# f('battery', 23.7)
#    a='battery'
#    b=23.7
Attributes by position consuming id by position and remaining attributes by *args:
@spl.map(style='position')
def f(key, *tuple)
    pass
# f('battery', 23.7, True)
#    key='battery'
#    tuple=(23.7, True)

In all cases the SPL tuple must be able to provide all parameters required by the function. If the SPL schema is insufficient then an error will result, typically a SPL compile time error.

The SPL schema can provide a subset of the formal parameters if the remaining attributes are optional (having a default).

Attributes by name consuming a subset of attributes with an optional parameter not matched by the schema:
@spl.map()
def f(id, temp, pressure=None)
   pass
# f(id='battery', temp=23.7)
#     id='battery'
#     temp=23.7
#     pressure=None