streamsx.topology.schema module

Schemas for streams.

On a structured stream a tuple is a sequence of attributes, and an attribute is a named value of a specific type.

The supported types are defined by IBM Streams Streams Processing Language (SPL).

class streamsx.topology.schema.CommonSchema

Bases: enum.Enum

Common stream schemas for interoperability within Streams applications.

Streams application can publish streams that are subscribed to by other applications. Use of common schemas allow streams connections regardless of the application implementation language.

Python applications publish streams using publish() and subscribe using subscribe().

  • Python - Stream constains Python objects.
  • Json - Stream contains JSON objects.
  • String - Stream contains strings.
  • Binary - Stream contains binary tuples.
  • XML - Stream contains XML documents.
Binary = <streamsx.topology.schema.StreamSchema object>

Stream where each tuple is a binary object (sequence of bytes).

Warning

Binary is not yet supported for Python applications.

Json = <streamsx.topology.schema.StreamSchema object>

Stream where each tuple is logically a JSON object.

Json can be used as a natural interchange format between Streams applications implemented in different programming languages. All languages supported by Streams support publishing and subscribing to JSON streams.

A Python callable receives each tuple as a dict as though it was created from json.loads(json_formatted_str) where json_formatted_str is the JSON formatted representation of tuple.

Python objects that are to be converted to JSON objects must be supported by JSONEncoder. If the object is not a dict then it will be converted to a JSON object with a single key payload containing the value.

Python = <streamsx.topology.schema.StreamSchema object>

Stream where each tuple is a Python object. Each object must be picklable to allow execution in a distributed environment where streams can connect processes running on the same or different resources.

Python streams can only be used by Python applications.

String = <streamsx.topology.schema.StreamSchema object>

Stream where each tuple is a string.

String can be used as a natural interchange format between Streams applications implemented in different programming languages. All languages supported by Streams support publishing and subscribing to string streams.

A Python callable receives each tuple as a str object.

Python objects are converted to strings using str(obj).

XML = <streamsx.topology.schema.StreamSchema object>

Stream where each tuple is an XML document.

Warning

XML is not yet supported for Python applications.

class streamsx.topology.schema.StreamSchema(schema)

Bases: object

Defines a schema for a structured stream.

On a structured stream a tuple is a sequence of attributes, and an attribute is a named value of a specific type.

The supported types are defined by IBM Streams Streams Processing Language and include such types as int8, int16, rstring and list<float32>.

A schema is defined with the syntax tuple<type name [,...]>, for example:

tuple<rstring id, timestamp ts, float64 value>

represents a schema with three attributes suitable for a sensor reading.

The complete list of supported types are:

Type Description Python representation
boolean True or False bool
int8 8-bit signed integer int
int16 16-bit signed integer int
int32 32-bit signed integer int
int64 64-bit signed integer int
uint8 8-bit unsigned integer int
uint16 16-bit unsigned integer int
uint32 32-bit unsigned integer int
uint64 64-bit unsigned integer int
float32 32-bit binary floating point float
float64 64-bit binary floating point float
decimal32 32-bit decimal floating point decimal.Decimal
decimal64 64-bit decimal floating point decimal.Decimal
decimal128 128-bit decimal floating point decimal.Decimal
complex32 complex using float32 values complex
complex64 complex using float64 values complex
timestamp Timestamp with nanosecond resolution Timestamp
rstring Character string (UTF-8 encoded) str (unicode 2.7)
rstring[N] Bounded string (UTF-8 encoded) str (unicode 2.7)
ustring Character string (UTF-16 encoded) str (unicode 2.7)
blob Sequence of bytes memoryview
list<T> List with elements of type T list
list<T>[N] Bounded list, limted to N elements list
set<T> Set with elements of type T set
set<T>[N] Bounded set, limted to N elements set
map<K,V> Map with typed keys and values dict
map<K,V>[N] Bounded map, limted to N pairs dict
enum{id [,...]} Enumeration Not supported
xml XML value Not supported
tuple<type name [, ...]> Nested tuple Not supported

When a type is not supported in Python it can only be used in a schema used for streams produced and consumed by invocation of SPL operators.

A StreamSchema can be created by passing a string of the form tuple<...> or by passing the name of an SPL type from an SPL toolkit, for example com.ibm.streamsx.transportation.vehicle::VehicleLocation.

Attribute names must start with an ASCII letter or underscore, followed by ASCII letters, digits, or underscores.

When a tuple on a structured scheme is passed into Python it is converted to a dict containing all attributes of the tuple. Each key is the attribute name as a str and the value is the attribute’s value.

When a Python object is submitted to a structured stream, for example as the return from the function invoked in a map() with the schema parameter set, it must be:

  • A Python dict. Attributes are set by name using value in the dict for the name. If a value does not exist (the name does not exist as a key) or is set to None then the attribute has its default value, zero, false, empty list or string etc.
  • A Python tuple. Attributes are set by position, with the first attribute being the value at index 0 in the Python tuple. If a value does not exist (the tuple has less values than the structured schema) or is set to None then the attribute has its default value, zero, false, empty list or string etc.
Parameters:schema (str) – Schema definition. Either a schema definition or the name of an SPL type.
as_dict()

Create a structured schema that will pass stream tuples into callables as dict instances. This allows a return to the default calling style for a structured schema.

If this instance represents a common schema then it will be returned without modification. Stream tuples with common schemas are always passed according to their definition.

Returns:Schema passing stream tuples as dict if allowed.
Return type:StreamSchema

New in version 1.8.

as_tuple()

Create a structured schema that will pass stream tuples into callables as tuple instances.

If this instance represents a common schema then it will be returned without modification. Stream tuples with common schemas are always passed according to their definition.

Returns:Schema passing stream tuples as tuple if allowed.
Return type:StreamSchema

New in version 1.8.

extend(schema)

Extend a structured schema by another.

For example extending tuple<rstring id, timestamp ts, float64 value> with tuple<float32 score> results in tuple<rstring id, timestamp ts, float64 value, float32 score>.

Parameters:schema (StreamSchema) – Schema to extend this schema by.
Returns:New schema that is an extension of this schema.
Return type:StreamSchema
schema()

Private method. May be removed at any time.

spl_json()

Private method. May be removed at any time.

style

Style stream tuples will be passed into a callable.

For the common schemas the style is fixed as:
  • CommonSchema.Python - object - Stream tuples are arbitrary objects.
  • CommonSchema.String - str - Stream tuples are strings.
  • CommonSchema.Json - dict - Stream tuples are a dict that represents the JSON object.
For a structured schema the supported styles are:
  • dict - Stream tuples are passed as a dict with the key being the attribute name and

    and the value the attribute value. This is the default. * E.g. with a schema of tuple<rsting id, float32 value> a value is passed as

    {'id':'TempSensor', 'value':20.3}.

  • tuple - Stream tuples are passed as a tuple with the value being the attributes

    value in order. A schema is set to pass stream tuples as tuples using as_tuple(). * E.g. with a schema of tuple<rsting id, float32 value> a value is passed as

    ('TempSensor', 20.3).

Structured schemas may be changed to pass the stream tuple as a tuple using

Returns:Class of tuples that will be passed into callables.
Return type:type

New in version 1.8.

streamsx.topology.schema.is_common(schema)

Is schema an common schema :param schema: Scheme to test.

Returns:True if schema is a common schema, otherwise False.
Return type:bool