Python Topology API

Toolkits > com.ibm.streamsx.topology 2.1.0 > com.ibm.streamsx.topology.python > Python Topology API

Develop IBM Streams applications with Python.

Overview

A functional api to develop streaming applications for IBM Streams using Python. Streams are defined, transformed and sinked (terminated) using Python callables. The return of a callable determines the content of the stream. Tuples on a stream are Python objects or structured SPL tuples, object streams may contain different types of objects.

Prerequisites

  • IBM Streams Version 4.2 (or later).
  • Python 3.5, 3.6, or 3.7
    • Either install Anaconda or Miniconda - this are the easiest options. When running distributed the Streams instance application environment variable PYTHONHOME must be set to the install location.
    • or install CPython. This is more involved and usually requires building Python from source code.
    • Python 3.5 is required when using the Streaming Analytics service on IBM Cloud.

HelloWorld Application

Example code that builds and then submits a Hello World topology.

import mymodule
from streamsx.topology.topology import *
import streamsx.topology.context

topo = Topology("HelloWorld")
hw = topo.source(['hello', 'world', 2018])
hw.for_each(print)
streamsx.topology.context.submit("STANDALONE", topo.graph)

The source function is passed a callable that returns an Iterable or an Iterable, in a list of two strings and a number.

SPL runtime will create an iterator from source's iterable. Then each value returned from the iterator will be sent on the stream.

The for_each function is passed a callable that will be called for each tuple on the stream, in this case the builtin print function (Python 3).

User-supplied callables to operations

Operations such as source and sink accept a callable as input. The callable must be one of the following:
  • the name of a built-in function
  • the name of a function defined at the top level of a module
  • an instance of a callable class defined at the top level of a module that implements the function __call__ and is picklable. Using a callable class allows state information such as user-defined parameters to be stored during class initialization and utilized when the instance is called.
The modules containing the callables, along with third-party libraries required by the modules, are copied into the Streams Application Bundle (sab file).
  • Dependent libraries can be individual modules or packages.
  • Dependent libraries can be installed in site packages, or not installed and simply reside in a directory in the Python search path
  • Dependent native libaries outside of the package directory are not copied into the bundle
Limitations on callable inputs to operations:
  • Importing modules that contain user-defined functions with importlib is unsupported. The PYTHONPATH or sys.path must contain the directory where modules to import are located.
  • Importing modules that contain user-defined functions from zip/egg/wheel files is unsupported.

Microservices

Python applications can publish streams and subscribe to streams from other applications running in the same IBM Streams instance. These allows interchange of streaming data from applications implemented in any language supported by IBM Streams.

See namespace:com.ibm.streamsx.topology.topic for more details.

Python documentation links
Sample application