Toolkits > com.ibm.streamsx.topology 2.1.0 > com.ibm.streamsx.topology.topic > Publish-subscribe Overview
Applications can publish streams to a topic name which can then be subscribed to by other applications (or even the same application). Publish-subscribe works across applications written in SPL and those written using application APIs provided by this toolkit.
A subscriber matches a publisher if their topic filter matches a publisher's topic name and the stream type is an exact match to that of the publisher. It is recommended that a single stream type is used for a topic name.
A topic is a rstring value (encoded with UTF-8), based upon the MQTT topic style.
Topic names for publishing a stream:
Without a wildcard character a topic filter is an exact match for a topic name so that filter a/b/c only matches a/b/c.
Note: a publish-subscribe match requires the stream type to match as well as the topic filter matching the topic name.
Note: For correct wildcard matching both pblisher and subscriber must have been compiled with version 1.3.0 or later.
Publish-subscribe is a many to many relationship, any number of publishers can publish to the same topic and stream type, and there can be many subscribers to a topic.
For example a telco ingest application may process Call Detail Records from network switches and publish processed records on multiple topics, cdr/voice/normal, cdr/voice/dropped, cdr/sms, etc. by publishing each processed stream with its own topic. Then a dropped call analytic application would subscribe to the cdr/voice/dropped topic.
Publish-subscribe is dynamic, using IBM Streams dynamic connections, an application can be submitted that subscribes to topics published by other already running applications. Once the new application has initialized, it will start consuming tuples from published streams from existing applications. And any stream the new application publishes will be subscribed to by existing applications where the topic and stream type matches.
An application only receives tuples that are published while it is connected, thus tuples are lost during a connection failure.
An SPL application uses Publish to publish a stream to a topic, and Subscribe to subscribe to a topic.
A Java application uses TStream.publish(topic) to publish streams.
A Python application uses Stream.publish(topic) to publish streams and Topology.subscribe(topic) to subscribe to published streams.
Python supports publishing and subscribing to streams containing Python, SPL (structured) schemas, JSON or String objects. igy
Published streams can be subscribed to by IBM Streams applications written in different languages, by ensuring common stream types (schemas).
Topic publish-subscribe model was changed for releases 1.1.6 to have intuitive and defined behavior when the publisher or subscriber is in a parallel region (with width > 1).
The intuitive case is that parallel regions are used to partition tuple processing across the channels, so that each channel processes a subset of the tuples, and each tuple is processed by a single channel.
Any subscriber connecting to a publisher must then process the complete output from the publisher and only process each published tuple only once. For example if a publisher is in a parallel region with width three, then there are three published channels. A connecting subscriber with width two will have two subscribing channels. Each published channel must connect to a single subscribing channel, and in this case one subscriber channel will connect to two published channels and the other to the remaining published channel.
Publisher |
Subscriber |
Behavior |
---|---|---|
not-parallel |
not-parallel |
Single connection between the publisher and subscriber containing all the tuples |
parallel(1) |
parallel(1) |
|
not-parallel |
parallel(N) N > 1 |
One and only one of the N subscribers connects to the single publisher and thus the subscriber region correctly processes the tuples once. Note that which subscriber channel processes the single published channel is not defined, it could be any of 0,1,...,N-1. |
parallel(1) |
||
parallel(M) M > 1 |
not-parallel |
Single subscriber connects to each of the M publishers so that the subscriber region will process all the the published tuples once. |
parallel(1) |
||
parallel(M) M > 1 |
parallel(N) N > 1 |
Each published channel is connected to a single subscribe channel with channel number publish channel % N. Thus the subscriber region processes the tuples once. |
Note: It's important to remember there may be multiple applications publishing and/or subscribing to the same topic and thus publish/subscribe must work in all combinations, e.g. there may be a publishers to the same topic of non-parallel, parallel(5) and parallel(3), and subscribers of non-parallel and parallel(7). This is especially true in a microservices style architecture where analytic applications may come and go and there may not be pre-assigned agreement that publisher and subscribers will have matching channel numbers.