See: Description
Class | Description |
---|---|
ConsistentRegionConfig |
Immutable consistent region configuration.
|
Enum | Description |
---|---|
ConsistentRegionConfig.Trigger |
Defines how the drain-checkpoint cycle of a consistent region is triggered.
|
A consistent region is a subgraph where the states region (the operators and transformations within it) become consistent by processing all the tuples within defined points on a stream. This enables tuples within the subgraph to be processed at least once. The consistent region is periodically drained of its current tuples. All tuples in the consistent region are processed through to the end of the subgraph. In-memory state of operators are automatically serialized and stored on checkpoint for each of the operators in the region.
If any element in a consistent region fails at run time, IBM Streams detects the failure and triggers the restart of the element and the reset of the region. In-memory state of the region is automatically reloaded to a consistent point.
The capability to drain the subgraph, which is coupled with start operators that can replay their output streams, enables a consistent region to achieve at-least-once processing.
A stream processing application can be defined with zero, one, or more
consistent regions. You can define the start of a consistent region with
the setConsistent()
.}.
IBM Streams then determines
the scope of the consistent region automatically, but you can reduce the
scope of the region with TStream.autonomous()
.
When a subgraph is a consistent region, IBM Streams enables the operators in that region to drain and reset. When a region is draining, it establishes logical divisions in the output streams of each operator in the region. A drain is successful when all operators in the region establish a logical division in their output streams, and when all tuples before the logical division are processed in their input streams. If a drain is successful, it means that all operators in the region consumed all input streams up until the established logical division. While the region is draining or resetting, operators in the region that completed their draining or resetting cannot submit new tuples. This behavior means that the tuple flow within the subgraph briefly stops while the region is draining or resetting.
TStream.setConsistent(ConsistentRegionConfig)
.
The operator that is the source of the TStream
must support logic that,
after a failure in the region, can replay tuples since the last checkpoint.
Typically the stream is a source stream that produces tuples from an external system
like Apache Kakfa.
The logic that produces a replayable stream must be implemented
using IBM Streams Java or C++ primitive operator apis.
A functional source
(e.g. Topology.source(com.ibm.streamsx.topology.function.Supplier)
)
cannot be used as the start of a consistent region,
thus the stream must be from an invocation of an SPL operator through
SPL
,
JavaPrimitive
.
Such an invocation may be wrapped by a Java method that provides
a simplified version of the invocation for application developers.
After any failure in the region, all operators are reset to a previous consistent point and then tuple processing resumes with the source operator replaying tuples since the last consistent point.
At least once processing is only seen when the operator modifies external state that cannot
be undone during a reset. For example sending an SMS text cannot be undone, so a IBM Streams
application using a consistent region for monitoring and sending text alerts could result
in more than once text message indicating an issue.
With coordination between the operator and the external system exactly once processing
is possible, depending on the capabilities of the external system, for example exactly
once can be achieved with database and file systems.
autonomous checkpointing
.
t = t.filter(s -> !s.empty())
on a TStream<String>
.
During a drain-checkpoint cycle no processing occurs related to the stateless function.
TStream.autonomous()
.
Any processing against the stream returned by autonomous()
is outside of the consistent region.