public interface SPLStream extends TStream<com.ibm.streams.operator.Tuple>, SPLInput
SPLStream
is a declaration of a continuous sequence of tuples with
a defined SPL schema. A SPLStream
is a TStream<Tuple> thus may be
handled using any functional logic where each tuple will be an instance of
com.ibm.streams.operator.Tuple
.TStream.Routing
Modifier and Type | Method and Description |
---|---|
SPLStream |
autonomous()
Return a stream matching this stream whose subsequent
processing will execute in an autonomous region.
|
<T> TStream<T> |
convert(Function<com.ibm.streams.operator.Tuple,T> convertor)
Convert SPL tuples into Java objects.
|
SPLStream |
endLowLatency()
Return a TStream that is no longer guaranteed to run in the same process
as the calling stream.
|
SPLStream |
endParallel()
Ends a parallel region by merging the channels into a single stream.
|
SPLStream |
filter(Predicate<com.ibm.streams.operator.Tuple> filter)
Declare a new stream that filters tuples from this stream.
|
com.ibm.streams.operator.StreamSchema |
getSchema()
SPL schema of this stream.
|
SPLStream |
isolate()
Return a stream whose immediate subsequent processing will execute
in a separate operating system process from this stream's processing.
|
SPLStream |
lowLatency()
Return a stream that is guaranteed to run in the same process as the
calling TStream.
|
SPLStream |
modify(UnaryOperator<com.ibm.streams.operator.Tuple> modifier)
Declare a new stream that modifies each tuple from this stream into one
(or zero) tuple of the same type
T . |
SPLStream |
parallel(int width)
Parallelizes the stream into a a fixed
number of parallel channels using round-robin distribution.
|
SPLStream |
parallel(Supplier<java.lang.Integer> width,
Function<com.ibm.streams.operator.Tuple,?> keyFunction)
Parallelizes the stream into a number of parallel channels
using key based distribution.
|
SPLStream |
parallel(Supplier<java.lang.Integer> width,
TStream.Routing routing)
Parallelizes the stream into
width parallel channels. |
SPLStream |
sample(double fraction)
Return a stream that is a random sample of this stream.
|
SPLStream |
setConsistent(ConsistentRegionConfig config)
Set the source operator for this stream to be the start of a
consistent region to support at least once and exactly once
processing.
|
SPLStream |
throttle(long delay,
java.util.concurrent.TimeUnit unit)
Throttle a stream by ensuring any tuple is submitted with least
delay from the previous tuple. |
TStream<com.ibm.json.java.JSONObject> |
toJSON()
Transform SPL tuples into JSON.
|
TStream<java.lang.String> |
toStringStream()
Create a TStream<Tuple> from this stream.
|
TStream<java.lang.String> |
toTupleString()
Create a stream that converts each input tuple on this stream to its SPL
character representation representation.
|
asType, connectTo, getTupleClass, getTupleType, join, join, joinLast, joinLast, last, last, last, multiTransform, output, parallel, print, publish, publish, sink, split, transform, union, union, window
addResourceTags, colocate, getResourceTags, isPlaceable, operator
builder, graph, topology
com.ibm.streams.operator.StreamSchema getSchema()
TStream<com.ibm.json.java.JSONObject> toJSON()
getSchema()
returns
JSONSchemas.JSON
then each tuple is taken as a serialized JSON and deserialized.
If the serialized JSON is an array,
then a JSON object is created, with
a single attribute payload
containing the deserialized
value.
com.ibm.streams.operator.encoding.JSONEncoding
.
JSONObject
.<T> TStream<T> convert(Function<com.ibm.streams.operator.Tuple,T> convertor)
transform(converter)
.convertor
- Function to convertT
transformed from this
stream's SPL tuples.TStream.transform(Function)
TStream<java.lang.String> toTupleString()
TStream<java.lang.String> toStringStream()
SPLStream
must
have a schema of SPLSchemas.STRING
.java.lang.IllegalStateException
- Stream does not have a value schema for TStream<Tuple>.SPLStreams.stringToSPLStream(TStream)
SPLStream endLowLatency()
---source---.lowLatency()---filter---transform---.endLowLatency()---filter2
It is guaranteed that the filter and transform operations will run in
the same process, but it is not guaranteed that the transform and
filter2 operations will run in the same process.
endLowLatency
in interface TStream<com.ibm.streams.operator.Tuple>
SPLStream filter(Predicate<com.ibm.streams.operator.Tuple> filter)
t
on this stream will appear in the returned stream if
filter.test(t)
returns true
. If
filter.test(t)
returns false
then then t
will not
appear in the returned stream.
Examples of filtering out all empty strings from stream s
of type
String
// Java 8 - Using lambda expression
TStream<String> s = ...
TStream<String> filtered = s.filter(t -> !t.isEmpty());
// Java 7 - Using anonymous class
TStream<String> s = ...
TStream<String> filtered = s.filter(new Predicate<String>() {
@Override
public boolean test(String t) {
return !t.isEmpty();
}} );
filter
in interface TStream<com.ibm.streams.operator.Tuple>
filter
- Filtering logic to be executed against each tuple.TStream.split(int, ToIntFunction)
SPLStream lowLatency()
TStream.endLowLatency()
is called.
---source---.lowLatency()---filter---transform---.endLowLatency()---
It is guaranteed that the filter and transform operations will run in
the same process.
lowLatency
in interface TStream<com.ibm.streams.operator.Tuple>
SPLStream isolate()
-->transform1-->.isolate()-->transform2-->transform3-->.isolate()-->sink
It is guaranteed that:
transform1
and transform2
will execute in separate processes.transform3
and sink
will execute in separate processes. t1, t2, t3
) are applied to a stream returned from isolate()
then it is guaranteed that each of them will execute in a separate operating
system process to this stream, but no guarantees are made about where t1, t2, t3
are placed in relationship to each other.
SPLStream modify(UnaryOperator<com.ibm.streams.operator.Tuple> modifier)
T
. For each tuple t
on this stream, the returned stream will contain a tuple that is the
result of modifier.apply(t)
when the return is not null
.
The function may return the same reference as its input t
or
a different object of the same type.
If modifier.apply(t)
returns null
then no tuple
is submitted to the returned stream for t
.
Example of modifying a stream String
values by adding the suffix 'extra
'.
TStream<String> strings = ...
TStream<String> modifiedStrings = strings.modify(new UnaryOperator() {
@Override
public String apply(String tuple) {
return tuple.concat("extra");
}});
This method is equivalent to
transform(Function<T,T> modifier
).
modify
in interface TStream<com.ibm.streams.operator.Tuple>
modifier
- Modifier logic to be executed against each tuple.T
modified from this
stream's tuples.SPLStream sample(double fraction)
SPLStream throttle(long delay, java.util.concurrent.TimeUnit unit)
delay
from the previous tuple.SPLStream parallel(int width)
round-robin fashion
.
width
channels until TStream.endParallel()
is called or
the stream terminates.
TStream.parallel(Supplier, Routing)
for more information.SPLStream parallel(Supplier<java.lang.Integer> width, TStream.Routing routing)
width
parallel channels. Tuples are routed
to the parallel channels based on the TStream.Routing
parameter.
TStream.Routing.ROUND_ROBIN
is specified the tuples are routed to parallel channels such that an
even distribution is maintained.
TStream.Routing.HASH_PARTITIONED
is specified then the
hashCode()
of the tuple is used to route the tuple to a corresponding
channel, so that all tuples with the same hash code are sent to the same channel.
TStream.Routing.KEY_PARTITIONED
is specified each tuple is
is taken to be its own key and is
routed so that all tuples with the same key are sent to the same channel.
This is equivalent to calling TStream.parallel(Supplier, Function)
with
an identity function.
TStream<String> myStream = ...;
TStream<String> parallel_start = myStream.parallel(of(3), TStream.Routing.ROUND_ROBIN);
TStream<String> in_parallel = parallel_start.filter(...).transform(...);
TStream<String> joined_parallel_streams = in_parallel.endParallel();
a visual representation a visual representation for parallel() would be
as follows:
|----filter----transform----|
| |
---myStream------|----filter----transform----|--joined_parallel_streams
| |
|----filter----transform----|
Each parallel channel can be thought of as being assigned its own thread.
As such, each parallelized stream function (filter and transform, in this
case) are separate instances and operate independently from one another.
parallel(...)
will only parallelize the stream operations performed after
the call to parallel(...)
and before the call to endParallel()
.
In the above example, the parallel(3)
was invoked on myStream
, so
its subsequent functions, filter(...)
and transform(...)
, were parallelized.
TStream<String> myStream = ...;
TStream<String> myParallelStream = myStream.parallel(6);
myParallelStream.print();
myParallelStream
will be printed to output in parallel. In other
words, a parallel sink is created by calling TStream.parallel(int)
and
creating a sink operation (such as TStream.sink(Consumer)
).
It is not necessary to invoke TStream.endParallel()
on parallel sinks.
parallel(...)
should never be made immediately after another call to parallel(...)
without
having an endParallel()
in between. parallel()
should not be invoked immediately after another call to
parallel()
. The following is invalid:
myStream.parallel(2).parallel(2);
Every call to endParallel()
must have a call to parallel(...)
preceding it. The
following is invalid:
TStream myStream = topology.strings("a","b","c");
myStream.endParallel();
A parallelized stream cannot be joined with another window, and a
parallelized window cannot be joined with a stream. The following is
invalid:
TWindow numWindow = topology.strings("1","2","3").last(3);
TStream stringStream = topology.strings("a","b","c");
TStream paraStringStream = myStream.parallel(5);
TStream filtered = myStream.filter(...);
filtered.join(numWindow, ...);
parallel
in interface TStream<com.ibm.streams.operator.Tuple>
width
- The degree of parallelism. see TStream.parallel(int width)
for more details.routing
- Defines how tuples will be routed channels.Topology.createSubmissionParameter(String, Class)
,
TStream.split(int, ToIntFunction)
SPLStream parallel(Supplier<java.lang.Integer> width, Function<com.ibm.streams.operator.Tuple,?> keyFunction)
t
keyer.apply(t)
is called
and then the tuples are routed
so that all tuples with the
same key are sent to the same channel
.parallel
in interface TStream<com.ibm.streams.operator.Tuple>
width
- The degree of parallelism.keyFunction
- Function to obtain the key from each tuple.width
channels
at the beginning of the parallel region.TStream.Routing.KEY_PARTITIONED
,
TStream.parallel(Supplier, Routing)
SPLStream endParallel()
endParallel
in interface TStream<com.ibm.streams.operator.Tuple>
TStream.parallel(int)
,
TStream.parallel(Supplier, Routing)
,
TStream.parallel(Supplier, Function)
SPLStream autonomous()
embedded
and standalone
contexts and will be ignored.autonomous
in interface TStream<com.ibm.streams.operator.Tuple>
SPLStream setConsistent(ConsistentRegionConfig config)
Consistent regions are only supported in distributed contexts.
setConsistent
in interface TStream<com.ibm.streams.operator.Tuple>
ConsistentRegionConfig