The Ngrams operator implements rolling hash technique and utilizes ngramhashing. Ngrams operator has two custom output functions:
- CountNgrams returns as result a list of n-gram counts.
- GetNgrams returns as result a map of n-grams and their counts.
Summary
- Ports
- This operator has 1 input port and 1 output port.
- Windowing
- This operator does not accept any windowing configurations.
- Parameters
- This operator does not support parameters.
- Metrics
- This operator does not report any metrics.
Properties
- Implementation
- C++
- Threading
- Always - Operator always provides a single threaded execution context.
Input Ports
- Ports (0)
-
The Ngrams operator is configurable with a single input port. The input port is non-mutating and its punctuation mode is Oblivious.
- Properties
-
Output Ports
- Assignments
- This operator allows any SPL expression of the correct type to be assigned to output attributes. Attributes not assigned in the output clause will be automatically assigned from the attributes of the input ports that have the same name and type. If there is no such input attribute, an error is reported at compile-time.
- Output Functions
-
- NgramsFS
-
-
<any T> T AsIs(T v)
-
-
list<uint32> CountNgrams(rstring data, uint32 n)
-
Counts each n-gram in the string and places the counter in the result list at the same index as located in the string.
-
map<rstring,uint32> GetNgrams(rstring data, uint32 n)
-
Counts each n-gram in the string and places the n-gram and the counter in the result map as a key/value pair.
- Ports (0)
-
The Ngrams operator is configurable with one output port. The output port is mutating and its punctuation mode is Preserving.
- Properties
-