Content
- Operators
-
- ContentRanking: This operator uses a previously trained model in order to find out the most likely intend of a text.
- Types
-
Composites
composite ContentRanking(output OutStream; input InStream)
This operator uses a previously trained model in order to find out the most likely intend of a text.
The ContentRanking operator should be used in a Streams release where SPL Python primitive support is not provided (e.g. Streams 3.2). It uses a ShellPipe operator to invoke Python scripts. When using a Streams release 4.2 or later, it is recommended to create a SPL Python primitive operator to invoke Python classes or functions.
Parameters
- pythonCommand: The name of the python binary. The default name is python. With this parameter you can change the version and the location of the python command according to your environment. Content ranking scripts need python 2.7 or later!
- pythonScript: The name of the python script. The default is <toolkit_dir>/etc/python/ContentRanking.py.
- modelFile: The trained model dumped as pklz file.
- kbIndexFile: The knowledge base dictionary as json file.
- dictLemmasFile: The lemmas dictionary as json file.
- kbNamesFile: The knowledge base Article IDs as text file. Article names (IDs) are separated by space character.
- documentAttribute: The attribute used for the content ranking.
- outStreamType: The OutStream (output port 0) schema of this operator. The schema must contain the schema defined by the resultType.
- initOnFirstTuple: The script is called on operator startup. If this parameter is set to true, then the script is called on first tuple.
Input Ports
- InStream: One tuple is one document.
Output Ports
- OutStream: The result tuple with a single attribute (list of article IDs) This stream must contain the schema defined by the resultType.
Static Types
- ContentRanking.resultType = tuple<list<rstring> articleIDs>;