Installation and Setup
Edit meSetup instructions
These are the basic requirements to create Streams applications with Python:
- Set up a Streams instance
- Set up your development environment(#setup)
- Set up a connection to the Streams instance
Set up a Streams instance
The Python API is used to create a Topology, or application that is executed by the Streams runtime. The Streams runtime can be in the public or private cloud or installed locally.
Choose the option that matches your desired Streams runtime, and follow the steps to install and/or configure the Streams instance.
Streams on Cloud Pak for Data (Recommended)
Cloud Pak for Data v3.0+
If you are developing for a Streams instance running in IBM Cloud Pak for Data, make sure you have installed the add-on and provisioned an instance of the service.Cloud Pak for Data v2.5
You can optionally install Streams as a stand-alone instance.Watch this video for detailed steps.
See the Installation section in the documentation for more details. </li> </ul>
Streaming Analytics service/CPDaaS
Using the Streaming Analytics service or Cloud Pak for Data as a Service (CPDaaS)
The Streaming Analytics service is Streams' Software as a Service offering. You do not need to install Streams to use it.Instead, create an instance of the service in the IBM Cloud. When you have an instance of the service, you can create applications that will run on the service using:
- A notebook in Watson Studio in Cloud Pak for Data as a Service
- Any IDE to develop your Python applications
Create an instance of the Streaming analytics service
Create an instance of the Streaming Analytics service, in IBM Cloud if you have not already done so: To create a new Streaming Analytics service:- Go to the IBM Cloud web portal and sign in (or sign up for a free IBM Cloud account).
- Click Catalog, browse for the Streaming Analytics service, and then click it.
- Enter the service name and then click Create to set up your service. The service dashboard opens and your service starts automatically. The service name appears as the title of the service dashboard.
Local installation of Streams
Developing for a local Streams installation
These steps assume that you are installing Python 3.6 from Anaconda on a Linux workstation.Install version 4.2 or later of IBM Streams or the IBM Streams Quick Start Edition:
IBM Streams Quick Start Edition Docker image or the Linux version.
(IBM Streams only, doesn't apply to the Quick Start Edition) If necessary, install a supported version of Python. Python 3.5, 3.6 and 3.7 are supported. Python 2.7 support is currently deprecated.
Important: Python 3.6 is required to build application bundles that can be submitted to your IBM Streams on Cloud Pak for Data or Cloud Pak for Data as a Service.
You can choose from one of these options:
(Recommended) Anaconda
CPython: https://www.python.org
If you build Python from source, remember to pass
--enable-shared
as a parameter toconfigure
. After installation, set theLD_LIBRARY_PATH
environment variable toPython_Install>/lib
.
Streams also includes a version of the
streamsx
package, so to make sure you are using the latest version of streamsx and not the one bundled with Streams, you should either:- Remove the
PYTHONPATH
environment variable, e.gunset PYTHONPATH
- Or, make sure that
PYTHONPATH
does not include a path ending withcom.ibm.streamsx.topology/opt/python/package
.
Tip: Add the
unset PYTHONPATH
line to yourhome-directory/.bashrc
shell initialization file. Otherwise, you'll have to enter the command every time you start IBM Streams.- Remove the
Set the
PYTHONHOME
application environment variable on your Streams instance by entering the followingstreamtool
command on the command line:streamtool setproperty -i <INSTANCE_ID> -d <DOMAIN_ID> --application-ev PYTHONHOME=<path_to_python_install>
For example, if using the Quick Start Edition:
streamtool setproperty -i StreamsInstance -d StreamsDomain --application-ev PYTHONHOME=/opt/pyenv/versions/3.6.1 --embeddedzk
You can also set the environment variable from the Streams Console in your service.
- Not required for the Quick Start Edition: Configure your Streams instance to use SSH keys instead of password authentication. See the documentation for details.
Note: If your applications are a mix of Python and SPL (Streams Processing Language) code, a local installation of Streams is required.
Set up your development environment
To get your development environment ready:
- Install Python on your local development environment. The version of Python you install must be supported by the Streams instance.
- Install the
streamsx
Python package. - Install a Java 1.8 JRE, if you do not already have one.
See the following sections for more information on these steps.
Install a supported version of Python
Make sure you have the right version of Python for your Streams instance:
- For the Streaming Analytics service in IBM Cloud, use Python 3.6.
- For a local installation of IBM Streams, Python 3.5, 3.6 or 3.7 are supported.
- IBM Cloud Pak for Data:
- The Streams add-on is pre-configured with Python 3.6, so install Python 3.6.
- For a standalone installation of Streams, make sure you install, at a minimum, the same version of Python installed with Streams.
Install the streamsx
package
-
Use pip to install
streamsx
:pip install streamsx
if
streamsx
is already installed, upgrade to the latest version:pip install --upgrade streamsx
-
Set the
JAVA_HOME
environment variable to a Java 1.8 JRE or JDK/SDK.
For the most complete instructions regarding installation, including when a local installation of Streams is required, see the developer setup page of the streamsx project documentation.
Set up a connection to the Streams instance
A Streams Python application, or Topology
, must always be compiled and run on a Streams instance.
After defining the application, you programmatically submit the Topology
to the Streams instance to be compiled and run using the streamsx.topology.context.submit
function.
Below is sample code that you can use to connect to the Streams instance and submit your Topology
. So copy it now and add it to your Python script or as a cell in your notebook.
You will see an example of how this sample code is used later in this tutorial.
Using a project in Cloud Pak for Data
Submit an application from a notebook in Cloud Pak for Data
In this context you need to provide the name of the Streams instance.
To find your Streams instance name:
- From the navigation menu, click Services > Instances.
- Select the Streams instance you want to use, and set the value of
STREAMS_INSTANCE_ID
where indicated in the code.
Submit to Cloud Pak for Data without a project
Submit without using a Cloud Pak for Data project
Collect the following information. Set the values for each variable where indicated.
CP4D_URL
- Cloud Pak for Data deployment URL, e.g.https://cp4d_server:31843
.STREAMS_INSTANCE_ID
:- From the navigation menu, click My instances.
- Click the Provisioned Instances tab.
- Select the Streams instance you want to use, and set the value of
STREAMS_INSTANCE_ID
where indicated in the code.
STREAMS_USERNAME
- (optional) User name to submit the job as, defaulting to the current operating system user name.STREAMS_PASSWORD
- Password for authentication.
See the documenatation or contact your administrator for details.
If you are using a username to authenticate, enter when prompted, otherwise delete that line before running the code.
Copy this code snippet:CPDaaS/Streaming Analytics service
Code to submit to the Streaming Analytics service
To connect to the Streaming Analytics service in IBM cloud you need to get the service credentials from the Streaming Analytics service dashboard.To copy your service credentials, open the Streaming Analytics service dashboard click Service Credentials, then View Credentials, and copy the contents of the cell. Click Add new credentials if there are no credentials listed.
See the image below for an example. Click to enlarge.
Copy this code snippet:
Local install
Code to submit to Streams v4.2 or v4.3
Use these steps if Streams is installed locally, or if you are using the Streams Quick Start Edition (QSE).If you are using the Streams Quick Start Edition, you do not have to modify the code as it uses the default instance and domain ids.
Otherwise, make sure that the STREAMS_INSTANCE_ID
and STREAMS_DOMAIN_ID
are set to match your installation.
Copy this code snippet:
Streams on Kubernetes/OpenShift
Code to submit to a Standalone Streams installation of Streams v5
In order to submit a Streams application you need the following information from the Streams instance:STREAMS_BUILD_URL
: Streams build service URL, e.g. when the service is exposed as node port:https://<NODE-IP>:<NODE-PORT>
STREAMS_REST_URL
: Streams SWS service (REST API) URLSTREAMS_USERNAME
: (optional) User name to submit the job as, defaulting to the current operating system user name.STREAMS_PASSWORD
: Password for authentication.
The documentation has the steps to retrieve the URLs for the Build and REST service. Set the values for each variable where indicated in the code.
Copy this code snippet:
Create your first application
Now you are ready to create your first application with the Streams Python API.
The application will ingest temperature readings from a simulated sensor and compute the rolling average reading for each sensor.
Learn more about the API
After you create your first application, visit the Process data with common Streams transforms section to learn more about the API.