Cybersecurity Toolkit - Getting Started

Edit me

Introduction

The Cybersecurity Toolkit provides operators that are capable of analyzing network traffic and detecting suspicious behaviour.

In order to get started with using the Cybersecurity Toolkit, it is highly recommended that the sample applications be used as a baseline for building cybersecurity applications. In many cases, the data must be pre-processed (filtered and enriched) prior to be being analyzed, otherwise the analytics will not work correctly. The sample applications contain the necessary pre-processing operators that enable the analytics to work properly. The three introductory sample projects are:

  • DomainProfiling - Detects suspicious behaviour based on profiles built using domains found in DNS response traffic.
  • HostProfiling - Detects suspicious behaviour based on profiles built using hosts found in DNS response traffic.
  • PredictiveBlacklisting - Predicts where a domain should be added to a blacklist.

Since the cybersecurity toolkit is focused on analyzing network traffic, you must download and install the com.ibm.streamsx.network toolkit, found in the streamsx.network GitHub repository. The build.xml file contained in each of the sample applications will automatically download the latest release of the com.ibm.streamsx.network toolkit and place it in the same directory as the samples.

Download Quick Start Edition VM

See the Installing Streams Quick Start Edition VM Image for more information.

Install Dependencies - Quick Start Edition VM

If you are using the Quick Start VM, you will need to download and build the following dependencies in order to use the cybersecurity toolkit:

  • GNU Bison
  • Flex
  • libpcap

GNU Bison

  1. Navigate to http://ftp.gnu.org/gnu/bison/ and download the latest version of GNU Bison to the Quick Start VM. As of the time of this writing, the latest version of GNU Bison was 3.0.4.
  2. Execute the following commands to extract the tarball and run the install:

    tar -xvf bison-3.0.4.tar.gz
    cd bison-3.0.4
    ./configure
    make
    sudo make install

Flex

  1. Navigate to http://flex.sourceforge.net and download the latest version of Flex to the Quick Start VM. When this guide was written, the latest version of flex was 2.5.39.
  2. Execute the following commands to extract the tarball and run the install:

    tar -xvf flex-2.5.39.tar.gz
    cd flex-2.5.39
    ./configure
    make
    sudo make install

libpcap

  1. Navigate to http://www.tcpdump.org and download the latest version of libpcap to the Quick Start VM. When this guide was written, the latest version of libpcap was 1.7.4.
  2. Execute the following commands to extract the tarball and run the install:

    tar -xvf libpcap-1.7.4.tar.gz
    cd libpcap-1.7.4
    ./configure
    make
    sudo make install

Install SPSS (Optional)

The SPSS Modeler Solution Publisher is only required if you want to run the PredictiveBlacklistingSample application. In order to download and install this version of SPSS, you need to a license for the product.

  1. Download and install SPSS Modeler Solution Publisher into the Quick Start VM
  2. Modify the /home/streamsadmin/.bashrc file and set the CLEMRUNTIME environment variable to the SPSS install path:

    echo "export CLEMRUNTIME=/usr/IBM/SPSS/ModelerSolutionPublisher/17.0/" >> /home/streamsadmin/.bashrc
    source ~/.bashrc
  3. Download and extract the com.ibm.spss.streams.analytics toolkit using the following commands:

    cd Downloads
    wget https://github.com/IBMPredictiveAnalytics/streamsx.spss.v4/raw/master/com.ibm.spss.streams.analytics.tar.gz
    tar -xvf com.ibm.spss.streams.analytics.tar.gz
  4. At this point, the PredictiveBlacklisting sample application can be compiled using the steps below.

NOTE: When building the PredictionBlacklistingSample using ant, you must specify the spss.toolkit.path property on the command-line and set the value to the toolkit path. For example: ant -Dspss.toolkit.path=/home/streamsadmin/Downloads/com.ibm.spss.streams.analytics.

Sample Applications

The cybersecurity toolkit sample applications should be used as a baseline for building cybersecurity applications on Streams. The samples contain the necessary filter and enrichment operators that allow the analytics to work properly.

By default, the sample applications will use the PacketFileSource operator (found in the com.ibm.streamsx.network toolkit) to read sample PCAP files packaged with the toolkit. However, this operator can easily be replaced with the PacketLiveSource operator, which allows for ingesting and parsing live data.

Download/Build/Run

The following steps can be taken to download, compile and run the cybersecurity samples:

  1. From the command-line, clone the samples github repository and navigate to the ‘cybersecurity’ directory. Here you will find directories containing the sample applications:

    git clone https://github.com/IBMStreams/samples.git
    cd samples/cybersecurity
    ls -l
    DomainProfilingSamples  HostProfilingSamples  PredictiveBlacklistingSamples
  2. Navigate to the DomainProfilingSamples directory. The directory contains a build.xml file that will download any necessary dependencies (including the networking toolkit) and compile one of the applications. Run the ant command to kick off the build.

    ant
  3. Use the Streams Console to submit the application to the instance. To get the URL for the Streams Console, run the following command:

    streamtool geturl
    https://streamsqse.localdomain:8443/streams/domain/console
  4. Once the Streams Console is open, you should be presented with a screen that looks like the following:

  5. At the top of the Streams Console, switch to the Application Dashboard, which allows you submit, cancel and monitor applications.

  6. With the Application Dashboard open, click the Submit Job icon . Select the *.sab file found in the ‘output/’ directory in the sample application. For example, for the DomainProfilingSample application, you would select this file: /path/to/DomainProfilingSample/output/DomainProfilingBasic_Output/com.ibm.streams.cybersecurity.sample.DomainProfilingBasic.sab

  7. Once the application has been submitted, the Streams Console should display the running application:

Analyze Output

The sample applications will output the results of the analytics to the data directory. There will be two files generated in this directory:

  • dpresults_suspicious.csv - lists the domains that were classified as suspicious
  • dpresults_benign.csv - lists the domains that were classified as benign

For the DomainProfilingBasic application, only the classified domains are written to the file. However, generally you will want to output additional information, such as the IP addresses of the hosts that accessed these domains. The ‘DomainProfilingExtended’ sample application demonstrates how to collect a set of the unique IPs that accessed the domain.

Importing into Streams Studio

The cybersecurity sample applications can be imported into Streams Studio as SPL Projects. When importanting the cybersecurity samples, you must add the com.ibm.streamsx.network toolkit location to Streams Explorer. If you are planning on using the PredictiveBlacklisting samples, you must add the com.ibm.spss.streams.analytics toolkit to Streams Explorer as well.