View on GitHub

Resource Managers

This project is focused on running Streams with popular resource managers

Download this project as a .zip file Download this project as a tar.gz file

Resource Managers

Cluster resource managers enable multiple data intensive computing and storage frameworks to co-exist within the same cluster and share physical resources. In addition, they enable practitioners to mix different programming paradigms together to collate disparate datasets, for instance, to analyse data in motion and data at rest using a stream processing system and a batch processing system, respectively. The predominant model to realise cluster management is via two level scheduling, wherein at the first level, the cluster manager allocates resources to each individual framework and at the second level, each framework distributes its allocation across the various jobs that it needs to execute. Examples of this type of cluster management include Apache YARN and Apache Mesos. The purpose of this project is to integrate IBM InfoSphere Streams with these cluster managers. As a first step towards this goal, we currently have support for Apache YARN.

For more information, refer to the wiki here: Resource Manager Wiki

This page describes how to use Streams With Yarn: Streams On Yarn README