Monday, March 18, 2019

Beyond BroControl - A New Process Supervision Model for Zeek

Current State of Affairs

A near-term item on the Zeek Roadmap is to provide an alternative, and eventual successor, to BroControl.  For context on why that's the case, there's the following pain points:
  • Process supervision in an external tool/process like BroControl is flaky.
Other modern examples of process supervision tools recognize the benefit/control gained from being the direct parent of supervised-processes.  By moving all the process supervision logic into Zeek itself, we can have more confidence and control of the ongoing health/state of a deployment.
  • It's awkward to develop and test new scripts that are destined for production environments.
The common use-case is load-balanced network traffic across a cluster of worker processes.  We need to make it easy to test, from the command-line, using just PCAP files, a complete cluster deployment (scaled down) as it would work in production.  Having to use a separate/intermediate tool, like BroControl, during development is not conducive to a fluid programming workflow or feedback loop.
  • Atypical system/service/container management and administration.
A goal is to cater more towards modern sysadmin expectations and we've gotten feedback that the current approach of using BroControl over Zeek/Bro directly is not a typical way of operating.  We want to improve that by no longer requiring an install/run of an entirely separate software.  The main Zeek process will be all that's needed to both supervise a deployment and serve as the central point of integration in existing service/system management schemes.

BroControl evolved from a prior tool that was originally built to satisfy a particular research use-case, not necessarily modern deployments.  That's expected, coming from such an early point in time, however, with a large user-base now depending on Zeek for production use, it's wise to design a new tool that, from the start, takes into account the wider community needs.

The Plan

There's been a brief round of internal discussion already with the following design and implementation notes produced from that:

Zeek Supervisor Design Doc

To summarize the goal: we want to make the main Zeek/Bro process the point of entry for deployments and allow just running the Zeek/Bro process to create a cluster deployment comparable to what BroControl would currently configure.

We haven't started implementing any of this yet in order to capture and respond to community feedback, so please get in touch with any you may have.  The mailing list (zeek@zeek.org) is a good place to discuss.