Thursday, September 12, 2019

Zeek Week to Gather Expert Users and Developers from Around the World to Showcase New Zeek Technology Innovations and Enhancements



The leading event for open-source Zeek network security monitor comes to Seattle

San Francisco, Calif. – Sept. 12, 2019 – Zeek Week 2019 (formerly BroCon), the most important community event for users, developers, incident responders, threat hunters and security architects who rely on the open-source Zeek network security monitor, today announced a full lineup of speakers with areas of expertise including DNSSEC protocol parsing, MITRE ATT&CK-based analytics, SSL/TLS encryption, Zeek performance optimization, and incident response.

The week will kick off with a keynote from Freddy Dezeure, founder and former head of CERT-EU. A renowned expert in cybersecurity and cyber risk management, Dezeure has held a variety of management positions with the European Commission for more than 20 years. He set up the EU Computer Emergency Response Team (CERT-EU) in 2011, and in that time it has grown to one of the most respected CERTs in Europe.

Dezeure’s keynote, “Threats are Changing, So are We as Defenders,” will present insights into the current attack trends used by adversaries, their motives and techniques, and the challenges these create for enterprises.

“The changing threat landscape requires us to continuously adapt our defenses to mitigate the risk to our organizations - and society as a whole - to an acceptable level,” said Dezeure of his keynote topic. “Complacency is not an option.”

Zeek Week, presented by the Zeek open-source community and hosted by Corelight, providers of the most powerful network visibility solution for cybersecurity, is an annual user conference featuring technical talks, demonstrations and discussions about the project, its many applications, and its future.

“This past year we have seen a major rise in innovation across the Zeek user community and we are excited to highlight many of these new uses cases and developments at Zeek Week,” said Dr. Vern Paxson, Zeek creator and co-founder of Corelight. “In the more than two decades since Zeek was created, the technology has thrived, thanks in part to a dedicated and growing user community that has augmented the platform with powerful new functionality.

“I look forward to this gathering every year because it provides the single greatest opportunity to learn how open source Zeek is transforming network traffic analysis for thousands of users and organizations around the globe,” added Paxson.

Zeek Week content and sessions are focused on the ever-evolving cybersecurity landscape and how Zeek is helping organizations across the public and private sectors by providing better data and network traffic analytics. In addition, this year’s conference will include announcement of the winners of the Zeek Package Contest, which will award the creators of five of the most innovative and useful open source Zeek packages that extend Zeek’s threat hunting and detection capabilities.

The full agenda is now live and scheduled speakers includes:

  • Vlad Grigorescu, ESnet
  • Mark Fernandez, The MITRE Corporation
  • Jim Mellander, Lawrence Berkeley National Laboratory
  • Robin Sommer, Corelight and Zeek Leadership Team
  • Fatema Bannat Wala, University of Delaware
  • Jordi Ros-Giralt, Reservoir Labs
  • Justin Azoff, Corelight
  • Michal Purzynski, Mozilla Corporation and Zeek Leadership Team
  • Adam Pumphrey, Nimbus LLC
  • Seth Hall, Corelight and Zeek Leadership Team
  • Aashish Sharma, Lawrence Berkeley National Laboratory and Zeek Leadership Team
  • Justin Kohler, Gigamon
  • Jason Lu, Gigamon
  • Johanna Amann, ICSI, Corelight and Zeek Leadership Team
  • Nick Skelsey, Secure Network
  • Keith Lehigh, Indiana University and Zeek Leadership Team
  • Amber Graner, Corelight


Zeek Week 2019 will take place at the King St. Ballroom & Perch at Embassy Suites by Hilton in Seattle, Wash., October 8-11. For additional information, or to register, visit https://www.zeekweek.com.

Zeek Week 2019 is generously sponsored by Bricata, Humio, AlphaSOC, Reservoir Labs, BluVector, Gigamon, and Brim Security.


About Zeek

Zeek (formerly known as Bro) is a powerful open-source network analysis framework that is much different from the typical IDS you may know. While focusing on network security monitoring, Zeek provides a comprehensive platform for more general network traffic analysis as well. For more information, visit https://www.zeek.org.

Thursday, August 8, 2019

Zeek 3.0.0 RC1 released

(Note: We will update this blog posting for the final release.  Please provide feedback on anything that would be helpful to add.)



We just published a release candidate for Zeek 3.0.0—our first major release since Bro 2.0 came out in 2012. This version is quite special as it undertakes The Big Zeekification™: It is executing on the technical side of the name change that we announced last year by now renaming the tool itself, including binaries, scripts, and even some events. “Bro” is now “Zeek.” 

This name change brings some disruption for existing users, which is unavoidable for a long-term codebase where the original name had more than 20 years to proliferate into pretty much every corner. Nevertheless, we have been trying hard to maintain backwards compatibility from Zeek 3.0.0 to Bro 2.6 as much as possible to facilitate smooth upgrades. Wherever we reasonably could, we put aliases and redirects in place so that old names remain working in parallel to the new ones. When using the old names, you will in many cases see explicit deprecation warnings that point you to the places that need updating. These transition mechanisms will remain in place for the Zeek 3.0.x series. We’ll remove them with the next feature release 3.1.0 and likewise with the next long-term stable release 4.0.0, in accordance with our new release schedule.

Below is a more detailed summary of the main changes coming with the renaming. In addition, Zeek 3.0.0 comes with a number of new features as well, including:

  • New analyzers for NTP and MQTT, and extended analyzers for DNS (SPF/DNSSEC), RDP, SMB, and TLS. 
  • Support for decapsulating VXLAN tunnels.
  • Support for logging in UTF8.
  • Several extensions of the scripting language:
  • Closures for anonymous functions
    • Iteration over key/value pairs of a table through for ( key, value in t ) ...)
    • Python-style vector slicing (v[2:4])
    • A new data structure, paraglob, for efficiently matching strings against large list of globs.
  • See the NEWS file for more detailed release notes, and CHANGES for the complete list of changes

Upgrading to Zeek 3


The following summarizes the main naming-related changes that you will encounter after installing Zeek 3.0.0. Unless otherwise noted, the Bro 2.6 names and paths will continue to work with this release, but often trigger deprecation warnings.

  • The names of all executables that had “bro” in their name have changed: bro -> zeek, bro-config -> zeek-config, broctl -> zeekctlbro-cut -> zeek-cut. Zeek 3.0.0 installs wrappers under the old names that will let them continue to work.
  • The default install prefix is now /usr/local/zeek instead of /usr/local/bro. If your existing installation used the previous default and you are using the new default when upgrading, we'll symlink /usr/local/zeek to /usr/local/bro. Certain subdirectories get similar treatment: share/bro, include/bro, and lib/bro.
  • Along with BroControl becoming ZeekControl, installation directories and files with broctl in their name have changed to use zeekctl instead. However, these changes remain backwards compatible with previous Bro installations by continuing to pull from existing locations where customizations might have been made. For example, if you have a broctl.cfg file from a previous installation, installing Zeek over it will retain that file and even symlink the new zeekctl.cfg to it.
  • The new extension for Zeek scripts is .zeek. This leads to two major changes:
    • All scripts ending in .bro have been renamed to .zeek. In particular, $prefix/share/bro/site/local.bro has been renamed to local.zeek. However, if you have an existing local.bro file from a previous Bro installation—possibly with customizations made to it—Zeek will install a symlink local.zeek file that points to that pre-existing local.bro. In that case, you may want to just copy local.bro into the new local.zeek location to avoid confusion, but things should generally also work properly without intervention.
    • The search logic for the @load script directive now prefers files ending in .zeek, but will still fallback to loading a .bro file if it exists. E.g. @load foo will first check for a foo.zeek file to load and then otherwise foo.bro. Note that @load foo.bro (with the explicit .bro file suffix) prefers the opposite order: it first checks for foo.bro and then falls back to a foo.zeek, if that exists.
  • Changes affecting scripts:
    • The events bro_init, bro_done, and bro_script_loaded are now deprecated; use zeek_init, zeek_done, and zeek_script_loaded instead. Any existing event handlers for the deprecated versions will automatically alias to the new events such that existing code will not break, but their usage will emit deprecation warnings.
    • The functions bro_is_terminating and bro_version function are deprecated and replaced by functions named zeek_is_terminating and zeek_version. The old names likewise continue to work with deprecation warnings.
  • The namespace used by all the builtin plugins that ship with Zeek have changed to use Zeek::.
  • Any Broker topic names used in scripts shipped with Zeek that previously were prefixed with bro/ are now prefixed with zeek/ instead. In the case where external applications were using a bro/ topic to send data into a Bro process, a Zeek process still subscribes to those topics in addition to the equivalently named zeek/ topic. In the case where external applications were using a bro/ topic to subscribe to remote messages or query data stores, there's no backwards compatibility and external applications must be changed to use the new zeek/ topic. The NEWS have a list of the most common topic names that one may need to change.
  • The Broxygen component, which is used to generate our Doxygen-like scripting API documentation, has been renamed to Zeekygen. This likely has no breaking or visible changes for most users, except in the case one used it to generate their own documentation via the --broxygen flag, which is now named --zeekygen. Besides that, various documentation in scripts has also been updated to replace Sphinx cross-referencing roles and directives like :bro:see: with :zeek:see:.


Upgrading to the Zeek Package Manager


The external package manager switched its name as well, from bro-pkg to zkg. On PyPI, both the old bro-pkg and new zkg packages share the same code-base, so you may continue using bro-pkg if you want, but it’s easy enough to switch for sake of consistency: run pip uninstall bro-pkg && pip install zkg.  Either way, a wrapper script is provided that forwards from bro-pkg to zkg


Renaming External Packages  


It's up to a package’s maintainer whether they want to rename a package that’s been using “bro” in its name—there’s nothing about such a package name that will be incompatible with Zeek 3.0.0. If you do want to rename your package, we recommend the following process, assuming it’s hosted on GitHub:
  1. Rename your GitHub repository from bro-foo to zeek-foo. GitHub will automatically provide a redirect from the old URL to the new URL, so people who had installed a package using the old URL will still be fine going forward.
  2. Add an alias to the package’s metadata: aliases = zeek-foo bro-foo. This tells zkg that old and new names are referring to the same package, and it will create corresponding symlinks so that explicit @load bro-foo directives will continue to work. See the documentation for more on aliases.
  3. Optionally, update the depends metadata field. The special dependencies zeek and zkg are replacing bro and bro-pkg, respectively, and zkg treats them as aliases. Note, however, that existing bro-pkg installations won’t recognize the new names yet, so you might want to leave them in there to support users who have not yet upgraded. See the documentation for more.
  4. Re-register the renamed package, zeek-foo with central package source. Follow the normal directions to update your index file: remove the old URL for bro-foo and add the URL for zeek-foo.


Common Issues When Upgrading 


  • If you were running Bro as the bro user and intend to use a zeek user now, don't forget to remove/update any potential cron jobs you may have.
  • If you're installing Zeek on an old Bro host, remember to first shut down the old cluster using broctl.
  • Symptoms of overlapping Bro/Zeek installations:
    • Plugins may have failing symbol problems depending on if you run Zeek or Bro.
    • zkg packages may fail to install with an error that btest can't find init-bare.bro.  This may be caused by certain packages using an old version of the get-bro-env script or bro_dist metadata substitution in combination with having the bro-pkg/zkg configuration set to use a mismatched Bro/Zeek sourcetree. 
  • Not remembering to update zkg configuration (i.e. updating the paths in ~/.zkg/config or ~/.bro-pkg/config in case you’re now using a different source/installation path for Zeek 3.0.0)
  • Not updating PATH environment variable (to either remove an old /usr/local/bro path or to add the new /usr/local/zeek path)
  • Plugins will generally need to be recompiled for Zeek 3.0.0 (as is usually the case with new versions). Plugins that require --bro-dist have been seen to have build issues. The best solution is to switch the plugin to the new skeleton code. However, we will try to address any specific issues if you file a ticket with instructions on how to reproduce.
  • If you run the BHR scripts, you may need to change those to run as the zeek user as well as the permissions on the queue directory.
  • Not remembering to update both where an external processes (e.g. cron job) writes Intel files into the old installation tree and where the Intel configuration (e.g. Intel::read_files) expects to read such files in the case you choose to use the new default installation path. e.g. if Intel was previously written to /usr/local/bro and you now want to use /usr/local/zeek, remember to update both the Zeek configuration and whatever external process may be writing the Intel files.

Feedback 


We realize that we may have missed some places where the name change can impact existing setups. We need your help to close those gaps: if you’re running into any issues upgrading from Bro 2.6 to Zeek 3.0.0, please let us know. If it’s something that we can/should fix, please file a ticket on GitHub. If you have advice for others on how to adapt their setups, scripts, or packages, please leave a comment on this blog posting or email the Zeek mailing list. We’ll be updating this blog posting once the final 3.0.0 release comes out.


Contributors


Thanks to Mike Dopheide, Jon Siwek, and Justin Azoff for contributing to this blog posting.

Wednesday, July 31, 2019

An update on Community ID

By Christian Kreibich, Senior Engineer at Corelight


Nearly a year has passed since the introduction of the Community ID flow hashing standard, so I’d like recap the goals of the project, share an update on what has happened since, and lay out the next steps.

The Community ID aims to simplify the correlation of flow-level logs produced by multiple network monitoring applications. Without the ID, one needs to locate the required parts of the flow tuple (typically the IP address and port of each endpoint, plus the transport protocol) in each log’s rendering, combine them, and match them up. This “join” is tedious in the best case, and in corner cases (specific ICMP message types, for example) can become fairly tricky. The ID standardizes the rendering of flow tuples into hash-like strings, reducing the correlation to a simple string comparison.

The project originated out of efforts to simplify the correlation of logs produced by two of the major modern open-source network monitors: Suricata and Zeek. The former added support in version 4.1, while Zeek users can install a package that adds support from Zeek version 2.5 onward. At last year’s SuriCon in Vancouver we presented the project in more detail. Feedback was very positive and lead to a series of early adopters, including Moloch, Elastic Beats and Common Schema, HELK, and most recently MISP and VAST. Other projects have declared intention to support (such as D4 and Sysmon). A major thanks to all developers involved! They not only took on the burden of implementing the standard, they did so from non-reusable implementations and a largely informational “specification” document.

We’ve recently updated the ID’s main document to become more normative, including a pseudo code implementation. At the moment, the ID is perhaps easiest to explore via our recently released communityid Python module: it installs via pip and significantly reduces the barrier to entry, particularly in data-processing / SIEM environments. It ships with a command-line tool that reports the ID for a given flow tuple, as follows:

     $ community-id tcp 10.0.0.1 192.168.0.1 1234 80

     1:K4ienR4L7rjxkkNvuZGIZwbbphY=

Going forward, our goals are threefold:

Gather feedback and experience reports. The ID provides version support, and the community has raised several interesting ideas for future revisions. The first version is, quite literally, the simplest approach we could think of. We’re particularly curious to hear about operational use of the ID, its proneness to hash collisions, practical concerns, or creative applications. If you have any feedback, please open tickets!

Provide as many off-the-shelf implementations of the ID as possible. We recently released the communityid Python module that installs via pip and significantly reduces the barrier to entry, particularly in data-processing / SIEM environments. Several of the existing implementations look like they will be relatively straightforward to make reusable. A C library would obviously be a great way to unify and simplify existing implementations, and enable others. If you are interested in working on these, please get in touch!

Add support to more network monitoring applications. Most immediately, we’re looking to support Wireshark, with others to follow. Whether you’re considering an implementation, are actively working on one, or have a tool that you would like to see support the ID, shout!

Please feel free to explore the ID at https://github.com/corelight/community-id-spec. We look forward to your feedback.

Tuesday, July 30, 2019

Open Source Zeek Leadership Team Meeting Minutes - 26 July 2019



The open source Zeek project Leadership Team (LT) is made up of contributors from multiple organizations throughout the community. The LT acts as both a technical steering committee and governance body. You can find out more about the LT on the team page of the website.
Below are the notes from the LT meeting held on 26 July 2019.

Zeek.org Leadership Team Members (Bold indicates attendance)
  • Keith Lehigh (Chair), Indiana University
  • Johanna Amann, International Computer Science Institute/Corelight/Lawrence Berkeley National Laboratory
  • Seth Hall, Corelight
  • Vern Paxson, Corelight & University of California at Berkeley
  • Michal Purzynski, Mozilla Foundation
  • Aashish Sharma, Lawrence Berkeley Lab
  • Adam Slagell, ESnet
  • Robin Sommer, Corelight


  • Amber Graner*, Corelight, Director of Community for the Open Source Zeek Community
  • Nicole Fischer*, Creative, Graphics Design
*not a member

Agenda

  • Zeek Logo Discussion
  • ZeekWeek Tagline and SWAG Discussion
  • Other Topics

Minutes


  • Zeek Logo Discussion
    • Narrowed down the choices, gave feedback about tweaking 3 of the designs.
    • Nicole to take feedback and will present at the next LT Meeting on 9 August

  • ZeekWeek Tagline and SWAG Discussion
    • “Zeek and ye shall find” will be the ZeekWeek Tagline
    • Stickers, T-shirts, Mugs

  • Other topics
    • Consider changing the time deadline times to end in another timezone besides PT
    • Zeek Events Naming - ZeekHours, ZeekDays, ZeekWeek

Helpful Links and information:

Getting Involved: If you would like to be part of the Open Source Zeek Community and contribute to the success of the project please sign up for our mailing lists, join our IRC Channel, come to our events, follow the blog and/or Twitter feed. If you’re writing scripts or plugins for Zeek we would love to hear from you! Can’t figure out what your next step should be, just reach out. Together we can find a place for you to actively contribute and be a part of this growing community.


About Zeek (formerly Bro): Zeek is a powerful network analysis framework that is much different from the typical IDS you may know. https://www.zeek.org/

Thursday, July 25, 2019

Announcing The Zeek Package Contest - Calling All Zeek Users



Zeek Package Contest


  • Are you a Zeek user?
  • Do you enjoy writing Zeek scripts?
  • Do you like being recognized for your awesome work?
  • Do you want to make the world’s networks safer?
  • Do you like winning prizes and claiming bragging rights?
  • Do you want the opportunity to present your work at Zeek events?

If you answered, “yes” to any of the above questions, then the Zeek Package Contest sponsored by Corelight, Inc. may be just the competition for you!

This contest is intended to inspire Zeek users to demonstrate their creativity and ingenuity while winning the admiration of their peers, and giving back to the community.


What is the Zeek Package Contest?


The challenge is straightforward: Create an innovative and useful open source Zeek package that extends Zeek’s threat hunting and detection capabilities.

  • 1st place wins one free trip (hotel and airfare) to ZeekWeek 2019, $5000 cash and Zeek swag (T-shirts, stickers, etc)
  • 2nd place wins $2500 cash and Zeek swag (T-shirts, stickers, etc)
  • 3rd place wins $1000 USD cash and Zeek swag (T-shirts, stickers, etc)
  • 4th and 5th place wins $100 gift card and Zeek swag (T-shirts, stickers, etc)

The winners may also get the opportunity to present their work at future Zeek events and/or have their contributions featured on the Zeek blog.

Submissions need to be made available through the central Zeek package repository. We will evaluate them in terms of their overall functionality & quality, utility for incident responders, customizability, test coverage, and clarity of documentation. The jury will consist of Zeek core developers and other long-time Zeek community members. More details below.


Jury Members


  • Aashish Sharma (Community)
  • Jeff Atkinson (Community)
  • Johanna Amann (Corelight)
  • Justin Azoff (Corelight)
  • Nick Turley (Community)
  • Robin Sommer (Corelight)
  • Seth Hall (Corelight)
  • Vlad Grigorescu (Community)


Important Dates


  • Submission opens: August 1, 2019
  • Submission deadline: September 1, 2019
  • Notification: September 25, 2019
  • Announcement of results: ZeekWeek 2019 (October 8-11, 2019)


Contest Results


Contest results will be posted here when the results have been announced.



Rules of Engagement 


  1. The goal is to create an innovative and useful Zeek package that's compatible with the Zeek Package Manager. The focus is on Zeek scripts, not binary plugins. A package may include a plugin to support its scripts through new built-in functions (“*.bif files”). However, the contest will not consider packages with other binary functionality, such as protocol or file analyzers, log writers, input readers, etc.
  2. To submit a package to the contest, it must first be made available through the central Zeek package repository. You can then nominate it for consideration by filling out the webform. Please include with your nomination: a link to the package’s git repository, a list of authors, a short summary describing the motivation for the work, and documentation of the package’s usage. We will acknowledge receipt, and we will evaluate the version of the package as the package manager installs it at that time.
  3. All submissions must be received no later than September 1, 2019, 11:59PM PDT. The winners will be notified on September 25, 2019.
  4. Packages already included in the Zeek package repository prior to the start of this contest, 1 August 2019, will not be eligible for this contest.
  5. Submitted packages must work with the Zeek 2.6 release. They must build and install on recent, standard Linux systems. Please specify any specific OS requirements of your package, if necessary.
  6. Submitted packages must be open source. We prefer BSD licensed submissions, but will accept any OSI-approved license. By submitting an entry, you declare that you own the copyright to the source code and all related materials, and are authorized to submit it.
  7. Submissions may leverage other packages included in the Zeek package repository as dependencies as long as the package manager can resolve them during installation. They may also link against external libraries as long as their installation is clearly documented and easy to follow.
  8. The top 5 winners of the contest will get the prizes mentioned above. We reserve the right to award fewer than 5 awards if we do not receive a sufficient number of high-quality submissions.
  9. A committee of Zeek core developers and other long-time Zeek community members, chosen by Corelight, will decide the winners based on the following criteria: overall functionality & quality, utility for incident responders, customizability, test coverage, and clarity of documentation.
  10. In order to collect the cash prizes, winners will need to provide a legal picture identification and bank account information within 30 days of notification. The bank transfer will be made within two weeks after the winner is authenticated.
  11. Group entries are allowed; the prize will be paid to a person designated by the group.
  12. You may submit more than one package for the contest, but we limit awards to one per person/group.
  13. Names/aliases of the winners will be listed on the "Zeek Package Contest" web page.
  14. Zeek team members, members of the selection committee, and Corelight employees are not eligible to participate.


The Legal Stuff


In no event will Corelight be liable to you or any party entering this contest for lost profits or any form of indirect, special, incidental, or consequential damages of any character from any causes of action of any kind with respect to this contest, whether based on breach of contract, tort (including negligence), or otherwise, and whether or not you have been advised of the possibility of such damage.


More Information


If you have any questions, please contact us at contest@zeek.org.
Find out more about Zeek at: https://www.zeek.org/
Current packages list can be found at: https://packages.zeek.org/ and https://github.com/zeek/packages

The Zeek Package Contest is inspired and modeled after the Hex-Rays Plugin and Volatility contests.

Wednesday, July 24, 2019

Complacency is not an option - Freddy Dezeure to keynote ZeekWeek 2019

The Zeek Leadership Team is pleased to announce that Freddy Dezeure will keynote ZeekWeek 2019 which will take place in Seattle, Wash., Oct. 8-11, 2019.

Dezeure’s ’s keynote, “Threats are changing, so are we as defenders”, will present insights into the current attack trends used by adversaries, their motives and techniques and the challenges these create for enterprises. Dezeure will highlight changes to - and increased dependency on - our infrastructure. He will also provide an overview on what innovative new methods the threat hunting community is creating as well as share practical guidance and pointers to best practices and tools.

“The changing threat landscape requires us to continuously adapt our defenses to mitigate the risk to our organizations and the society as a whole to an acceptable level,” said Dezeure of his keynote topic. “Complacency is not an option.”

The keynote will take place Wednesday, Oct. 9 at 9:30 a.m.

Registration is open! Make sure you register soon, prices will increase on Aug. 1, 2019.

About Freddy Dezeure: Freddy Dezeure graduated from the KUL in Belgium, with a master of science in engineering in 1982. He was CIO of a private company from 1982 until 1987. He joined the European Commission in 1987 where he held a variety of management positions in administrative, financial and operational areas, in particular in information technology. He set up the EU Computer Emergency and Response Team (CERT-EU) for EU institutions, agencies and bodies in 2011 and made it into one of the most mature and respected CERTs in Europe. Until May 2017 he held the position of the Head of CERT-EU. Presently, he is an independent management consultant providing strategic advice in cybersecurity and cyber-risk management and serving as a board member and/or advisory board member for several technology companies.

About ZeekWeek: ZeekWeek (formerly BroCon) is the most important community event for users, developers, incident responders, threat hunters and architects who rely on the open-source Zeek network security monitor as a critical element in their security stack. Attending ZeekWeek is your opportunity to learn from the open-source Zeek founders, experts and enthusiasts (of all levels).

Friday, July 19, 2019

Zeke on Zeek: Working With Open-Source Zeek: Adding a Key-value For-Loop

By Zach Medley

Getting started working on Zeek can be daunting because of the sheer size of the repository. While designed reasonably, Zeek is big and a lot of reasonable design can still be a lot to handle. This blog post walks through how I added Zeek’s key-value for loop in the hope that it might make it easier for future Zeek developers to get started.

Zeek, formerly Bro, is an open-source network security monitoring tool that transforms raw traffic into rich logs, extracted files, and custom insights via a Turing-complete Zeek programming language. It’s all open source, and developed on GitHub with its community.

Defining the Problem


Before the addition of a key-value for loop in Zeek you can iterate over the items in a container with a standard range based for loop:



However, looping over tables where there are both keys and values requires a separate lookup:



This is less than ideal for both ergonomic and efficiency reasons. At its core, when Zeek does a lookup in a table, it retrieves the corresponding value as well as makes the second lookup unnecessary as Zeek user Jon points out below:




As for the syntax, Zeek’s tables can be indexed by tuples. The existing for loop supported iteration over tables with tuples by wrapping the keys in brackets and unpacking the tuple.



Christian suggested that we extend this tuple unpacking for use with key-value for loops.


Writing Tests


The testing framework Zeek uses is called btest and tests written using it are commonly called “btests.” Zeek's btests live in the testing/btest/ directory. Once you get the hang of them, they are pretty straightforward, but at first glance they can be a little confusing.

A btest usually consists of a test and a baseline. Btest works by running your test and comparing its output to a known baseline. A difference between the output and the baseline results in a failed test. In addition to cloning Zeek, you’ll need to install btest separately, as follows:

To get btest we suggest installing the development version. This will give you access to a more up-to-date btest version that the master version of Zeek may depend on. After cloning Zeek, move to the directory that it’s installed in and run:

     pip install -e aux/btest/

With btest installed, we can begin to write our tests. Zeek already has tests that cover for-loops in testing/btest/language/for.bro, so modifying that file is fine, but I chose to add a separate test file called key-value-for.bro. I wrote a couple tests for key-value for-loops and added one for iterating over tables with more than one index value because there wasn’t a test for that yet. My tests for the key-value look like this:




Note: It's important that your test has the # @TEST-EXEC … line on the top. If you don’t, btest won't know what command to use to run the test. In this case, our btest involves running Zeek on the following content, and a subsequent diff compares to our baseline of expected output.

With the test written, you’ll now have to add a baseline so that btest knows what the desired output should be. The best way to create a btest is fairly nebulous as there are many ways that will work well. Ultimately though, once you find a way you like, and as long as in the end you’re left with a working test, it’s likely fine.

The easiest way to create a simple btest is to replace the test script with some ad-hoc script that produces the same output. For the above we might replace it with some print statements that produce the desired output. Then you can go ahead and run the test with the -U parameter, which will prompt you to make a baseline. Once that’s done, don't forget to go back and change the script back to the one you want to test.

For more complicated tests, though, this ad-hoc method can get troublesome. Here, Christian suggests running the real test, letting it fail, then copying the “out” file it creates over to the baseline directory.

More or less in line with Christian’s suggestion, I created my btests by moving to the /btest/Baseline/ directory. Here I created a new folder with the name <the btest folder your test is in>.<the name of your test file>. For example, my tests were named key-value-for.bro and in the btest/language folder, so I added a folder to the btest/Baseline folder called language.key-value-for. Inside of your new folder add a file called out, and write whatever the expected output of your test is. My out file looks like this:



Now we can run our test and see if it fails. To run the test, first build and install Zeek by running

     ./configure

     make

     make install


Then, change back to the ./btest directory and run:

     btest -d language/key-value-for.bro

Writing Code


Adding new language functionality in Zeek can be done in a couple of simple steps:

Modify parse.y so that the new syntax is recognized and handled properly;

Write the underlying C++ code to make it all work. We’ll start by writing the code to parse the new for-loop.

Parsing


Zeek uses lex and yacc to generate its parser. The part that we’re concerned with can be found in src/parse.y. Specifically, we’re interested in the part that parses the for statement, underneath for_head:



I’ll walk through this code to give an overview of how it works, and then show the new parsing rules for a key-value for-loop.

TOK_FOR ‘(‘ TOK_ID TOK_IN expr ‘)’

Indicates the type of syntax that the following code deals with. Each of the tokens is represented below as a positional number, with TOK_FOR corresponding to the number 1 and ‘)’ corresponding to the number 6.

set_location(@1, @6);

When Zeek is parsed, objects can be associated with a location. For more information on the utility of this, see Bison’s page here. For a little more on how a location is represented, see src/Obj.h.

ID* loop_var = lookup_ID($3, current_module.c_str());

In this case, $3 refers to TOK_ID. Here we get loop_var’s previous definition if it already exists in the current module.





This is the meat of the parse phase. Here, if loop_var already has a definition, we make sure that it is not a global variable. Otherwise, we initialize it.

$$ = new ForStmt(loop_vars, $5);

Finally, we build a new for-statement, and $5, which refers to the thing we’re iterating through.

My implementation follows the basic for-loop’s parsing procedure very closely and calls an alternate version of the constructor that I’ll discuss next.




Core Functionality


In order to preserve as much of the original for-loop’s functionality as possible, I opted to write an alternate constructor for the for-loop that included a variable for values to be stored in as the loop moves through the table. The constructor first calls the regular for-loop constructor on the loop variables and expression, and then runs some additional code to verify the type of the value variable.

The most interesting part of the for-loop is the actual looping. This is done in the DoExec part of the for-loop in src/Stmt.cc.




We’re only interested in the part of the for-loop that deals with looping over tables because they are the only data type supported by key value for-loops. This code is mostly self explanatory with the exception of the usage of Ref() and Unref().

Zeek uses reference counting under the hood to clean up objects when they’re done being used. If you’re familiar with modern C++, this is the same way that shared_ptr works. Each object keeps track of how many references it has, if that number drops to zero, Zeek will clean it up. Whenever we’re setting an element in a frame we need to call Ref() on it. This increases the reference count in the frame, indicating that something needs to use that value until some time in the future when Unref() is called on it.

Keeping track of reference counting in Zeek can be quite difficult to get the hang of and lead to hard to track down bugs. Take care when using a value after passing it elsewhere and if you get a segfault, this is often the cause. Debuggers like gdb and tools like valgrind can be useful to help track down what it was that got deleted.

Conclusion


The addition of key-value for loops to Zeek make the process of iterating over a table simpler and more performant:



When possible, key-value for loops should be preferred to regular loops over tables.

If you’re interested in contributing to Zeek there is no bar to entry. For C and C++ people, the Zeek core is a great place to get your feet wet developing a scripting language. You can also get involved just writing Zeek. Much of Zeek is written in Zeek. Even if you don’t program much, I wrote the README so I’m sure it's got a couple spelling and grammar errors.
No matter how you do it, working on Zeek can be an incredibly rewarding experience. It's fun, challenging, educational, and keeps the world’s networks safe.


Helpful Links and information:

Getting Involved: If you would like to be part of the Open Source Zeek Community and contribute to the success of the project please sign up for our mailing lists, join our IRC Channel, come to our events, follow the blog and/or Twitter feed. If you’re writing scripts or plugins for Zeek we would love to hear from you! Can’t figure out what your next step should be, just reach out. Together we can find a place for you to actively contribute and be a part of this growing community.

About Zeek (formerly Bro): Zeek is a powerful network analysis framework that is much different from the typical IDS you may know. https://www.zeek.org/