Monday, December 23, 2019

How To Add A JPEG File Analyzer To Zeek - Part 4

by Keith J. Jones, Ph.D

Introduction

The last three blog posts demonstrated how to add a JPEG file analysis plugin into the core Zeek source code or as a package.  This part will demonstrate how you can add tests to your code when distributed as part of the Zeek source code (blog posts 1 and 2) or as a package (blog post 3).  First, this post will introduce Btest, the testing mechanism used by Zeek.  Next, this post will walk you through the mechanisms you will use to test either the Zeek core source code or packages you write for Zeek.  Although this post is specific to our JPEG example from the prior three posts, it is important to note that the testing concepts presented in this post can be used for other source code development within Zeek.  Therefore, this post should be relevant no matter what type of source code development you may be performing on Zeek.  Read on to learn more.

How Does Btest Work?

You can think of btest (https://github.com/zeek/btest) as a generic driver to automate testing using whatever language is appropriate for the task.  While that explanation sounds vague, it is actually a very powerful concept.  Btest can be used to test Zeek with pretty much any language that appropriate; so for some tests you can use shell scripting, and for other tests you can use Zeek scripts.  Such a powerful testing framework would seem complex, and it is, but using the framework is nearly trivial once you learn the basics of how it works.
Btest operates by reading any file in the current directory and executing the btest commands within them that are prefixed by “@TEST-EXEC” tokens.  Therefore, by placing the btest commands in comments for whatever language we want to use to test our code, we are able to test Zeek using pretty much any Unix command line tool we like.  The documentation explains this process better than I could reproduce it here, so at least skim the concepts introduced here:
Stop here and read the documentation at the link above before you continue further on this post.  No, really; it’s important.  I will wait.  I will now assume you know the basic concepts of btest before I describe how we are going to test the JPEG plugin we have been working on.
For this blog post, we will demonstrate some tests that will use shell and Zeek scripts to ensure our JPEG plugin is performing as expected.  Our inputs to btest will be the shell and Zeek scripts, and our tests will use the PCAP files we used to test our code in the prior blog posts.  Our commands used to run the tests will be defined in the comments of the shell and Zeek scripts.  This will make more sense as we look at a concrete example applying the concepts in the Zeek testing documentation above.
Note that some additional documentation for Zeek testing can be found at https://www.zeek.org/development/projects/testing.html.

Writing Tests For The JPEG Analyzer Plugin

If you read parts 1-3 of this blog series, you will remember that we coded the JPEG analyzer twice.  The first time, we coded it directly into the Zeek core source code tree.  Then, we took the same logic and made it into an installable package.  We will discuss testing for both methods in this post, but we will start with the package deployment method first since it is the easier of the two methods to distribute your logic.  Then, I will show how to add some tests to the Zeek core source code method for completeness.

Package Deployment

When we ran “init-plugin” in the last post, it automatically created a test for us in the “tests” directory:
The first file we will want to discuss is the “btest.cfg” configuration file:
Line 2 says that the test directories includes “jpeg”.  This is the directory we will add our new test to.  Line 15 shows where we can store pcap files.  The symbol “TRACES” is expandable within btest tests, so we do not have to discern the parent path.  We will use the trace we downloaded earlier in this series from the Wireshark project to create some tests, and this is the path we will place it in.
After running “init-plugin”, from the last blog post, one test is added to our package:
This file is a zeek script.  Inside the comments, “@TEST-EXEC:” says to run zeek in the following manner:
zeek -NN FileAnalyzers::JPEG |sed -e 's/version.*)/version)/g' >output
This line runs Zeek in a mode where the plugins are searched for FileAnalyzers::JPEG, then the “sed” command is used to find the version, and it is all output to a file called “output”. Then, the second line says to run “btest-diff” on the “output” file generated previously.  As you recall from the btest documentation, if the output file were to change between testing, the test would fail.  It is the output that defines the tests, and deviations from the output signal a failed test.
In order to detect changes between tests, baselines must be created.  Baselines can be created with the following btest command from within the “tests” directory:
$ btest -U
After that, any subsequent runs of “btest” will detect changes from this baseline run.
Now, let us add a simple test to our plugin.  Create a file called “jpeg.zeek” in the “tests\jpeg” directory with the following content:
# @TEST-EXEC: zeek -r $TRACES/http_with_jpegs.pcap %INPUT >jpeg.out
# @TEST-EXEC: btest-diff jpeg.out
# @TEST-EXEC: btest-diff jpeg.log
event file_jpeg_marker(f: fa_file, m: FileAnalyzers::JPEGMarker)
    {
      print m;
    }
The first line above tells Zeek to run with a traces file in the “$TRACES” directory, we discussed previously in the btest.cfg file.  This command uses the current file (%INPUT) as the input script.  The second two testing lines detect changes on jpeg.out and jpeg.log from the baseline.  Lastly, the event defined here outputs some of the marker information detected from the trace file.  Simply printing the marker data to the output adds it to the test because we are looking for differences in the output to the Zeek command.
We will need a PCAP file, so place the following trace file in “Traces” directory and call it “http_with_jpegs.pcap” once you have decompressed it:
Now, record the baseline with the following command:
$ btest -U
From this point forward, running “btest” again will detect changes and fail the test if the output is different.  The changes between the last blog post and this post to add these tests can be found at the following link:

Core Zeek Source Code Deployment

Adding the same test to the core Zeek source code deployment method is pretty much the same as the package version once you are able to locate the key areas to add the tests.  The new code we are adding is available at:
The “jpeg.zeek” script from above will be located at the path “testing/btest/scripts/base/frameworks/file-analysis/http/jpeg.zeek” based upon the matching directory structures between the code we added and this path.  Note that there are a few changes between this version of “jpeg.zeek” and packaged version.  The changes are mainly associated with record pathing and the path to our pcap file.
The trace from above needs to be placed at “testing/btest/Traces/http/http_with_jpegs.pcap”.  Now, enter the “testing/btest” directory and update our new baseline for this one test only:
$ btest -U scripts/base/frameworks/file-analysis/http/jpeg.zeek
Next, you can test our new test with the following command:
$ btest scripts/base/frameworks/file-analysis/http/jpeg.zeek
If there has been a change to the baseline, btest will notify you.  If you decide to test all of Zeek by simply executing “btest”, be warned that there are many and not all of them will pass.  You will need to find your plugin’s test out of many, so for development purposes you will likely want to run single tests to save time.

Conclusion

This blog post demonstrated how to add testing to our JPEG file analyzers.  We tested the core source code with Zeek if we choose to deploy that way, and we also tested the package we created in the last blog post to cover that deployment method.  With testing added, your new custom JPEG file analyzer is ready to be consumed by the open source community.  You can either apply to add your source code to the core Zeek source code distribution on GitHub, or you can add your package to https://packages.zeek.org/ so that other users can benefit from your work.

-------------

About Keith J. Jones, Ph.D

Dr. Jones is an internationally industry-recognized expert with over two decades of experience in cyber security, incident response, and computer forensics. His expertise includes software development, innovative prototyping, information security consulting, application security, malware analysis & reverse engineering, software analysis/design and image/video/audio analysis.

Dr. Jones holds an Electrical Engineering and Computer Engineering undergraduate degrees from Michigan State University. He also earned a Master of Science degree in Electrical Engineering from MSU. Dr. Jones recently completed his Ph.D. in Cyber Operations from Dakota State University in 2019.

Friday, December 20, 2019

How To Add A JPEG File Analyzer To Zeek - Part 3

by Keith J. Jones, Ph.D
Introduction

The last two blog posts (Part 1 and Part 2) demonstrated how to add a JPEG file analysis plugin. This part will show you how to take our working JPEG source code and make it a Zeek package.  A Zeek package can be installed dynamically instead of requiring compilation directly into the Zeek source code tree like we did in the last post.  At this point, it might be a good idea to read the following blog post if you are unfamiliar with packages:
You will find that one of the benefits of using Zeek packages is that your development time will be shortened considerably when compared to the method we used in the past two posts.  Basically, the Zeek source code will only need to be compiled and installed once while our package can be repeatedly compiled in a minute or two.  For that benefit alone, it is worth understanding how you can quickly create new file analyzers for Zeek through packages.
The source code for this post is available at:
I wish I could say that there are easy methods for determining the changes to the source code from our last blog post to make it load as a package, but there are no shortcuts here.  Most of the changes discussed in this blog post were determined based upon several days of trial and error to match the logic we wrote previously to the requirements for a package produced by the “init-plugin” script.  I will try to describe the reasons for the changes the best that I can, but as convoluted as it might sound, remember that you have all of the source code through the links above.  This means you can experiment as I did to understand the concepts I try to present here if my explanations are unclear.

What Are Zeek Plugins and Packages?

Zeek plugins are extensions to the Zeek C++ source code.  Plugins can be distributed in the Zeek core repository or they can be distributed separately as packages.  Compiled plugins eventually end up as dynamically linked libraries that can be inserted into the main Zeek source code at run time, along with any required Zeek scripts.  This allows for a plugable model for execution such that you do not have to provide the whole Zeek source code tree in addition to your plugin logic to your end users.  If you were to try to distribute your plugin source code with the rest of Zeek, say because it didn’t fit in the core Zeek source code, this distribution mechanism would quickly become cumbersome for you and your users.  Packages containing plugins make distribution of your custom logic much easier.  
Another benefit to developing packages versus on the core Zeek source code is that your development time will be shortened dramatically.  The Zeek source code will only need to be compiled and installed once.  Our package can be repeatedly compiled in a minute or two for each little change we might like to try and inserted in our working copy of Zeek.  
For development velocity alone, it is worth understanding how you can quickly create new file analyzers for Zeek through packages.  However, there are other benefits.  With packages, you are able to keep portions of your custom logic private and only share it with users that might want it without worrying about them having to compile and install a brand new version of Zeek.  Therefore, this post will assume that you want to translate the JPEG code we developed in the last two posts into a package we can install into Zeek.  
At this point, compile and install the stock version of Zeek with debugging enabled (this was discussed in part 1 (https://blog.zeek.org/2019/12/how-to-add-jpeg-file-analyzer-to-zeek.html).  We want to overwrite custom version we compiled from our last blog post because it may conflict with the code in this post.  We don’t need our custom version of Zeek anymore when we have a plugin!
For your future reference, the basic Zeek package documentation is available at the following links:

Creating A JPEG File Analyzer Zeek Package

The Zeek source code comes with a shell script that will create the basic structure of a Zeek loadable plugin called “init-plugin” at https://github.com/zeek/zeek-aux/blob/master/plugin-support/init-plugin.  You can create the plugin skeleton in the directory “~/Source/ZeekFileAnalyzers” with the Zeek namespace “FileAnalyzers” named “JPEG” with the following command:
$ ./aux/zeek-aux/plugin-support/init-plugin ~/Source/ZeekFileAnalyzers FileAnalyzers JPEG
Executing the script creates the skeleton that can be found in the “initial” branch of the GitHub repository for this plugin.  You can either run the script above or checkout the “initial” branch:
The source code can now be compiled with the following commands:
$ ./configure --enable-debug --zeek-dist=/your/directory/with/zeek/source/code
$ make
I recommend you navigate through all of the files in this skeleton and become familiar with the information in the comments.  You will find this information helpful as I walk you through the portions of the skeleton that must be modified to support our JPEG file analyzer.  The subsections below will walk you through the remaining steps and the reasoning behind the changes to the skeleton source code.
The changes to the output of the “init-plugin” command to add our JPEG logic are explained in a thirteen step process below.  Each short section for each step below will explain the minimal changes required to get our logic working and compiling as a package.

Step 1 - Copy Over Our JPEG Logic

First, we know we are going to need the following custom files we wrote previously, so copy them to the “src” directory in the skeleton.  We know we need these files because there are no equivalent files in the skeleton for us to modify, so just copy them as they are from our last blog post:
  1. JPEG.h
  2. JPEG.cc
  3. jpeg.pac
  4. jpeg-file.pac
  5. jpeg-file-headers.pac
  6. jpeg-analyzer.pac
  7. events.bif
The next several steps will fix up some of these and other files in the repository we are creating so that they are able to be compiled and installed as a Zeek package.

Step 2 - Fix Up JPEG.h

Most of JPEG.h is correct, but we need to include two new files that will exist in our package.  In the include section, add the following lines:
#include "events.bif.h"
#include "types.bif.h"
These lines can be viewed at the following link:  https://github.com/corelight/zeek-jpeg/blob/master/src/JPEG.h#L8

Step 3 - Fix Up events.bif

We changed the namespace making this a package, therefore we have to fix up a few places where we use records.  Make sure the last line of events.bif looks like this:
This change makes the “file_jpeg_marker event accept the new type “FileAnalyzers::JPEGMarker” we will be defining next.

Step 4 - Add types.bif

We need to add the type we just used in Step 3.  We will be defining this type in a new file called “types.bif”.  You will remember this type from our last blog post.  Your new types.bif file should look like this:

Step 5 - Configure zkg.meta

This file contains the metadata for the package and was created by “init-plugin”.  zkg.meta tells the various Zeek tools how to install this package.  This file should have been created mostly correct, but the configure command for our package needs the Zeek installation directory.  Change your “build_command” to match the one in this file:

Step 6 - Fix Up jpeg.pac

You will need to link jpeg.pac to events inside the package with the include lines 4-6 here:
During this process, I also renamed the file from “JPEG.pac” to “jpeg.pac” for OCD consistency. ;-)

Step 7 - Fix Up jpeg-file.pac

You can delete the line containing “%include jpeg-file-types.pac” since we no longer have that file in our package (we no longer need what was originally defined in it).

Step 8 - Fix Up jpeg-analyzer.pac

We now have to use the new type we defined, since we are not in the Zeek core source code.  The new type we defined must be included with the following line:
The new type must be used by modifying the following line to match as follows:
Note that the modification is the record type “BifType::Record::FileAnalyzers::JPEGMarker” instead of the original value of “BifType::Record::JPEG::JPEGMarker” since we have a new namespace with this package.

Step 9 - Fix Up CMakeLists.txt

We must add the new binpac, cc, and bif files we placed in this project to the makefiles.  This is done with lines 9-17 in the following file:
As you may recall, this file identifies which files will be made (or compiled) into the package.  Therefore, we are identifying all of the new files here.

Step 10 - Make The Module Attach The Analyzer

To tell Zeek to attach the new analyzer to the file analyzer pipeline for JPEG files, the following line must be added to the load script within the package:
This says that when the module is loaded, the “main.zeek” script will be executed.  Recall from our last blog post that “main.zeek” has all of the logic for processing JPEG files.

Step 11 - Fix Up main.zeek

Only one small change is needed from our original main.zeek from the last blog post.  Copy over the previous main.zeek, but change the following line to point to our new namespace:
Specifically, we are changing the argument from “m: JPEG::JPEGMarker” to “m: FileAnalyzers::JPEGMarker”.

Step 12 - Add types.zeek

Instead of having our types ready through the “init-bare.zeek” script, as we showed in the last post, we are going to add it to a file named “types.zeek” that identifies this record.  Create the content in the following file:

Step 13 - Fix Up Plugin.cc

Add “JPEG.h” to the include section of this file:
Add the new component we are creating, to Zeek:
Then, add a “config.description” to look like the following:

Build And Install The Package

With the source code completed, we can build and install the package, which will install the JPEG plugin, with the following commands:
$ ./configure --enable-debug --zeek-dist=/your/directory/with/zeek/source/code
$ make
$ sudo make install
Recall from our last post that you can view the installed plugins with the following command:
$ zeek -N
Zeek::ARP - ARP Parsing (built-in)
Zeek::AsciiReader - ASCII input reader (built-in)
...
Zeek::XMPP - XMPP analyzer (StartTLS only) (built-in)
Zeek::ZIP - Generic ZIP support analyzer (built-in)
FileAnalyzers::JPEG - JPEG File Analyzer (dynamic, version 0.1.0)
As you can see from the last line above, the FileAnalyzers::JPEG plugin has been loaded dynamically.  We can now use it in our Zeek scripts.

Using The Plugin

Create a file called “jpeg.zeek” with the following content:
@load FileAnalyzers/JPEG
event file_jpeg_marker(f: fa_file, m: FileAnalyzers::JPEGMarker)
    {
    print m;
    }
You can execute Zeek with our JPEG analyzer on the PCAP we used from the last blog post, like so:
$ zeek -r pcaps/http_with_jpegs.cap jpeg.zeek
The output is too lengthy to reproduce here, but if you are following along you will see output for every JPEG marker encountered, just as we saw in the last blog post.  Note that because we load “main.zeek” from “__load__.zeek” script, we only have to load the module with “@load FileAnalyzers/JPEG” to get things started from our custom scripts.  The “file_jpeg_marker” event is fired because we see the output from our script.
At this point, we have replicated our JPEG functionality as an installable package.  The various “zkg” commands will work on this directory, or if you point it at the GitHub URL, such as:
$ zkg autoconfig

Create The Test Baseline

This is going to be a topic of a future blog post, but for now go into the “tests” directory and run the following command to generate the baselines needed for the basic package test:
$ btest -U

Source Code Differences

You can see the default source code from the stub provided by Zeek’s “init-plugin” command at the following link:
The changes between the output from the “init-plugin” script and the code presented in this blog post can be viewed at:

Conclusion

This blog post walked you through the steps required to make your Zeek JPEG file analyzer into a dynamically loadable package.  A considerable portion of the logic could be copied “as-is”, but there were several modifications needed so that it could be compiled and installed as a package.  This blog post discussed those modifications before building, installing, and using the JPEG analyzer.  Lastly, this blog post created the baseline required for “btest”, the testing system Zeek uses.  The next blog post will discuss how tests work in Zeek’s world, and we will add some additional tests to our JPEG analyzer.

-------------

About Keith J. Jones, Ph.D

Dr. Jones is an internationally industry-recognized expert with over two decades of experience in cyber security, incident response, and computer forensics. His expertise includes software development, innovative prototyping, information security consulting, application security, malware analysis & reverse engineering, software analysis/design and image/video/audio analysis.

Dr. Jones holds an Electrical Engineering and Computer Engineering undergraduate degrees from Michigan State University. He also earned a Master of Science degree in Electrical Engineering from MSU. Dr. Jones recently completed his Ph.D. in Cyber Operations from Dakota State University in 2019.