This is a guest blog post from Martin Holste. He's been a great participant in our community and lead developer of the log search utility; ELSA. We asked him to do a guest blog post because we think ELSA is so important to give security analysts better visibility into their Bro logs. One of Bro's greatest strengths is the massive amount of incredibly detailed information it produces that describes exactly what's taking place on your network. It does all of this by default, with no extra configuration or tuning required. Then on top of that, it provides a framework for creating advanced IDS signatures. This is an amazing thing, but the benefit is only as good as the extent to which the security or IT staff is able to make use of the data. Here is an example line of output from Bro: 1322829241.041505 drj3tWq4mu8 10.236.41.95 63714 198.78.209.254 80 HTTP::MD5 10.236.41.95 c28ec592ac13e009feeea6de6b71f130 http://au.download.windowsupdate.com/msdownload/update/software/secu/2011/01/msipatchregfix-amd64_fdc2d81714535111f2c69c70b39ed1b7cd2c6266.exe c28ec592ac13e009feeea6de6b71f130 10.236.41.95 198.78.209.254 80 - worker-0 Notice::ACTION_LOG 6 3600.000000 - - - - - - - - - There are many currently available methods for making sense of this output. Most of those methods involve variations of using text utilities to search and format the log data into an output that is requested. The problem with this is that for large installations, scalability quickly becomes an issue. To start with, combining logs from multiple servers is non-trivial if a single location does not have enough disk space to store all of the logs. Even if you can get all of the logs in one location, grepping through the hundreds of Gigabytes per day per sensor that Bro can produce in large environments is prohibitively inefficient. How much does Bro log? A large network with tens of thousands of users will generate a few thousand HTTP requests per second during the day. Bro will create many logs describing this activity, namely, per request: 1 HTTP connect log
1 DNS log (when a lookup is necessary)
1 Notice log (if an executable is downloaded)
2 Connection logs (TCP for HTTP, UDP for DNS)
1 Software inventory log (if this client hasn't been seen before)
That's a total of six logs for just one HTTP request. If the network is seeing 2,000 requests per second, that's 12,000 logs per second (about one billion per day). The logs average about 300 bytes, which means this is about 3.6 MB/sec of logs. That's about 311 Gigabytes of logs per day (if the rate were constant). Text utility speeds vary greatly, but searching even a few Gigabytes of data will take many seconds or minutes. Searching 311 Gigabytes will take hours. To put this in perspective, if we assume that a single log entry is represented by a stalk of hay, and a stalk of hay is 50 grams, and a hay bale contains 1,000 stalks for 50 kg, then one billion logs would take 1,000,000 bales. If a bale is one meter long and half a meter wide, that would be 500 square kilometers of hay to search through, per day. That's a haystack of 15,000 square kilometers per month (about five times the size of Rhode Island) to search through for a given log. Constant TimeEnter ELSA: the open-source project for Enterprise Log Search and Archive. ELSA (http://enterprise-log-search-and-archive.googlecode.com) is capable of receiving, parsing, indexing, and storing logs at obscene rates. It provides an easy to use full-text web search interface for getting that data into the hands of analysts and customers. In addition to basic search, ELSA provides ways to report on arbitrary fields such as time, hostname, URL, etc., email alerts for log searches, and a mechanism for storing and sharing search results. Read more: Bro blog
QR:
0 comments:
Post a Comment