let’s go a bit deeper and discuss some best practices regarding centralized logging and what other log files you can put in your security incident and event management (SIEM) server. Before I do, picture this scenario:It’s 11:00 p.m. Saturday night over the Labor Day weekend. Your helpdesk reported that the network is slow in New York City. That is very odd—no one is working Saturday in New York, Chicago, Los Angeles or in any of your offices.What is going on?
You haven’t yet implemented centralized logging or a SIEM tool, so you call the operations team (Ops) and alert them that something is going on in New York. You wait for them to get back to you. Thirty minutes pass; then you get a text back:Ops: Yes, there is a problem in New York. It is in the big video conference room on the 45th floor. Someone or something is flooding the network with traffic. The entire network in New York is crawling at a snail’s pace.Maybe a criminal is launching a ransomware attack.Maybe there is a denial-of-service (DOS) attack.Maybe the customer you hosted on Friday afternoon put a Raspberry Pi on the network in the conference room and has attacked the network.
What is going on?!
For you or your team to be able to answer this question quickly, you need to know what is happening on your network. You need centralized logging.
As you read this post, you might be thinking, “I can’t afford this, John!” and you’re probably right. Information security probably doesn’t have the budget for centralized logging just for the sake of information security. But once you have the logs in a central location, they can be used for other business purposes besides infosec.
OK, that makes more sense. How do you get started?
First, you want to follow best practices; namely, plan ahead and think it through. Planning and thinking through this kind of project will pay off on several fronts, not just for information security.
Here are some of the things to consider when you say to yourself, “I want centralized logging to improve my information security program.”
Step One: Make a plan and have a strategy for this project
Do not buy the first SIEM tool you find. Think about what data you want to collect. As part of this planning process you’ll ask questions of your network team and others including:
- How big are the daily logs from the web servers, SQL, Oracle DBs, etc.?
- What is our network traffic load like (Gigabytes of network logs? Terabytes of network logs)?
- How many devices do we want (or need) to monitor (servers, switches, firewalls, wireless APs)?
- From what other systems do we want to collect logs (antivirus, home-grown applications, VoIP traffic, printer logs, your Kubernetes farm, etc.)?
- What kind of shop are you running? All Microsoft? All Linux? A hybrid?
- Besides security monitoring, why are you logging all this information? Application troubleshooting? Customer support? Continuous improvement?
Step Two: Standardize
Before you purchase anything, make sure the structure of the logs you are collecting is consistent or can be made consistent in the tool.
You won’t be able to ingest logs from multiple data sources unless there is a consistent log format. Your network infrastructure devices will have a format—most likely syslog format—and your firewall(s) will likely have a similar format and then things can get proprietary (ugly, in other words). Remember, you are not just dumping data into a SQL server and then magically extracting useful information and meaningful insight into your network.
Step Three: Keep all your network devices synchronized
This might seem obvious, but to be clear, you need to make sure the logs are all synced to the same time. All network devices and computer systems have a clock, so you will get the date and time for the events that you are logging. You want to use network time protocol (NTP) to sync all the systems to the same time source or you’ll have problems. Time is relative, sure, but for the purposes of logging events in a SIEM tool for troubleshooting, you need the clocks on your devices set to the same time and time zone.
If you have a switch (or two) that thinks it’s 1990 but you know it’s 2022, you are going to have a real tough time figuring out what happened Saturday night. It is easy for network devices and servers to get out of sync; you need them to be synchronized so that you know what happened at exactly what time.
Step Four: Ensure that each data source has unique identifiers
If you are searching through log data looking to see what happened Saturday night at 11:00 p.m. Eastern Daylight Time, make sure you know which switch is in the server room and which switch is in the big video conference room on the 45th floor. Here is an example of a switch log record; note the various fields and values that you want to be able to search and index.
You can see that this single log record has lots of information, but what switch did it come from? You need to be able to answer that question or all your time and effort will be wasted.
Step Five: Keep your production logs and centralized logs separate
This is, again, probably obvious; but to be clear: Your SIEM tool does not replace your SQL logs (or Oracle logs or other production logs). When you need to roll back transactions in SQL or Oracle, etc., you are going to use those production logs. The value of the SIEM tool is to gather insights about your network and servers and other devices. We are mainly focused on security insights (telemetry, correlation, etc.), but the same advice applies to troubleshooting a cranky application, investigating dropped VoIP calls or providing customer support.
“Hooray! Now I’m done!”
You’re not done yet.
“Wait—I’m not?”
Well, you’re more than halfway done. You’ve done the heavy lifting of getting your log data organized and centralized so that you can identify problems on your network when they happen. That is great! Now you get to use this new tool to get insight into what is happening on your network.
Flashback to the Labor Day incident and you can start to see how this tool can help you figure out what is happening.It’s 11:00 p.m. Saturday night over the Labor Day weekend. Your helpdesk reported that the network is slow in New York City. That is very odd—no one is working Saturday in New York, Chicago, Los Angeles or in any of your offices.What is going on?
You tell the helpdesk to put in a ticket to network operations about the slow network in New York. The Ops team opens up the SIEM tool and does a query. Sure enough, the switch in the conference room is blasting out a ton of bad packets. When they look a bit closer, they see the offender is an IoT device that has gone bad and is flooding the network with bad packets.
No other alerts have been triggered.
- The firewall is not showing unusual activity out of New York or anywhere else.
- The database servers are humming along fine in the server room.
- The only problem is that one switch in the conference room.
- It’s not ransomware; you’re not under attack.
- You don’t have to call the CEO or the CFO about a possible ransomware attack.
The Ops team shuts off the port on the switch, traffic returns to normal and the event is logged in the ticketing system. A ticket has been opened for a support person in New York to replace the bad IoT device first thing Tuesday morning.
Mystery solved, crisis averted—and you can chalk up that win to using the SIEM tool to identify the offending switch. That is #Winning.
Guide to Computer Security Log Management : Recommendations of the National Institute of Standards and Technology