Introduction:
The Altiris IIS Log Analyzer - version 2 (aila2) is a c# re-write of the first version of aila, that was written in C on a Linux system (it was my way to get intimate with c, pointers and efficient or not so-efficient data structures).
The rational behind the move to c# was simple: it is much easier and much more efficient for users to run the tool on their server to get the results as close as possible to where the (large) data resides and where results are consumed.
So we have a set of downloads and files here on Connect, but I need to go one step further into documenting and sharing how the tool works. Hence this code review and the following articles that will go into how easy the code can be customized or improved for everyone fits..
In this first article we will review the over code structure and make some changes to have a standalone C# source code that can be compiled directly on your server of workstation.
Presenting the aila2 version 2 code:
The code as of aila2 release 2 [1] is attached to this article. It is made of 3 files:
- aila2.cs (attached as aila2.cs_.txt, 779 lines)
- constant.cs (attached as constant.cs_.txt 288 lines)
- aila2.csproj (aila2.csproj.txt, 48 lines)
aila2.cs is the main program file. It makes use of constants defined in constant.cs whilst the csproj file is used with a Visual Studio IDE (version 2005) in order to compiling the code into an executable file. We will not need this file but it is attached here for documentation purposes.
aila2 code structure
Here is an outline of the different classes, methods and important properties that make up aila2:
class aila2 { static int Main(string[]); public enum log_levels; class Logger { public static void log_evt(log_levels, string); } class CLIConfig { public parse_results status; public string file_path; public static log_levels log_level = log_levels.error; public bool progress_bar; public bool stdin; public string out_path; public CLIConfig(); public enum parse_results; public void dump_config(); public int CheckConfig(string[]); } class ResultSet { public int LineCount; public int DataLines; public int SchemaDef; public int[] MIME_TYPE_hit_counter; public int[] IIS_STATUS_hit_counter; public int[] IIS_SUB_STATUS_hit_counter; public int[] IIS_WIN32_STATUS_hit_counter; public long[,] WEBAPP_Hit_counter; public long[,] AGENT_Hit_counter; public long[,] TASK_Hit_counter; public long[,] IRM_Hit_counter; public int[,] HOURLY_hit_counter; public IpHitLists IP_Handler; public ResultSet(); } class Timer { public static void Init(); public static void Start(); public static void Stop(); public static string tickCount(); public static string duration(); } public static readonly string[]; public enum FieldPositions; class SchemaParser { public string current_schema_string; public List field_positions; public bool ready; public SchemaParser(); public int ParseSchemaString(string schema); } class IpHitLists { public SortedDictionary ip_list; public SortedList> ip_hitters; public IpHitLists(); public string GetIpList(); } class LogAnalyzer { private ResultSet results; private SchemaParser schema; private CLIConfig config; private string[] current_line; private int _hour; private int _timetaken; private long _status; private long _substatus; private long _win32status; private string md5_hash; private string filename; public LogAnalyzer(CLIConfig); public void AnalyzeFile(string filepath); public bool AnalyzeStdin(string line); private void AnalyzeLine(ref string line); private int Analyze_MimeTypes(ref string uri); private int Analyze_WebApp(ref string uri, ref string param); private int Analyze_NSAgent(ref string uri); private void Analyze_TaskMgmt(ref string uri); private void Analyze_IRM(ref string param); private void SaveToFile(string filepath, string data); private string FloatToDottedString(float f); public void DumpResults(); } }
aila2 execution flow:
And here is a run down of the execution with the most important features detailed explained.
- aila2 start and checks the command line arguments provided. If no filepath is specified the data stream to analyze is expected to come from stdin.
- for file based analysis we read each line one at a time and check if it's a schema definition, comment or data-line
- for stdin based analysis we read each line from stdin and check if its a schema definition, comment or data-line
- schema definition line are used to update the current iis log file schema (which defines in which position interesting fields are located). We a new schema is encountered it is immediately used. If the schema is unchanged we go to the next line
- comment lines are skipped
- data-lines are parsed to update statistics in the ResultSet class
- at the end of the file stream the ResultSet is output as a json file and the Ip List is output as a text file
- at the end of the stream the ResultSet is output to stdout
Key classes and methods in the execution flow:
The classes and methods that we will review in detail relate to how we handle schema and data lines, and how we store and output the statistics. As such the classes that matter are:
- LogAnalyzer -> this is where we implement the log stream analysis
- SchemaParser -> this is how we ensure the lines are analyzed correctly
- IpHitLists -> a couple of list to store ip addresses with hit counts and hit counts ip address lists
The LogAnalyzer key methods are:
- AnalyzeFile
- AnalyzeStdin
- AnalyzeLine
- Analyze_*
These method follow the process flow as previously described. For each line in the log stream (may it come from stdin or from a file reader object) we call AnalyzeLine(ref string line). Note that we needn'tlly pass the string by reference - but we'll correct this at a later stage ;).
Now the analyze methods are where the real work take place, and most likely the location that will need to be modified to further customize t he application to support new web-applications or counters (on-top of the constants sections - but we'll get to it right now).
Constants.cs file content:
With quite a lot of details already shown above we will not dive into the constant file and the arrays and enums that are defined within.
Let's just state here that the file was a port of the c implementation, and that some elements were added and not many removed, so that will be a task for us later on as well.
So, let's just merge the files now to get a standalone C# file for quick and easy compilation using csc.exe.
Merging aila2.cs and constants.cs
This is a very simple process but let's describe it here step-by-step, to make sure we don't overlook anything and get a usable cs file:
- open constants.cs_.txt with another instance of your editor (or on a tab)
- remove the first 5 lines of constants.cs_.txt
- remove the last line with a single brace (})
- select all of the text in the document and copy the data (Ctrl+a then Ctrl +c in most GUI based editors)
- open aila2.cs_.txt with your favorite editor
- go to the beginning of last line of aila.cs_.txt where a single brace closing is present (})
- paste (Ctrl + v) the content extracted from constants.cs_.txt right there
- save the file (Ctrl +s)
You now have a single cs file that can be compiled into an executable with CSC.exe. Instruction on how to find the correct version of csc.exe are available in another document here on Symantec Connect [2]. Using csc.bat as built-up from you can now run the following command line (providing the batch and cs file are in the same folder:
csc aila2.cs
This should output an executable file that can analyze any log files you point it to or stream to it.
Conclusion:
We have started in this article reviewing the code that makes up aila2 as it is currently available on Symantec Connect. Next we will add some new features and clean up the code to make it as small as possible (without compromising on the documentation part of it).
References:
[1] https://www-secure.symantec.com/connect/downloads/aila2-c-program-analyze-altiris-iis-log-files
[2] https://www-secure.symantec.com/connect/articles/csharp-essentials-generating-sha256-directory-hash