Analyzing (small) log file

time to read 4 min | 686 words

I got a log file with some request trace data from a customer, and I want to have a better view about what is actually going on. The log file size was 35MB, so that made things very easy.

I know about Log Parser, but to be honest, it would take more time to learn to use that effectively than to write my own tool for a single use case.

The first thing I needed to do is actually get the file into a format that I could work with:

var file = @"C:\Users\Ayende\Downloads\u_ex140904\u_ex140904.log";
var parser = new TextFieldParser(file)
{
CommentTokens = new[] {"#"},
Delimiters = new[] {" "},
HasFieldsEnclosedInQuotes = false,
TextFieldType = FieldType.Delimited,
TrimWhiteSpace = false,
};

////fields
// "date", "time", "s-ip", "cs-method", "cs-uri-stem", "cs-uri-query", "s-port", "cs-username", "c-ip",
// "cs(User-Agent)", "sc-status", "sc-substatus", "sc-win32-status", "time-taken"

var entries = new List<LogEntry>();

while (parser.EndOfData == false)
{
var values = parser.ReadFields();
if (values == null)
break;
var entry = new LogEntry
{
Date = DateTime.Parse(values[0]),
Time = TimeSpan.Parse(values[1]),
ServerIp = values[2],
Method = values[3],
Uri = values[4],
Query = values[5],
Port = int.Parse(values[6]),
UserName = values[7],
ClientIp = values[8],
UserAgent = values[9],
Status = int.Parse(values[10]),
SubStatus = int.Parse(values[11]),
Win32Status = int.Parse(values[12]),
TimeTaken = int.Parse(values[13])
};
entries.Add(entry);
}

Since I want to run many queries, I just serialized the output to a binary file, to save the parsing cost next time. But the binary file (BinaryFormatter) was actually 41MB is size, and while parsing the file took 5.5 seconds for text parsing, the binary load process took 6.7 seconds.

After that, I can run queries like this:

var q = from entry in entries
where entry.TimeTaken > 10
group entry by new {entry.Uri}
into g
where g.Count() > 2
select new
{
g.Key.Uri,
Avg = g.Average(e => e.TimeTaken)
}
into r
orderby r.Avg descending
select r;

And start digging into what the data is telling me.