Understanding %D in LogFormat for apache2 logs

A colleague recently asked me if %D, when used in your LogFormat for apache2 access logs includes the ‘download’ time of transmitting the response back to the client.

Following are observations from my apache2 access logs when using the %D format code. The first column is %D in the following log entries.

# 5 second pause on server side code. In this test case I deliberately induce a 5 second delay on server side code using the php sleep function. This delay can be seen in the timestamp (microseconds)
5003909 138.130.86.92 - - [11/Sep/2009:16:43:50 +1000] "GET /test.php HTTP/1.1" 200 139 "-" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3"

# Static image open bandwidth. In this test case I let the browser download a static image as fast as my mobile broadband permits. The time takes approx 4.8 secs to download approx 900KB.
4860133 138.130.86.92 - - [11/Sep/2009:17:00:51 +1000] "GET /gorilla.jpg HTTP/1.1" 200 912128 "-" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3"

# Static image restricted bandwidth. In this case I restrict the bandwidth to any destination port of 80 to 1KB/sec using ipfw[1]. This time it takes approx 12.8 secs to download the same file.
12891663 138.130.86.92 - - [11/Sep/2009:17:02:56 +1000] "GET /gorilla.jpg HTTP/1.1" 200 912128 "-" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3"

[1] to restrict bandwidth on my mac I did this:
sudo ipfw pipe 1 config bw 1KByte/s
sudo ipfw add 1 pipe 1 dst-port 80
sudo ipfw delete 1

To sum up, when using %D in LogFormat for apache2, this timestamp is increased by any:
1. server side processing delays
2. delays attributed to restricted bandwidth etc

QED

Read More

Establishing Concurrency from your Web Logs

I recently had a conversation with a colleague about how one might determine the concurrency for a given page or transaction at any point in time. The system under test was a typical web server and we had access to the web logs for analysis.

That is, we wanted to know how many virtual users were hitting a web page at the same time without using a sledgehammer to open the walnut so to speak… To that end we fell back to simple unix commands like awk, sort and uniq.

The typical web log entry we were looking for looked like this:
10.1.20.10 - Unauth [16/Dec/2008:11:59:16 +1100] "GET HTTP://system.under.test/secure.html HTTP/1.1" 200 692

On windows we would use a command similar to this to generate our report:
gawk "{print $4,$7}" web01-request.log | grep "\/secure" | uniq -c | sort -T C:\temp | gawk "{print $1}" | uniq -c

The first awk prints the date time column and url column which we then grep for any entries containing our target word ‘secure’ (we could have done this in the awk statement too).

Then we use uniq -c to generate a unique count of rows that have the same date time, sort that (using a temp file), awk again on the first column and uniq -c again. The output looks like this:

2218 1
1483 2
263 3
258 4
38 5
36 6
2 7
5 8
1 12

So we can then determine for example that there were 1,483 occurrences where 2 virtual users hit the target at the same time. There were 258 occurrences where 4 virtual users hit the target at the same time and so on.

You can then graph this using the google chart API. No excel required =)

Read More