HTTP Logs Analysis using Microsoft Log Parser

While there are several tools freely available on the web to analyze your website traffic and they are doing great at this (Google AnalyticsGoogle Webmaster ToolBing Webmaster tool …). These tools provide great and free value to track your traffic and troubleshoot potential issues on your website. As any tool available they have some limitations and the need to find alternative/complementary solutions becomes necessary.

In this post I will discuss the use of Microsoft Log Parser to analyze “hits” on your web server  Any website of different size or complexity comes to have these different types of problems with time:

1)    Change of URL
2)    Removing old pages
3)    Error pages

To some extend the tools mention above will show you these errors, but they might not be exactly what you seek in a real data analysis perspective. Let’s take for example Error pages, some of your pages crashes sending HTTP 500 Status Code, you might not be able to recover data using the normal Google Analytics Javascript depending of how you are treating these crashes.

One way to get access to these data is to analyze you web server logs (if they are active of course). So as not to get too detailed in the explanation find below some utility code that will help you troubleshoot issues in your application. (After installing Log Parser you will be able to run the below syntax from command line)

HTTP 200 OK from Google Bots
[SQL]
LogParser.exe “SELECT date, count(*) as hit INTO HTTP200.jpg FROM Path\to\Logs\*.log WHERE cs(User-Agent) LIKE ‘%%google%%’ AND sc-status = ‘200’ GROUP BY date ORDER BY date” -i:w3c -groupSize:800×600 -chartType:Area -categories:ON -legend:OFF -fileType:JPG -chartTitle:”HTTP 200 Hits”
[/SQL]

HTTP 301 Permantly Moved Google Bots
[SQL]
LogParser.exe “SELECT date, count(*) as hit INTO HTTP301.jpg FROM Path\to\Logs\*.log WHERE cs(User-Agent) LIKE ‘%%google%%’ AND sc-status = ‘301’ GROUP BY date ORDER BY date” -i:w3c -groupSize:800×600 -chartType:Area -categories:ON -legend:OFF -fileType:JPG -chartTitle:”HTTP 301 Hits”
[/SQL]

HTTP 4xx Not Found / Gone Google Bots
[SQL]
LogParser.exe “SELECT date, count(*) as hit INTO HTTP4xx.jpg FROM Path\to\Logs\*.log WHERE cs(User-Agent) LIKE ‘%%google%%’ AND sc-status >= 400 AND sc-status < 500 GROUP BY date ORDER BY date” -i:w3c -groupSize:800×600 -chartType:Area -categories:ON -legend:OFF -fileType:JPG -chartTitle:”HTTP 4xx Hits”
[/SQL]

These queries will produce nice graphs of how much HTTP 200,301,4xx hits you receive per day while the Google bot is crawling you site.

You can also easily find out the same thing for your users by changing the cs(User-Agent) LIKE ‘%%google%%’ to cs(User-Agent) NOT LIKE ‘%%bot%%’.

Of course these are approximated to a certain level, because not all bots add the keyword “bot” to use user-agent.

Hoping this can come in handy. If you have more queries to share, drop by and put a comment.
Further readings :

http://blogs.iis.net/carlosag/archive/2010/03/25/analyze-your-iis-log-files-favorite-log-parser-queries.aspx

http://logparserplus.com/

Using windows hosts file

Windows hosts file, located under “[SystemDriveLetter]:\Windows\System32\drivers\etc” is very useful when you have to test your web applications hosted either locally or on a remote server and you do not wish to map them to your DNS.

Let’s take an example where you have a website named : http://www.my-simple-web-application.com. You will most likely have 3-4 versions of the application dev, preprod, test, live (where live would be http://www.my-simple-web-application.com)

To facilitate testing you could come up with a standard way of addressing these environments :

http://dev.my-simple-web-application.com
http://preprod.my-simple-web-application.com
http://test.my-simple-web-application.com

Each of these sub-domains might point to the same or different servers. This is where the hosts file comes handy, you can configure something like :

127.0.0.1 dev.my-simple-web-application.com
127.0.0.1 preprod.my-simple-web-application.com
127.0.0.1 test.my-simple-web-application.com

In this example all IP addresses are local, you can change them as needed, beware that this configuration should be place on each desktop (development and test) that you want to use these sub-domains.

On another note, this configuration can also be achieve network wide if you have a configurable router where you can add global hosts.

There a number of other situations where hosts file can be helpful :
1) You are migrating your website to a new server, in this case you can specify you existing domain name in the hosts file and point it to the IP of the new server
2) You have multiple web servers hosting the same application and one of them is not working properly you can target the mischievous server and change your host file to point only this server.

Lenovo 3000 N100

I just bought a Lenovo 3000 N100 6 month ago. Now it’s time for some review on this piece of hardware.

The hardware specification is as follows:

Model: N100 0768-FFG
CPU: Core 2 Duo T5600 1.83 Ghz
Memory: 1Gb DDR2 PC5300
Hard Drive: 120 GB HDD (Fujitsu) SATA
Screen: 15.4″ WSXGA+ 1680×1050 Glossy
Optical Drive: DVD-RW Matshima
GPU: NVIDIA 7300 Go 128 MB (dedicated)
Network/Wireless: Intel Wireless 3945A/B/G, Realtek 10/100 Ethernet Card, Modem and Bluetooth
Inputs: 84 Key Keyboard with Two Button Touchpad with Scroll Bar
Buttons: Power, Lenovo Care, Power Up and Down, Mute, and WiFi/Bluetooth On/Off Switch.
Ports:

  • Four USB 2.0
  • Four-Pin Firewire
  • 4-in-1 Card Reader
  • Ethernet
  • Modem
  • VGA Out
  • S-Video Out
  • Microphone
  • Headphone
  • Security Lock
  • Power Connector

Integrated Camera (1.3 MegaPixel)
Fingerprint reader
6c Li-Ion

It was delivered with Windows Vista Business Edition. And some other software that makes it run slower than it should. Thus after the 1st month having Vista on it, i decided to downgrade back to Windows XP SP2 Pro. And this is just so fine. Beware before doing this operation make sure that you download any Driver that are needed. Since the set is not delivered with a driver CD. I found all the drivers on the Lenovo website.

Here are some stats and test that i did.

Using CPUz(http://www.cpuid.com/cpuz.php)

CPU

Cache

SPD

Mainboard

Super Pi Calculation (http://www.overclock.net/downloads/28044-definitive-super-pi-thread.html)

Will be coming soon… let me get some time.

Web Stress Tool

Today I have been trying to web stress my development website using ACT from Microsoft. I am using Visual Studio 2005 Professional, and while taking a look to find this precious tool that was readily available in the previous version VS.NET 2003. I couldn’t find it. (uh!!! did i forget to install it ???? ) checked out but nothing done, checked on the web and reach a forum post which said. Application Test Center is no more available on VS.NET 2005, you can buy a new licence of VS.NET Team tester to to be able to use some stress tool. Ok how nice 🙂 marketing strategy…

Anyway, i have been wandering around the web to find a proper web stress/load tool to be able to test my web developments. results have been pretty deceiving… could not find a proper tool for testing ASP.NET Websites. After some time i came up on sourceforge. to find this tool : WEBLOAD (Open source performance testing) which is the open source version of the recognised Radview Webload.

I’ll be now trying this tool and then give some feedback soon after.