HTTP Logs Analysis using Microsoft Log Parser

While there are several tools freely available on the web to analyze your website traffic and they are doing great at this (Google AnalyticsGoogle Webmaster ToolBing Webmaster tool …). These tools provide great and free value to track your traffic and troubleshoot potential issues on your website. As any tool available they have some limitations and the need to find alternative/complementary solutions becomes necessary.

In this post I will discuss the use of Microsoft Log Parser to analyze “hits” on your web server  Any website of different size or complexity comes to have these different types of problems with time:

1)    Change of URL
2)    Removing old pages
3)    Error pages

To some extend the tools mention above will show you these errors, but they might not be exactly what you seek in a real data analysis perspective. Let’s take for example Error pages, some of your pages crashes sending HTTP 500 Status Code, you might not be able to recover data using the normal Google Analytics Javascript depending of how you are treating these crashes.

One way to get access to these data is to analyze you web server logs (if they are active of course). So as not to get too detailed in the explanation find below some utility code that will help you troubleshoot issues in your application. (After installing Log Parser you will be able to run the below syntax from command line)

HTTP 200 OK from Google Bots
LogParser.exe “SELECT date, count(*) as hit INTO HTTP200.jpg FROM Path\to\Logs\*.log WHERE cs(User-Agent) LIKE ‘%%google%%’ AND sc-status = ‘200’ GROUP BY date ORDER BY date” -i:w3c -groupSize:800×600 -chartType:Area -categories:ON -legend:OFF -fileType:JPG -chartTitle:”HTTP 200 Hits”

HTTP 301 Permantly Moved Google Bots
LogParser.exe “SELECT date, count(*) as hit INTO HTTP301.jpg FROM Path\to\Logs\*.log WHERE cs(User-Agent) LIKE ‘%%google%%’ AND sc-status = ‘301’ GROUP BY date ORDER BY date” -i:w3c -groupSize:800×600 -chartType:Area -categories:ON -legend:OFF -fileType:JPG -chartTitle:”HTTP 301 Hits”

HTTP 4xx Not Found / Gone Google Bots
LogParser.exe “SELECT date, count(*) as hit INTO HTTP4xx.jpg FROM Path\to\Logs\*.log WHERE cs(User-Agent) LIKE ‘%%google%%’ AND sc-status >= 400 AND sc-status < 500 GROUP BY date ORDER BY date” -i:w3c -groupSize:800×600 -chartType:Area -categories:ON -legend:OFF -fileType:JPG -chartTitle:”HTTP 4xx Hits”

These queries will produce nice graphs of how much HTTP 200,301,4xx hits you receive per day while the Google bot is crawling you site.

You can also easily find out the same thing for your users by changing the cs(User-Agent) LIKE ‘%%google%%’ to cs(User-Agent) NOT LIKE ‘%%bot%%’.

Of course these are approximated to a certain level, because not all bots add the keyword “bot” to use user-agent.

Hoping this can come in handy. If you have more queries to share, drop by and put a comment.
Further readings :

SEO: Bounce rate of a website

Why is my bounce rate so high ?

Definition: A bounce occurs when a person leaves your website after reaching your entry page. The above cases can be considered equally as bounces from your website.

1) Visitor enters your site and press back immediately (before or even after the page has loaded)

2) Visitor waits for the page to load stays on this page for some time and then press back or navigate on another site. ( In this case the visitor might have found the information and then chose to navigate elsewhere to either find some supplementary information. Or it could be that he/she might not have found it but just read some pieces to see what is there, a third case could be the persons did not like the: website, content or colors on the site and went away.)

Therefore there seems to be considerable number of aspects to take into consideration to get a more precise question about “Why is my bounce rate so high ?”. There isn’t any straight forward answer to this question, but there are many questions that can lead to possible solutions:

When you ask your questions about bounce rate here are the different questions that might come to your mind.

Why is my bounce rate so high ?

User Interface
Is my layout/presentation/design attractive to visitors ?
Does my pages load slowly ?
Do my page have appropriate ads ? Are these ads non-aggressive towards the user ?
Is your page browser friendly ? (Can be views at any resolution with any browser the same way)