httpsum - an apache log file analyzer

No log file analyzer seemed perfect. So I wrote one that incorporates the features I need based on my output requirements:

Details

Examples

Here's a real world example of anylizing the log files from my Captured on Earth Photography website:

Example Input Config

  <httpsum>
    <global>
      <include src="agents.xml" />
    </global>
    <sites>
      <site name="capturedonearth.com">
        <include>type-gallery2.xml</include>
      </site>
    </sites>
  </httpsum>

Example Output

 # httpsum -c config.xml capturedonearth.com/access.log.2010-05-1*

 ----- capturedonearth.com -----

 Bot hits:
      8 Ask Jeeves
      3 SurveyBot
    548 dotnetdotcom
     21 Twiceler
   1663 MSN
   ...

 Hits:
      1 Item: 6379
      1 core: DownloadItem - 6831
        1 Item: 81
      1 core: DownloadItem - 6647
        1 Item: 8545
      ...
     28 Item: 8545
        1 http://twitter.com/
        1 Item: 8549
        1 Item: 8545
        1 http://twitter.com/NevadaWolf/geocachers
        6 http://touch.facebook.com/
     62 slideshow: DownloadPicLens
        1 Item: 7128
        1 slideshow: Slideshow - 35
        ...
        2 slideshow: Slideshow - 6638
        3 slideshow: Slideshow - 6363
        8 http://capturedonearth.com/main.php
       10 Item: 8545
       13 Item: 8549
The indented lines are the referring site. EG, picture number 8545 was refered to 6 times by http://touch.facebook.com/.
Wes Hardaker
Last modified: Mon May 24 10:48:18 PDT 2010