Your SlideShare is downloading. ×

Weblog analsys

5,911
views

Published on

The main idea of this presentation is to give an overall idea of web log analysis tool.

The main idea of this presentation is to give an overall idea of web log analysis tool.

Published in: Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
5,911
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Presented By: Somnath Mazumdar somnath.mazumdar@ucdconnect.iehttps://www.csi.ucd.ie/users/somnath-mazumdar
  • 2. z Introductionz Pros & Cons of Methodsz AWStatsz Google Analyticsz AWStats Vs Google Analyticsz Packet Sniffingz Approachz Conclusion 1
  • 3. z  Weblogs: Activity/transaction information of web serversz  Earlier weblogs are used to count the visitors.z  Web Analysis: off-site and on-site.z  On site information retrieval: 1. Page Tag 2. Historical Web data Analysis.z  Usages : 1.Performance 2.Security 3.Prediction (Regression/CART) 4.Reporting&Profiling: 4.1. Web statistics 4.2. BusinessAnalytics(K-means, MC) 2
  • 4. z  Pros: 1. Accuracy: End user data. 2. Speed of Data Reporting 3. Data Collection Flexibility 4. No need of own web serverz  Cons: 1. User or Firewalls can restrict tag L 2. Tag each page L 3. cannot report on non-pages hit 4. Unable to track bandwidth, serverresponse time or completed downloads. 3
  • 5. z  Pros: 1. Non-invasive data collection 2. Can track bandwidth and completed downloads 3. Helps to optimize for search engine 4. Securely capture http user names 5. Can track “spiders” or robots. 4
  • 6. 6. Exact content delivery information 7. Website content time-to-serve time 8. Missing or broken pages informationz  Cons: 1. Proxy/caching inaccuracies 2. No event (javascript, flash or AJAX )tracking 3. Log management :Log generation, Logstorage, and log file transfer. 5
  • 7. z  Goal: System based or Product basedz  Cost: Freeware or Commercialz  Storage: Log Storage (3rd party)z  Report/Tips: Generate report static or real time with tips.. AWStats is a powerful log analyzer createsadvanced web, ftp, mail and streaming server statisticsreports. Google Analytics provides in depth productmarketing information and tips (Google Adwords/AdSense). 6
  • 8. z  Freewarez  Graphically presented reportsz  Customizable reportsz  Reports based on users, OS, browser, location, data transfer, bookmark, total visits and so on.z  Standard and custom log format supportedz  Works from CLI as well as a CGI (Flexibility)z  Written in Perlz  Many desired features..z  But Less visualized/interactive (GA) 7
  • 9. z  Issues: 1. DNS look up & Full Year View (time) 2. Database Format Using "xml" format 3 times larger than default. 3. Feature exclude records from SPAM referrer (5 times slower). 4. To differentiate URLs of dynamic pages(memory). 5. Accuracy hampers speed: Keywords ( 1%),Search Engines (9%) Worms Detection(15%), OS(2%). 6. Each Extra section reduces AWStatsspeed by 8%. Wrong setup may eat all memory. 8
  • 10. z  Session "unknown"z  AWStats counts everything as pagesz  Reports cannot be generate based on current/custom datez  Reports cannot be generate based on custom date range and on weekly basis.z  On few Intel Pentium4 / Xeon4 based host systems, log file time can not be computed correctly L . 9
  • 11. 10
  • 12. z  “Google Analytics shows you how people found your site, how they explored it, and how you can enhance their visitor experience.”—Googlez  Freez  Help visitors by providing better keyword searchz  Provide information related to website design.z  Tagging :Automatic for content management system or blogging platform but manual for customize website.z  Confidentiality : Third party data processing. 11
  • 13. 12
  • 14. Name AWStats Google AnalyticsBased on logs Yes Site Search dataPage Tagging No YesHits count Count everything as IP address and page cookiesConfidentiality Not an issue Issue (if not owner)Meant for website traffic Website traffic and analysis. marketing effectiveness.Market Share NA Around 49.95% of top 1,000,000 hosts 13
  • 15. z  Power of analysis is limited by the information in logs.z  Extensive logging that consumes resources. ….more we measure, less accurate weunderstand ….. Awstats, Webalizer and Google Analyticsare always different due to different techniques. Use AWStats as well as Google Analytics to have better prediction 14
  • 16. 15
  • 17. z  Packet sniffer can capture and decode data streams passing over a digital network.z  Non-intrusive technology : no log, no page tag.z  Deploy sniffer into local network of servers to be tracked.z  Completely transparent for tracked website(s)z  Supports multiple servers without effecting server response time. Block Diagram of Packet Sniffing 16
  • 18. z  Packet sniffer can capture and decode data streams passing over a digital network.z  Non-intrusive technology : no log, no page tag.z  Deploy sniffer into local network of servers to be tracked.z  Completely transparent for tracked website(s)z  Supports multiple servers without effecting server response time. Block Diagram of Packet Sniffing 17
  • 19. z  Client communication disconnects informationz  Server-side timing informationz  Website content delivery informationz  Full spectrum of hits including non-pagesz  Copes with proxy or browser cachingz  Robots and automated agents data availablez  Website content time-to-serve time 18
  • 20. 19