SlideShare a Scribd company logo
1 of 46
1 | P a g e
A STUDY ON WEB ANALYTICS
WITH REFERENCE TO SELECT SPORTS WEBSITES
A project report submitted to GITAM Institute of Management, GITAM
University in partial fulfillment for the award of degree of
BACHELOR OF BUSINESS ADMINSTRATION
(BUSINESS ANALYTICS)
Submitted by
Y. Bhanu Prakash,
Regd.No: 1214415127
Under the guidance of
Dr. D. Vijaya Geeta
Associate Professor
GITAM INSTITUTE OF MANAGEMENT
GITAM UNIVERSITY
VISAKHAPATNAM
2015-2018
2 | P a g e
Declaration By Student
I, Y. Bhanu Prakash, Regd.No:1214415127 hereby declare that the project titled “A study
on web analytics with reference to select sports websites ” is submitted to GITAM Institute
Of Management, GITAM University is an original work done by me and is not being
submitted to any other University for award of any degree or diploma.
Y. Bhanu Prakash
Regd.No:1214415127
3 | P a g e
Certificate By Guide
This is to certify that the project titled “A study on web analytics with reference to select
sports websites” is project work undertaken by Y. Bhanu Prakash, Regd.No: 1214415127
under my guidance.
Place:- Visakhapatnam Dr. D. Vijaya Geeta
Date:- AssociateProfessor
4 | P a g e
ACKNOWLEDGEMENT
It is a genuine pleasure to express my deep sense of thanks and gratitude to my principal
Prof. P. Sheela, GITAM Institute of Management, GITAM University, Visakhapatnam,
Andhra Pradesh for her continuous support and guidance throughout my project. Her
dedication and keen interest above all her overwhelming attitude to help her students had
been solely and mainly responsible for completing my work. Her timely advice,
meticulous scrutiny and scholarly advice have helped me to a very great extent to
accomplish my task.
And also I take this moment to thank my guide Dr. D. Vijaya Geeta, Associate Professor,
GITAM Institute of Management, GITAM University, Visakhapatnam, Andhra Pradesh.
Her prompt inspirations, timely suggestions with kindness, enthusiasm and dynamism
have enabled me to complete my project. I perceive as this opportunity as a big milestone
in my career development. I will strive to use gained skills and knowledge in the best
possible way, and I will continue to work on their improvement, in order to attain desired
career objectives. Hope to continue cooperationwith all of you in the future.
Y. Bhanu Prakash
5 | P a g e
CONTEXT Pg.No
CHAPTER 1:
INTRODUCTION TO ANALYTICS
 DATA ANALYTICS 8-16
 WEB ANALYTICS 17-19
CHAPTER 2:
PROFILE OF ALEXA,COM
 ALEXA INTERNET 21-23
CHAPTER 3:
METHODOLOGY
 RESEARCH PROBLEM 25
 OBJECTIVES OF THE STUDY 25
 METHODLOGY OF THE STUDY 25-26
 SCOPE OF THE STUDY 26
 LIMITATIONS OF THE STUDY 26
CHAPTER 4:
ANALYSIS AND DATA INTERPRETATION 28-40
CHAPTER 5:
OBSERVATIONS AND CONCLUSION
 OBSERVATIONS 42
 CONCLUSION 43
 BIBLIOGRAPHY 44
 ANNEXURE 45-46
6 | P a g e
LIST OF TABLES
Table No. Title Page No.
1 List Of Top 10 Websites (Global) 29
2 List Of Top 10 Websites In India 30
3 List Of Summary Statistics Of The 50 Selected Sports(Cricket) Websites 31
4 Count Of The Websites With Reference To Their Summary Statistics 32
LIST OF CHARTS
Chart No. Title Page No.
1 Percentage Of Visits By Indian User For Top 10 Websites (Global) 29
2 Percentage Of Visits By Indian User For Top 10 Websites In India 30
LIST OF FIGURES
Figure No. Title Page No.
1 The Websites Which Has Highest Trend During April – July Months 33-35
2 The Websites Which Has Highest Trend During August-October Months 36-38
3 The Websites Which Has Highest Trend During October-December Months 39
4 The Websites Which Has Highest Trend During January-March Months 40
7 | P a g e
CHAPTER – I
INTRODUCTION TO ANALYTICS
8 | P a g e
CHAPTER-I
Data Analytics
1. Introduction
Imagine a world without data storage; a place where every detail about a person or
organization, every transaction performed, or every aspect which can be documented is
lost directly after use. Organizations would thus lose the ability to extract valuable
information and knowledge, perform detailed analyses, as well as provide new
opportunities and advantages. Data is an essential part of our lives, and the ability to store
and access such data has become a crucial task which we cannot live without. Anything
ranging from customer names and addresses, to products available, to purchases made, to
employees hired, etc. has become essential for day to day continuity. Data is the building
block upon which any organization thrives.
2. Big Data
The term "Big Data" has recently been applied to datasets that grow so large that they
become awkward to work with using traditional on-hand database management tools.
They are data sets whose size is beyond the ability of commonly used software tools and
storage systems to capture, store, manage, as well as process the data within a tolerable
elapsed time. Big data also refers to databases which are measured in terabytes and above,
and are too complex and large to be effectively used on conventional systems.
Big data sizes are a constantly moving target, currently ranging from a few dozen
terabytes to many petabytes of data in a single data set. Consequently, some of the
difficulties related to big data include capture, storage, search, sharing, analytics, and
visualizing. Today, enterprises are exploring large volumes of highly detailed data so as to
discover facts they didn’t know before. Business benefit can commonly be derived from
analyzing larger and more complex data sets that require real time or near-real time
capabilities, however, this leads to a need for new data architectures, analytical methods,
and tools. In this section, we will discuss the characteristics of big data as well the issues
surround storing and analyzing suchdata.
9 | P a g e
2.1. Big Data Characteristics
Big data is data whose scale, distribution, diversity, and/or timeliness require the use of
new technical architectures, analytics, and tools in order to enable insights that unlock
new sources of business value. Big data is characterized by three main features: volume,
variety, and velocity. The volume of the data is its size, and how enormous it is. Velocity
refers to the rate with which data is changing, or how often it is created. Finally, variety
includes the different formats and types of data, as well as the different kinds of uses and
ways of analyzing the data.
2.2. Importance of Managing Big Data
There are five broad ways in which using big data can create value. First of all, big data
can unlock significant value by making information transparent and usable at a much
higher frequency. Second of all, as organizations create and store more and more
transactional data in a digital form, they can collect more accurate and detailed
performance information on everything from product inventories to sick days. This can
therefore expose variability in the data and boost performance. Third of all, big data
allows a narrower segmentation of customers and therefore much more precisely tailored
products or services to meet their needs and requirements. Fourth of all, sophisticated
analytics performed on big data can substantially improve decision making. Finally, big
data can also be used to improve the development of the next generation of products and
services. For example, manufacturers are currently using data obtained from sensors
which are embedded in products to create innovative after-sales service offerings such as
proactive maintenance, which are preventive measures that take place before a failure
occurs or is even noticed by the customer.
Nowadays, along with the increasing ubiquity of technology comes the increase in the
amount of electronic data. Only a few years ago, corporate databases tended to be
measured in the range of tens to hundreds of gigabytes. Now, however, multi-terabyte
(TB) or even petabyte (PB) databases have become normal. According to Longbottom ,
the World Data Center for Climate (WDDC) stores over 6PB of data overall and the
National Energy Research Scientific Computing Center (NERSC) has over 2.8PB of
10 | P a g e
available data around atomic energy research, physics projects and so on. These are only a
couple of examples of the enormous amounts of data which must be dealt with nowadays.
Furthermore, even companies such as Amazon are running with databases in the tens of
terabytes, and companies which wouldn’t be expected to have to worry about such
massive systems are dealing with databases with sizes of hundreds of terabytes.
Additionally, other companies with large databases in place include telecom companies
and service providers, as well as social media sites. For telecom companies, just dealing
with log files of all the events happening and call logs can easily build up database sizes.
Moreover, social media sites, even those that are primarily text, such as Twitter or
Facebook, have big enough problems; and sites such as YouTube have to deal with
massively expanding datasets. With such increasing amounts of big data, there arises an
essential need to be able to analyze the datasets. Thus, big data analytics will be discussed
in the subsequentsection.
3. Big Data Analytics
Big data analytics is where advanced analytic techniques operate on big data sets.
Analytics based on large data samples reveals and leverages business change. However,
the larger the set of data, the more difficult it becomes to manage. Sophisticated analytics
can substantially improve decision making, minimize risks, and unearth valuable insights
from the data that would otherwise remain hidden. Sometimes decisions do not
necessarily need to be automated, but rather augmented by analyzing huge, entire datasets
using big data techniques and technologies instead of just smaller samples that individuals
with spreadsheets can handle and understand. Therefore, decision making may never be
the same. Some organizations are already making better decisions by analyzing entire
datasets from customers, employees, or even sensors embedded in products. In this
section, we will discuss the data analytics lifecycle, followed by some advanced data
analytics methods, as well as some possible tools and methods for big data analytics in
particular.
11 | P a g e
3.1. Advanced Data Analytics Methods
With the evolution of technology and the increased multitudes of data flowing in and out
of organizations daily, there has become a need for faster and more efficient ways of
analyzing such data. Having piles of data on hand is no longer enough to make efficient
decisions at the right time. The acquired data must not only be accurate, consistent, and
sufficient enough to base decisions upon, but it must also be integrated and subject-
oriented, as well as non volatile and variant with time. New tools and algorithms have
been designed to aid decision makers in automatically filtering and analyzing these
diverse pools of data.
Data Analytics is the process of applying algorithms in order to analyze sets of data and
extract useful and unknown patterns, relationships, and information. Furthermore, data
analytics are used to extract previously unknown, useful, valid, and hidden patterns and
information from large data sets, as well as to detect important relationships among the
stored variables. Thus, analytics have had a significant impact on research and
technologies, since decision makers have become more and more interested in learning
from previous data, thus gaining competitive advantage.
Nowadays, people don’t just want to collect data, they want to understand the meaning
and importance of the data, and use it to aid them in making decisions. Data analytics
have gained a great amount of interest from organizations throughout the years, and have
been used for many diverse applications. Some of the applications of data analytics
include science, such as particle physics, remote sensing, and bioinformatics, while other
applications focus on commerce, such as customer relationship management, consumer
finance, and fraud detection.
In this section, we will take a look at some of the most common data analytics methods. In
order to fully grasp the concept of data analytics, we will take a look at some of the most
common approaches as well as how they can be applied and what algorithms are
frequently used. Three different data analytics approaches will be discussed: association
rules, clustering, and decision trees.
12 | P a g e
3.2. AssociationRules
Association rules are one of the most popular data analytics tasks for discovering
interesting relations between variables in large databases. It is an approach for pattern
detection which finds the most common combinations of categorical variables. Using
association rules shows relationships between data items by identifying patterns of their
co-occurrence. Since so many various association rules can be derived from even a tiny
dataset, the interest in such rules is restricted to those that apply to a reasonably large
number of instances and have a reasonably high accuracy on the instances to which they
apply to.
Association rule analytics discover interesting correlations between attributes of a
database by using two measures, support and confidence. Support is the probability that
two different attributes occur together in a single event, or the frequency of occurrence,
while confidence is the probability that when one attribute occurs, the other will also
occur in the same event. Association rules are normally used in business applications to
determine the items which are usually purchased together. An example of an association
rule would be the statement that people who buy cars also buy CD’s 80% of the time,
written as Car → CD. In this case the two attributes being associated are the car and the
CD, while the confidence value is the 80% and the support value is how many times in the
database both a car and a CD were bought together. If a rule passes the minimum support
then it is considered as a frequent rule, while rules which pass both support and
confidence are considered strong rules.
One of the most common algorithms for association rule analytics is the Apriori
algorithm. Like most association rule algorithms, it splits the problem into two major
tasks. The first task is frequent itemset generation, in which the objective is to find all the
itemsets which satisfy the minimum support threshold and are thus frequent itemsets. The
formula for calculating supportis:
The second task is rule generation, in which the objective is to extract the high confidence,
or strong, rules from the previously found frequent itemsets. The formula for calculating
confidence is:
13 | P a g e
Since the first step is computationally expensive and requires the generation of all
combinations of itemsets, the Apriori algorithm provides a principle for guiding itemset
generation and reducing computational requirements.
The Apriori principle states that a subset of a frequent itemset must also be frequent. In
this case, if an itemset is not frequent, then it will be discarded and will not be used as a
subset for the generation of another itemset. The algorithm uses a breadth first search
strategy and a tree structure, to count candidate itemsets efficiently.
Each level in the tree contains all the k-itemsets, where k is the number of items in the
itemset. For example level 1 contains all 1-itemsets, level 2 all 2-itemsets, and so forth.
Instead of ending up with so many itemsets through all possible combinations of items,
the Apriori algorithm only considers the frequent itemsets. So in the first level, the
algorithm calculates the support of each itemset. Frequent itemsets which pass the
minimum support are taken to the next level, and all possible 2-itemset combinations are
made only out of these frequent sets, while all others are discarded. Finally, rules are
extracted from the frequent itemsets in the form of A → B (if A then B). The confidence
for each rule is calculated, and rules which pass the minimum confidence are taken as
strong rules.
3.3. Clustering
Data clustering is a technique which uses unsupervised learning, or in other words
discovers unknown structures. Clustering is the process of grouping sets of objects
together into classes based on similarity measures and the behavior of the group. Instances
within the same group are similar to each other, and are dissimilar to instances in other
groups. Clustering is similar to classification in that it groups data into classes; however
the main difference is that clustering is unsupervised, and the classes are defined by the
data alone, hence they are not predefined. Therefore, data to be analyzed is not compared
to a model built from training data, but is rather compared to other data and clustered
according to the level of similarity between them. Several representations of clusters are
depicted.
14 | P a g e
3.4. DecisionTrees
Another type of data analytics technique is the decision tree. Decision trees are used as
predictive models to map observations about an attribute to conclusions about an
attribute’s target value. A decision tree is a hierarchical structure of nodes and directed
edges which consists of three types of nodes. The root node is a node with no incoming
edges and zero or more outgoing edges to other nodes. An internal node is a node in the
middle levels of the tree, and consists of one incoming edge and two or more outgoing
edges. Finally, the leaf node has exactly one incoming edge and no outgoing edges, and is
assigned a class label which provided the decision of the tree.
Each of the tree’s nodes specifies a test of a certain attribute of the instance, and each
descending branch from the node corresponds to one of the attribute’s possible values. An
instance is classified by moving down the tree by starting at the root node, testing the
attribute specified by that node, and moving down the branch which corresponds to the
value of the given attribute to a new node. The same process is repeated at that node, until
a leaf node providing a decision is finally reached.
4. Big Data Analytics Tools and Methods
Big data is too large to be handled by conventional means, and the larger the data grows,
the more organizations purchase more powerful hardware and computational resources.
However, the data keeps on growing and performance needs increase, but the available
resources have a maximum capacity and capability. The MapReduce paradigm is based on
adding more computers or resources, rather than increasing the power or storage capacity
of a single computer; in other words, scaling out rather than scaling up. The fundamental
idea of MapReduce is breaking a task down into stages and executing the stages in parallel
in order to reduce the time needed to complete the task.
Map Reduce is a parallel programming model which is suitable for big data processing. It
is built on Hadoop, which is a concrete platform which implements MapReduce. In
MapReduce, data is split into distributable chunks, which are called shards. The steps to
process those chunks are defined, and the big data processing is run in parallel on the
chunks. This model is scalable, in that the bigger the data processing becomes, or the
15 | P a g e
more computational resources are the required, the more machines can be added to
process the chunks.
The first phase of the MapReduce job is to map input values to a set of key/value pairs as
output. Thus, unstructured data such as text can be mapped to a structured key/value pair,
where, in this case, the key could be the word in the text and the value is the number of
occurrences of the word. This output is then the input to the "Reduce" function. Reduce
then performs the collection and combination of this output. So assuming we have
millions of text documents and would like to count the occurrence of a certain word. The
text documents would be divided upon several workers, or machines, which will perform
parallel processing. These workers will act as mappers and map the desired word to the
number of occurrences in the text documents given to it for processing in parallel. The
reducers will then aggregate these counts, thus giving the total count in the millions of text
documents.
Hadoop is a framework for performing big data analytics which provides reliability,
scalability, and manageability by providing an implementation for the MapReduce
paradigm as well as gluing the storage and analytics together. Hadoop consists of two
main components: the Hadoop Distributed File System (HDFS) for the big data storage,
and MapReduce for big data analytics. The HDFS storage function provides a redundant
and reliable distributed file system which is optimized for large files. Data is stored in
replicated file blocks across the multiple Data Nodes, and the Name Node acts as a
regulator between the client and the Data Node, directing the client to the particular Data
Node which contains the requested data. Additionally, the data processing and analytics
functions are performed by MapReduce which consists of a java API as well as software
in order to implement the services which Hadoop needs to function.
The MapReduce function within Hadoop depends on two different nodes: the Job Tracker
and the Task Tracker nodes. The Job Tracker nodes are the ones which are responsible for
distributing the Mapper and Reducer functions to the available Task Trackers, as well as
monitoring the results. On the other hand, the Task Tracker nodes actually run the jobs
16 | P a g e
and communicate results back to the Job Tracker. That communication between nodes is
often through files and directories in HDFS so inter-node communication is minimized.
5. Big Data Challenges
Several issues will have to be addressed in order to capture the full potential of big data.
Policies related to privacy, security, intellectual property, and even liability all need to be
addressed in a big data world. Organizations need to put the right talent and technology in
place, as well as additionally structure workflows and incentives to optimize the use of big
data. Access to data is critical, and companies will need to increasingly integrate
information from multiple data sources, often from third parties or different locations.
Furthermore, questions on how to store and analyze data with volume, variety, and
velocity have arisen, and current research lacks the capability for providing an answer.
Consequently, the biggest problem has become not only the sheer volume of data, but the
fact that the type of data companies must deal with is changing. In order to accommodate
for the change in data, the approaches for storing data have changed throughout the years.
Data storage started with data warehouses, data marts, data cubes, and then moved on to
master data management, data federation and other techniques such as in-memory
databases. However, database suppliers are still struggling to cope with enormous
amounts of data, and the emergence of interest in big data has led to a need for storing and
managing such large amounts of data.
17 | P a g e
Web Analytics
Web analytics is reporting and analysis of data on website visitor activity. It is not only a
tool to measure web traffic but also can be used as a tool for business and market research.
Techniques used to access and improve the contribution of e-marketing to a business, such
as referrals, click streams, online research data, customer satisfaction surveys, and leads
and sales. Thus, marketers use web analytics exploring data and reports to build their
knowledge on customers' preference and behavior according to types of sites, which areas
customers click more often when they online. It also helps marketer understand their
customers better and improve their business performance.
These are three stages that they need to concern when setting up a web analytic tool. The
analysis is the ticket for them move from Steupland to Actionland. It is the isolating of
meaningful and actionable insights in data and reports that when acted upon by your
organization can drive business value.
Alignment Stage: At this early planning stage, it is necessary for marketer to gather
their business objectives and capture stakeholders' online behavior by their online
measurement strategy. Clearly understand measurement strategy and well analyze visitors
is critical to success. Thus, marketers have to carefully handling relevant and meaningful
data which will directly affect the business in the long-term.
Collection Stage: At this point of stage, large companies may spend amount of time
on technical implementation such as multiple web domains and online marketing
initiatives.
Reporting Stage: This is the last stage for companies move from Setupland to
Actionland. This stage is important where you create report and distribute them to
organization using a manual or preferably automated approach.
TOOLS AND METHODS USED TO HELP MARKETER:-
There are two types of web analytics, on-site and off-site web analytics.
18 | P a g e
ON-SITE ANALYTICS
On-site web analytics is used for marketers to measure a visitor's activity when he
browses on your website. This includes its drivers and conversations, for example which
ads on landing page encourage more people to purchase and which title of information
visitors click most. This data is used to analysis visitors' online behavior and can be used
to improve website or marketing campaign's audience response.
Simply, on-site web analytics tools are used to analysis and measure behaviors of visitors'
journey and actual visitor traffic arriving on your website. For example, which landing
page encourage visitors to make a purchase, what links visitors clicked on (from search
engine to get to the site or came there directly) to the site, and time they spent and stayed
on given page. Therefore, On-site web analytics measures of website in a commercial
context.
For the business, website became more important than ever before, it handles more
information. Companies also need to know if their marketing campaigns are working on
internet-based.
OFF-SITE ANALYTICS
Off-site analytics data can be obtained for any website-including your competitors and
partners. Which means is analysis the internet as a whole for the websites. Thus, the key
differences of off-site web analytics measures from your potential audience (opportunity),
share of voice (visibility), and buzz (comments).
Unlikely to on-site web analytics only captures what happens when visitors visit and
engage with your website, by using various technologies to help monitor and analysis
website to create meaningful actions and results. However, as social website becomes
more popular and ascendant channel for internet users, and everything becomes more
transparent on social web, organization information are shared, spread on it, thus, through
this platform, marketers are able to measure the latest buzz about website or organization.
It is important for marketers to monitor not only what happens on the website but also
outside of your website. Improving from what other people are saying about the company
19 | P a g e
and provide products and services match customers requires. Off-site Web Analytics
solutions can help businesses stay on the leading edge of overall trends.
20 | P a g e
CHAPTER- II
PROFILE OF ALEXA.COM
21 | P a g e
CHAPTER-II
Alexa Internet
Alexa Internet, Inc. is a California-based company that provides commercial web traffic
data and analytics. It is a wholly owned subsidiary of Amazon.com
Founded as an independent company in 1996, Alexa was acquired by Amazon in 1999.
Its toolbar collects data on browsing behavior and transmits them to the Alexa website,
where they are stored and analyzed, forming the basis for the company's web traffic
reporting. According to its website, Alexa provides traffic data, global rankings and other
information on 30 million websites, and as of 2015 its website is visited by over 6.5
million people monthly.
Operation & History
Alexa Internet was founded in April 1996 by American web entrepreneurs Brewster Kahle
and Bruce Gilliat. The company's name was chosen in homage to the Library of
Alexandira of Ptlolemaic Egypt, drawing a parallel between the largest repository of
knowledge in the ancient world and the potential of the Internet to become a similar store
of knowledge.
Alexa initially offered a toolbar that gave Internet users suggestions on where to go next,
based on the traffic patterns of its user community. The company also offered context for
each site visited: to whom it was registered, how many pages it had, how many other sites
pointed to it, and how frequently it was updated. Alexa's operations grew to include
archiving of web pages as they are crawled. This database served as the basis for the
creation of the Internet Archive accessible through the Wayback Machine. In 1998, the
company donated a copy of the archive, two terabytes in size, to the Library of Congress.
Alexa continues to supply the Internet Archive with Web crawls.
In 1999, as the company moved away from its original vision of providing an "intelligent"
search engine, Alexa was acquired by Amazon.com for approximately US$250 million in
Amazon stock. Alexa began a partnership with Google in early 2002 and with the web
directory DMOZ in January 2003. In May 2006, Amazon replaced Google with Bing (at
the time known as Windows Live Search) as a provider of search results. In December
22 | P a g e
2006, Amazon released Alexa Image Search. Built in-house, it was the first major
application built on the company's Web platform.
In December 2005, Alexa opened its extensive search index and Web-crawling facilities
to third party programs through a comprehensive set of web services and APIs. These
could be used, for instance, to construct vertical search engines that could run on Alexa's
own servers or elsewhere. In May 2007, Alexa changed their API to limit comparisons to
three websites, reduce the size of embedded graphs in Flash, and add mandatory
embedded BritePic advertisements.
On November 27, 2008, Amazon announced that Alexa Web Search was no longer
accepting new customers, and that the service would be deprecated or discontinued for
existing customers on January 26, 2009. Thereafter, Alexa became a purely analytics-
focused company.
On March 31, 2009, Alexa launched a major website redesign. The redesigned site
provided new web traffic metrics—including average page views per individual user,
bounce rate, and user time on site. In the following weeks, Alexa added more features,
including visitor demographics, click stream and search traffic statistics. Alexa introduced
these new features to compete with other web analytics services.
Toolbar
Alexa ranks sites based primarily on tracking a sample set of internet traffic—users of its
toolbar for the Internet Explorer, Firefox and Google Chrome web browsers. The Alexa
Toolbar includes a popup blocker, a search box, links to Amazon.com and the Alexa
homepage, and the Alexa ranking of the site that the user is visiting. It also allows the user
to rate the site and view links to external, relevant sites. In early 2005, Alexa stated that
there had been 10 million downloads of the toolbar, though the company did not provide
statistics about active usage.
Originally, web pages were only ranked amongst users who had the Alexa Toolbar
installed, and could be biased if a specific audience subgroup was reluctant to take part in
the rankings. This caused some controversies over how representative Alexa's user base
was of typical Internet behavior, especially for less-visited sites.
23 | P a g e
Until 2007, a third-party-supplied plugin for the Firefox Browser served as the only option
for Firefox users after Amazon abandoned its A9 toolbar. On July 16, 2007, Alexa
released an official toolbar for Firefox called Sparky.
On 16 April 2008, many users reported drastic shifts in their Alexa rankings. Alexa
confirmed this later in the day with an announcement that they had released an updated
ranking system, claiming that they would now take into account more sources of data
"beyond Alexa Toolbarusers".
Certified Statistics
Using the Alexa Pro service, website owners can sign up for "certified statistics," which
allows Alexa more access to a site's traffic data. Site owners input Javascript code on each
page of their website that, if permitted by the user’s security and privacy settings, runs and
send traffic data to Alexa, allowing Alexa to display or not display, depending on the
owner's preference more accurate statistics such as total pageviews and unique pageviews.
24 | P a g e
CHAPTER – III
METHODOLOGY
25 | P a g e
CHAPTER – III
METHODOLOGY
RESEARCH PROBLEM:- Study on web-analytics with reference to select sport
websites.
OBJECTIVESOF THE STUDY:-
The objectives of the study are,
 To find out the sports (cricket) websites with highest visitors of India.
 To know the rank of the website in India and as wells as global too.
 To know the bounce rate, daily page views per visitor and daily time on site of the
selected websites.
METHODOLOGYOF THE STUDY:-
The top 50 sport websites selected to cricket is taken for the study where website
traffic rank is a combined measure of page views and users. So, albeit a website has more
reach i.e., number of users visiting the website, its rank may differ based on the unique
pages that were visited for the website.
The main aim of the study is to know the most popular cricket website in India and as
well as global with reference to the alexa.com website and also to know the bounce rate,
daily pageviews per visitor and daily time on site of the selected websites.
The metrics used to measure the popularity of the websites are taken as follows:
 Bounce Rate:- The percentage of visitors to a particular website who navigate
away from the site after viewing only one page. A rising bounce rate is a sure sign
that your homepage is boring or off-putting.
 Daily Pageviews Per Visitor:- The average number of pages viewed by each
visitor to your website per day. When this number is higher, your website is
considered to have more engaging information.
26 | P a g e
 Daily Time On The Site:- The average number of minutes of minutes spent on
your website by each visitor per day. As with “Daily Pageviews Per Visitor”, when
this number is higher, your website is considered to have more engaging
information.
SCOPE OF THE STUDY:-
The Web traffic data for this study is collected from alexa.com that collects the traffic
data by using a Web crawler.
Only top 50 sports (cricket) websites as ranked by Alexa.com as on 31st May. 2016 is
taken for the study.
LIMITATIONS OF THE STUDY:- The study is limited to the sites or the data given in
the alexa.com website. And the study is limited only to the top 50 sites of sports (cricket)
websites. The rank of the each website is taken only as per global and in India, other
countries ranks are not taken in the study.
27 | P a g e
CHAPTER – IV
ANALYSIS AND INTERPRETATION
28 | P a g e
CHAPTER – IV
ANALYSIS AND INTERPRETATION
In this section, the results are shown along with the interpretations.
The categorization, along with the rank in India & percentage of visitors in India of
top 10 websites as per global rank as on 31st May, 2016, is exhibited in the Table 1 and
Chart 1. And for the top 10 websites in India are shown in Table 2 and Chart 2. These
tables also includes the bounce rate, daily page views per visitor & daily time on the site
of those selected sports (cricket) websites.
29 | P a g e
Table 1 : List Of Top 10 Websites (Global)
S.NO Website
Global
Rank
Rank
In
India
% Of
Visitors
In India
Bounce
Rate
Daily
Pageviews
Per Visitor
Daily
Time
On Site
1 Https://www.youtube.com/user/CricketICC 2 3 9.3 33.30 13.42 23:42
2 Msn.com/en-in/sports/cricket 15 43.30 4.19 11:36
3 Bbc.com/sport/cricket 126 131 7.1 52.20 2.75 4:30
4 Telegraph.co.uk/sport/cricket/ 327 293 8.4 71.20 2.28 3:10
5 Espncricinfo.com 370 75 47.6 33.30 3.44 5:46
6 Cricbuzz.com 623 110 83.6 27.10 3.52 5:47
7 Skysports.com/cricket 1110 1035 8.9 44.10 2.89 4:12
8 Iplt20.com 1522 8028 64.7 27.50 4.01 6:36
9 Smh.com.au/sport/cricket/ 1914 4627 4 30.10 2.16 5:32
10 Stuff.co.nz/sport/cricket 3242 7930 4.5 41.00 3.78 7:07
The above chart shows about the percentage of visits by Indian user for the top 10
websites of sports (cricket) websites as per global rank of the website as on 31st may,
2016. Among the top 10 websites in global, cricbuzz.com holds the first place in highest
percentage of viewers in India i.e., 35%. Msn.com has 15th rank in global but it doesn’t
have any rank in India becauseit deals mainly with England cricket.
4%
3%
3%
20%
35%
4%
27%
2% 2%
% Of Visitors In India
Https://www.youtube.co
m/user/CricketICC
Msn.com/en-
in/sports/cricket
Bbc.com/sport/cricket
Telegraph.co.uk/sport/cric
ket/
Espncricinfo.com
Cricbuzz.com
Skysports.com/cricket
30 | P a g e
Table 2 : List Of Top 10 Websites In India
S.NO Website
Rank
In
India
% Of
Visitors
In India
Global
Rank
Bounce
Rate
Daily
Pageviews
Per Visitor
Daily
Time
On Site
1 Https://www.youtube.com/user/CricketICC 3 9.3 2 33.3 13.42 23:42
2 Espncricinfo.com 75 47.6 370 33.3 3.44 5:46
3 Cricbuzz.com 110 83.6 623 27.1 3.52 5:47
4 Bbc.com/sport/cricket 131 7.1 126 52.2 2.75 4:30
5 Telegraph.co.uk/sport/cricket/ 293 8.4 327 71.2 2.28 3:10
6 Skysports.com/cricket 1035 8.9 1110 44.1 2.89 4:12
7 Icc-cricket.com/cricket-world-cup 2635 57.4 18158 56.5 1.95 2:52
8 Cricket.com.au 4249 37.3 14784 53.9 2.23 3:39
9 Smh.com.au/sport/cricket/ 4627 4 1914 30.1 2.16 5:32
10 Bcci.tv 4955 83.5 71275 45.5 2.59 2:49
The above chart shows about the percentage of visits by Indian user for the top 10
websites of sports (cricket) websites as per rank in India of the website as on 31st may,
2016. Among the top 10 websites in India, cricbuzz.com & youtube.com holds the first
place in highest percentage of viewers in India i.e., 24% each. Smh.com gets lower
percentage of visitors in India as it deals mainly with Australian cricket.
31 | P a g e
SUMMARY STATISTICS
Table 3 : List Of Summary Statistics Of The 50 Selected Sports (Cricket) Websites
Global
Rank
Rank In
India
% Of Visitors
In India (%)
Bounce
Rate
Daily PageviewsPer
Visitor
Daily Time
On Site
Maximum 773118 256703 92.9 75 24 42.18
Minimum 2 3 1 4 1.16 1.11
Range 773116 256700 91.9 71 22.84 41.16
Mean 221274 70601 34.34 43.73 3.72 6.61
Median 172102 44956 21.5 42.8 2.78 3.83
Standard Deviation 220738 71308 29.51 15.22 3.62 8.29
The above table gives the summary statistics of the data. Following inferences can be
drawn from the table,
 Websites which are having lesser bounce rate value than the mean value of bounce
rate, they are said to be more popular sites.
 Websites which are having greater daily pageviews per visitor mean value are said
to be more popular sites.
 Similarly as daily pageviews per visitor, websites having greater mean value of
daily time on site are said to be more popular sites.
32 | P a g e
Table 4 : Count Of The Websites With Reference To Their Summary Statistics
Global
Rank
Rank In
India
% Of Visitors In
India (%)
Bounce
Rate
Daily Pageviews
Per Visitor
Greater Than Mean 19 16 16 24 14
Lesser Than Mean 31 23 23 26 36
Greater Than Median 25 19 19 25 25
Lesser Than Median 25 19 19 25 25
Greater Than Standard Deviation 19 16 17 49 14
Lesser Than Standard Deviation 31 23 22 1 36
From the above table we say that, 38% of the websites had good global rank and 32%
websites had good rank in India and same percentage of visitors in India. 52% of websites
had bounce rate lesser than mean, it means the viewers are going deeply into the website
for more information. 28% of websites have greater daily pageviews per visitor value. It
means that 28% visitors are visiting the website daily. Even though bounce rate is good
for many websites, daily pageviews per visitor is not so good for the websites.
33 | P a g e
INTERPRETATIONOF WEBSITESCATEGORIZED BYTHE TREND
Category 1 : The Websites Which Has Highest Trend During April – July Months
1. Espncricinfo.com 2. Cricbuzz.com
3. Msn.com/en-in/sports/cricket 4. Icc-cricket-com/cricket-world-cup
5. Iplt20.com 6. Pakpassion.net
7. Batsman.com 8. Cricketnmore.com
9. Carribeancricket.com 10.Cricwaves.com
34 | P a g e
11.Cricketwrold.com 12.Cricketweb.net
13.Pcb.com.pk 14.Lastmanstands.com
15.Cricketweb.net 16.Kkr.in
17.Cricket365.com 18.Cricruns.com
35 | P a g e
From the above table, we can say that all the websites have their highest point in their
trend line during the April – July months. Many of the sites are from India. This is
because of the IPL season. But some sites of other countries like msn.com,
carribeancricke.com, winidescricket.com have also their peak stage during April month
because it was the time of T20 World Cup Playoffs.
19.www.youtube.com/user/CricketICC 20.Windiescricket.com
21.Royalchallengers.in 22.Cricketfresh.in
36 | P a g e
Category 2: The Websites Which Has Highest Trend During August-October Months
1. Ecb.co.uk 2. Skysports.com/cricket
3. Islandcricket.lk 4. Lords.org
5. Foxsports.com.au/cricket/ 6. Kiaoval.com
7. Yorkshireeccc.com 8. Lccc.co.uk
9. Telegraph.co.uk/sport/cricket 10.Srilankacricket.lk
37 | P a g e
11.Mcc.org.au 12.Glamorgancricket.com
13.Smh.com/au/sport/cricket 14.Middleexccc.com
15.Kentcricket.co.uk 16.Wccc.co.uk
17.Indiancricketfans.com 18. Islandcricket.lk
38 | P a g e
The sites of the above table have their highest point in their trend line during the months
of August – October. Most of sites in this category belong to England cricket. Since it was
time of county cricket and England played most of their matches during that time, these
sites had their peak stage. Two sites namely srilankancricket.com and
indiancricketfan.com also have their peak stages during those months because their
countries played some international cricket during that time.
39 | P a g e
The sites of the above table have their highest point in their trend line during the months
October - December. Most of the above websites belong to Australia and New Zealand.
These sites have their peak stage during these months because it was the time of Big Bash
League.
Category3:The Websites Which Has Highest Trend During October-DecemberMonths
1. Bbc.com/sport/cricket 2. Supersport.com/cricket/
3. Sports24.co.za.uk 4. Stuff.co.nz/sport/cricket
5. Lol.co.za/sport/cricket/ 6. Cricbay.com
40 | P a g e
From the above table, we can say that all the websites have their highest point in their
trend line during the January – March months. These websites belong to India. Since,
India had played more international cricket during this period, these sites had peak stage.
Category4: The Websites Which Has Highest Trend During January-March Months
1. Cricket.com.au 2. Cricketfanforum.net
3. Cricket.co.za 4. Bccci.tv
41 | P a g e
CHAPTER – V
OBSERVATIONS AND CONCLUSION
42 | P a g e
CHAPTER - V
OBSERVATIONS
The findings from the above inferences are given below:
 Globally, www.youtube.com/user/CricketICC holds the first rank in cricket
websites.
 In India also www.youtube.com/user/CricketICC holds the top rank.
 The websites can be categorized into 4 categories by using their trend line graph.
1. April – July Months:- The websites under this category have their highest
point during that time because of IPL and T20 World Cup playoffs.
2. August – October Months:- The websites under this category have their
highest point during that time becauseof County cricket in England.
3. October – December Months:- The websites under this category have their
highest point during that time becauseof Big Bash League in Australia.
4. January – March Months:- The websites under this category have their
highest point because of India played more international cricket during that
time.
 32% (16) websites have good percentage of visitors in India.
 52% (26) websites have good percentage of bounce rate.
 28% (14) websites have good percentage of pageviews per visitor in India.
43 | P a g e
CONCLUSION
The following study has been undertaken to understand the popularity and usage of cricket
websites in India. Categorization of 50 websites that are preferred by the users has given
an insight into the kind of information that a user seek in the net. Specifically, four
categories have been selected by looking at their trend line. By this categorization we can
easily know the popularity of the website and its reason. Web traffic analysis method does
cover large set of population but fails to get more information about the user in particular.
A combination of survey and Web traffic analysis method can be adopted to get even
more information about the users.
44 | P a g e
BIBLIOGRAPHY
 Avinash Kaushik, Web Analytics2.0., SYBEX, A Willey Brand, New Delhi, 2015.
 http://www.uniassignment.com/essay-samples/information-technology/big-data-
analytics-opportunities-and-challenges-information-technology-essay.php
 https://www.ukessays.com/essays/information-technology/what-are-web-analytics-
information-technology-essay.php
 http://www.alexa.com/topsites/category/Top/Sports/Cricket
45 | P a g e
ANNEXURE
S.NO Website
Global
Rank
Rank
In
India
% Of
Visitors
In India
Bounce
Rate
Daily
Pageviews
Per Visitor
Daily
Time
On Site
1 Espncricinfo.com 370 75 47.6 33.30 3.44 5:46
2 Cricbuzz.com 623 110 83.6 27.10 3.52 5:47
3 Cricket.com.au 14784 4249 37.3 53.90 2.23 3:39
4 Bbc.com/sport/cricket 126 131 7.1 52.20 2.75 4:30
5 Msn.com/en-in/sports/cricket 15 43.30 4.19 11:36
6 Bcci.tv 71275 4955 83.5 45.50 2.59 2:49
7 Icc-cricket.com/cricket-world-cup 18158 2635 57.4 56.50 1.95 2:52
8 Iplt20.com 1522 8028 64.7 27.50 4.01 6:36
9 Supersport.com/cricket/ 6185 34.80 3.99 5:20
10 Ecb.co.uk 76612 17529 32.4 50.90 2.65 2:51
11 Sport24.co.za/Cricket/ 12579 68318 1.4 55.60 2.48 3:50
12 Skysports.com/cricket 1110 1035 8.9 44.10 2.89 4:12
13 Pakpassion.net 72568 26045 21.5 38.00 3.90 8:12
14 Batsman.com 121160 2.6 29.80 8.40 17:00
15 Cricketnmore.com 120204 11835 92.9 25.00 24.00 40:52
16 Islandcricket.lk 95949 116395 4.8 63.30 1.93 0:02:26
17 Lords.org 177820 141290 9.8 36.70 3.60 3:34
18 Foxsports.com.au/cricket/ 9447 8667 8.6 59.90 2.06 3:43
19 Cricwaves.com 91438 36022 52.7 59.80 1.59 1:45
20 Cricketworld.com 193051 35991 52.9 65.10 1.76 1:35
21 Kiaoval.com 300467 209357 8.5 46.90 2.14 3:20
22 Yorkshireccc.com 301917 167856 9.9 28.30 3.20 3:50
23 Stuff.co.nz/sport/cricket 3242 7930 4.5 41.00 3.78 7:07
24 Cricketweb.net 194708 104355 28.7 21.70 6.00 10:09
25 Pcb.com.pk 211355 256703 49.00 3.10 3:15
26 Lccc.co.uk 319616 111051 18 32.00 3.60 3:51
27 Caribbeancricket.com 400535 171264 13.8 29.70 4.10 6:25
28 Lastmanstands.com 342004 41.00 4.70 4:36
29 Iol.co.za/sport/cricket 5485 44956 1 52.70 2.86 4:52
46 | P a g e
30 Cricketfansforum.net 467359 42368 83.2 4.00 3.80 42:18
31 Telegraph.co.uk/sport/cricket/ 327 293 8.4 71.20 2.28 3:10
32 Essexcricket.org.uk 411842 40.70 2.80 4:08
33 Cricbay.com 473376 21.70 8.80 9:28
34 Srilankacricket.lk 418448 52.40 1.90 2:17
35 Cricketweb.net/forum/ 194708 104355 28.7 21.70 6.00 10:09
36 Kkr.in 184488 67495 70.2 56.90 1.90 2:33
37 Mcc.org.au 544616 31.30 3.50 4:05
38 Cricket365.com 390088 159607 18.2 75.00 1.40 1:08
39 Glamorgancricket.com 474829 129761 21.1 42.30 2.80 3:08
40 Smh.com.au/sport/cricket/ 1914 4627 4 30.10 2.16 5:32
41 Cricket.co.za 508743 194482 14.6 51.00 1.90 3:11
42 Cricruns.com 166384 61398 85.5 65.40 1.16 1:11
43 Middlesexccc.com 626138 39.90 2.10 2:36
44 Kentcricket.co.uk 674008 37.90 2.70 2:31
45 Https://www.youtube.com/user/CricketICC 2 3 9.3 33.30 13.42 23:42
46 Wccc.co.uk 602966 34.00 2.40 2:55
47 Indiancricketfans.com 773118 98548 54.5 46.30 2.30 2:49
48 Windiescricket.com 405667 82095 53.2 55.40 1.80 2:46
49 Royalchallengers.com 67773 82706 82.2 56.90 2.09 2:34
50 Cricketfresh.in 512557 168911 52 74.40 1.40 11:45

More Related Content

What's hot

Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Oomph! Recruitment
 
Big Data Analytics: Recent Achievements and New Challenges
Big Data Analytics: Recent Achievements and New ChallengesBig Data Analytics: Recent Achievements and New Challenges
Big Data Analytics: Recent Achievements and New ChallengesEditor IJCATR
 
IRJET- Big Data: A Study
IRJET-  	  Big Data: A StudyIRJET-  	  Big Data: A Study
IRJET- Big Data: A StudyIRJET Journal
 
Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...
Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...
Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...DATAVERSITY
 
Influence of Big Data Analytics in Supply Chain Management- A case study in B...
Influence of Big Data Analytics in Supply Chain Management- A case study in B...Influence of Big Data Analytics in Supply Chain Management- A case study in B...
Influence of Big Data Analytics in Supply Chain Management- A case study in B...Monwar Anik
 
Influence of Big Data Analytics in Supply Chain Management- A case study in B...
Influence of Big Data Analytics in Supply Chain Management- A case study in B...Influence of Big Data Analytics in Supply Chain Management- A case study in B...
Influence of Big Data Analytics in Supply Chain Management- A case study in B...Monwar Anik
 
Applying Data Quality Best Practices at Big Data Scale
Applying Data Quality Best Practices at Big Data ScaleApplying Data Quality Best Practices at Big Data Scale
Applying Data Quality Best Practices at Big Data ScalePrecisely
 
Big Data Analytics Architecture PowerPoint Presentation Slides
Big Data Analytics Architecture PowerPoint Presentation SlidesBig Data Analytics Architecture PowerPoint Presentation Slides
Big Data Analytics Architecture PowerPoint Presentation SlidesSlideTeam
 
Orzota all-in-one Big Data Platform
Orzota all-in-one Big Data PlatformOrzota all-in-one Big Data Platform
Orzota all-in-one Big Data PlatformOrzota
 
White paper "From Big Data to Big Busine$$"
White paper "From Big Data to Big Busine$$"White paper "From Big Data to Big Busine$$"
White paper "From Big Data to Big Busine$$"Business & Decision
 

What's hot (20)

Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
 
Unlocking big data
Unlocking big dataUnlocking big data
Unlocking big data
 
Sample
Sample Sample
Sample
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Big data
Big dataBig data
Big data
 
Business analytics
Business analyticsBusiness analytics
Business analytics
 
Big data
Big dataBig data
Big data
 
Big Data Analytics: Recent Achievements and New Challenges
Big Data Analytics: Recent Achievements and New ChallengesBig Data Analytics: Recent Achievements and New Challenges
Big Data Analytics: Recent Achievements and New Challenges
 
IRJET- Big Data: A Study
IRJET-  	  Big Data: A StudyIRJET-  	  Big Data: A Study
IRJET- Big Data: A Study
 
Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...
Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...
Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...
 
Influence of Big Data Analytics in Supply Chain Management- A case study in B...
Influence of Big Data Analytics in Supply Chain Management- A case study in B...Influence of Big Data Analytics in Supply Chain Management- A case study in B...
Influence of Big Data Analytics in Supply Chain Management- A case study in B...
 
Influence of Big Data Analytics in Supply Chain Management- A case study in B...
Influence of Big Data Analytics in Supply Chain Management- A case study in B...Influence of Big Data Analytics in Supply Chain Management- A case study in B...
Influence of Big Data Analytics in Supply Chain Management- A case study in B...
 
Big Data
Big DataBig Data
Big Data
 
Applying Data Quality Best Practices at Big Data Scale
Applying Data Quality Best Practices at Big Data ScaleApplying Data Quality Best Practices at Big Data Scale
Applying Data Quality Best Practices at Big Data Scale
 
Big Data Analytics Architecture PowerPoint Presentation Slides
Big Data Analytics Architecture PowerPoint Presentation SlidesBig Data Analytics Architecture PowerPoint Presentation Slides
Big Data Analytics Architecture PowerPoint Presentation Slides
 
Big data
Big dataBig data
Big data
 
Managing Data as a Strategic Resource – Foundation of the Digital and Data-Dr...
Managing Data as a Strategic Resource – Foundation of the Digital and Data-Dr...Managing Data as a Strategic Resource – Foundation of the Digital and Data-Dr...
Managing Data as a Strategic Resource – Foundation of the Digital and Data-Dr...
 
Orzota all-in-one Big Data Platform
Orzota all-in-one Big Data PlatformOrzota all-in-one Big Data Platform
Orzota all-in-one Big Data Platform
 
Bigdata
BigdataBigdata
Bigdata
 
White paper "From Big Data to Big Busine$$"
White paper "From Big Data to Big Busine$$"White paper "From Big Data to Big Busine$$"
White paper "From Big Data to Big Busine$$"
 

Similar to A study on web analytics with reference to select sports websites

Big data analytics in Business Management and Businesss Intelligence: A Lietr...
Big data analytics in Business Management and Businesss Intelligence: A Lietr...Big data analytics in Business Management and Businesss Intelligence: A Lietr...
Big data analytics in Business Management and Businesss Intelligence: A Lietr...IRJET Journal
 
The Comparison of Big Data Strategies in Corporate Environment
The Comparison of Big Data Strategies in Corporate EnvironmentThe Comparison of Big Data Strategies in Corporate Environment
The Comparison of Big Data Strategies in Corporate EnvironmentIRJET Journal
 
Practical analytics john enoch white paper
Practical analytics john enoch white paperPractical analytics john enoch white paper
Practical analytics john enoch white paperJohn Enoch
 
Big data - The next best thing
Big data - The next best thingBig data - The next best thing
Big data - The next best thingBharath Rao
 
Embracing data science
Embracing data scienceEmbracing data science
Embracing data scienceVipul Kalamkar
 
Analysis of Big Data
Analysis of Big DataAnalysis of Big Data
Analysis of Big DataIRJET Journal
 
Know The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdfKnow The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdfAnil
 
L3 Big Data and Application.pptx
L3  Big Data and Application.pptxL3  Big Data and Application.pptx
L3 Big Data and Application.pptxShambhavi Vats
 
Big Data Analytics : Existing Systems and Future Challenges – A Review
Big Data Analytics : Existing Systems and Future Challenges – A ReviewBig Data Analytics : Existing Systems and Future Challenges – A Review
Big Data Analytics : Existing Systems and Future Challenges – A ReviewIRJET Journal
 
Application of Big Data in Enterprise Management
Application of Big Data in Enterprise ManagementApplication of Big Data in Enterprise Management
Application of Big Data in Enterprise Managementijtsrd
 
IRJET - Big Data Analysis its Challenges
IRJET - Big Data Analysis its ChallengesIRJET - Big Data Analysis its Challenges
IRJET - Big Data Analysis its ChallengesIRJET Journal
 
What is Big Data? - Business Plans
What is Big Data? - Business PlansWhat is Big Data? - Business Plans
What is Big Data? - Business PlansOur Business Ladder
 
IRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial DomainIRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial DomainIRJET Journal
 
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANC...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS  IN KNOWLEDGE MANAGEMENT FOR ENHANC...LEVERAGING CLOUD BASED BIG DATA ANALYTICS  IN KNOWLEDGE MANAGEMENT FOR ENHANC...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANC...ijdpsjournal
 
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...ijdpsjournal
 
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...ijdpsjournal
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataAkshata Humbe
 

Similar to A study on web analytics with reference to select sports websites (20)

Big data analytics in Business Management and Businesss Intelligence: A Lietr...
Big data analytics in Business Management and Businesss Intelligence: A Lietr...Big data analytics in Business Management and Businesss Intelligence: A Lietr...
Big data analytics in Business Management and Businesss Intelligence: A Lietr...
 
The Comparison of Big Data Strategies in Corporate Environment
The Comparison of Big Data Strategies in Corporate EnvironmentThe Comparison of Big Data Strategies in Corporate Environment
The Comparison of Big Data Strategies in Corporate Environment
 
Practical analytics john enoch white paper
Practical analytics john enoch white paperPractical analytics john enoch white paper
Practical analytics john enoch white paper
 
Big data - The next best thing
Big data - The next best thingBig data - The next best thing
Big data - The next best thing
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Embracing data science
Embracing data scienceEmbracing data science
Embracing data science
 
Analysis of Big Data
Analysis of Big DataAnalysis of Big Data
Analysis of Big Data
 
Know The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdfKnow The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdf
 
L3 Big Data and Application.pptx
L3  Big Data and Application.pptxL3  Big Data and Application.pptx
L3 Big Data and Application.pptx
 
Big Data Analytics : Existing Systems and Future Challenges – A Review
Big Data Analytics : Existing Systems and Future Challenges – A ReviewBig Data Analytics : Existing Systems and Future Challenges – A Review
Big Data Analytics : Existing Systems and Future Challenges – A Review
 
Application of Big Data in Enterprise Management
Application of Big Data in Enterprise ManagementApplication of Big Data in Enterprise Management
Application of Big Data in Enterprise Management
 
Big data upload
Big data uploadBig data upload
Big data upload
 
IRJET - Big Data Analysis its Challenges
IRJET - Big Data Analysis its ChallengesIRJET - Big Data Analysis its Challenges
IRJET - Big Data Analysis its Challenges
 
Difference b/w DataScience, Data Analyst
Difference b/w DataScience, Data AnalystDifference b/w DataScience, Data Analyst
Difference b/w DataScience, Data Analyst
 
What is Big Data? - Business Plans
What is Big Data? - Business PlansWhat is Big Data? - Business Plans
What is Big Data? - Business Plans
 
IRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial DomainIRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial Domain
 
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANC...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS  IN KNOWLEDGE MANAGEMENT FOR ENHANC...LEVERAGING CLOUD BASED BIG DATA ANALYTICS  IN KNOWLEDGE MANAGEMENT FOR ENHANC...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANC...
 
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
 
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 

Recently uploaded

毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 

Recently uploaded (20)

毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 

A study on web analytics with reference to select sports websites

  • 1. 1 | P a g e A STUDY ON WEB ANALYTICS WITH REFERENCE TO SELECT SPORTS WEBSITES A project report submitted to GITAM Institute of Management, GITAM University in partial fulfillment for the award of degree of BACHELOR OF BUSINESS ADMINSTRATION (BUSINESS ANALYTICS) Submitted by Y. Bhanu Prakash, Regd.No: 1214415127 Under the guidance of Dr. D. Vijaya Geeta Associate Professor GITAM INSTITUTE OF MANAGEMENT GITAM UNIVERSITY VISAKHAPATNAM 2015-2018
  • 2. 2 | P a g e Declaration By Student I, Y. Bhanu Prakash, Regd.No:1214415127 hereby declare that the project titled “A study on web analytics with reference to select sports websites ” is submitted to GITAM Institute Of Management, GITAM University is an original work done by me and is not being submitted to any other University for award of any degree or diploma. Y. Bhanu Prakash Regd.No:1214415127
  • 3. 3 | P a g e Certificate By Guide This is to certify that the project titled “A study on web analytics with reference to select sports websites” is project work undertaken by Y. Bhanu Prakash, Regd.No: 1214415127 under my guidance. Place:- Visakhapatnam Dr. D. Vijaya Geeta Date:- AssociateProfessor
  • 4. 4 | P a g e ACKNOWLEDGEMENT It is a genuine pleasure to express my deep sense of thanks and gratitude to my principal Prof. P. Sheela, GITAM Institute of Management, GITAM University, Visakhapatnam, Andhra Pradesh for her continuous support and guidance throughout my project. Her dedication and keen interest above all her overwhelming attitude to help her students had been solely and mainly responsible for completing my work. Her timely advice, meticulous scrutiny and scholarly advice have helped me to a very great extent to accomplish my task. And also I take this moment to thank my guide Dr. D. Vijaya Geeta, Associate Professor, GITAM Institute of Management, GITAM University, Visakhapatnam, Andhra Pradesh. Her prompt inspirations, timely suggestions with kindness, enthusiasm and dynamism have enabled me to complete my project. I perceive as this opportunity as a big milestone in my career development. I will strive to use gained skills and knowledge in the best possible way, and I will continue to work on their improvement, in order to attain desired career objectives. Hope to continue cooperationwith all of you in the future. Y. Bhanu Prakash
  • 5. 5 | P a g e CONTEXT Pg.No CHAPTER 1: INTRODUCTION TO ANALYTICS  DATA ANALYTICS 8-16  WEB ANALYTICS 17-19 CHAPTER 2: PROFILE OF ALEXA,COM  ALEXA INTERNET 21-23 CHAPTER 3: METHODOLOGY  RESEARCH PROBLEM 25  OBJECTIVES OF THE STUDY 25  METHODLOGY OF THE STUDY 25-26  SCOPE OF THE STUDY 26  LIMITATIONS OF THE STUDY 26 CHAPTER 4: ANALYSIS AND DATA INTERPRETATION 28-40 CHAPTER 5: OBSERVATIONS AND CONCLUSION  OBSERVATIONS 42  CONCLUSION 43  BIBLIOGRAPHY 44  ANNEXURE 45-46
  • 6. 6 | P a g e LIST OF TABLES Table No. Title Page No. 1 List Of Top 10 Websites (Global) 29 2 List Of Top 10 Websites In India 30 3 List Of Summary Statistics Of The 50 Selected Sports(Cricket) Websites 31 4 Count Of The Websites With Reference To Their Summary Statistics 32 LIST OF CHARTS Chart No. Title Page No. 1 Percentage Of Visits By Indian User For Top 10 Websites (Global) 29 2 Percentage Of Visits By Indian User For Top 10 Websites In India 30 LIST OF FIGURES Figure No. Title Page No. 1 The Websites Which Has Highest Trend During April – July Months 33-35 2 The Websites Which Has Highest Trend During August-October Months 36-38 3 The Websites Which Has Highest Trend During October-December Months 39 4 The Websites Which Has Highest Trend During January-March Months 40
  • 7. 7 | P a g e CHAPTER – I INTRODUCTION TO ANALYTICS
  • 8. 8 | P a g e CHAPTER-I Data Analytics 1. Introduction Imagine a world without data storage; a place where every detail about a person or organization, every transaction performed, or every aspect which can be documented is lost directly after use. Organizations would thus lose the ability to extract valuable information and knowledge, perform detailed analyses, as well as provide new opportunities and advantages. Data is an essential part of our lives, and the ability to store and access such data has become a crucial task which we cannot live without. Anything ranging from customer names and addresses, to products available, to purchases made, to employees hired, etc. has become essential for day to day continuity. Data is the building block upon which any organization thrives. 2. Big Data The term "Big Data" has recently been applied to datasets that grow so large that they become awkward to work with using traditional on-hand database management tools. They are data sets whose size is beyond the ability of commonly used software tools and storage systems to capture, store, manage, as well as process the data within a tolerable elapsed time. Big data also refers to databases which are measured in terabytes and above, and are too complex and large to be effectively used on conventional systems. Big data sizes are a constantly moving target, currently ranging from a few dozen terabytes to many petabytes of data in a single data set. Consequently, some of the difficulties related to big data include capture, storage, search, sharing, analytics, and visualizing. Today, enterprises are exploring large volumes of highly detailed data so as to discover facts they didn’t know before. Business benefit can commonly be derived from analyzing larger and more complex data sets that require real time or near-real time capabilities, however, this leads to a need for new data architectures, analytical methods, and tools. In this section, we will discuss the characteristics of big data as well the issues surround storing and analyzing suchdata.
  • 9. 9 | P a g e 2.1. Big Data Characteristics Big data is data whose scale, distribution, diversity, and/or timeliness require the use of new technical architectures, analytics, and tools in order to enable insights that unlock new sources of business value. Big data is characterized by three main features: volume, variety, and velocity. The volume of the data is its size, and how enormous it is. Velocity refers to the rate with which data is changing, or how often it is created. Finally, variety includes the different formats and types of data, as well as the different kinds of uses and ways of analyzing the data. 2.2. Importance of Managing Big Data There are five broad ways in which using big data can create value. First of all, big data can unlock significant value by making information transparent and usable at a much higher frequency. Second of all, as organizations create and store more and more transactional data in a digital form, they can collect more accurate and detailed performance information on everything from product inventories to sick days. This can therefore expose variability in the data and boost performance. Third of all, big data allows a narrower segmentation of customers and therefore much more precisely tailored products or services to meet their needs and requirements. Fourth of all, sophisticated analytics performed on big data can substantially improve decision making. Finally, big data can also be used to improve the development of the next generation of products and services. For example, manufacturers are currently using data obtained from sensors which are embedded in products to create innovative after-sales service offerings such as proactive maintenance, which are preventive measures that take place before a failure occurs or is even noticed by the customer. Nowadays, along with the increasing ubiquity of technology comes the increase in the amount of electronic data. Only a few years ago, corporate databases tended to be measured in the range of tens to hundreds of gigabytes. Now, however, multi-terabyte (TB) or even petabyte (PB) databases have become normal. According to Longbottom , the World Data Center for Climate (WDDC) stores over 6PB of data overall and the National Energy Research Scientific Computing Center (NERSC) has over 2.8PB of
  • 10. 10 | P a g e available data around atomic energy research, physics projects and so on. These are only a couple of examples of the enormous amounts of data which must be dealt with nowadays. Furthermore, even companies such as Amazon are running with databases in the tens of terabytes, and companies which wouldn’t be expected to have to worry about such massive systems are dealing with databases with sizes of hundreds of terabytes. Additionally, other companies with large databases in place include telecom companies and service providers, as well as social media sites. For telecom companies, just dealing with log files of all the events happening and call logs can easily build up database sizes. Moreover, social media sites, even those that are primarily text, such as Twitter or Facebook, have big enough problems; and sites such as YouTube have to deal with massively expanding datasets. With such increasing amounts of big data, there arises an essential need to be able to analyze the datasets. Thus, big data analytics will be discussed in the subsequentsection. 3. Big Data Analytics Big data analytics is where advanced analytic techniques operate on big data sets. Analytics based on large data samples reveals and leverages business change. However, the larger the set of data, the more difficult it becomes to manage. Sophisticated analytics can substantially improve decision making, minimize risks, and unearth valuable insights from the data that would otherwise remain hidden. Sometimes decisions do not necessarily need to be automated, but rather augmented by analyzing huge, entire datasets using big data techniques and technologies instead of just smaller samples that individuals with spreadsheets can handle and understand. Therefore, decision making may never be the same. Some organizations are already making better decisions by analyzing entire datasets from customers, employees, or even sensors embedded in products. In this section, we will discuss the data analytics lifecycle, followed by some advanced data analytics methods, as well as some possible tools and methods for big data analytics in particular.
  • 11. 11 | P a g e 3.1. Advanced Data Analytics Methods With the evolution of technology and the increased multitudes of data flowing in and out of organizations daily, there has become a need for faster and more efficient ways of analyzing such data. Having piles of data on hand is no longer enough to make efficient decisions at the right time. The acquired data must not only be accurate, consistent, and sufficient enough to base decisions upon, but it must also be integrated and subject- oriented, as well as non volatile and variant with time. New tools and algorithms have been designed to aid decision makers in automatically filtering and analyzing these diverse pools of data. Data Analytics is the process of applying algorithms in order to analyze sets of data and extract useful and unknown patterns, relationships, and information. Furthermore, data analytics are used to extract previously unknown, useful, valid, and hidden patterns and information from large data sets, as well as to detect important relationships among the stored variables. Thus, analytics have had a significant impact on research and technologies, since decision makers have become more and more interested in learning from previous data, thus gaining competitive advantage. Nowadays, people don’t just want to collect data, they want to understand the meaning and importance of the data, and use it to aid them in making decisions. Data analytics have gained a great amount of interest from organizations throughout the years, and have been used for many diverse applications. Some of the applications of data analytics include science, such as particle physics, remote sensing, and bioinformatics, while other applications focus on commerce, such as customer relationship management, consumer finance, and fraud detection. In this section, we will take a look at some of the most common data analytics methods. In order to fully grasp the concept of data analytics, we will take a look at some of the most common approaches as well as how they can be applied and what algorithms are frequently used. Three different data analytics approaches will be discussed: association rules, clustering, and decision trees.
  • 12. 12 | P a g e 3.2. AssociationRules Association rules are one of the most popular data analytics tasks for discovering interesting relations between variables in large databases. It is an approach for pattern detection which finds the most common combinations of categorical variables. Using association rules shows relationships between data items by identifying patterns of their co-occurrence. Since so many various association rules can be derived from even a tiny dataset, the interest in such rules is restricted to those that apply to a reasonably large number of instances and have a reasonably high accuracy on the instances to which they apply to. Association rule analytics discover interesting correlations between attributes of a database by using two measures, support and confidence. Support is the probability that two different attributes occur together in a single event, or the frequency of occurrence, while confidence is the probability that when one attribute occurs, the other will also occur in the same event. Association rules are normally used in business applications to determine the items which are usually purchased together. An example of an association rule would be the statement that people who buy cars also buy CD’s 80% of the time, written as Car → CD. In this case the two attributes being associated are the car and the CD, while the confidence value is the 80% and the support value is how many times in the database both a car and a CD were bought together. If a rule passes the minimum support then it is considered as a frequent rule, while rules which pass both support and confidence are considered strong rules. One of the most common algorithms for association rule analytics is the Apriori algorithm. Like most association rule algorithms, it splits the problem into two major tasks. The first task is frequent itemset generation, in which the objective is to find all the itemsets which satisfy the minimum support threshold and are thus frequent itemsets. The formula for calculating supportis: The second task is rule generation, in which the objective is to extract the high confidence, or strong, rules from the previously found frequent itemsets. The formula for calculating confidence is:
  • 13. 13 | P a g e Since the first step is computationally expensive and requires the generation of all combinations of itemsets, the Apriori algorithm provides a principle for guiding itemset generation and reducing computational requirements. The Apriori principle states that a subset of a frequent itemset must also be frequent. In this case, if an itemset is not frequent, then it will be discarded and will not be used as a subset for the generation of another itemset. The algorithm uses a breadth first search strategy and a tree structure, to count candidate itemsets efficiently. Each level in the tree contains all the k-itemsets, where k is the number of items in the itemset. For example level 1 contains all 1-itemsets, level 2 all 2-itemsets, and so forth. Instead of ending up with so many itemsets through all possible combinations of items, the Apriori algorithm only considers the frequent itemsets. So in the first level, the algorithm calculates the support of each itemset. Frequent itemsets which pass the minimum support are taken to the next level, and all possible 2-itemset combinations are made only out of these frequent sets, while all others are discarded. Finally, rules are extracted from the frequent itemsets in the form of A → B (if A then B). The confidence for each rule is calculated, and rules which pass the minimum confidence are taken as strong rules. 3.3. Clustering Data clustering is a technique which uses unsupervised learning, or in other words discovers unknown structures. Clustering is the process of grouping sets of objects together into classes based on similarity measures and the behavior of the group. Instances within the same group are similar to each other, and are dissimilar to instances in other groups. Clustering is similar to classification in that it groups data into classes; however the main difference is that clustering is unsupervised, and the classes are defined by the data alone, hence they are not predefined. Therefore, data to be analyzed is not compared to a model built from training data, but is rather compared to other data and clustered according to the level of similarity between them. Several representations of clusters are depicted.
  • 14. 14 | P a g e 3.4. DecisionTrees Another type of data analytics technique is the decision tree. Decision trees are used as predictive models to map observations about an attribute to conclusions about an attribute’s target value. A decision tree is a hierarchical structure of nodes and directed edges which consists of three types of nodes. The root node is a node with no incoming edges and zero or more outgoing edges to other nodes. An internal node is a node in the middle levels of the tree, and consists of one incoming edge and two or more outgoing edges. Finally, the leaf node has exactly one incoming edge and no outgoing edges, and is assigned a class label which provided the decision of the tree. Each of the tree’s nodes specifies a test of a certain attribute of the instance, and each descending branch from the node corresponds to one of the attribute’s possible values. An instance is classified by moving down the tree by starting at the root node, testing the attribute specified by that node, and moving down the branch which corresponds to the value of the given attribute to a new node. The same process is repeated at that node, until a leaf node providing a decision is finally reached. 4. Big Data Analytics Tools and Methods Big data is too large to be handled by conventional means, and the larger the data grows, the more organizations purchase more powerful hardware and computational resources. However, the data keeps on growing and performance needs increase, but the available resources have a maximum capacity and capability. The MapReduce paradigm is based on adding more computers or resources, rather than increasing the power or storage capacity of a single computer; in other words, scaling out rather than scaling up. The fundamental idea of MapReduce is breaking a task down into stages and executing the stages in parallel in order to reduce the time needed to complete the task. Map Reduce is a parallel programming model which is suitable for big data processing. It is built on Hadoop, which is a concrete platform which implements MapReduce. In MapReduce, data is split into distributable chunks, which are called shards. The steps to process those chunks are defined, and the big data processing is run in parallel on the chunks. This model is scalable, in that the bigger the data processing becomes, or the
  • 15. 15 | P a g e more computational resources are the required, the more machines can be added to process the chunks. The first phase of the MapReduce job is to map input values to a set of key/value pairs as output. Thus, unstructured data such as text can be mapped to a structured key/value pair, where, in this case, the key could be the word in the text and the value is the number of occurrences of the word. This output is then the input to the "Reduce" function. Reduce then performs the collection and combination of this output. So assuming we have millions of text documents and would like to count the occurrence of a certain word. The text documents would be divided upon several workers, or machines, which will perform parallel processing. These workers will act as mappers and map the desired word to the number of occurrences in the text documents given to it for processing in parallel. The reducers will then aggregate these counts, thus giving the total count in the millions of text documents. Hadoop is a framework for performing big data analytics which provides reliability, scalability, and manageability by providing an implementation for the MapReduce paradigm as well as gluing the storage and analytics together. Hadoop consists of two main components: the Hadoop Distributed File System (HDFS) for the big data storage, and MapReduce for big data analytics. The HDFS storage function provides a redundant and reliable distributed file system which is optimized for large files. Data is stored in replicated file blocks across the multiple Data Nodes, and the Name Node acts as a regulator between the client and the Data Node, directing the client to the particular Data Node which contains the requested data. Additionally, the data processing and analytics functions are performed by MapReduce which consists of a java API as well as software in order to implement the services which Hadoop needs to function. The MapReduce function within Hadoop depends on two different nodes: the Job Tracker and the Task Tracker nodes. The Job Tracker nodes are the ones which are responsible for distributing the Mapper and Reducer functions to the available Task Trackers, as well as monitoring the results. On the other hand, the Task Tracker nodes actually run the jobs
  • 16. 16 | P a g e and communicate results back to the Job Tracker. That communication between nodes is often through files and directories in HDFS so inter-node communication is minimized. 5. Big Data Challenges Several issues will have to be addressed in order to capture the full potential of big data. Policies related to privacy, security, intellectual property, and even liability all need to be addressed in a big data world. Organizations need to put the right talent and technology in place, as well as additionally structure workflows and incentives to optimize the use of big data. Access to data is critical, and companies will need to increasingly integrate information from multiple data sources, often from third parties or different locations. Furthermore, questions on how to store and analyze data with volume, variety, and velocity have arisen, and current research lacks the capability for providing an answer. Consequently, the biggest problem has become not only the sheer volume of data, but the fact that the type of data companies must deal with is changing. In order to accommodate for the change in data, the approaches for storing data have changed throughout the years. Data storage started with data warehouses, data marts, data cubes, and then moved on to master data management, data federation and other techniques such as in-memory databases. However, database suppliers are still struggling to cope with enormous amounts of data, and the emergence of interest in big data has led to a need for storing and managing such large amounts of data.
  • 17. 17 | P a g e Web Analytics Web analytics is reporting and analysis of data on website visitor activity. It is not only a tool to measure web traffic but also can be used as a tool for business and market research. Techniques used to access and improve the contribution of e-marketing to a business, such as referrals, click streams, online research data, customer satisfaction surveys, and leads and sales. Thus, marketers use web analytics exploring data and reports to build their knowledge on customers' preference and behavior according to types of sites, which areas customers click more often when they online. It also helps marketer understand their customers better and improve their business performance. These are three stages that they need to concern when setting up a web analytic tool. The analysis is the ticket for them move from Steupland to Actionland. It is the isolating of meaningful and actionable insights in data and reports that when acted upon by your organization can drive business value. Alignment Stage: At this early planning stage, it is necessary for marketer to gather their business objectives and capture stakeholders' online behavior by their online measurement strategy. Clearly understand measurement strategy and well analyze visitors is critical to success. Thus, marketers have to carefully handling relevant and meaningful data which will directly affect the business in the long-term. Collection Stage: At this point of stage, large companies may spend amount of time on technical implementation such as multiple web domains and online marketing initiatives. Reporting Stage: This is the last stage for companies move from Setupland to Actionland. This stage is important where you create report and distribute them to organization using a manual or preferably automated approach. TOOLS AND METHODS USED TO HELP MARKETER:- There are two types of web analytics, on-site and off-site web analytics.
  • 18. 18 | P a g e ON-SITE ANALYTICS On-site web analytics is used for marketers to measure a visitor's activity when he browses on your website. This includes its drivers and conversations, for example which ads on landing page encourage more people to purchase and which title of information visitors click most. This data is used to analysis visitors' online behavior and can be used to improve website or marketing campaign's audience response. Simply, on-site web analytics tools are used to analysis and measure behaviors of visitors' journey and actual visitor traffic arriving on your website. For example, which landing page encourage visitors to make a purchase, what links visitors clicked on (from search engine to get to the site or came there directly) to the site, and time they spent and stayed on given page. Therefore, On-site web analytics measures of website in a commercial context. For the business, website became more important than ever before, it handles more information. Companies also need to know if their marketing campaigns are working on internet-based. OFF-SITE ANALYTICS Off-site analytics data can be obtained for any website-including your competitors and partners. Which means is analysis the internet as a whole for the websites. Thus, the key differences of off-site web analytics measures from your potential audience (opportunity), share of voice (visibility), and buzz (comments). Unlikely to on-site web analytics only captures what happens when visitors visit and engage with your website, by using various technologies to help monitor and analysis website to create meaningful actions and results. However, as social website becomes more popular and ascendant channel for internet users, and everything becomes more transparent on social web, organization information are shared, spread on it, thus, through this platform, marketers are able to measure the latest buzz about website or organization. It is important for marketers to monitor not only what happens on the website but also outside of your website. Improving from what other people are saying about the company
  • 19. 19 | P a g e and provide products and services match customers requires. Off-site Web Analytics solutions can help businesses stay on the leading edge of overall trends.
  • 20. 20 | P a g e CHAPTER- II PROFILE OF ALEXA.COM
  • 21. 21 | P a g e CHAPTER-II Alexa Internet Alexa Internet, Inc. is a California-based company that provides commercial web traffic data and analytics. It is a wholly owned subsidiary of Amazon.com Founded as an independent company in 1996, Alexa was acquired by Amazon in 1999. Its toolbar collects data on browsing behavior and transmits them to the Alexa website, where they are stored and analyzed, forming the basis for the company's web traffic reporting. According to its website, Alexa provides traffic data, global rankings and other information on 30 million websites, and as of 2015 its website is visited by over 6.5 million people monthly. Operation & History Alexa Internet was founded in April 1996 by American web entrepreneurs Brewster Kahle and Bruce Gilliat. The company's name was chosen in homage to the Library of Alexandira of Ptlolemaic Egypt, drawing a parallel between the largest repository of knowledge in the ancient world and the potential of the Internet to become a similar store of knowledge. Alexa initially offered a toolbar that gave Internet users suggestions on where to go next, based on the traffic patterns of its user community. The company also offered context for each site visited: to whom it was registered, how many pages it had, how many other sites pointed to it, and how frequently it was updated. Alexa's operations grew to include archiving of web pages as they are crawled. This database served as the basis for the creation of the Internet Archive accessible through the Wayback Machine. In 1998, the company donated a copy of the archive, two terabytes in size, to the Library of Congress. Alexa continues to supply the Internet Archive with Web crawls. In 1999, as the company moved away from its original vision of providing an "intelligent" search engine, Alexa was acquired by Amazon.com for approximately US$250 million in Amazon stock. Alexa began a partnership with Google in early 2002 and with the web directory DMOZ in January 2003. In May 2006, Amazon replaced Google with Bing (at the time known as Windows Live Search) as a provider of search results. In December
  • 22. 22 | P a g e 2006, Amazon released Alexa Image Search. Built in-house, it was the first major application built on the company's Web platform. In December 2005, Alexa opened its extensive search index and Web-crawling facilities to third party programs through a comprehensive set of web services and APIs. These could be used, for instance, to construct vertical search engines that could run on Alexa's own servers or elsewhere. In May 2007, Alexa changed their API to limit comparisons to three websites, reduce the size of embedded graphs in Flash, and add mandatory embedded BritePic advertisements. On November 27, 2008, Amazon announced that Alexa Web Search was no longer accepting new customers, and that the service would be deprecated or discontinued for existing customers on January 26, 2009. Thereafter, Alexa became a purely analytics- focused company. On March 31, 2009, Alexa launched a major website redesign. The redesigned site provided new web traffic metrics—including average page views per individual user, bounce rate, and user time on site. In the following weeks, Alexa added more features, including visitor demographics, click stream and search traffic statistics. Alexa introduced these new features to compete with other web analytics services. Toolbar Alexa ranks sites based primarily on tracking a sample set of internet traffic—users of its toolbar for the Internet Explorer, Firefox and Google Chrome web browsers. The Alexa Toolbar includes a popup blocker, a search box, links to Amazon.com and the Alexa homepage, and the Alexa ranking of the site that the user is visiting. It also allows the user to rate the site and view links to external, relevant sites. In early 2005, Alexa stated that there had been 10 million downloads of the toolbar, though the company did not provide statistics about active usage. Originally, web pages were only ranked amongst users who had the Alexa Toolbar installed, and could be biased if a specific audience subgroup was reluctant to take part in the rankings. This caused some controversies over how representative Alexa's user base was of typical Internet behavior, especially for less-visited sites.
  • 23. 23 | P a g e Until 2007, a third-party-supplied plugin for the Firefox Browser served as the only option for Firefox users after Amazon abandoned its A9 toolbar. On July 16, 2007, Alexa released an official toolbar for Firefox called Sparky. On 16 April 2008, many users reported drastic shifts in their Alexa rankings. Alexa confirmed this later in the day with an announcement that they had released an updated ranking system, claiming that they would now take into account more sources of data "beyond Alexa Toolbarusers". Certified Statistics Using the Alexa Pro service, website owners can sign up for "certified statistics," which allows Alexa more access to a site's traffic data. Site owners input Javascript code on each page of their website that, if permitted by the user’s security and privacy settings, runs and send traffic data to Alexa, allowing Alexa to display or not display, depending on the owner's preference more accurate statistics such as total pageviews and unique pageviews.
  • 24. 24 | P a g e CHAPTER – III METHODOLOGY
  • 25. 25 | P a g e CHAPTER – III METHODOLOGY RESEARCH PROBLEM:- Study on web-analytics with reference to select sport websites. OBJECTIVESOF THE STUDY:- The objectives of the study are,  To find out the sports (cricket) websites with highest visitors of India.  To know the rank of the website in India and as wells as global too.  To know the bounce rate, daily page views per visitor and daily time on site of the selected websites. METHODOLOGYOF THE STUDY:- The top 50 sport websites selected to cricket is taken for the study where website traffic rank is a combined measure of page views and users. So, albeit a website has more reach i.e., number of users visiting the website, its rank may differ based on the unique pages that were visited for the website. The main aim of the study is to know the most popular cricket website in India and as well as global with reference to the alexa.com website and also to know the bounce rate, daily pageviews per visitor and daily time on site of the selected websites. The metrics used to measure the popularity of the websites are taken as follows:  Bounce Rate:- The percentage of visitors to a particular website who navigate away from the site after viewing only one page. A rising bounce rate is a sure sign that your homepage is boring or off-putting.  Daily Pageviews Per Visitor:- The average number of pages viewed by each visitor to your website per day. When this number is higher, your website is considered to have more engaging information.
  • 26. 26 | P a g e  Daily Time On The Site:- The average number of minutes of minutes spent on your website by each visitor per day. As with “Daily Pageviews Per Visitor”, when this number is higher, your website is considered to have more engaging information. SCOPE OF THE STUDY:- The Web traffic data for this study is collected from alexa.com that collects the traffic data by using a Web crawler. Only top 50 sports (cricket) websites as ranked by Alexa.com as on 31st May. 2016 is taken for the study. LIMITATIONS OF THE STUDY:- The study is limited to the sites or the data given in the alexa.com website. And the study is limited only to the top 50 sites of sports (cricket) websites. The rank of the each website is taken only as per global and in India, other countries ranks are not taken in the study.
  • 27. 27 | P a g e CHAPTER – IV ANALYSIS AND INTERPRETATION
  • 28. 28 | P a g e CHAPTER – IV ANALYSIS AND INTERPRETATION In this section, the results are shown along with the interpretations. The categorization, along with the rank in India & percentage of visitors in India of top 10 websites as per global rank as on 31st May, 2016, is exhibited in the Table 1 and Chart 1. And for the top 10 websites in India are shown in Table 2 and Chart 2. These tables also includes the bounce rate, daily page views per visitor & daily time on the site of those selected sports (cricket) websites.
  • 29. 29 | P a g e Table 1 : List Of Top 10 Websites (Global) S.NO Website Global Rank Rank In India % Of Visitors In India Bounce Rate Daily Pageviews Per Visitor Daily Time On Site 1 Https://www.youtube.com/user/CricketICC 2 3 9.3 33.30 13.42 23:42 2 Msn.com/en-in/sports/cricket 15 43.30 4.19 11:36 3 Bbc.com/sport/cricket 126 131 7.1 52.20 2.75 4:30 4 Telegraph.co.uk/sport/cricket/ 327 293 8.4 71.20 2.28 3:10 5 Espncricinfo.com 370 75 47.6 33.30 3.44 5:46 6 Cricbuzz.com 623 110 83.6 27.10 3.52 5:47 7 Skysports.com/cricket 1110 1035 8.9 44.10 2.89 4:12 8 Iplt20.com 1522 8028 64.7 27.50 4.01 6:36 9 Smh.com.au/sport/cricket/ 1914 4627 4 30.10 2.16 5:32 10 Stuff.co.nz/sport/cricket 3242 7930 4.5 41.00 3.78 7:07 The above chart shows about the percentage of visits by Indian user for the top 10 websites of sports (cricket) websites as per global rank of the website as on 31st may, 2016. Among the top 10 websites in global, cricbuzz.com holds the first place in highest percentage of viewers in India i.e., 35%. Msn.com has 15th rank in global but it doesn’t have any rank in India becauseit deals mainly with England cricket. 4% 3% 3% 20% 35% 4% 27% 2% 2% % Of Visitors In India Https://www.youtube.co m/user/CricketICC Msn.com/en- in/sports/cricket Bbc.com/sport/cricket Telegraph.co.uk/sport/cric ket/ Espncricinfo.com Cricbuzz.com Skysports.com/cricket
  • 30. 30 | P a g e Table 2 : List Of Top 10 Websites In India S.NO Website Rank In India % Of Visitors In India Global Rank Bounce Rate Daily Pageviews Per Visitor Daily Time On Site 1 Https://www.youtube.com/user/CricketICC 3 9.3 2 33.3 13.42 23:42 2 Espncricinfo.com 75 47.6 370 33.3 3.44 5:46 3 Cricbuzz.com 110 83.6 623 27.1 3.52 5:47 4 Bbc.com/sport/cricket 131 7.1 126 52.2 2.75 4:30 5 Telegraph.co.uk/sport/cricket/ 293 8.4 327 71.2 2.28 3:10 6 Skysports.com/cricket 1035 8.9 1110 44.1 2.89 4:12 7 Icc-cricket.com/cricket-world-cup 2635 57.4 18158 56.5 1.95 2:52 8 Cricket.com.au 4249 37.3 14784 53.9 2.23 3:39 9 Smh.com.au/sport/cricket/ 4627 4 1914 30.1 2.16 5:32 10 Bcci.tv 4955 83.5 71275 45.5 2.59 2:49 The above chart shows about the percentage of visits by Indian user for the top 10 websites of sports (cricket) websites as per rank in India of the website as on 31st may, 2016. Among the top 10 websites in India, cricbuzz.com & youtube.com holds the first place in highest percentage of viewers in India i.e., 24% each. Smh.com gets lower percentage of visitors in India as it deals mainly with Australian cricket.
  • 31. 31 | P a g e SUMMARY STATISTICS Table 3 : List Of Summary Statistics Of The 50 Selected Sports (Cricket) Websites Global Rank Rank In India % Of Visitors In India (%) Bounce Rate Daily PageviewsPer Visitor Daily Time On Site Maximum 773118 256703 92.9 75 24 42.18 Minimum 2 3 1 4 1.16 1.11 Range 773116 256700 91.9 71 22.84 41.16 Mean 221274 70601 34.34 43.73 3.72 6.61 Median 172102 44956 21.5 42.8 2.78 3.83 Standard Deviation 220738 71308 29.51 15.22 3.62 8.29 The above table gives the summary statistics of the data. Following inferences can be drawn from the table,  Websites which are having lesser bounce rate value than the mean value of bounce rate, they are said to be more popular sites.  Websites which are having greater daily pageviews per visitor mean value are said to be more popular sites.  Similarly as daily pageviews per visitor, websites having greater mean value of daily time on site are said to be more popular sites.
  • 32. 32 | P a g e Table 4 : Count Of The Websites With Reference To Their Summary Statistics Global Rank Rank In India % Of Visitors In India (%) Bounce Rate Daily Pageviews Per Visitor Greater Than Mean 19 16 16 24 14 Lesser Than Mean 31 23 23 26 36 Greater Than Median 25 19 19 25 25 Lesser Than Median 25 19 19 25 25 Greater Than Standard Deviation 19 16 17 49 14 Lesser Than Standard Deviation 31 23 22 1 36 From the above table we say that, 38% of the websites had good global rank and 32% websites had good rank in India and same percentage of visitors in India. 52% of websites had bounce rate lesser than mean, it means the viewers are going deeply into the website for more information. 28% of websites have greater daily pageviews per visitor value. It means that 28% visitors are visiting the website daily. Even though bounce rate is good for many websites, daily pageviews per visitor is not so good for the websites.
  • 33. 33 | P a g e INTERPRETATIONOF WEBSITESCATEGORIZED BYTHE TREND Category 1 : The Websites Which Has Highest Trend During April – July Months 1. Espncricinfo.com 2. Cricbuzz.com 3. Msn.com/en-in/sports/cricket 4. Icc-cricket-com/cricket-world-cup 5. Iplt20.com 6. Pakpassion.net 7. Batsman.com 8. Cricketnmore.com 9. Carribeancricket.com 10.Cricwaves.com
  • 34. 34 | P a g e 11.Cricketwrold.com 12.Cricketweb.net 13.Pcb.com.pk 14.Lastmanstands.com 15.Cricketweb.net 16.Kkr.in 17.Cricket365.com 18.Cricruns.com
  • 35. 35 | P a g e From the above table, we can say that all the websites have their highest point in their trend line during the April – July months. Many of the sites are from India. This is because of the IPL season. But some sites of other countries like msn.com, carribeancricke.com, winidescricket.com have also their peak stage during April month because it was the time of T20 World Cup Playoffs. 19.www.youtube.com/user/CricketICC 20.Windiescricket.com 21.Royalchallengers.in 22.Cricketfresh.in
  • 36. 36 | P a g e Category 2: The Websites Which Has Highest Trend During August-October Months 1. Ecb.co.uk 2. Skysports.com/cricket 3. Islandcricket.lk 4. Lords.org 5. Foxsports.com.au/cricket/ 6. Kiaoval.com 7. Yorkshireeccc.com 8. Lccc.co.uk 9. Telegraph.co.uk/sport/cricket 10.Srilankacricket.lk
  • 37. 37 | P a g e 11.Mcc.org.au 12.Glamorgancricket.com 13.Smh.com/au/sport/cricket 14.Middleexccc.com 15.Kentcricket.co.uk 16.Wccc.co.uk 17.Indiancricketfans.com 18. Islandcricket.lk
  • 38. 38 | P a g e The sites of the above table have their highest point in their trend line during the months of August – October. Most of sites in this category belong to England cricket. Since it was time of county cricket and England played most of their matches during that time, these sites had their peak stage. Two sites namely srilankancricket.com and indiancricketfan.com also have their peak stages during those months because their countries played some international cricket during that time.
  • 39. 39 | P a g e The sites of the above table have their highest point in their trend line during the months October - December. Most of the above websites belong to Australia and New Zealand. These sites have their peak stage during these months because it was the time of Big Bash League. Category3:The Websites Which Has Highest Trend During October-DecemberMonths 1. Bbc.com/sport/cricket 2. Supersport.com/cricket/ 3. Sports24.co.za.uk 4. Stuff.co.nz/sport/cricket 5. Lol.co.za/sport/cricket/ 6. Cricbay.com
  • 40. 40 | P a g e From the above table, we can say that all the websites have their highest point in their trend line during the January – March months. These websites belong to India. Since, India had played more international cricket during this period, these sites had peak stage. Category4: The Websites Which Has Highest Trend During January-March Months 1. Cricket.com.au 2. Cricketfanforum.net 3. Cricket.co.za 4. Bccci.tv
  • 41. 41 | P a g e CHAPTER – V OBSERVATIONS AND CONCLUSION
  • 42. 42 | P a g e CHAPTER - V OBSERVATIONS The findings from the above inferences are given below:  Globally, www.youtube.com/user/CricketICC holds the first rank in cricket websites.  In India also www.youtube.com/user/CricketICC holds the top rank.  The websites can be categorized into 4 categories by using their trend line graph. 1. April – July Months:- The websites under this category have their highest point during that time because of IPL and T20 World Cup playoffs. 2. August – October Months:- The websites under this category have their highest point during that time becauseof County cricket in England. 3. October – December Months:- The websites under this category have their highest point during that time becauseof Big Bash League in Australia. 4. January – March Months:- The websites under this category have their highest point because of India played more international cricket during that time.  32% (16) websites have good percentage of visitors in India.  52% (26) websites have good percentage of bounce rate.  28% (14) websites have good percentage of pageviews per visitor in India.
  • 43. 43 | P a g e CONCLUSION The following study has been undertaken to understand the popularity and usage of cricket websites in India. Categorization of 50 websites that are preferred by the users has given an insight into the kind of information that a user seek in the net. Specifically, four categories have been selected by looking at their trend line. By this categorization we can easily know the popularity of the website and its reason. Web traffic analysis method does cover large set of population but fails to get more information about the user in particular. A combination of survey and Web traffic analysis method can be adopted to get even more information about the users.
  • 44. 44 | P a g e BIBLIOGRAPHY  Avinash Kaushik, Web Analytics2.0., SYBEX, A Willey Brand, New Delhi, 2015.  http://www.uniassignment.com/essay-samples/information-technology/big-data- analytics-opportunities-and-challenges-information-technology-essay.php  https://www.ukessays.com/essays/information-technology/what-are-web-analytics- information-technology-essay.php  http://www.alexa.com/topsites/category/Top/Sports/Cricket
  • 45. 45 | P a g e ANNEXURE S.NO Website Global Rank Rank In India % Of Visitors In India Bounce Rate Daily Pageviews Per Visitor Daily Time On Site 1 Espncricinfo.com 370 75 47.6 33.30 3.44 5:46 2 Cricbuzz.com 623 110 83.6 27.10 3.52 5:47 3 Cricket.com.au 14784 4249 37.3 53.90 2.23 3:39 4 Bbc.com/sport/cricket 126 131 7.1 52.20 2.75 4:30 5 Msn.com/en-in/sports/cricket 15 43.30 4.19 11:36 6 Bcci.tv 71275 4955 83.5 45.50 2.59 2:49 7 Icc-cricket.com/cricket-world-cup 18158 2635 57.4 56.50 1.95 2:52 8 Iplt20.com 1522 8028 64.7 27.50 4.01 6:36 9 Supersport.com/cricket/ 6185 34.80 3.99 5:20 10 Ecb.co.uk 76612 17529 32.4 50.90 2.65 2:51 11 Sport24.co.za/Cricket/ 12579 68318 1.4 55.60 2.48 3:50 12 Skysports.com/cricket 1110 1035 8.9 44.10 2.89 4:12 13 Pakpassion.net 72568 26045 21.5 38.00 3.90 8:12 14 Batsman.com 121160 2.6 29.80 8.40 17:00 15 Cricketnmore.com 120204 11835 92.9 25.00 24.00 40:52 16 Islandcricket.lk 95949 116395 4.8 63.30 1.93 0:02:26 17 Lords.org 177820 141290 9.8 36.70 3.60 3:34 18 Foxsports.com.au/cricket/ 9447 8667 8.6 59.90 2.06 3:43 19 Cricwaves.com 91438 36022 52.7 59.80 1.59 1:45 20 Cricketworld.com 193051 35991 52.9 65.10 1.76 1:35 21 Kiaoval.com 300467 209357 8.5 46.90 2.14 3:20 22 Yorkshireccc.com 301917 167856 9.9 28.30 3.20 3:50 23 Stuff.co.nz/sport/cricket 3242 7930 4.5 41.00 3.78 7:07 24 Cricketweb.net 194708 104355 28.7 21.70 6.00 10:09 25 Pcb.com.pk 211355 256703 49.00 3.10 3:15 26 Lccc.co.uk 319616 111051 18 32.00 3.60 3:51 27 Caribbeancricket.com 400535 171264 13.8 29.70 4.10 6:25 28 Lastmanstands.com 342004 41.00 4.70 4:36 29 Iol.co.za/sport/cricket 5485 44956 1 52.70 2.86 4:52
  • 46. 46 | P a g e 30 Cricketfansforum.net 467359 42368 83.2 4.00 3.80 42:18 31 Telegraph.co.uk/sport/cricket/ 327 293 8.4 71.20 2.28 3:10 32 Essexcricket.org.uk 411842 40.70 2.80 4:08 33 Cricbay.com 473376 21.70 8.80 9:28 34 Srilankacricket.lk 418448 52.40 1.90 2:17 35 Cricketweb.net/forum/ 194708 104355 28.7 21.70 6.00 10:09 36 Kkr.in 184488 67495 70.2 56.90 1.90 2:33 37 Mcc.org.au 544616 31.30 3.50 4:05 38 Cricket365.com 390088 159607 18.2 75.00 1.40 1:08 39 Glamorgancricket.com 474829 129761 21.1 42.30 2.80 3:08 40 Smh.com.au/sport/cricket/ 1914 4627 4 30.10 2.16 5:32 41 Cricket.co.za 508743 194482 14.6 51.00 1.90 3:11 42 Cricruns.com 166384 61398 85.5 65.40 1.16 1:11 43 Middlesexccc.com 626138 39.90 2.10 2:36 44 Kentcricket.co.uk 674008 37.90 2.70 2:31 45 Https://www.youtube.com/user/CricketICC 2 3 9.3 33.30 13.42 23:42 46 Wccc.co.uk 602966 34.00 2.40 2:55 47 Indiancricketfans.com 773118 98548 54.5 46.30 2.30 2:49 48 Windiescricket.com 405667 82095 53.2 55.40 1.80 2:46 49 Royalchallengers.com 67773 82706 82.2 56.90 2.09 2:34 50 Cricketfresh.in 512557 168911 52 74.40 1.40 11:45