This document discusses web usage mining and clickstream data analysis. It provides an overview of web usage mining goals such as understanding user behavior and customizing websites. It also describes different types of clickstream data sources like web server log files, page tags, and cookies. Additionally, it covers various aspects of processing clickstream data like sessionization, path completion, and data integration to model usage patterns and identify frequent behaviors. The overall aim is to apply these analytical techniques to clickstream data from a website to help maximize revenue.
Document management system (dms ) with digitization using share point2013_ sp...Milind Kumthekar
This presentation provides SharePoint 2010 based solution to scan, capture, create meta data based on scanned or captured documents and route to appropriate library in SharePoint DMS system. This solution with Digital Rights management with restricted access is implemented for various clients.
The document outlines a proposed framework for developing an economical digital content management system. It discusses how current CMS tools have high costs that prevent many public organizations from using them, instead relying on inefficient methods to store and manage content. The proposed framework aims to design a low-cost CMS that stores content in the public cloud and provides multi-channel access and improved storage principles using digital object storage. The research methodology involves investigating existing systems and literature, designing the framework and addressing challenges of storage, security, compliance and multi-device access.
Moving mountains with Sharepoint - Document Management with SharePoint 2013Oliver Wirkus
This presentation shows how to implement a Document Management System to an existing SharePoint Intranet, Best practices on how to start that kind of projects and on Document Management with SharePoint
There are few knowledge management tools available on the market today. SharePoint is widely adopted today as one of knowledge management tools.
This presentation describes main SharePoint features and presents few other content management systems such as Documentum, TeamSite, OpenText ECM Suite, Oracle UCM, and others. It further describes user adoption strategies, information governance in general and specifically in SharePoint.
This presentation was given by Marianne van Wanrooij of Connected Solutions as part of the Sparked Toolkit Session: SharePoint Nightmares.
It discusses her SharePoint Nightmare and her solution.
This document provides an outline of the course content for SAP Business Objects 4.0 training. The training covers topics such as SAP Business Objects Web Intelligence, BI Launch Pad, Dashboards, Crystal Reports, Information Design Tool, Universe Design Tool, and administration. Specific topics include creating reports with queries, enhancing report presentation, calculating data, connecting to data sources, and customizing and scheduling BusinessObjects.
- Agnes Molnar is a SharePoint consultant and MVP who has contributed to books on SharePoint.
- SharePoint 2010 provides document management capabilities including document libraries, content types, document sets, metadata management, and workflows. Content types allow classifying content with properties, templates and workflows. Document sets are like folders that enable collaboration on related documents.
- Metadata, content organizer rules, and search help ensure content is well organized and findable. Rules route documents to the proper libraries but do not work for modifying existing documents.
This document discusses key features in SharePoint related to document management, records management, and web content management. It describes document libraries, content types, templates, and policies that allow organizations to control the lifecycle of documents. It also explains records centers and the DoD 5015.2 standard for compliant records management, including routing records, holds, and an immutable records repository. Overall, the document outlines how SharePoint provides capabilities for organizations to centrally manage documents, records, and web content throughout their lifecycles.
Document management system (dms ) with digitization using share point2013_ sp...Milind Kumthekar
This presentation provides SharePoint 2010 based solution to scan, capture, create meta data based on scanned or captured documents and route to appropriate library in SharePoint DMS system. This solution with Digital Rights management with restricted access is implemented for various clients.
The document outlines a proposed framework for developing an economical digital content management system. It discusses how current CMS tools have high costs that prevent many public organizations from using them, instead relying on inefficient methods to store and manage content. The proposed framework aims to design a low-cost CMS that stores content in the public cloud and provides multi-channel access and improved storage principles using digital object storage. The research methodology involves investigating existing systems and literature, designing the framework and addressing challenges of storage, security, compliance and multi-device access.
Moving mountains with Sharepoint - Document Management with SharePoint 2013Oliver Wirkus
This presentation shows how to implement a Document Management System to an existing SharePoint Intranet, Best practices on how to start that kind of projects and on Document Management with SharePoint
There are few knowledge management tools available on the market today. SharePoint is widely adopted today as one of knowledge management tools.
This presentation describes main SharePoint features and presents few other content management systems such as Documentum, TeamSite, OpenText ECM Suite, Oracle UCM, and others. It further describes user adoption strategies, information governance in general and specifically in SharePoint.
This presentation was given by Marianne van Wanrooij of Connected Solutions as part of the Sparked Toolkit Session: SharePoint Nightmares.
It discusses her SharePoint Nightmare and her solution.
This document provides an outline of the course content for SAP Business Objects 4.0 training. The training covers topics such as SAP Business Objects Web Intelligence, BI Launch Pad, Dashboards, Crystal Reports, Information Design Tool, Universe Design Tool, and administration. Specific topics include creating reports with queries, enhancing report presentation, calculating data, connecting to data sources, and customizing and scheduling BusinessObjects.
- Agnes Molnar is a SharePoint consultant and MVP who has contributed to books on SharePoint.
- SharePoint 2010 provides document management capabilities including document libraries, content types, document sets, metadata management, and workflows. Content types allow classifying content with properties, templates and workflows. Document sets are like folders that enable collaboration on related documents.
- Metadata, content organizer rules, and search help ensure content is well organized and findable. Rules route documents to the proper libraries but do not work for modifying existing documents.
This document discusses key features in SharePoint related to document management, records management, and web content management. It describes document libraries, content types, templates, and policies that allow organizations to control the lifecycle of documents. It also explains records centers and the DoD 5015.2 standard for compliant records management, including routing records, holds, and an immutable records repository. Overall, the document outlines how SharePoint provides capabilities for organizations to centrally manage documents, records, and web content throughout their lifecycles.
This document discusses key features in SharePoint Server 2007 related to document management, records management, and web content management. It describes document libraries, content types, templates, and policies that allow organizations to control the lifecycle of documents. Records management concepts like routing records, holds, and DoD 5015.2 compliance are also summarized. The document aims to explain how these features can help organizations address challenges in managing documents and records.
Using sharepoint to solve business problems #spsnairobi2014Amos Wachanga
Using sharepoint to solve business problems #spsnairobi2014. This presentation was done by Amos Wachanga of Techno Brain Ltd at Sharepoint Saturday Nairobi event on 18th Oct 2014, held at Techno Brain HQ in Nairobi, Kenya.
The presentation creates a business scenario at start, then introduces Sharepoint and mentions some key features that would solve identified business problems, and finally using the case study and examples, ties it all down through a typical solution creation for the business scenario.
The Enterprise Content Management features in SharePoint have steadily improved with each new release of the platform. In this session, we will explore the top 10 new ECM features that have been added to SharePoint 2013, with an emphasis on "new". The session will include demos that showcase real-world examples of how each feature can be used to enhance the overall user experience when working with email, collaborative documents as well as official records.
Enterprise Document Management in SharePoint 2010Agnes Molnar
This document discusses best practices for enterprise document management in SharePoint 2010. It provides an overview of the speaker and their sessions at an upcoming conference. The document then reviews SharePoint 2010's capabilities for document management, including document libraries, content types, metadata management, workflows, and search. It discusses best practices for organizing documents, both within and outside of SharePoint, as well as online and offline access through SharePoint Workspace.
Managed Metadata and Taxonomies in SharePoint 2013Chris McNulty
This document contains the presentation slides for a talk given by Chris McNulty at the SPTechCon 2014 conference. The slides cover topics such as the SharePoint farm topology, content types, managed metadata, records management, and best practices for advanced enterprise content management and taxonomy in SharePoint 2013. Contact information is provided for Chris McNulty for any additional questions.
SPSNYC14 - Must Love Term Sets: The New and Improved Managed Metadata Service...Jonathan Ralton
This document summarizes a presentation on the new and improved Managed Metadata Service in SharePoint 2013. The presentation covers the content management capabilities in SharePoint, the services architecture including service applications and proxies, and the new information architecture features in the Term Store. Key changes discussed include content type syndication across site collections using the Content Type Hub and enhanced management of terms, term sets and term set groups in the centralized Term Store.
SPS Philly Architecting a Content Management SolutionPatrick Tucker
This document summarizes Patrick Tucker's presentation on architecting a SharePoint 2013 content management solution. It discusses defining and organizing content through tools like content types, metadata, and taxonomies. It also covers tracking and routing content with features like the content organizer and document ID service. Finally, it discusses management and retention of content through policies, holds, records management, and eDiscovery tools in SharePoint 2013.
Overview of how to improve records management and findability using SharePoint 2010, EMM, Term Store and Content Types and ConceptClassifier for SharePoint.
This document proposes a new system for extracting data from JavaScript web applications. It aims to address the challenge of extracting dynamic content that is generated through asynchronous JavaScript calls. The key aspects of the proposed system include using a headless browser to fetch pages and deal with dynamically generated DOM, extracting JSON data cached in pages rather than parsing DOM trees, and allowing users to define extraction rules and schedules through a configuration system. The preliminary results showed the system could successfully extract data from modern JavaScript web applications.
This document summarizes a presentation on using SharePoint 2013 for enterprise content management (ECM) and records management (ERM). It discusses why organizations use SharePoint for these purposes due to its cost advantages over competitors and integration capabilities. It outlines SharePoint's ECM and ERM features and limitations. It provides examples of overcoming limitations through custom configurations and third-party tools. The presentation emphasizes aligning ERM with business needs and integrating it with everyday processes rather than creating isolated records systems.
Who says you can't do records management in SharePoint?John F. Holliday
Although records management features have steadily improved with each new SharePoint version, many industry observers are starting to express their doubts as to whether SharePoint is a viable platform for building real-world ERM solutions. This session will explore the enhanced RM capabilities of SharePoint 2013 and show how to leverage them to full advantage. The session will also introduce several third-party tools that further enhance the platform to enable true enterprise-class content lifecycle management.
The document discusses web mining, which involves applying data mining techniques to extract useful information from the web. It describes the three types of web mining: web content mining of web page content, web structure mining of hyperlink structures, and web usage mining of web server logs to analyze user behavior patterns. Challenges of web mining include the complexity and dynamic nature of web data as well as diversity of users. Applications of web mining include marketing, data analysis, audience behavior analysis, and advertising campaign analysis.
This document provides an overview of web mining. It defines web mining as using data mining techniques to automatically discover and extract information from web documents and services. It discusses the differences between web mining and data mining, and covers the main topics in web mining including web graph analysis, structured data extraction, and web advertising. It also describes the different approaches of web content mining, web structure mining, and web usage mining.
This document provides an overview of SharePoint document management capabilities including:
- SharePoint allows storage, collaboration, updating, management, archiving and restoring of documents and work items according to compliance policies.
- It reviews SharePoint versions and features like document libraries, version control, and content approval.
- An exercise walks through uploading a document to a library, enabling versioning and content approval, and checking the document in and out while making changes.
Aiim Seminar - SharePoint Crossroads May 23 - Bending but Not Breaking - Spea...Bill England
At the AIIM SharePoint seminar in DC this past may, Buildingi presented out experience moving a Project Knowledge Center (PKC) .Net application to SharePoint, and were joined by Joanna Elazrak from Microsoft who spoke on 'Using SharePoint for Microsoft Records Management'.
This document introduces SharePoint 2010 for document compliance, management and automation. It provides an overview of Netwoven, a company that provides SharePoint services and custom development. It describes the key solution areas Netwoven addresses including SharePoint upgrades, portal development, and business intelligence. It also covers business needs for document management and the components required for compliance, management and automation like storage, security and search. Finally, it outlines scenarios for using SharePoint and its key document management features.
6. Sim Fanji database dan manajemen informasi Yoyo Sudaryo
This document summarizes key concepts about database and information management from Chapter 6 of a Management Information Systems textbook. It describes the problems with traditional file-based data storage, how database management systems address these issues, different database models (relational, hierarchical, network, object-oriented), database design principles, and new database trends like data warehousing and online analytical processing. It also discusses management opportunities and challenges in creating an effective corporate database environment.
This document discusses web usage mining and related processes. It begins with an introduction to web usage mining and its goal of analyzing user behavioral patterns on websites. It then covers topics like data collection and pre-processing, including cleaning, fusion, transformation, and reduction. Specific pre-processing techniques are described, such as sessionization, pageview identification, and user identification. The document also discusses data modeling and discovery of patterns, including various pattern types like decision trees, paths, groups, and associations. Finally, it covers potential applications and conclusions about web usage mining.
The document discusses web mining and analyzing user behavior data from web logs. It provides examples of different types of web mining including web structure mining, web content mining, and web usage mining. It also discusses criteria for evaluating user behavior data such as credibility, validity, and reliability. The document includes two case studies, one on the Institute for Policy Studies website which finds that most new visitors leave immediately and campaign traffic has a high bounce rate, and another on the City of Prague, Oklahoma website which sees most traffic from Oklahoma and high bounce rates from other countries.
This document discusses personal web usage mining, which involves analyzing individual user's web browsing and navigation data recorded on the client side, rather than server side web logs. It describes logging a user's local and remote web activities into an activity log, warehousing that data, mining it for patterns and profiles, and building tools to help and enhance the individual user's web experience. The goal is true personalization by understanding each user's interests and preferences to provide customized recommendations and assistance.
This document discusses personal web usage mining, which involves analyzing individual user's web browsing and navigation data recorded on the client side, rather than server side web logs. It proposes recording both remote activities sent to web servers as well as local on-desktop activities in an activity log. This log, along with cached web pages, would be stored and processed in a data warehouse to facilitate data mining and the development of tools and applications to understand users' interests and enhance their web experience.
ADV Slides: Building and Growing Organizational Analytics with Data LakesDATAVERSITY
Data lakes are providing immense value to organizations embracing data science.
In this webinar, William will discuss the value of having broad, detailed, and seemingly obscure data available in cloud storage for purposes of expanding Data Science in the organization.
This document discusses key features in SharePoint Server 2007 related to document management, records management, and web content management. It describes document libraries, content types, templates, and policies that allow organizations to control the lifecycle of documents. Records management concepts like routing records, holds, and DoD 5015.2 compliance are also summarized. The document aims to explain how these features can help organizations address challenges in managing documents and records.
Using sharepoint to solve business problems #spsnairobi2014Amos Wachanga
Using sharepoint to solve business problems #spsnairobi2014. This presentation was done by Amos Wachanga of Techno Brain Ltd at Sharepoint Saturday Nairobi event on 18th Oct 2014, held at Techno Brain HQ in Nairobi, Kenya.
The presentation creates a business scenario at start, then introduces Sharepoint and mentions some key features that would solve identified business problems, and finally using the case study and examples, ties it all down through a typical solution creation for the business scenario.
The Enterprise Content Management features in SharePoint have steadily improved with each new release of the platform. In this session, we will explore the top 10 new ECM features that have been added to SharePoint 2013, with an emphasis on "new". The session will include demos that showcase real-world examples of how each feature can be used to enhance the overall user experience when working with email, collaborative documents as well as official records.
Enterprise Document Management in SharePoint 2010Agnes Molnar
This document discusses best practices for enterprise document management in SharePoint 2010. It provides an overview of the speaker and their sessions at an upcoming conference. The document then reviews SharePoint 2010's capabilities for document management, including document libraries, content types, metadata management, workflows, and search. It discusses best practices for organizing documents, both within and outside of SharePoint, as well as online and offline access through SharePoint Workspace.
Managed Metadata and Taxonomies in SharePoint 2013Chris McNulty
This document contains the presentation slides for a talk given by Chris McNulty at the SPTechCon 2014 conference. The slides cover topics such as the SharePoint farm topology, content types, managed metadata, records management, and best practices for advanced enterprise content management and taxonomy in SharePoint 2013. Contact information is provided for Chris McNulty for any additional questions.
SPSNYC14 - Must Love Term Sets: The New and Improved Managed Metadata Service...Jonathan Ralton
This document summarizes a presentation on the new and improved Managed Metadata Service in SharePoint 2013. The presentation covers the content management capabilities in SharePoint, the services architecture including service applications and proxies, and the new information architecture features in the Term Store. Key changes discussed include content type syndication across site collections using the Content Type Hub and enhanced management of terms, term sets and term set groups in the centralized Term Store.
SPS Philly Architecting a Content Management SolutionPatrick Tucker
This document summarizes Patrick Tucker's presentation on architecting a SharePoint 2013 content management solution. It discusses defining and organizing content through tools like content types, metadata, and taxonomies. It also covers tracking and routing content with features like the content organizer and document ID service. Finally, it discusses management and retention of content through policies, holds, records management, and eDiscovery tools in SharePoint 2013.
Overview of how to improve records management and findability using SharePoint 2010, EMM, Term Store and Content Types and ConceptClassifier for SharePoint.
This document proposes a new system for extracting data from JavaScript web applications. It aims to address the challenge of extracting dynamic content that is generated through asynchronous JavaScript calls. The key aspects of the proposed system include using a headless browser to fetch pages and deal with dynamically generated DOM, extracting JSON data cached in pages rather than parsing DOM trees, and allowing users to define extraction rules and schedules through a configuration system. The preliminary results showed the system could successfully extract data from modern JavaScript web applications.
This document summarizes a presentation on using SharePoint 2013 for enterprise content management (ECM) and records management (ERM). It discusses why organizations use SharePoint for these purposes due to its cost advantages over competitors and integration capabilities. It outlines SharePoint's ECM and ERM features and limitations. It provides examples of overcoming limitations through custom configurations and third-party tools. The presentation emphasizes aligning ERM with business needs and integrating it with everyday processes rather than creating isolated records systems.
Who says you can't do records management in SharePoint?John F. Holliday
Although records management features have steadily improved with each new SharePoint version, many industry observers are starting to express their doubts as to whether SharePoint is a viable platform for building real-world ERM solutions. This session will explore the enhanced RM capabilities of SharePoint 2013 and show how to leverage them to full advantage. The session will also introduce several third-party tools that further enhance the platform to enable true enterprise-class content lifecycle management.
The document discusses web mining, which involves applying data mining techniques to extract useful information from the web. It describes the three types of web mining: web content mining of web page content, web structure mining of hyperlink structures, and web usage mining of web server logs to analyze user behavior patterns. Challenges of web mining include the complexity and dynamic nature of web data as well as diversity of users. Applications of web mining include marketing, data analysis, audience behavior analysis, and advertising campaign analysis.
This document provides an overview of web mining. It defines web mining as using data mining techniques to automatically discover and extract information from web documents and services. It discusses the differences between web mining and data mining, and covers the main topics in web mining including web graph analysis, structured data extraction, and web advertising. It also describes the different approaches of web content mining, web structure mining, and web usage mining.
This document provides an overview of SharePoint document management capabilities including:
- SharePoint allows storage, collaboration, updating, management, archiving and restoring of documents and work items according to compliance policies.
- It reviews SharePoint versions and features like document libraries, version control, and content approval.
- An exercise walks through uploading a document to a library, enabling versioning and content approval, and checking the document in and out while making changes.
Aiim Seminar - SharePoint Crossroads May 23 - Bending but Not Breaking - Spea...Bill England
At the AIIM SharePoint seminar in DC this past may, Buildingi presented out experience moving a Project Knowledge Center (PKC) .Net application to SharePoint, and were joined by Joanna Elazrak from Microsoft who spoke on 'Using SharePoint for Microsoft Records Management'.
This document introduces SharePoint 2010 for document compliance, management and automation. It provides an overview of Netwoven, a company that provides SharePoint services and custom development. It describes the key solution areas Netwoven addresses including SharePoint upgrades, portal development, and business intelligence. It also covers business needs for document management and the components required for compliance, management and automation like storage, security and search. Finally, it outlines scenarios for using SharePoint and its key document management features.
6. Sim Fanji database dan manajemen informasi Yoyo Sudaryo
This document summarizes key concepts about database and information management from Chapter 6 of a Management Information Systems textbook. It describes the problems with traditional file-based data storage, how database management systems address these issues, different database models (relational, hierarchical, network, object-oriented), database design principles, and new database trends like data warehousing and online analytical processing. It also discusses management opportunities and challenges in creating an effective corporate database environment.
This document discusses web usage mining and related processes. It begins with an introduction to web usage mining and its goal of analyzing user behavioral patterns on websites. It then covers topics like data collection and pre-processing, including cleaning, fusion, transformation, and reduction. Specific pre-processing techniques are described, such as sessionization, pageview identification, and user identification. The document also discusses data modeling and discovery of patterns, including various pattern types like decision trees, paths, groups, and associations. Finally, it covers potential applications and conclusions about web usage mining.
The document discusses web mining and analyzing user behavior data from web logs. It provides examples of different types of web mining including web structure mining, web content mining, and web usage mining. It also discusses criteria for evaluating user behavior data such as credibility, validity, and reliability. The document includes two case studies, one on the Institute for Policy Studies website which finds that most new visitors leave immediately and campaign traffic has a high bounce rate, and another on the City of Prague, Oklahoma website which sees most traffic from Oklahoma and high bounce rates from other countries.
This document discusses personal web usage mining, which involves analyzing individual user's web browsing and navigation data recorded on the client side, rather than server side web logs. It describes logging a user's local and remote web activities into an activity log, warehousing that data, mining it for patterns and profiles, and building tools to help and enhance the individual user's web experience. The goal is true personalization by understanding each user's interests and preferences to provide customized recommendations and assistance.
This document discusses personal web usage mining, which involves analyzing individual user's web browsing and navigation data recorded on the client side, rather than server side web logs. It proposes recording both remote activities sent to web servers as well as local on-desktop activities in an activity log. This log, along with cached web pages, would be stored and processed in a data warehouse to facilitate data mining and the development of tools and applications to understand users' interests and enhance their web experience.
ADV Slides: Building and Growing Organizational Analytics with Data LakesDATAVERSITY
Data lakes are providing immense value to organizations embracing data science.
In this webinar, William will discuss the value of having broad, detailed, and seemingly obscure data available in cloud storage for purposes of expanding Data Science in the organization.
Part of a SharePoint Farm Administrators role is to make sure the SharePoint farm is available for end users to access their information. This will typically include regular maintenance and support for the environment. This session will look at some of these activities that will help keep your environment humming along. There will also be tips and tricks, PowerShell help, as well as other useful suggestions along the way. This presentation is primarily geared for SharePoint Farm Administrators, however general SharePoint users will get some great background information about the inner works of SharePoint.
This document discusses web content mining and summarizes key concepts from a lecture on the topic. It covers extracting both structured and unstructured data from web pages, including lists, details pages, text, opinions and reviews. Pre-processing steps for web content mining are outlined, including removing HTML tags, identifying main content blocks, and detecting duplicate pages. Text preprocessing techniques like stop word removal and stemming are also summarized. The document concludes by discussing web spamming techniques used to improperly influence search engine rankings.
Dealing with Common Data Requirements in Your EnterpriseWSO2
To view recording of this webinar please use below URL:
http://wso2.com/library/webinars/2016/11/dealing-with-common-data-requirements-in-your-enterprise/
Today’s enterprises are challenged with fast growing data requirements. Unlike in the past, where organizations relied on a single database or isolated data silos, today’s enterprises need to cope with multiple data sources and complex access control requirements. They also need to analyze large amounts of data in order to gain insights into their business functions.
This webinar will discuss how the WSO2 platform can help deal with common enterprise data requirements such as data as service transactions, aggregation of corporate entities and management of fragmented data sources to build an efficient enterprise data management strategy.
The document discusses using big data architecture and Hadoop. It compares relational database management systems (RDBMS) to Hadoop, noting differences in schema, speed, governance, processing, and data types between the two. A scenario is presented of a trucking company collecting sensor data from vehicles via GPS, acceleration, braking etc. and how that data could flow through the Hadoop ecosystem using Flume, Sqoop, Hive, Pig, and Spark. Another example discusses acquiring and processing user event data from a bank. The document outlines the reference architecture and requirements extraction process for designing a big data system.
Traditional web analytics tools were not designed for today's digital landscape with multiple channels, devices, and data speeds. Forrester defined a new approach called "digital intelligence" to accommodate emerging needs. Splunk is a tool that provides digital intelligence by capturing machine data from various sources, allowing real-time insights, segmentation, correlation across data sources, and drilldown to original data. This provides businesses with a comprehensive view of customer interactions to optimize experiences and make better decisions.
Charlotte SPUG - Planning for MySites and Social in the EnterpriseMichael Oryszak
This document summarizes a presentation about planning for MySites and social features in SharePoint enterprises. It covers architecture and feature overviews, planning deployment considerations like MySite host configuration and capacity planning, planning for user profiles including custom attributes and privacy policies, and approaches to governance like appropriate use policies and quota management.
Data Systems Integration & Business Value Pt. 2: CloudDATAVERSITY
Certain systems are more data focused than others. Usually their primary focus is on accomplishing integration of disparate data. In these cases, failure is most often attributable to the adoption of a single pillar (silver bullet). The three webinars in the Data Systems Integration and Business Value series are designed to illustrate that good systems development more often depends on at least three DM disciplines (pie wedges) in order to provide a solid foundation.
Data Systems Integration & Business Value Pt. 2: CloudData Blueprint
The document discusses cloud-based integration and its prerequisites. It states that for organizations to benefit from cloud integration, data must be (1) of higher quality, (2) lower volume, and (3) more shareable than data residing outside the cloud. Investments in data engineering are needed to cleanse, reduce the size of, and increase the shareability of datasets so that organizations can realize increased capacity, flexibility, and cost savings from cloud-based computing. The webinar will show how to identify opportunities for cloud integration and properly oversee efforts to capitalize on those opportunities.
SQLSaturday 664 - Troubleshoot SQL Server performance problems like a Microso...Marek Maśko
The document discusses tools used by Microsoft engineers to troubleshoot SQL Server performance problems when assisting customers. It describes how the Performance and Diagnostic Monitor (PSSDiag) collects diagnostic data from a SQL Server and how Microsoft engineers analyze the collected data using tools like SQL Nexus and PAL to identify issues, root causes, and solutions.
Industry Ontologies: Case Studies in Creating and Extending Schema.org for In...MakoLab SA
The presentation introduces listeners into the details of the most important global semantic vocabulary build jointly by Google, Yahoo, Microsoft and Yandex: schema.org. It then discusses the experiences related to the creation of “hosted” extensions for the automotive industries (existing: auto.schema.org) and for the financial industries (in making: fibo.schema.org). The two extensions, built by an international team of specialists managed by MakoLab with full respect to the community processes, have two different creation strategies which will be presented and discussed.
The use cases for both vocabularies will be demonstrated. They are related to both “external” business effects (better visibility of the websites using them on the web) and “internal” effects (new kind of analytics and search capacities).
The presentation will also invite to participate to two W3C Community Groups responsible for the open communication activities around the two extensions.
The document discusses metadata in data warehousing and business intelligence contexts. Some key points:
1. Metadata provides information about data in a data warehouse or warehouse components like data marts. It describes data structures, attributes, transformations and more.
2. Metadata is important for tasks like ETL processing, querying, reporting and overall data management. It helps users understand what data is available and how to access and analyze it.
3. There are different types of metadata including technical metadata about data storage and processes, and business metadata that provides business definitions and rules. Maintaining accurate and consistent metadata is vital for a successful data warehouse.
The document discusses databases and database management systems. It provides examples of common database applications like banking, universities, sales, and airlines. It defines what a database is, the role of a database management system, and examples of DBMS software. It also compares the advantages and disadvantages of using a database system versus a traditional file system to store data. Key benefits of a DBMS include supporting complex queries, controlling redundancy and consistency, handling concurrent access from multiple users, and providing security and data recovery.
Build applications with generative AI on Google CloudMárton Kodok
We will explore Vertex AI - Model Garden powered experiences, we are going to learn more about the integration of these generative AI APIs. We are going to see in action what the Gemini family of generative models are for developers to build and deploy AI-driven applications. Vertex AI includes a suite of foundation models, these are referred to as the PaLM and Gemini family of generative ai models, and they come in different versions. We are going to cover how to use via API to: - execute prompts in text and chat - cover multimodal use cases with image prompts. - finetune and distill to improve knowledge domains - run function calls with foundation models to optimize them for specific tasks. At the end of the session, developers will understand how to innovate with generative AI and develop apps using the generative ai industry trends.
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...sameer shah
"Join us for STATATHON, a dynamic 2-day event dedicated to exploring statistical knowledge and its real-world applications. From theory to practice, participants engage in intensive learning sessions, workshops, and challenges, fostering a deeper understanding of statistical methodologies and their significance in various fields."
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...Social Samosa
The Modern Marketing Reckoner (MMR) is a comprehensive resource packed with POVs from 60+ industry leaders on how AI is transforming the 4 key pillars of marketing – product, place, price and promotions.
The Ipsos - AI - Monitor 2024 Report.pdfSocial Samosa
According to Ipsos AI Monitor's 2024 report, 65% Indians said that products and services using AI have profoundly changed their daily life in the past 3-5 years.
2. Web Usage Mining
• Mining the behavior of human users
• Understand the customers
• Track the behavior and make
recommendations
• Customize the appearance
• Based on Click stream analysis
– This is the lowest level of data
– Needs to be aggregated to Session level data
12/3/2018 Professor V. Nagadevara
3. Web Usage Mining
• Analyze Click-stream Data
– From client or server point of view
• Used for
– Personalization
– Determine frequent access usage
– For caching
– Improve sales and advertisement
12/3/2018 Professor V. Nagadevara
4. Sources of Data
• Web server log files
• Page tags
• Cookies
12/3/2018 Professor V. Nagadevara
5. Types of click-stream data
• Site centric
– Server log files of a website
– Information on behavior within the website
– Information of cookie ID and IP address
– Lack information regarding activity on other sites
(competing sites?)
12/3/2018 Professor V. Nagadevara
6. Web Server Log Files
• Also called click stream data
• The log files are customized by the server.
There are four general formats:
– NCSA Common Log (Access Log format),
– NCSA Combined Log,
– NCSA Separate Log, and
– W3C Extended Log
12/3/2018 Professor V. Nagadevara
7. NCSA Common Log
• Includes the client IP address, client identifier,
visitor username, date and time, HTTP
request, status code for the request, and the
number of bytes transferred
• 172.21.100.30 – nagadev
[18/Dec/2013:11:25:15 +0530] “GET
/index.html HTTP/1.0” 200 1043
12/3/2018 Professor V. Nagadevara
8. NCSA Combined Log
• common log plus
– the referring URL, the visitor’s Web browser and
operating system information, and the cookie
• 172.21.100.30 – nagadev [18/Dec/2013:11:25:15
+0530] “GET /index.html HTTP/1.0” 200 1043
“http://www.dataminingresources.blogspot.com”
“Mozilla/4.05 [en] (WinNT; I)”
“USERID=CustomerA; IMPID=01234”
12/3/2018 Professor V. Nagadevara
9. NCSA Separate Log
• Same information as the combined log, but in
three separate files—the access log, the
referral
Common Log: 172.21.100.30 – nagadev
[18/Dec/2013:11:25:15 +0530] “GET /index.html
HTTP/1.0” 200 1043
Referral Log: [18/Dec/2013:11:25:15 +0530]
“http://www.dataminingresources.blogspot.com/ ”
Agent Log: [18/Dec/2013:11:25:15 +0530]
“Microsoft Internet Explorer - 7.0”
12/3/2018 Professor V. Nagadevara
10. W3C Extended Log
• provide for better control and manipulation of data
while producing a log file readable by most Web
analytics tools
• #Software: Microsoft Internet Information Services 6.0
• #Version: 1.0
• #Date: 2009 -05-24 20:18:01
• #Fields: date time c-ip cs-username s-ip s-port cs-method cs-uri-stem cs-uri-
query sc-status sc-bytes cs-bytes time-taken cs(User-Agent) cs(Referrer)
• 2009-05-24 20:18:01 172.224.24.114 - 206.73.118.24 80 GET /Default.htm -
200 7930 248 31
Mozilla/4.0+(compatible;+MSIE+7.01;+Windows+2000+Server)http://54.114.
24.224/
12/3/2018 Professor V. Nagadevara
11. W3C Extended log
• Can be extended to customized fields
• #Software: Microsoft Internet Information Services 6.0
#Version: 1.0 #Date: 2002-05-24 20:18:01
• #Fields: date time c-ip cs-username s-ip s-port cs-method cs-
uri-stem cs-uri-query sc-status sc-bytes cs-bytes time-taken
cs(User-Agent) cs(Referrer)
• 2002-05-24 20:18:01 172.224.24.114 - 206.73.118.24 80 GET
/Default.htm - 200 7930 248 31
Mozilla/4.0+(compatible;+MSIE+5.01;+Windows+2000+Server)
http://64.224.24.114/
12/3/2018 Professor V. Nagadevara
12. Page Tags
• This is client-side data collection
• Tags (java scripts) are added to web pages
• When web pages are downloaded, the “tags” are
also downloaded
• These tags are then “executed” and info is sent to a
data center by sending a request for a small file,
appending a long query to the request – called “Web
Bug”
• Data center parses the query and send the file,
completing the transaction
12/3/2018 Professor V. Nagadevara
13. Page Tags
• Tags can be customized
• Variables can be pre-determined and pre-
formatted
• Cookies can be dropped for unique identification
• Data can be parsed automatically
• More accurate because client-side. Crawlers
don’t really render pages!
• Data can be reported/analyzed in real time
12/3/2018 Professor V. Nagadevara
14. Page Tags
• Issues
– Dependence on java scripts
– Adding tags to each page (Manual is very difficult)
– Adds “weight” to pages.
– Errors on pages or failed downloads
– Vendors do not like individual customization
– Ownership of data is an issue
– Privacy issues
12/3/2018 Professor V. Nagadevara
15. Cookies
• Used for identifying the uniqueness of the
user
• Can be deleted or prevented
• First party cookie is dropped (served) directly
from the website
• Third party cookies are served from another
domain – eg. These can “observe” the user’s
behavior across multiple domains
12/3/2018 Professor V. Nagadevara
16. Primary Groups of Data
• Usage data
• Content data
• Structure data
• User Data
12/3/2018 Professor V. Nagadevara
17. Usage Data
• “Page View” is the most basic level
– “Aggregate representation of a collection of web
objects contributing to the display on a user’s
browser resulting from a single user action (click)”
– It is a collection of web objects or resources
representing a specific user event
– Eg. Reading an article, viewing a product list,
viewing a detailed list, adding an item to the cart
12/3/2018 Professor V. Nagadevara
18. Usage Data
• Session
– “A session is a sequence of page views by a single
user during a single visit”
– We normally select a subset of page views that are
significant or relevant for the analysis
12/3/2018 Professor V. Nagadevara
19. Content Data
• “Collection of objects and relationships that is
conveyed to the user”
• Consist of static pages, multimedia files,
dynamic page segments, records from
operational databases etc.
• Also include conceptual hierarchies such as
product categories
12/3/2018 Professor V. Nagadevara
20. Structure Data
• “Represents designers view of the content
organization”
• Captured by the inter-page linkage structure
between pages
• These are reflected by hyper links
12/3/2018 Professor V. Nagadevara
21. User Data
• Information regarding user profile
• Demographic information on registered users
• Past purchases
• Reviews and ratings
• Visit histories
• Anonymous information collected by cookies
12/3/2018 Professor V. Nagadevara
22. Data Pre-processing
• Data Fusion and Cleaning
• Page View identification
• User identification
• Sessionization
• Path Completion
• Data Integration
12/3/2018 Professor V. Nagadevara
23. Data Fusion and Cleaning
• Data is drawn from multiple web or application servers
• Data fusion is merging log files from different servers
• Cleaning involves removal of unnecessary data from log
files,
• Removal of Crawler navigation (by crawler name) or by
heuristics
• “Keynote”, a performance monitoring system accessed the source site
for KDD Cup 2000, three times per minute all day, every day!
12/3/2018 Professor V. Nagadevara
24. Page View Identification
• Requires understanding of the structure of the
site, page contents, site domain knowledge
• Can be single file (one-to-one relationship
correspondence with page view)
• Can be a collection of objects, or dynamically
constructed page
• Can be hierarchical list (eg. Information pages,
product views, registration, shopping cart
changes, payment etc.)
12/3/2018 Professor V. Nagadevara
25. User Identification
• Easy if the user has to login
• IP addresses are very accurate (Problem with
Proxy servers)
• Combination of IP address and browser
• More difficult across different sessions (multiple
machines and multiple users)
• Cookies are a possible option
– Different browsers
– Different computers
– Cookies are deleted!
12/3/2018 Professor V. Nagadevara
26. Sessionization
• Process of identifying the page views
requested by a single user in a single session
• Find all page requests from the same user and
group them using heuristics
• Issue a “session id”
• Modify the URL in the log record to include
session id
• Decide when the session ended!
12/3/2018 Professor V. Nagadevara
27. Sessionization
• Time oriented Heuristics
– Total session duration may not exceed Θ
– Total time on a page may not exceed δ
• Referrer oriented
– A request q is added to the session S if the referrer
for q is previously invoked in S
– Else q is the starting point for a new session
12/3/2018 Professor V. Nagadevara
28. Sessionization
• Episode
– A subset of relevant page views in a session
– Comprising of functionally or semantically related
page views
– Requires classification of page views into
functional or concept categories
12/3/2018 Professor V. Nagadevara
29. Path Completion
• The paths are incomplete
– Caching leads to missing entries
– Caching by proxy servers
– Back button creates missing links
• Session log contains time stamps which can be
mined
– Missing pages do not have time stamps
– Dynamic pages are unique and not cached!
• Requires knowledge of the site structure and referrer
information
12/3/2018 Professor V. Nagadevara
30. Data Integration
• Pre-processing results in a set of sessions or episodes
• Other data (demographics, ratings, past purchases
etc.) needs to be integrated to lead to WA/BI metrics
such as customer conversion ratios, lifetime value
• Additional data – shopping cart changes, shipping and
address info, click throughs, impressions
• The transactional database is extracted into data marts
or OLAP cubes after certain amount of aggregation
12/3/2018 Professor V. Nagadevara
31. Modeling
• Statistical Analysis
– Aggregated by pre-determined units (days, sessions,
visitors etc.)
– Most frequent pages, average view time, length of
path, entry and exit etc.
– Referrers, user agents, requested resources
– Usually presented in bar charts, tables and
comparative tables
12/3/2018 Professor V. Nagadevara
32. Modeling
• Segmentation – use cluster analysis
• Associations and correlation analysis
• Frequent item-set graph
• Sequential and navigational patterns
• Predictive analytics using classification
techniques
12/3/2018 Professor V. Nagadevara
34. Information from Web Analytics
How many visitors visit the page daily?
Who are the regular visitors?
What percentage of the visitors to the page are registered users?
What are the top pages that are visited on the web page?
What is the average visit time on the website?
How often does the visitor return to the site?
What is the average page depth of a visitor?
What is the geographic distribution of users of the website?
Web Analytics
Personilization
System
Improvement
Site
Modification
Business
Intelligence
Usage
characteristics
35. Objectives of the Study
• The objectives of this study are to
– Explore Web analytics and its usefulness to web
based business.
– Identify the techniques used in click stream
analysis.
– Identify the application of click stream analysis
through analyzing click stream data obtained from
a particular website using appropriate click stream
analysis techniques.
36. Methodology
• This study analyzes the click stream data obtained from a web site, which
specializes in an online information exchange service to facilitate
identification of suitable partners, in India and other countries.
• The site has a very different revenue model. The visitors are allowed to
browse through the site without any initial payment. The visitors are
allowed to look at the profiles of prospective partners free of charge. The
visitors will have to become members by making a one-time payment only
when they need to contact the prospective brides or grooms.
• Users can search for profiles through advanced search options on the site
on various preferences ranging from basic details of preferred partner to
lifestyle, career, education, profession etc.
37. Methodology
• Members can make initial contact with each other through services
available via Chat, SMS, and e-mail.
• Users can avail free registration on the website and are assured of
exclusive privacy and confidentiality. The website allows the users to
create their profiles, search for other profiles, and express interest in
other profiles and contact others. Registration and creating a profile is free
of cost.
• Registered users can become paid members that will allow them to
contact others, view contact details of other members, write personalized
messages, initiate chats and let other members view their contact details.
Paid memberships are provided for a specified duration.
38. Methodology
• The click stream data is analyzed to identify different
paths taken by the visitors and the sequence of
pages that lead to payment of membership fee.
Based on this analysis, specific strategies are
recommended to maximize the revenue for the
website.
39. DATA PREPARATION
Problem : Format of data
– Clickstream data files are neither delimited nor fixed length files
Solution:
– Used the date in the clickstream as the delimiter to import data to database
– Have to perform string handling in database to separate out the fields
10.208.65.96 172.16.8.37, 124.124.35.130 - - [23/May/2008:00:00:00 -0400] "GET
/billing/billing.php?user=&cid=22401528da14a61c43512fa025b59578i353273 HTTP/1.0" 200 1832
10.208.65.96 68.126.193.219 - - [23/May/2008:00:00:00 -0400] "GET /profile/js/common.js HTTP/1.1" 200 1246210.208.65.96
59.95.71.32 - - [23/May/2008:00:00:00 -0400] "GET /P/css/comm_style.css HTTP/1.1" 200 2640
10.208.65.96 122.163.70.145 - - [23/May/2008:00:00:00 -0400] "GET
/P/search.php?checksum=&searchchecksum=16465054&j=300&newsearch=&inf_checksum=&castemapping=&crmback=&searchorder
=T&label_select_no=&savesearch=&from_index=&viewall=&save_search_redirect=&hide_search_bar=y HTTP/1.1" 200 21561
10.208.65.96 61.1.81.153 - - [23/May/2008:00:00:00 -0400] "GET /P/css/homestyle.css HTTP/1.1" 304 26
10.208.65.96 68.197.236.117 - - [23/May/2008:00:00:00 -0400] "GET
/profile/mainmenu.php?checksum=3590208069017f9d75933dfa9ac9005d|i|537f26ca181f05c308393257397ab261i2810388 HTTP/1.1"
200 3333
10.208.65.96 172.16.25.60, 59.145.189.43 - - [23/May/2008:00:00:00 -0400] "GET /P/css/homestyle.css HTTP/1.0" 304 26
10.208.65.96 10.232.65.96, 10.232.49.1, 203.126.136.220 - - [23/May/2008:00:00:00 -0400] "GET /profile/mainmenu.php?checksum=
HTTP/1.1" 200 3329
40. Data
• Data is obtained from the site in the form of click stream
records. Each record consists of the details of clicks by the
visitors and each record contains the following details:
– Server IP
– Client IP
– Time stamp with Date
– Status: HTTP Status code
– URL requested: has three subfields namely The request method,
resource requested and the protocol used
– No. of bytes transferred
• The country of origin for a specific request is identified using
the IP address.
41. Data
• URL is used to identify the information/web page browsed by the
visitors.
• Time stamp of each click is used to sequence the movement of the
visitors across different pages in the website.
• Identifying a unique user session is an important step in the analysis
of click stream data. Inactivity for more than 30 minutes is
considered as a break of session.
• This is an approximation since there could be multiple users
accessing from the same IP, or the same user accessing from
different IPs.
• Due to lack of more data available we consider hits from each
unique IP as belonging to a unique user for a unique session.
42. No of Sessions
Day
Number of
sessions
Number of
clicks
Day 1 23,440 460,211
Day 2 22,717 453,977
Day 3 24,694 461,518
43. DATA PREPARATION
Problem 3: Volume of data
– Volume of data is huge. Performing string handling on this
volume hits performance
Solution:
– Convert data fields into non-string fields, dates as dates,
numbers as numbers etc..
– Remove unnecessary data (server IP)
– Process data in batches of 100000 records
– Database tuning, indexing and query tuning required
– Over 1500 lines of code written
– Processing still required more than 24hours run time
Day Number of
records
24-May-08 6285949
25-May-08 6061424
26-May-08 6298494
44. DATA PREPARATION
Analyzing information in the clickstream.
10.208.65.96 10.232.65.96, 10.232.49.1, 203.126.136.220 - - [23/May/2008:00:00:00 -0400] "GET
/profile/mainmenu.php?checksum= HTTP/1.1" 200 3329
Field Descript ion
IPaddressof t he server
example: 10.208.65.96
IPaddressof t he client
example: 10.232.65.96, 10.232.49.1,
203.126.136.220
Dat e and t ime of click (server dat e t ime)
example: [23/ May/ 2008:00:00:00 -0400]
Request line exact ly asit came from t he
client . It has3 subfields, The request
met hod, resource request ed and t he
prot ocol used,
example: GET
/ profile/ mainmenu.php?checksum=
HTTP/ 1.1
Request met hod : GET
Resource :
/ profile/ mainmenu.php?checksum=
Prot ocol : HTTP/ 1.1
The HTTPst at uscode ret urned t o t he
client .
example:200
The cont ent -lengt h of t he document
t ransferred.
example: 3329
Server IP
Client IP
Dat e Time
URL
request ed
St at us
byt es
45. Data Preparation
• Getting additional information
– IP addresses allocation by country
– Website mapping (identifying key actions on the website)
– Identifying visitors, registered users and paid users through
the actions performed on the website
• Data transformation
– Extract client IP address
– Represent time as number of seconds past midnight
– Extract web action from the URL string
– Day of the week
50. Data Preparation
• Session Identification
– Each unique client IP address is considered as a unique user
– A break of more than 30 minutes between clicks is considered
as the end of one session
– Clicks in a session are ordered by the time of occurrence
• Session Sampling
– Data volume is huge, need to select sample sessions for further
analysis
– Sessions having between 50 to 100 clicks are selected for
further analysis
– Only those records that relate to a specific user action are
retained, remaining records are discarded.
51. DATA PREPARATION
10.208.65.96 61.1.81.153 - - [23/May/2008:00:00:00 -0400] "GET /P/css/homestyle.css HTTP/1.1" 304 26
10.208.65.96 68.197.236.117 - - [23/May/2008:00:00:00 -0400] "GET
/profile/mainmenu.php?checksum=3590208069017f9d75933dfa9ac9005d|i|537f26ca181f05c308393257397ab261i2810388 HTTP/1.1" 200 3333
10.208.65.96 172.16.25.60, 59.145.189.43 - - [23/May/2008:00:00:00 -0400] "GET /P/css/homestyle.css HTTP/1.0" 304 26
10.208.65.96 10.232.65.96, 10.232.49.1, 203.126.136.220 - - [23/May/2008:00:00:00 -0400] "GET /profile/mainmenu.php?checksum= HTTP/1.1" 200 3329
Day Number of
sessions
Number of
clicks
24th
May 2008 23440 460211
25th
May 2008 22717 453977
26th
May 2008 24694 461518
Day Number of
records
24-May-08 6285949
25-May-08 6061424
26-May-08 6298494
53. DATA PREPARATION
Learnings :
Clickstream data should be processed at runtime or at least on a
daily basis. Processing this data in batches is not efficient
Have a mechanism to capture user ID of person logged on. This is
a very important information that is missing in the clickstream data
57. Exit Points
Last action performed in a session
0
1000
2000
3000
4000
5000
6000
7000
logoutview
profile
contact_hit_tryphotocheck
index
m
m
_show
m
sg
top_search_bandm
ainm
enu
contacts_m
ade_received
search_clustering
search
single_contact_aj
login
m
em
_com
parison
sim
profile_search
61. Associations
Consequent Antecedent
1
Antecedent 2 Antecedent 3 Antecedent 4 Support
%
Confidence
%
Payment = T Photorequest
=T
memcomp=T 100 73.1
Payment = T Country =
India
Photorequest=
T
memcomp=T 80 73
Payment = T Login=T Photorequest=
T
memcomp=T 60 73
Payment = T ViewProfile=
T
Photorequest=
T
memcomp=T 90 72.8
Payment = T ViewProfile=
T
Login=T Photorequest=T memcomp=T 60 72.5
Payment = T Country =
India
ViewProfile=T Photorequest=T memcomp=T 70 71.4
Payment = T Mmshowmsg
= T
Photorequest=
T
memcomp=T 50 67.2
Payment = T ViewProfile=
T
Mmshowmsg
= T
Photorequest=T memcomp=T 50 66.4
62. Summary and Conclusions
• Usage of the website by time of the day.
– This will help busy hour identification, and provide
information of the server capacity required for the
website, and when maintenance window can be
scheduled.
• Usage of website from different geographic location.
– This can provide the data of the distribution of users across
geographical locations
• Exit screens
– provide information on where the users exit from the
website. This input can help redesign the webpage if it
provides information on which pages are breaking the flow
of the user session.
63. Summary and Conclusions
• Most accessed and least accessed pages
– This can be used for variable pricing of advertisings on the
web page. This can also be used for better user interface
design and space utilization, by removing or repositioning the
links that are infrequently accessed.
• Associations
– Provide information on unique actions on the website and the
sequence in which the user has performed these actions. This
can be used in better user interface design.
• Web diagrams
– Gives information on co-occurrence of actions on the webpage
and their significance – also provides inputs on user interface
design.