Webmining (1)

Web Mining
Low precision
Low Recall
Discovering new knowledge from the web
Personalised web page synthesis
Learning about individual users

Web Content Mining
Web page Content Mining
Search Result Mining

Web Structure Mining
1. Page Rank
i. PageRank Algorithm
ii. Standing of a Node
2. Traversing and Intrinsic Links
3. Reference Nodes and Index Nodes
i. Index nodes
ii. Reference Nodes
4. Clustering and Determining Similar pages
i. Bibliographic Coupling
Bibliographic coupling occurs when two works reference a common third work in their bibliographies.
ii. Co-citation
Co-citation is defined as the frequency with which two documents are cited together by other documents.
[1]
If at least
one other document cites two documents in common these documents are said to be co-cited.

Bibliographic Coupling Co-Citation

Web Usage Mining
General Access Pattern Tracking
Customized Usage Tracking

Text Mining
Information Retrieval
Information Extraction
Computational Linguistics

Unstructured Text
● Features
○ Word Occurrences
○ Stop Words
○ Latent Semantic Indexing
○ Stemming
○ n-GRAM
○ POS (Part-of-Speech)
○ Positional Collocations
○ Higher Order Features

Episode Rule Discovery for Texts
Hierarchy of Categories
Text Clustering
● Scatter/Gather

Webmining (1)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Webmining (1)

Similar to Webmining (1) (20)

Recently uploaded

Recently uploaded (20)

Webmining (1)