Your SlideShare is downloading. ×
Weblog Extraction With Fuzzy Classification Methods
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Weblog Extraction With Fuzzy Classification Methods

814
views

Published on

Presentation for the Second International Conference on the Applications of Digital Information and Web Technologies

Presentation for the Second International Conference on the Applications of Digital Information and Web Technologies

Published in: Education, Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
814
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Weblog Extraction with Fuzzy Classification Methods
    Edy Portmann -
    University of Fribourg - Switzerland
  • 2. Content
    Introduction
    Weblog extraction – Folksonomies - Fuzzy logic – Fuzzy data clustering
    Fuzzy weblog extraction
    Building blocks – Interface - Query engine - Meta search engine - Aggregated documents
    Example
    Concluding Remarks
    Questions and Answers
  • 3. Weblog extraction
    Website with regular (reverse-chronological) entries of comments, descriptions of events, or other material
    Provide instantnews on a particular subject and the readers can leave comments
    Data extraction is the act or process of retrieving data out of unstructured data sources
  • 4. Folksonomies
    Practice and technique to create and manipulate tags collaboratively and annotate and categorize content collaboratively
    Freely chosen keywords instead of controlled vocabulary
    User-generated taxonomy
    • To harvest social knowledge from tags
    • 5. To generate an ontology
  • Fuzzy logic
    1
    Adult
    Teenager
    Infant
    - 20
    - 7.5
    0 – 7.5
    20 – 22.5
    10 - 20
    7.5 - 10
    22.5 -
    22.5 -
    10 -
    O
    Membership Level
    Adult
    Infant
  • 6. Hard vs. fuzzy clustering
    In hard clustering, data is divided into distinct clusters, where each data element belongs to exactly one cluster
    In fuzzy clustering, data elements can belong to more than one cluster, and associated with each element is a set of membership levels
  • 7. Content
    Introduction
    Weblog extraction – Folksonomies - Fuzzy logic – Fuzzy data clustering
    Fuzzy weblog extraction
    Building blocks – Interface - Query engine - Meta search engine - Aggregated documents
    Example
    Concluding Remarks
    Questions and Answers
  • 8. Building blocks
    1
    4
    2
    3
  • 9. Interface
    Blogretrievr
    www.blogretrievr.com/
    Blogretrievr™
    I
    Yo-yo
    I
    1
    3
    FuzzynessFactor
    2
    Caption
    1. Search box
    2. Fuzzyness Factor
    3. Go!
  • 10. Query engine: Grassroots Tagging
    Tags
    Yo-yo
    According to these tags, yo-yo, triangle and the colours green, red and blue they must be related in some way!
    But in which way?
    Triangle
    Green
    Tags
    Yo-yo
    Triangle
    Red
    Tags
    Yo-yo
    Triangle
    Blue
  • 11. Query engine: Jaccard coefficient
    B
    A
    Jaccard coefficient
    AB
    BA
    AB
    AB
    A
    A
    B
    A
    B
    B
    A
    B
    C
    Not at all similar
    Somewhatsimilar
    Quitesimilar
  • 12. Query engine: fuzzy c-means (FCM)
    d
    FCM is a method of clustering which allows one piece of data to belong to two or more clusters
    d
    d
    d
    d
  • 13. Query engine: fuzzy c-means (FCM)
    The algorithm defines for each term the belonging to a certain cluster
    It is possible that a term belongs to more than one cluster
  • 14. Query engine: iterative FCM
    The same terms which belongs to different clusters will be linked together
    The clusters and the membership degrees remain still
    Membership Level
    Green
    Red
    Blue
  • 15. Query engine: iterative FCM (ontology)
    Each term is linked with other terms
    Every other term is again linked with terms
    Every new source tagged (in the Internet) causes new term-links
    A
    Membership Cluster
    Green
    Red
    Blue
  • 16. Query engine: dendrogram
    d
    4
    3
    1
    2
    6
    1
    2
    3
    5
    2
    4
    1
    3
    Membership Level
    Red
    Blue
    Green
  • 17. Meta search engine
    Action
    Blogosphere
    Fuzzy set search query
    1
    2
    3
    2. The meta search engine sends the fuzzy set search query to other blog search engines
    Technorati
    3. Each blog search engines send the query to the blogosphere…
    Meta search engine
    Blogdigger
    4. …and gathers the results
    etc.
    5. The meta search engine collects all results…
    6. …and aggregates them
    4
    5
    6
  • 18. Aggregated documents
    Blogretrievr
    www.blogretrievr.com/
    Blogretrievr™
    Yo-yo
    Hand puppet
    I
    I
    5
    FuzzynessFactor
    1
    2
    Caption
    1. Search Map
    2. Search Results
    3. Map Rotation
    4. Zoom in/out
    5. New search
    3
    4
  • 19. Content
    Introduction
    Weblog extraction – Folksonomies - Fuzzy logic – Fuzzy data clustering
    Fuzzy weblog extraction
    Building blocks – Interface - Query engine - Meta search engine - Aggregated documents
    Example
    Concluding Remarks
    Questions and Answers
  • 20. Example: problem specifications
    What is coming around the edge?
    Samsung is screening the competitors for new killer applications
    In the blogosphere new technologies are discussed earlier than in other media
    OLED
    LCD
    LED
    OEL
  • 21. Example: Pre-search
    OEL
    [0.6,1]
    OLED
    LED
    [0.9,1]
    is related
    OLED
    [1]
    0.9
    LED
    0.6
    OEL
  • 22. Example: The search
    Search for an weblog
    with new OLED
    technology
    The membership
    degree is [0.8,1]
    This includes
    OLED [1] and
    LED [0.9,1]
    But not OEL [0.6,1]
    OEL
    [0.6,1]
    [0.8..1]
    LED
    [0.9,1]
    OLED
    [1]
    FuzzynessFactor
  • 23. Example: Results
    • Not found with Boolean Search
    • 24. Not found with Fuzzy Search [0.8..1]
    Found with Boolean Search
    Found with Fuzzy Search [0.8..1]
    OLED
    LCD
    LED
    OEL
    OLED
    LCD
    LED
    OEL
  • 25. Content
    Introduction
    Weblog extraction – Folksonomies - Fuzzy logic – Fuzzy data clustering
    Fuzzy weblog extraction
    Building blocks – Interface - Query engine - Meta search engine - Aggregated documents
    Example
    Concluding Remarks
    Questions and Answers
  • 26. Concluding remarks
    The boundaries in the fuzzy set theory are not well-defined
    • The idea is a relationship function with the fundamentals of the set
    • 27. This function takes values in the interval [0,1]
    Relationship in a fuzzy set is intrinsically steady instead of abrupt
    As a result it is possible to find more relevant documents
  • 28. Aggregated docs with aim to organize the search results into several meaningful categories (clusters)
    A cluster is a group of similar topics that are related to the original
    The user benefits include:
    • Get an overview of the available themes or topics
    • 29. View similar results together in folders rather than scattered throughout a list
    Concluding remarks
  • 30. Content
    Introduction
    Weblog extraction – Folksonomies - Fuzzy logic – Fuzzy data clustering
    Fuzzy weblog extraction
    Building blocks – Interface - Query engine - Meta search engine - Aggregated documents
    Example
    Concluding Remarks
    Questions and Answers
  • 31. Questions and Answers

×