Data Mining and Text Mining
      A marriage made in heaven!
              Tom Khabaza




© ISL 2002        1
Data Mining & Text Mining
             A marriage made in heaven
w Data mining & Clementine
w Text mining – A data miner’s...
What is Data Mining?

             Finding patterns in your data
                   which you can use
                    ...
Applications of Data Mining
w Customer Relationship
                                       w   Money laundering detection
...
Clementine data mining system

                              Comprehensive
                              Interactive
     ...
What is data mining like?




© ISL 2002              6
Text Mining – a data miner’s view

w Data mining addresses
  problems through data
w Information extraction
  derives stru...
Data Mining / Text Mining Integration
                       The Need
w A large proportion of available data is
  free-tex...
First steps towards integration

w An initial “demonstration”
w Power industry news items /
  competitive intelligence
w I...
A New Integration

w LexiQuest text mining
  company acquired by
  SPSS in February 2002
w LexiQuest text mining tools
   ...
LexiQuest Mine Integration
                  with Clementine
w LexiQuest’s underlying
  “extraction engine”
w Integrated w...
Text Mining
             Integrated into Clementine


             "

© ISL 2002              12
© ISL 2002   13
Data Mining and Text Mining
                   A Bright Future
w Structure data and free-text data are no
  longer separat...
Upcoming SlideShare
Loading in …5
×

Data Mining and Text Mining A marriage made in heaven!

1,040 views
978 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,040
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
29
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Data Mining and Text Mining A marriage made in heaven!

  1. 1. Data Mining and Text Mining A marriage made in heaven! Tom Khabaza © ISL 2002 1
  2. 2. Data Mining & Text Mining A marriage made in heaven w Data mining & Clementine w Text mining – A data miner’s view w Requirement for integration w First steps towards integration w A new integration w A bright future © ISL 2002 2
  3. 3. What is Data Mining? Finding patterns in your data which you can use to do your business better Critical factors: Business Deployment Knowledge Comprehensive of Results Facilities © ISL 2002 3
  4. 4. Applications of Data Mining w Customer Relationship w Money laundering detection Management (CRM) Who are our best customers? w Network intrusion detection Can we get more like that? w Wind turbine maintenance What/why do they buy? w Industrial process optimisation Why do they leave? & QA w Environmental management w eCRM – Web-mining / conservation How do they behave? w Drug discovery w Fraud detection w Medical research w Food authentication w Crime analysis © ISL 2002 4
  5. 5. Clementine data mining system Comprehensive Interactive Problem-oriented Enables the analyst to “engage” with the data © ISL 2002 5
  6. 6. What is data mining like? © ISL 2002 6
  7. 7. Text Mining – a data miner’s view w Data mining addresses problems through data w Information extraction derives structured data from free text documents Name AgeIncom e Mar/S F. Bloggs 25000 25 CarC Pur Val Last Child in/Div Cardch Purch Single es M/C 5 23.5 34 0 Y ren Source L1 J. Smith 33000 37 Mar. Yes VISA3 123.4102 2 L2 J. Dow 45 40000 Div. No VISA12 15.2 48 1 L1 w A natural match w Find the concepts using information extraction then find the patterns using data mining © ISL 2002 7
  8. 8. Data Mining / Text Mining Integration The Need w A large proportion of available data is free-text w Classical data mining addresses problems only through structured data w But… The requirement to analyse structured and free-text data co-exist in the same application Often in the same database w Data mining suppliers get a constant stream of requests for free-text analysis © ISL 2002 8
  9. 9. First steps towards integration w An initial “demonstration” w Power industry news items / competitive intelligence w Integrated a purpose-built information extraction engine (from Brighton U. ITRI) with Clementine w Highly successful, useful results But: very domain-specific So expensive to replicate © ISL 2002 9
  10. 10. A New Integration w LexiQuest text mining company acquired by SPSS in February 2002 w LexiQuest text mining tools q Mine – text mining q Categorize – document classifier q Information retrieval tools © ISL 2002 10
  11. 11. LexiQuest Mine Integration with Clementine w LexiQuest’s underlying “extraction engine” w Integrated with Clementine w The free-text miner’s dream: q Extract the concepts q Find the patterns As simple as that! © ISL 2002 11
  12. 12. Text Mining Integrated into Clementine " © ISL 2002 12
  13. 13. © ISL 2002 13
  14. 14. Data Mining and Text Mining A Bright Future w Structure data and free-text data are no longer separate domains w The fluency of exploration and discovery provided by data mining for structured data is now available for free text data and for combinations of the two. w This will revolutionise CRM, fraud detection, crime investigation, competitive intelligence, … … … © ISL 2002 14

×