SlideShare a Scribd company logo
Progress reports
        2010/7/15
Student / Rui-Zhe Liu, Meng-Lun Wu
           Advisor / Chia-Hui Chang
Outline
       Introduction
       Methods
           Baseline(K-means)
           ITCC (Information theoretic co-clustering)
           CCAM (Co-clustering with augmented matrix )
       Evaluations
           Results-based approach
           Feature-based approach




    2
Introduction(1/2)
       Dhillon et al. proposed information theoretic co-
        clustering (ITCC) to progress two way clustering for the
        document-word matrix.
       Sometimes we have addition information (called
        augmented matrix) which are not considered by ITCC.
           For example, in addition to user-ad link matrix, we may have
            user description matrix and advertisement description
            matrix.




    3
Introduction (2/2)
       To fully utilize augmented matrix, we proposed a new
        method called Co-clustering with augmented
        matrix (CCAM).
       We also use the mutual information to model each data.




    4
Methods
                                                      User id   Ad features
       Baseline(k-means):
           Data:
                                                   Ad_id        Ad_id
               ad feature + ad-user link matrix
               lohas game + user-ad link matrix
       ITCC:
           Data: ad-user link matrix
       CCAM
           Data: ad-user link, ad feature, lohas game matrix
       Each method generates its ad clusters, and user group
        results matrix.


    5
Evaluations(1/8)
       Each method evaluated by classification methods,
        including SVM, decision tree, simple CART.




    6
Evaluations(2/8) – result based
       Method: baseline(k-means), co-clustering
           Evaluation data:
               ad feature + ad-user link + method results(ad cluster) matrix
               lohas game + user-ad link + method results(user group) matrix
       Results are as follows.




    7
Evaluations(3/8)
                            Evaluation of ad cluster (K=5)
                    1.2

                     1

                    0.8
    F-measure




                    0.6

                    0.4

                    0.2

                     0
                                 svm            Cart
                                                kart         decision tree
            co-clustering        0.312          0.277           0.349
            baseline             0.965          0.826           0.822

8
Evaluations(4/8)
                                Evaluation of user group (K=5)
                          1
                        0.9
                        0.8
                        0.7
    F-measure




                        0.6
                        0.5
                        0.4
                        0.3
                        0.2
                        0.1
                          0
                                     svm           kart
                                                   Cart          decision tree
                co-clustering       0.861         0.729             0.729
                baseline            0.931         0.677             0.677

9
Evaluations(5/8)
    K-means is better unfortunately, because it generates the
     standard answers for classifications.
    Therefore, we propose another way to evaluate.




    10
Evaluations(6/8) – feature based
    Method: baseline
        Evaluation data:
            ad feature + ad-user link data + baseline(k-means) results matrix
            lohas game + user-ad link data + baseline(k-means) results matrix
    Method: ITCC, CCAM
        Evaluation data:
            ad feature + ad-user link data + co-clustering feature (ad-user group
             matrix) + baseline(k-means) results matrix
            lohas game + user-ad link data + co-clustering feature (user-ad cluster
             matrix) + baseline(k-means) results matrix
    Results are as follows.                          User id   methods    User group


                                                  Ad_id                   Ad_id
    11
Evaluations(7/8)

                            Co-clustering comparing of ad clustering
                    1.050


                    1.000
Average F-measure




                    0.950
                                                                  ccammethod
                                                                   our
                                                                   itcc
                    0.900
                                                                   baseline

                    0.850


                    0.800
                                k=2     k=3      k=4     k=5
12
Evaluations(8/8)

                            Co-clustering comparing of user group
                    1.000

                    0.950
Average F-measure




                    0.900

                                                                ccam
                                                                our method
                    0.850
                                                                itcc
                    0.800                                       baseline

                    0.750

                    0.700
                               k=2     k=3     k=4      k=5
     13
Future work
    Discretize ad feature data.
    Try different parameters for CCAM.




    14
Thank you for listening.

More Related Content

Viewers also liked

Grid07 7 Gagliardi
Grid07 7 GagliardiGrid07 7 Gagliardi
Grid07 7 Gagliardi
imec.archive
 
Crsm 6 2009 Filip Louagie The Flemish Cognitive Radio Research Cluster
Crsm 6 2009   Filip Louagie   The Flemish Cognitive Radio Research ClusterCrsm 6 2009   Filip Louagie   The Flemish Cognitive Radio Research Cluster
Crsm 6 2009 Filip Louagie The Flemish Cognitive Radio Research Cluster
imec.archive
 
Ipr08 3 Vertrouwelijkheid Wie Ligt Er Wakker Van Hans Bracquene
Ipr08 3 Vertrouwelijkheid   Wie Ligt Er Wakker Van   Hans BracqueneIpr08 3 Vertrouwelijkheid   Wie Ligt Er Wakker Van   Hans Bracquene
Ipr08 3 Vertrouwelijkheid Wie Ligt Er Wakker Van Hans Bracqueneimec.archive
 
ENoLL @ AAL Forum 2012
ENoLL @ AAL Forum 2012ENoLL @ AAL Forum 2012
ENoLL @ AAL Forum 2012
imec.archive
 
Ehip3 caring through sharing legislation and-its-practical-implications kirst...
Ehip3 caring through sharing legislation and-its-practical-implications kirst...Ehip3 caring through sharing legislation and-its-practical-implications kirst...
Ehip3 caring through sharing legislation and-its-practical-implications kirst...
imec.archive
 
C E D A R C R E S T Presentation
C E D A R  C R E S T PresentationC E D A R  C R E S T Presentation
C E D A R C R E S T Presentation
macs30
 
I Minds2009 Dr M Claire Van De Velde Ibbt Valorization Strategy
I Minds2009 Dr  M  Claire Van De Velde   Ibbt Valorization StrategyI Minds2009 Dr  M  Claire Van De Velde   Ibbt Valorization Strategy
I Minds2009 Dr M Claire Van De Velde Ibbt Valorization Strategy
imec.archive
 
Brokerage 2007 presentation security
Brokerage 2007 presentation securityBrokerage 2007 presentation security
Brokerage 2007 presentation security
imec.archive
 
31032010 we bbt workshop assertiveness
31032010 we bbt workshop assertiveness 31032010 we bbt workshop assertiveness
31032010 we bbt workshop assertiveness
imec.archive
 
20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Joris20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Joris
imec.archive
 
I Minds2009 Markku Markkula Research & Innovations Lessons Learnt In Creat...
I Minds2009 Markku Markkula   Research & Innovations  Lessons Learnt In Creat...I Minds2009 Markku Markkula   Research & Innovations  Lessons Learnt In Creat...
I Minds2009 Markku Markkula Research & Innovations Lessons Learnt In Creat...
imec.archive
 
Robby Berloznik - 5 key messages for an all-round digital inclusion policy in...
Robby Berloznik - 5 key messages for an all-round digital inclusion policy in...Robby Berloznik - 5 key messages for an all-round digital inclusion policy in...
Robby Berloznik - 5 key messages for an all-round digital inclusion policy in...
imec.archive
 
08 Dennis Van Duuren Website Erfgoed 2 0
08  Dennis Van Duuren   Website Erfgoed 2 008  Dennis Van Duuren   Website Erfgoed 2 0
08 Dennis Van Duuren Website Erfgoed 2 0imec.archive
 
MamaBear Family Tech Conference: Paid Acquisition for Startups
MamaBear Family Tech Conference: Paid Acquisition for StartupsMamaBear Family Tech Conference: Paid Acquisition for Startups
MamaBear Family Tech Conference: Paid Acquisition for Startups
Seth Berman
 
Curriculumvitae 100425072655-phpapp01
Curriculumvitae 100425072655-phpapp01Curriculumvitae 100425072655-phpapp01
Curriculumvitae 100425072655-phpapp01
nnasirkful
 
Pieter Colpaert - iRail
Pieter Colpaert - iRailPieter Colpaert - iRail
Pieter Colpaert - iRail
imec.archive
 
Jeroen Hoebeke - MoCo
Jeroen Hoebeke - MoCoJeroen Hoebeke - MoCo
Jeroen Hoebeke - MoCo
imec.archive
 
Search Engine Strategies: Mobile Marketing Tactics
Search Engine Strategies: Mobile Marketing TacticsSearch Engine Strategies: Mobile Marketing Tactics
Search Engine Strategies: Mobile Marketing Tactics
Seth Berman
 

Viewers also liked (18)

Grid07 7 Gagliardi
Grid07 7 GagliardiGrid07 7 Gagliardi
Grid07 7 Gagliardi
 
Crsm 6 2009 Filip Louagie The Flemish Cognitive Radio Research Cluster
Crsm 6 2009   Filip Louagie   The Flemish Cognitive Radio Research ClusterCrsm 6 2009   Filip Louagie   The Flemish Cognitive Radio Research Cluster
Crsm 6 2009 Filip Louagie The Flemish Cognitive Radio Research Cluster
 
Ipr08 3 Vertrouwelijkheid Wie Ligt Er Wakker Van Hans Bracquene
Ipr08 3 Vertrouwelijkheid   Wie Ligt Er Wakker Van   Hans BracqueneIpr08 3 Vertrouwelijkheid   Wie Ligt Er Wakker Van   Hans Bracquene
Ipr08 3 Vertrouwelijkheid Wie Ligt Er Wakker Van Hans Bracquene
 
ENoLL @ AAL Forum 2012
ENoLL @ AAL Forum 2012ENoLL @ AAL Forum 2012
ENoLL @ AAL Forum 2012
 
Ehip3 caring through sharing legislation and-its-practical-implications kirst...
Ehip3 caring through sharing legislation and-its-practical-implications kirst...Ehip3 caring through sharing legislation and-its-practical-implications kirst...
Ehip3 caring through sharing legislation and-its-practical-implications kirst...
 
C E D A R C R E S T Presentation
C E D A R  C R E S T PresentationC E D A R  C R E S T Presentation
C E D A R C R E S T Presentation
 
I Minds2009 Dr M Claire Van De Velde Ibbt Valorization Strategy
I Minds2009 Dr  M  Claire Van De Velde   Ibbt Valorization StrategyI Minds2009 Dr  M  Claire Van De Velde   Ibbt Valorization Strategy
I Minds2009 Dr M Claire Van De Velde Ibbt Valorization Strategy
 
Brokerage 2007 presentation security
Brokerage 2007 presentation securityBrokerage 2007 presentation security
Brokerage 2007 presentation security
 
31032010 we bbt workshop assertiveness
31032010 we bbt workshop assertiveness 31032010 we bbt workshop assertiveness
31032010 we bbt workshop assertiveness
 
20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Joris20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Joris
 
I Minds2009 Markku Markkula Research & Innovations Lessons Learnt In Creat...
I Minds2009 Markku Markkula   Research & Innovations  Lessons Learnt In Creat...I Minds2009 Markku Markkula   Research & Innovations  Lessons Learnt In Creat...
I Minds2009 Markku Markkula Research & Innovations Lessons Learnt In Creat...
 
Robby Berloznik - 5 key messages for an all-round digital inclusion policy in...
Robby Berloznik - 5 key messages for an all-round digital inclusion policy in...Robby Berloznik - 5 key messages for an all-round digital inclusion policy in...
Robby Berloznik - 5 key messages for an all-round digital inclusion policy in...
 
08 Dennis Van Duuren Website Erfgoed 2 0
08  Dennis Van Duuren   Website Erfgoed 2 008  Dennis Van Duuren   Website Erfgoed 2 0
08 Dennis Van Duuren Website Erfgoed 2 0
 
MamaBear Family Tech Conference: Paid Acquisition for Startups
MamaBear Family Tech Conference: Paid Acquisition for StartupsMamaBear Family Tech Conference: Paid Acquisition for Startups
MamaBear Family Tech Conference: Paid Acquisition for Startups
 
Curriculumvitae 100425072655-phpapp01
Curriculumvitae 100425072655-phpapp01Curriculumvitae 100425072655-phpapp01
Curriculumvitae 100425072655-phpapp01
 
Pieter Colpaert - iRail
Pieter Colpaert - iRailPieter Colpaert - iRail
Pieter Colpaert - iRail
 
Jeroen Hoebeke - MoCo
Jeroen Hoebeke - MoCoJeroen Hoebeke - MoCo
Jeroen Hoebeke - MoCo
 
Search Engine Strategies: Mobile Marketing Tactics
Search Engine Strategies: Mobile Marketing TacticsSearch Engine Strategies: Mobile Marketing Tactics
Search Engine Strategies: Mobile Marketing Tactics
 

Similar to Progress reports 2010.7.15

Performance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various ClassifiersPerformance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various Classifiers
amreshkr19
 
Weka_Manual_Sagar
Weka_Manual_SagarWeka_Manual_Sagar
Weka_Manual_Sagar
Sagar Kumar
 
A scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringA scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clustering
AllenWu
 
Dr. Andreas Lattner- Setting up predictive services with Palladium
Dr. Andreas Lattner- Setting up predictive services with PalladiumDr. Andreas Lattner- Setting up predictive services with Palladium
Dr. Andreas Lattner- Setting up predictive services with Palladium
PyData
 
Presentation_BigData_NenaMarin
Presentation_BigData_NenaMarinPresentation_BigData_NenaMarin
Presentation_BigData_NenaMarin
n5712036
 
Spark MLlib - Training Material
Spark MLlib - Training Material Spark MLlib - Training Material
Spark MLlib - Training Material
Bryan Yang
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Boston Institute of Analytics
 
Comparing Machine Learning Algorithms in Text Mining
Comparing Machine Learning Algorithms in Text MiningComparing Machine Learning Algorithms in Text Mining
Comparing Machine Learning Algorithms in Text Mining
Andrea Gigli
 
Towards a Unified Data Analytics Optimizer with Yanlei Diao
Towards a Unified Data Analytics Optimizer with Yanlei DiaoTowards a Unified Data Analytics Optimizer with Yanlei Diao
Towards a Unified Data Analytics Optimizer with Yanlei Diao
Databricks
 
Apache Cassandra at Wayin
Apache Cassandra at WayinApache Cassandra at Wayin
Apache Cassandra at Wayin
DataStax Academy
 
Competition 1 (blog 1)
Competition 1 (blog 1)Competition 1 (blog 1)
Competition 1 (blog 1)
TarunPaparaju
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
Benjamin Bengfort
 
Cloudera Movies Data Science Project On Big Data
Cloudera Movies Data Science Project On Big DataCloudera Movies Data Science Project On Big Data
Cloudera Movies Data Science Project On Big Data
Abhishek M Shivalingaiah
 
Yoda fifth elephant
Yoda fifth elephantYoda fifth elephant
Yoda fifth elephant
Gaurav Agarwal
 
Machine Learning Model for M.S admissions
Machine Learning Model for M.S admissionsMachine Learning Model for M.S admissions
Machine Learning Model for M.S admissions
Omkar Rane
 
Mining Regional Knowledge in Spatial Dataset
Mining Regional Knowledge in Spatial DatasetMining Regional Knowledge in Spatial Dataset
Mining Regional Knowledge in Spatial Dataset
butest
 
Data Mining using Weka
Data Mining using WekaData Mining using Weka
Data Mining using Weka
Shashidhar Shenoy
 
IRJET- Semantic Segmentation using Deep Learning
IRJET- Semantic Segmentation using Deep LearningIRJET- Semantic Segmentation using Deep Learning
IRJET- Semantic Segmentation using Deep Learning
IRJET Journal
 
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learntKaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Eugene Yan Ziyou
 
Open06
Open06Open06
Open06
butest
 

Similar to Progress reports 2010.7.15 (20)

Performance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various ClassifiersPerformance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various Classifiers
 
Weka_Manual_Sagar
Weka_Manual_SagarWeka_Manual_Sagar
Weka_Manual_Sagar
 
A scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringA scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clustering
 
Dr. Andreas Lattner- Setting up predictive services with Palladium
Dr. Andreas Lattner- Setting up predictive services with PalladiumDr. Andreas Lattner- Setting up predictive services with Palladium
Dr. Andreas Lattner- Setting up predictive services with Palladium
 
Presentation_BigData_NenaMarin
Presentation_BigData_NenaMarinPresentation_BigData_NenaMarin
Presentation_BigData_NenaMarin
 
Spark MLlib - Training Material
Spark MLlib - Training Material Spark MLlib - Training Material
Spark MLlib - Training Material
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Comparing Machine Learning Algorithms in Text Mining
Comparing Machine Learning Algorithms in Text MiningComparing Machine Learning Algorithms in Text Mining
Comparing Machine Learning Algorithms in Text Mining
 
Towards a Unified Data Analytics Optimizer with Yanlei Diao
Towards a Unified Data Analytics Optimizer with Yanlei DiaoTowards a Unified Data Analytics Optimizer with Yanlei Diao
Towards a Unified Data Analytics Optimizer with Yanlei Diao
 
Apache Cassandra at Wayin
Apache Cassandra at WayinApache Cassandra at Wayin
Apache Cassandra at Wayin
 
Competition 1 (blog 1)
Competition 1 (blog 1)Competition 1 (blog 1)
Competition 1 (blog 1)
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Cloudera Movies Data Science Project On Big Data
Cloudera Movies Data Science Project On Big DataCloudera Movies Data Science Project On Big Data
Cloudera Movies Data Science Project On Big Data
 
Yoda fifth elephant
Yoda fifth elephantYoda fifth elephant
Yoda fifth elephant
 
Machine Learning Model for M.S admissions
Machine Learning Model for M.S admissionsMachine Learning Model for M.S admissions
Machine Learning Model for M.S admissions
 
Mining Regional Knowledge in Spatial Dataset
Mining Regional Knowledge in Spatial DatasetMining Regional Knowledge in Spatial Dataset
Mining Regional Knowledge in Spatial Dataset
 
Data Mining using Weka
Data Mining using WekaData Mining using Weka
Data Mining using Weka
 
IRJET- Semantic Segmentation using Deep Learning
IRJET- Semantic Segmentation using Deep LearningIRJET- Semantic Segmentation using Deep Learning
IRJET- Semantic Segmentation using Deep Learning
 
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learntKaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
 
Open06
Open06Open06
Open06
 

Recently uploaded

GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 

Recently uploaded (20)

GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 

Progress reports 2010.7.15

  • 1. Progress reports 2010/7/15 Student / Rui-Zhe Liu, Meng-Lun Wu Advisor / Chia-Hui Chang
  • 2. Outline  Introduction  Methods  Baseline(K-means)  ITCC (Information theoretic co-clustering)  CCAM (Co-clustering with augmented matrix )  Evaluations  Results-based approach  Feature-based approach 2
  • 3. Introduction(1/2)  Dhillon et al. proposed information theoretic co- clustering (ITCC) to progress two way clustering for the document-word matrix.  Sometimes we have addition information (called augmented matrix) which are not considered by ITCC.  For example, in addition to user-ad link matrix, we may have user description matrix and advertisement description matrix. 3
  • 4. Introduction (2/2)  To fully utilize augmented matrix, we proposed a new method called Co-clustering with augmented matrix (CCAM).  We also use the mutual information to model each data. 4
  • 5. Methods User id Ad features  Baseline(k-means):  Data: Ad_id Ad_id  ad feature + ad-user link matrix  lohas game + user-ad link matrix  ITCC:  Data: ad-user link matrix  CCAM  Data: ad-user link, ad feature, lohas game matrix  Each method generates its ad clusters, and user group results matrix. 5
  • 6. Evaluations(1/8)  Each method evaluated by classification methods, including SVM, decision tree, simple CART. 6
  • 7. Evaluations(2/8) – result based  Method: baseline(k-means), co-clustering  Evaluation data:  ad feature + ad-user link + method results(ad cluster) matrix  lohas game + user-ad link + method results(user group) matrix  Results are as follows. 7
  • 8. Evaluations(3/8) Evaluation of ad cluster (K=5) 1.2 1 0.8 F-measure 0.6 0.4 0.2 0 svm Cart kart decision tree co-clustering 0.312 0.277 0.349 baseline 0.965 0.826 0.822 8
  • 9. Evaluations(4/8) Evaluation of user group (K=5) 1 0.9 0.8 0.7 F-measure 0.6 0.5 0.4 0.3 0.2 0.1 0 svm kart Cart decision tree co-clustering 0.861 0.729 0.729 baseline 0.931 0.677 0.677 9
  • 10. Evaluations(5/8)  K-means is better unfortunately, because it generates the standard answers for classifications.  Therefore, we propose another way to evaluate. 10
  • 11. Evaluations(6/8) – feature based  Method: baseline  Evaluation data:  ad feature + ad-user link data + baseline(k-means) results matrix  lohas game + user-ad link data + baseline(k-means) results matrix  Method: ITCC, CCAM  Evaluation data:  ad feature + ad-user link data + co-clustering feature (ad-user group matrix) + baseline(k-means) results matrix  lohas game + user-ad link data + co-clustering feature (user-ad cluster matrix) + baseline(k-means) results matrix  Results are as follows. User id methods User group Ad_id Ad_id 11
  • 12. Evaluations(7/8) Co-clustering comparing of ad clustering 1.050 1.000 Average F-measure 0.950 ccammethod our itcc 0.900 baseline 0.850 0.800 k=2 k=3 k=4 k=5 12
  • 13. Evaluations(8/8) Co-clustering comparing of user group 1.000 0.950 Average F-measure 0.900 ccam our method 0.850 itcc 0.800 baseline 0.750 0.700 k=2 k=3 k=4 k=5 13
  • 14. Future work  Discretize ad feature data.  Try different parameters for CCAM. 14
  • 15. Thank you for listening.