SlideShare a Scribd company logo
MACHINE LEARNING
PROJECTS WITH R
Yiou (Leo) Li
Outline


   Classification of glass data

   Clustering of glass data
Classification by ridge regression
3
Plotting the three classes by four features
4

                                 Simple Scatterplot Matrix
                                11   12   13   14   15                        0.5   1.0   1.5   2.0




                                                                                                      1.525
                        V2




                                                                                                      1.515
          15
          14
          13




                                          V3
          12
          11




                                                                                                      4
                                                                                                      3
                                                                 V4




                                                                                                      2
                                                                                                      1
                                                                                                      0
          2.0
          1.5




                                                                                     V5
          1.0
          0.5




                1.515   1.525                            0   1   2    3   4
Performance looks good when consider only the
    classification error rate
5
Performance is poor when consider ROC
6
Using high order polynomial helps improve ROC
7




    Decision point
Using high order polynomial helps improve TPR
    and FPR!
8



                    Y ~ [V2, V3, …, V10, V2*V3, V2*V4, …]
                             Training            Test
       True Positive Rate    0.6820833           0.55
       False Positive Rate   0.008368031         0.0804762
       Error rate            0.03953965          0.1270588



                              Y ~ [V2, V3 … , V10]
                              Training               Test
       True Positive Rate     0                      0
       False Positive Rate    0.00685288             0.007142857
       Error rate             0.1104277              0.1102941
Notes on ridge regression
9




    1. The ridge solutions are not invariant under scaling of the inputs --- usually
       standardize the input --- so that the solution is invariant to scaling of inputs

    2. Intercept β0 should be left out of the penalty term! --- so that the solution is
       invariant to the choice of origin of inputs and outputs
Outline


   Classification of glass data

   Clustering of glass data
Multi-Dimensional Scaling of glass
data (Labeled as: 1,2,3,5,6,7)
                                          Metric MDS




                       6
                                              1
                                              2
                                              3
                                              5
                                              6
                       4



                                              7
        Coordinate 2

                       2
                       0
                       -2




                            -4   -2   0            2     4   6

                                          Coordinate 1
Kmeans of glass
                           K-means cluster




                     1.0
                     0.8
                     0.6
      Correct rate

                     0.4
                     0.2
                     0.0




                             Original labels
Hierarchical of glass
                            Hierachical cluster




                      1.0
                      0.8
                      0.6
       Correct rate

                      0.4
                      0.2
                      0.0




                                Original labels
Correct rate

                  0.0   0.2   0.4          0.6   0.8   1.0
                                                                  EM of glass
                                                             EM




Original labels

More Related Content

Similar to Machine learning projects with r

Amth250 octave matlab some solutions (3)
Amth250 octave matlab some solutions (3)Amth250 octave matlab some solutions (3)
Amth250 octave matlab some solutions (3)
asghar123456
 
Wikipedia ws
Wikipedia wsWikipedia ws
Wikipedia ws
Yu Suzuki
 
9th ICCS Noordwijkerhout
9th ICCS Noordwijkerhout9th ICCS Noordwijkerhout
9th ICCS Noordwijkerhout
Gerard van Westen
 
Towards Probabilistic Assessment of Modularity
Towards Probabilistic Assessment of ModularityTowards Probabilistic Assessment of Modularity
Towards Probabilistic Assessment of Modularity
Kevin Hoffman
 
adc converter basics
adc converter basicsadc converter basics
adc converter basics
hacker1500
 
Mlb graphs slide deck
Mlb graphs slide deckMlb graphs slide deck
Mlb graphs slide deck
Kevin Teh
 
Metrado de madera
Metrado de maderaMetrado de madera
Metrado de madera
Richard Flores Atachagua
 
Important Topics for JEE Advanced
Important Topics for JEE AdvancedImportant Topics for JEE Advanced
Important Topics for JEE Advanced
100marks
 
DCT_TR802
DCT_TR802DCT_TR802
DCT_TR802
aniruddh Tyagi
 
DCT_TR802
DCT_TR802DCT_TR802
DCT_TR802
Aniruddh Tyagi
 
DCT_TR802
DCT_TR802DCT_TR802
DCT_TR802
aniruddh Tyagi
 
VaR of Operational Risk
VaR of Operational RiskVaR of Operational Risk
VaR of Operational Risk
Rahmat Mulyana
 
Brief survey on Three-Dimensional Displays
Brief survey on Three-Dimensional DisplaysBrief survey on Three-Dimensional Displays
Brief survey on Three-Dimensional Displays
Taufiq Widjanarko
 
RIT 101: Understanding Scores From MAP
RIT 101: Understanding Scores From MAPRIT 101: Understanding Scores From MAP
RIT 101: Understanding Scores From MAP
NWEA
 
Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
pmaloney1
 
股票期貨問答
股票期貨問答股票期貨問答
股票期貨問答
frogman1688
 
Why we don’t know how many colors there are
Why we don’t know how many colors there areWhy we don’t know how many colors there are
Why we don’t know how many colors there are
Jan Morovic
 
SPICE MODEL of 2SK2962 (Professional+BDP Model) in SPICE PARK
SPICE MODEL of 2SK2962 (Professional+BDP Model) in SPICE PARKSPICE MODEL of 2SK2962 (Professional+BDP Model) in SPICE PARK
SPICE MODEL of 2SK2962 (Professional+BDP Model) in SPICE PARK
Tsuyoshi Horigome
 
Financial analysis
Financial analysisFinancial analysis
Financial analysis
kanchan89
 
SPICE MODEL of 2SK2989 (Professional+BDP Model) in SPICE PARK
SPICE MODEL of 2SK2989 (Professional+BDP Model) in SPICE PARKSPICE MODEL of 2SK2989 (Professional+BDP Model) in SPICE PARK
SPICE MODEL of 2SK2989 (Professional+BDP Model) in SPICE PARK
Tsuyoshi Horigome
 

Similar to Machine learning projects with r (20)

Amth250 octave matlab some solutions (3)
Amth250 octave matlab some solutions (3)Amth250 octave matlab some solutions (3)
Amth250 octave matlab some solutions (3)
 
Wikipedia ws
Wikipedia wsWikipedia ws
Wikipedia ws
 
9th ICCS Noordwijkerhout
9th ICCS Noordwijkerhout9th ICCS Noordwijkerhout
9th ICCS Noordwijkerhout
 
Towards Probabilistic Assessment of Modularity
Towards Probabilistic Assessment of ModularityTowards Probabilistic Assessment of Modularity
Towards Probabilistic Assessment of Modularity
 
adc converter basics
adc converter basicsadc converter basics
adc converter basics
 
Mlb graphs slide deck
Mlb graphs slide deckMlb graphs slide deck
Mlb graphs slide deck
 
Metrado de madera
Metrado de maderaMetrado de madera
Metrado de madera
 
Important Topics for JEE Advanced
Important Topics for JEE AdvancedImportant Topics for JEE Advanced
Important Topics for JEE Advanced
 
DCT_TR802
DCT_TR802DCT_TR802
DCT_TR802
 
DCT_TR802
DCT_TR802DCT_TR802
DCT_TR802
 
DCT_TR802
DCT_TR802DCT_TR802
DCT_TR802
 
VaR of Operational Risk
VaR of Operational RiskVaR of Operational Risk
VaR of Operational Risk
 
Brief survey on Three-Dimensional Displays
Brief survey on Three-Dimensional DisplaysBrief survey on Three-Dimensional Displays
Brief survey on Three-Dimensional Displays
 
RIT 101: Understanding Scores From MAP
RIT 101: Understanding Scores From MAPRIT 101: Understanding Scores From MAP
RIT 101: Understanding Scores From MAP
 
Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
 
股票期貨問答
股票期貨問答股票期貨問答
股票期貨問答
 
Why we don’t know how many colors there are
Why we don’t know how many colors there areWhy we don’t know how many colors there are
Why we don’t know how many colors there are
 
SPICE MODEL of 2SK2962 (Professional+BDP Model) in SPICE PARK
SPICE MODEL of 2SK2962 (Professional+BDP Model) in SPICE PARKSPICE MODEL of 2SK2962 (Professional+BDP Model) in SPICE PARK
SPICE MODEL of 2SK2962 (Professional+BDP Model) in SPICE PARK
 
Financial analysis
Financial analysisFinancial analysis
Financial analysis
 
SPICE MODEL of 2SK2989 (Professional+BDP Model) in SPICE PARK
SPICE MODEL of 2SK2989 (Professional+BDP Model) in SPICE PARKSPICE MODEL of 2SK2989 (Professional+BDP Model) in SPICE PARK
SPICE MODEL of 2SK2989 (Professional+BDP Model) in SPICE PARK
 

Recently uploaded

“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 

Recently uploaded (20)

“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 

Machine learning projects with r

  • 2. Outline  Classification of glass data  Clustering of glass data
  • 3. Classification by ridge regression 3
  • 4. Plotting the three classes by four features 4 Simple Scatterplot Matrix 11 12 13 14 15 0.5 1.0 1.5 2.0 1.525 V2 1.515 15 14 13 V3 12 11 4 3 V4 2 1 0 2.0 1.5 V5 1.0 0.5 1.515 1.525 0 1 2 3 4
  • 5. Performance looks good when consider only the classification error rate 5
  • 6. Performance is poor when consider ROC 6
  • 7. Using high order polynomial helps improve ROC 7 Decision point
  • 8. Using high order polynomial helps improve TPR and FPR! 8 Y ~ [V2, V3, …, V10, V2*V3, V2*V4, …] Training Test True Positive Rate 0.6820833 0.55 False Positive Rate 0.008368031 0.0804762 Error rate 0.03953965 0.1270588 Y ~ [V2, V3 … , V10] Training Test True Positive Rate 0 0 False Positive Rate 0.00685288 0.007142857 Error rate 0.1104277 0.1102941
  • 9. Notes on ridge regression 9 1. The ridge solutions are not invariant under scaling of the inputs --- usually standardize the input --- so that the solution is invariant to scaling of inputs 2. Intercept β0 should be left out of the penalty term! --- so that the solution is invariant to the choice of origin of inputs and outputs
  • 10. Outline  Classification of glass data  Clustering of glass data
  • 11. Multi-Dimensional Scaling of glass data (Labeled as: 1,2,3,5,6,7) Metric MDS 6 1 2 3 5 6 4 7 Coordinate 2 2 0 -2 -4 -2 0 2 4 6 Coordinate 1
  • 12. Kmeans of glass K-means cluster 1.0 0.8 0.6 Correct rate 0.4 0.2 0.0 Original labels
  • 13. Hierarchical of glass Hierachical cluster 1.0 0.8 0.6 Correct rate 0.4 0.2 0.0 Original labels
  • 14. Correct rate 0.0 0.2 0.4 0.6 0.8 1.0 EM of glass EM Original labels