Rule-based classifiers classify records using a set of "if-then" rules. Each rule has a condition part (antecedent) and a class label part (consequent). The classifier applies the rules sequentially to a record and assigns the class label of the first matching rule. The algorithm PRISM is used to generate rules by starting with an empty rule and iteratively adding conditions to maximize the ratio of correctly classified to total instances covered by the rule. It separates out instances covered by each rule until all instances are classified.
The document discusses several alternative classification techniques including rule-based classifiers, nearest neighbors classifiers, and Naive Bayes classifiers. It provides examples of how each technique works and some key aspects to consider, such as how to build rule-based classifiers directly from data or indirectly from other models like decision trees. It also covers concepts like mutual exclusivity of rules, rule coverage and accuracy, and how to order rules.
Classification: Alternative Techniques and Nearest Neighbor Classifiersaqeelyounus90
This document discusses rule-based classifiers and how they work. It provides examples of classification rules for different types of animals based on their attributes. It then discusses strategies for handling non-mutually exclusive and non-exhaustive rule sets, such as using an ordered rule set or default class. The document also covers techniques for building rule-based classifiers like sequential covering and evaluating rules using measures like information gain.
The document summarizes different techniques for rule-based classification in data mining, including rule-based classifiers, nearest neighbor classifiers, Bayesian classifiers, artificial neural networks, and support vector machines. It focuses on rule-based classifiers, explaining how rules are generated from data (direct and indirect methods) and how rule-based classifiers work by applying rules to classify new data instances. Sequential covering is described as a direct method for building rules sequentially from data.
Rule-based classification uses a set of if-then rules to classify tuples. Each rule has a condition part (if) and a consequent part (then) that assigns a class. The document discusses evaluating rule coverage, accuracy, and characteristics of rule sets such as being mutually exclusive or exhaustive. It also describes direct and indirect methods for building classification rules, including sequential covering algorithms and extracting rules from decision trees.
What is the Covering (Rule-based) algorithm?
Classification Rules- Straightforward
1. If-Then rule
2. Generating rules from Decision Tree
Rule-based Algorithm
1. The 1R Algorithm / Learn One Rule
2. The PRISM Algorithm
3. Other Algorithm
Application of Covering algorithm
Discussion on e/m-learning application
2024 State of Marketing Report – by HubspotMarius Sescu
https://www.hubspot.com/state-of-marketing
· Scaling relationships and proving ROI
· Social media is the place for search, sales, and service
· Authentic influencer partnerships fuel brand growth
· The strongest connections happen via call, click, chat, and camera.
· Time saved with AI leads to more creative work
· Seeking: A single source of truth
· TLDR; Get on social, try AI, and align your systems.
· More human marketing, powered by robots
Rule-based classifiers classify records using a set of "if-then" rules. Each rule has a condition part (antecedent) and a class label part (consequent). The classifier applies the rules sequentially to a record and assigns the class label of the first matching rule. The algorithm PRISM is used to generate rules by starting with an empty rule and iteratively adding conditions to maximize the ratio of correctly classified to total instances covered by the rule. It separates out instances covered by each rule until all instances are classified.
The document discusses several alternative classification techniques including rule-based classifiers, nearest neighbors classifiers, and Naive Bayes classifiers. It provides examples of how each technique works and some key aspects to consider, such as how to build rule-based classifiers directly from data or indirectly from other models like decision trees. It also covers concepts like mutual exclusivity of rules, rule coverage and accuracy, and how to order rules.
Classification: Alternative Techniques and Nearest Neighbor Classifiersaqeelyounus90
This document discusses rule-based classifiers and how they work. It provides examples of classification rules for different types of animals based on their attributes. It then discusses strategies for handling non-mutually exclusive and non-exhaustive rule sets, such as using an ordered rule set or default class. The document also covers techniques for building rule-based classifiers like sequential covering and evaluating rules using measures like information gain.
The document summarizes different techniques for rule-based classification in data mining, including rule-based classifiers, nearest neighbor classifiers, Bayesian classifiers, artificial neural networks, and support vector machines. It focuses on rule-based classifiers, explaining how rules are generated from data (direct and indirect methods) and how rule-based classifiers work by applying rules to classify new data instances. Sequential covering is described as a direct method for building rules sequentially from data.
Rule-based classification uses a set of if-then rules to classify tuples. Each rule has a condition part (if) and a consequent part (then) that assigns a class. The document discusses evaluating rule coverage, accuracy, and characteristics of rule sets such as being mutually exclusive or exhaustive. It also describes direct and indirect methods for building classification rules, including sequential covering algorithms and extracting rules from decision trees.
What is the Covering (Rule-based) algorithm?
Classification Rules- Straightforward
1. If-Then rule
2. Generating rules from Decision Tree
Rule-based Algorithm
1. The 1R Algorithm / Learn One Rule
2. The PRISM Algorithm
3. Other Algorithm
Application of Covering algorithm
Discussion on e/m-learning application
2024 State of Marketing Report – by HubspotMarius Sescu
https://www.hubspot.com/state-of-marketing
· Scaling relationships and proving ROI
· Social media is the place for search, sales, and service
· Authentic influencer partnerships fuel brand growth
· The strongest connections happen via call, click, chat, and camera.
· Time saved with AI leads to more creative work
· Seeking: A single source of truth
· TLDR; Get on social, try AI, and align your systems.
· More human marketing, powered by robots
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...shadow0702a
This document serves as a comprehensive step-by-step guide on how to effectively use PyCharm for remote debugging of the Windows Subsystem for Linux (WSL) on a local Windows machine. It meticulously outlines several critical steps in the process, starting with the crucial task of enabling permissions, followed by the installation and configuration of WSL.
The guide then proceeds to explain how to set up the SSH service within the WSL environment, an integral part of the process. Alongside this, it also provides detailed instructions on how to modify the inbound rules of the Windows firewall to facilitate the process, ensuring that there are no connectivity issues that could potentially hinder the debugging process.
The document further emphasizes on the importance of checking the connection between the Windows and WSL environments, providing instructions on how to ensure that the connection is optimal and ready for remote debugging.
It also offers an in-depth guide on how to configure the WSL interpreter and files within the PyCharm environment. This is essential for ensuring that the debugging process is set up correctly and that the program can be run effectively within the WSL terminal.
Additionally, the document provides guidance on how to set up breakpoints for debugging, a fundamental aspect of the debugging process which allows the developer to stop the execution of their code at certain points and inspect their program at those stages.
Finally, the document concludes by providing a link to a reference blog. This blog offers additional information and guidance on configuring the remote Python interpreter in PyCharm, providing the reader with a well-rounded understanding of the process.
Discover the latest insights on Data Driven Maintenance with our comprehensive webinar presentation. Learn about traditional maintenance challenges, the right approach to utilizing data, and the benefits of adopting a Data Driven Maintenance strategy. Explore real-world examples, industry best practices, and innovative solutions like FMECA and the D3M model. This presentation, led by expert Jules Oudmans, is essential for asset owners looking to optimize their maintenance processes and leverage digital technologies for improved efficiency and performance. Download now to stay ahead in the evolving maintenance landscape.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
ChatGPT is a revolutionary addition to the world since its introduction in 2022. A big shift in the sector of information gathering and processing happened because of this chatbot. What is the story of ChatGPT? How is the bot responding to prompts and generating contents? Swipe through these slides prepared by Expeed Software, a web development company regarding the development and technical intricacies of ChatGPT!
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
The realm of product design is a constantly changing environment where technology and style intersect. Every year introduces fresh challenges and exciting trends that mold the future of this captivating art form. In this piece, we delve into the significant trends set to influence the look and functionality of product design in the year 2024.
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...shadow0702a
This document serves as a comprehensive step-by-step guide on how to effectively use PyCharm for remote debugging of the Windows Subsystem for Linux (WSL) on a local Windows machine. It meticulously outlines several critical steps in the process, starting with the crucial task of enabling permissions, followed by the installation and configuration of WSL.
The guide then proceeds to explain how to set up the SSH service within the WSL environment, an integral part of the process. Alongside this, it also provides detailed instructions on how to modify the inbound rules of the Windows firewall to facilitate the process, ensuring that there are no connectivity issues that could potentially hinder the debugging process.
The document further emphasizes on the importance of checking the connection between the Windows and WSL environments, providing instructions on how to ensure that the connection is optimal and ready for remote debugging.
It also offers an in-depth guide on how to configure the WSL interpreter and files within the PyCharm environment. This is essential for ensuring that the debugging process is set up correctly and that the program can be run effectively within the WSL terminal.
Additionally, the document provides guidance on how to set up breakpoints for debugging, a fundamental aspect of the debugging process which allows the developer to stop the execution of their code at certain points and inspect their program at those stages.
Finally, the document concludes by providing a link to a reference blog. This blog offers additional information and guidance on configuring the remote Python interpreter in PyCharm, providing the reader with a well-rounded understanding of the process.
Discover the latest insights on Data Driven Maintenance with our comprehensive webinar presentation. Learn about traditional maintenance challenges, the right approach to utilizing data, and the benefits of adopting a Data Driven Maintenance strategy. Explore real-world examples, industry best practices, and innovative solutions like FMECA and the D3M model. This presentation, led by expert Jules Oudmans, is essential for asset owners looking to optimize their maintenance processes and leverage digital technologies for improved efficiency and performance. Download now to stay ahead in the evolving maintenance landscape.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
ChatGPT is a revolutionary addition to the world since its introduction in 2022. A big shift in the sector of information gathering and processing happened because of this chatbot. What is the story of ChatGPT? How is the bot responding to prompts and generating contents? Swipe through these slides prepared by Expeed Software, a web development company regarding the development and technical intricacies of ChatGPT!
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
The realm of product design is a constantly changing environment where technology and style intersect. Every year introduces fresh challenges and exciting trends that mold the future of this captivating art form. In this piece, we delve into the significant trends set to influence the look and functionality of product design in the year 2024.
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
Mental health has been in the news quite a bit lately. Dozens of U.S. states are currently suing Meta for contributing to the youth mental health crisis by inserting addictive features into their products, while the U.S. Surgeon General is touring the nation to bring awareness to the growing epidemic of loneliness and isolation. The country has endured periods of low national morale, such as in the 1970s when high inflation and the energy crisis worsened public sentiment following the Vietnam War. The current mood, however, feels different. Gallup recently reported that national mental health is at an all-time low, with few bright spots to lift spirits.
To better understand how Americans are feeling and their attitudes towards mental health in general, ThinkNow conducted a nationally representative quantitative survey of 1,500 respondents and found some interesting differences among ethnic, age and gender groups.
Technology
For example, 52% agree that technology and social media have a negative impact on mental health, but when broken out by race, 61% of Whites felt technology had a negative effect, and only 48% of Hispanics thought it did.
While technology has helped us keep in touch with friends and family in faraway places, it appears to have degraded our ability to connect in person. Staying connected online is a double-edged sword since the same news feed that brings us pictures of the grandkids and fluffy kittens also feeds us news about the wars in Israel and Ukraine, the dysfunction in Washington, the latest mass shooting and the climate crisis.
Hispanics may have a built-in defense against the isolation technology breeds, owing to their large, multigenerational households, strong social support systems, and tendency to use social media to stay connected with relatives abroad.
Age and Gender
When asked how individuals rate their mental health, men rate it higher than women by 11 percentage points, and Baby Boomers rank it highest at 83%, saying it’s good or excellent vs. 57% of Gen Z saying the same.
Gen Z spends the most amount of time on social media, so the notion that social media negatively affects mental health appears to be correlated. Unfortunately, Gen Z is also the generation that’s least comfortable discussing mental health concerns with healthcare professionals. Only 40% of them state they’re comfortable discussing their issues with a professional compared to 60% of Millennials and 65% of Boomers.
Race Affects Attitudes
As seen in previous research conducted by ThinkNow, Asian Americans lag other groups when it comes to awareness of mental health issues. Twenty-four percent of Asian Americans believe that having a mental health issue is a sign of weakness compared to the 16% average for all groups. Asians are also considerably less likely to be aware of mental health services in their communities (42% vs. 55%) and most likely to seek out information on social media (51% vs. 35%).
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
Creative operations teams expect increased AI use in 2024. Currently, over half of tasks are not AI-enabled, but this is expected to decrease in the coming year. ChatGPT is the most popular AI tool currently. Business leaders are more actively exploring AI benefits than individual contributors. Most respondents do not believe AI will impact workforce size in 2024. However, some inhibitions still exist around AI accuracy and lack of understanding. Creatives primarily want to use AI to save time on mundane tasks and boost productivity.
Organizational culture includes values, norms, systems, symbols, language, assumptions, beliefs, and habits that influence employee behaviors and how people interpret those behaviors. It is important because culture can help or hinder a company's success. Some key aspects of Netflix's culture that help it achieve results include hiring smartly so every position has stars, focusing on attitude over just aptitude, and having a strict policy against peacocks, whiners, and jerks.
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
PepsiCo provided a safe harbor statement noting that any forward-looking statements are based on currently available information and are subject to risks and uncertainties. It also provided information on non-GAAP measures and directing readers to its website for disclosure and reconciliation. The document then discussed PepsiCo's business overview, including that it is a global beverage and convenient food company with iconic brands, $91 billion in net revenue in 2023, and nearly $14 billion in core operating profit. It operates through a divisional structure with a focus on local consumers.
Content Methodology: A Best Practices Report (Webinar)contently
This document provides an overview of content methodology best practices. It defines content methodology as establishing objectives, KPIs, and a culture of continuous learning and iteration. An effective methodology focuses on connecting with audiences, creating optimal content, and optimizing processes. It also discusses why a methodology is needed due to the competitive landscape, proliferation of channels, and opportunities for improvement. Components of an effective methodology include defining objectives and KPIs, audience analysis, identifying opportunities, and evaluating resources. The document concludes with recommendations around creating a content plan, testing and optimizing content over 90 days.
How to Prepare For a Successful Job Search for 2024Albert Qian
The document provides guidance on preparing a job search for 2024. It discusses the state of the job market, focusing on growth in AI and healthcare but also continued layoffs. It recommends figuring out what you want to do by researching interests and skills, then conducting informational interviews. The job search should involve building a personal brand on LinkedIn, actively applying to jobs, tailoring resumes and interviews, maintaining job hunting as a habit, and continuing self-improvement. Once hired, the document advises setting new goals and keeping skills and networking active in case of future opportunities.
A report by thenetworkone and Kurio.
The contributing experts and agencies are (in an alphabetical order): Sylwia Rytel, Social Media Supervisor, 180heartbeats + JUNG v MATT (PL), Sharlene Jenner, Vice President - Director of Engagement Strategy, Abelson Taylor (USA), Alex Casanovas, Digital Director, Atrevia (ES), Dora Beilin, Senior Social Strategist, Barrett Hoffher (USA), Min Seo, Campaign Director, Brand New Agency (KR), Deshé M. Gully, Associate Strategist, Day One Agency (USA), Francesca Trevisan, Strategist, Different (IT), Trevor Crossman, CX and Digital Transformation Director; Olivia Hussey, Strategic Planner; Simi Srinarula, Social Media Manager, The Hallway (AUS), James Hebbert, Managing Director, Hylink (CN / UK), Mundy Álvarez, Planning Director; Pedro Rojas, Social Media Manager; Pancho González, CCO, Inbrax (CH), Oana Oprea, Head of Digital Planning, Jam Session Agency (RO), Amy Bottrill, Social Account Director, Launch (UK), Gaby Arriaga, Founder, Leonardo1452 (MX), Shantesh S Row, Creative Director, Liwa (UAE), Rajesh Mehta, Chief Strategy Officer; Dhruv Gaur, Digital Planning Lead; Leonie Mergulhao, Account Supervisor - Social Media & PR, Medulla (IN), Aurelija Plioplytė, Head of Digital & Social, Not Perfect (LI), Daiana Khaidargaliyeva, Account Manager, Osaka Labs (UK / USA), Stefanie Söhnchen, Vice President Digital, PIABO Communications (DE), Elisabeth Winiartati, Managing Consultant, Head of Global Integrated Communications; Lydia Aprina, Account Manager, Integrated Marketing and Communications; Nita Prabowo, Account Manager, Integrated Marketing and Communications; Okhi, Web Developer, PNTR Group (ID), Kei Obusan, Insights Director; Daffi Ranandi, Insights Manager, Radarr (SG), Gautam Reghunath, Co-founder & CEO, Talented (IN), Donagh Humphreys, Head of Social and Digital Innovation, THINKHOUSE (IRE), Sarah Yim, Strategy Director, Zulu Alpha Kilo (CA).
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
The search marketing landscape is evolving rapidly with new technologies, and professionals, like you, rely on innovative paid search strategies to meet changing demands.
It’s important that you’re ready to implement new strategies in 2024.
Check this out and learn the top trends in paid search advertising that are expected to gain traction, so you can drive higher ROI more efficiently in 2024.
You’ll learn:
- The latest trends in AI and automation, and what this means for an evolving paid search ecosystem.
- New developments in privacy and data regulation.
- Emerging ad formats that are expected to make an impact next year.
Watch Sreekant Lanka from iQuanti and Irina Klein from OneMain Financial as they dive into the future of paid search and explore the trends, strategies, and technologies that will shape the search marketing landscape.
If you’re looking to assess your paid search strategy and design an industry-aligned plan for 2024, then this webinar is for you.
5 Public speaking tips from TED - Visualized summarySpeakerHub
From their humble beginnings in 1984, TED has grown into the world’s most powerful amplifier for speakers and thought-leaders to share their ideas. They have over 2,400 filmed talks (not including the 30,000+ TEDx videos) freely available online, and have hosted over 17,500 events around the world.
With over one billion views in a year, it’s no wonder that so many speakers are looking to TED for ideas on how to share their message more effectively.
The article “5 Public-Speaking Tips TED Gives Its Speakers”, by Carmine Gallo for Forbes, gives speakers five practical ways to connect with their audience, and effectively share their ideas on stage.
Whether you are gearing up to get on a TED stage yourself, or just want to master the skills that so many of their speakers possess, these tips and quotes from Chris Anderson, the TED Talks Curator, will encourage you to make the most impactful impression on your audience.
See the full article and more summaries like this on SpeakerHub here: https://speakerhub.com/blog/5-presentation-tips-ted-gives-its-speakers
See the original article on Forbes here:
http://www.forbes.com/forbes/welcome/?toURL=http://www.forbes.com/sites/carminegallo/2016/05/06/5-public-speaking-tips-ted-gives-its-speakers/&refURL=&referrer=#5c07a8221d9b
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
Everyone is in agreement that ChatGPT (and other generative AI tools) will shape the future of work. Yet there is little consensus on exactly how, when, and to what extent this technology will change our world.
Businesses that extract maximum value from ChatGPT will use it as a collaborative tool for everything from brainstorming to technical maintenance.
For individuals, now is the time to pinpoint the skills the future professional will need to thrive in the AI age.
Check out this presentation to understand what ChatGPT is, how it will shape the future of work, and how you can prepare to take advantage.
The document provides career advice for getting into the tech field, including:
- Doing projects and internships in college to build a portfolio.
- Learning about different roles and technologies through industry research.
- Contributing to open source projects to build experience and network.
- Developing a personal brand through a website and social media presence.
- Networking through events, communities, and finding a mentor.
- Practicing interviews through mock interviews and whiteboarding coding questions.
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
1. Core updates from Google periodically change how its algorithms assess and rank websites and pages. This can impact rankings through shifts in user intent, site quality issues being caught up to, world events influencing queries, and overhauls to search like the E-A-T framework.
2. There are many possible user intents beyond just transactional, navigational and informational. Identifying intent shifts is important during core updates. Sites may need to optimize for new intents through different content types and sections.
3. Responding effectively to core updates requires analyzing "before and after" data to understand changes, identifying new intents or page types, and ensuring content matches appropriate intents across video, images, knowledge graphs and more.
A brief introduction to DataScience with explaining of the concepts, algorithms, machine learning, supervised and unsupervised learning, clustering, statistics, data preprocessing, real-world applications etc.
It's part of a Data Science Corner Campaign where I will be discussing the fundamentals of DataScience, AIML, Statistics etc.
Time Management & Productivity - Best PracticesVit Horky
Here's my presentation on by proven best practices how to manage your work time effectively and how to improve your productivity. It includes practical tips and how to use tools such as Slack, Google Apps, Hubspot, Google Calendar, Gmail and others.
The six step guide to practical project managementMindGenius
The six step guide to practical project management
If you think managing projects is too difficult, think again.
We’ve stripped back project management processes to the
basics – to make it quicker and easier, without sacrificing
the vital ingredients for success.
“If you’re looking for some real-world guidance, then The Six Step Guide to Practical Project Management will help.”
Dr Andrew Makar, Tactical Project Management
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
During this webinar, Anand Bagmar demonstrates how AI tools such as ChatGPT can be applied to various stages of the software development life cycle (SDLC) using an eCommerce application case study. Find the on-demand recording and more info at https://applitools.info/b59
Key takeaways:
• Learn how to use ChatGPT to add AI power to your testing and test automation
• Understand the limitations of the technology and where human expertise is crucial
• Gain insight into different AI-based tools
• Adopt AI-based tools to stay relevant and optimize work for developers and testers
* ChatGPT and OpenAI belong to OpenAI, L.L.C.
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
chap4_rule_based data mining power point.
1. Data Mining
Classification: Alternative Techniques
Lecture Notes for Chapter 4
Rule-Based
Introduction to Data Mining , 2nd Edition
by
Tan, Steinbach, Karpatne, Kumar
2. 9/30/2020 Introduction to Data Mining, 2nd Edition 2
Rule-Based Classifier
Classify records by using a collection of
“if…then…” rules
Rule: (Condition) y
– where
Condition is a conjunction of tests on attributes
y is the class label
– Examples of classification rules:
(Blood Type=Warm) (Lay Eggs=Yes) Birds
(Taxable Income < 50K) (Refund=Yes) Evade=No
3. 9/30/2020 Introduction to Data Mining, 2nd Edition 3
Rule-based Classifier (Example)
R1: (Give Birth = no) (Can Fly = yes) Birds
R2: (Give Birth = no) (Live in Water = yes) Fishes
R3: (Give Birth = yes) (Blood Type = warm) Mammals
R4: (Give Birth = no) (Can Fly = no) Reptiles
R5: (Live in Water = sometimes) Amphibians
Name Blood Type Give Birth Can Fly Live in Water Class
human warm yes no no mammals
python cold no no no reptiles
salmon cold no no yes fishes
whale warm yes no yes mammals
frog cold no no sometimes amphibians
komodo cold no no no reptiles
bat warm yes yes no mammals
pigeon warm no yes no birds
cat warm yes no no mammals
leopard shark cold yes no yes fishes
turtle cold no no sometimes reptiles
penguin warm no no sometimes birds
porcupine warm yes no no mammals
eel cold no no yes fishes
salamander cold no no sometimes amphibians
gila monster cold no no no reptiles
platypus warm no no no mammals
owl warm no yes no birds
dolphin warm yes no yes mammals
eagle warm no yes no birds
4. 9/30/2020 Introduction to Data Mining, 2nd Edition 4
Application of Rule-Based Classifier
A rule r covers an instance x if the attributes of
the instance satisfy the condition of the rule
R1: (Give Birth = no) (Can Fly = yes) Birds
R2: (Give Birth = no) (Live in Water = yes) Fishes
R3: (Give Birth = yes) (Blood Type = warm) Mammals
R4: (Give Birth = no) (Can Fly = no) Reptiles
R5: (Live in Water = sometimes) Amphibians
The rule R1 covers a hawk => Bird
The rule R3 covers the grizzly bear => Mammal
Name Blood Type Give Birth Can Fly Live in Water Class
hawk warm no yes no ?
grizzly bear warm yes no no ?
5. 9/30/2020 Introduction to Data Mining, 2nd Edition 5
Rule Coverage and Accuracy
Coverage of a rule:
– Fraction of records
that satisfy the
antecedent of a rule
Accuracy of a rule:
– Fraction of records
that satisfy the
antecedent that
also satisfy the
consequent of a
rule
Tid Refund Marital
Status
Taxable
Income Class
1 Yes Single 125K No
2 No Married 100K No
3 No Single 70K No
4 Yes Married 120K No
5 No Divorced 95K Yes
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
10 No Single 90K Yes
10
(Status=Single) No
Coverage = 40%, Accuracy = 50%
6. 9/30/2020 Introduction to Data Mining, 2nd Edition 6
How does Rule-based Classifier Work?
R1: (Give Birth = no) (Can Fly = yes) Birds
R2: (Give Birth = no) (Live in Water = yes) Fishes
R3: (Give Birth = yes) (Blood Type = warm) Mammals
R4: (Give Birth = no) (Can Fly = no) Reptiles
R5: (Live in Water = sometimes) Amphibians
A lemur triggers rule R3, so it is classified as a mammal
A turtle triggers both R4 and R5
A dogfish shark triggers none of the rules
Name Blood Type Give Birth Can Fly Live in Water Class
lemur warm yes no no ?
turtle cold no no sometimes ?
dogfish shark cold yes no yes ?
7. 9/30/2020 Introduction to Data Mining, 2nd Edition 7
Characteristics of Rule Sets: Strategy 1
Mutually exclusive rules
– Classifier contains mutually exclusive rules if
the rules are independent of each other
– Every record is covered by at most one rule
Exhaustive rules
– Classifier has exhaustive coverage if it
accounts for every possible combination of
attribute values
– Each record is covered by at least one rule
8. 9/30/2020 Introduction to Data Mining, 2nd Edition 8
Characteristics of Rule Sets: Strategy 2
Rules are not mutually exclusive
– A record may trigger more than one rule
– Solution?
Ordered rule set
Unordered rule set – use voting schemes
Rules are not exhaustive
– A record may not trigger any rules
– Solution?
Use a default class
9. 9/30/2020 Introduction to Data Mining, 2nd Edition 9
Ordered Rule Set
Rules are rank ordered according to their priority
– An ordered rule set is known as a decision list
When a test record is presented to the classifier
– It is assigned to the class label of the highest ranked rule it has
triggered
– If none of the rules fired, it is assigned to the default class
R1: (Give Birth = no) (Can Fly = yes) Birds
R2: (Give Birth = no) (Live in Water = yes) Fishes
R3: (Give Birth = yes) (Blood Type = warm) Mammals
R4: (Give Birth = no) (Can Fly = no) Reptiles
R5: (Live in Water = sometimes) Amphibians
Name Blood Type Give Birth Can Fly Live in Water Class
turtle cold no no sometimes ?
10. 9/30/2020 Introduction to Data Mining, 2nd Edition 10
Rule Ordering Schemes
Rule-based ordering
– Individual rules are ranked based on their quality
Class-based ordering
– Rules that belong to the same class appear together
Rule-based Ordering
(Refund=Yes) ==> No
(Refund=No, Marital Status={Single,Divorced},
Taxable Income<80K) ==> No
(Refund=No, Marital Status={Single,Divorced},
Taxable Income>80K) ==> Yes
(Refund=No, Marital Status={Married}) ==> No
Class-based Ordering
(Refund=Yes) ==> No
(Refund=No, Marital Status={Single,Divorced},
Taxable Income<80K) ==> No
(Refund=No, Marital Status={Married}) ==> No
(Refund=No, Marital Status={Single,Divorced},
Taxable Income>80K) ==> Yes
11. 9/30/2020 Introduction to Data Mining, 2nd Edition 11
Building Classification Rules
Direct Method:
Extract rules directly from data
Examples: RIPPER, CN2, Holte’s 1R
Indirect Method:
Extract rules from other classification models (e.g.
decision trees, neural networks, etc).
Examples: C4.5rules
12. 9/30/2020 Introduction to Data Mining, 2nd Edition 12
Direct Method: Sequential Covering
1. Start from an empty rule
2. Grow a rule using the Learn-One-Rule function
3. Remove training records covered by the rule
4. Repeat Step (2) and (3) until stopping criterion
is met
13. 9/30/2020 Introduction to Data Mining, 2nd Edition 13
Example of Sequential Covering
(i) Original Data (ii) Step 1
14. 9/30/2020 Introduction to Data Mining, 2nd Edition 14
Example of Sequential Covering…
(iii) Step 2
R1
(iv) Step 3
R1
R2
15. 9/30/2020 Introduction to Data Mining, 2nd Edition 15
Rule Growing
Two common strategies
Status =
Single
Status =
Divorced
Status =
Married
Income
> 80K
...
Yes: 3
No: 4
{ }
Yes: 0
No: 3
Refund=
No
Yes: 3
No: 4
Yes: 2
No: 1
Yes: 1
No: 0
Yes: 3
No: 1
(a) General-to-specific
Refund=No,
Status=Single,
Income=85K
(Class=Yes)
Refund=No,
Status=Single,
Income=90K
(Class=Yes)
Refund=No,
Status = Single
(Class = Yes)
(b) Specific-to-general
16. 9/30/2020 Introduction to Data Mining, 2nd Edition 16
Rule Evaluation
Foil’s Information Gain
– R0: {} => class (initial rule)
– R1: {A} => class (rule after adding conjunct)
–
– 𝑝0: number of positive instances covered by R0
𝑛0: number of negative instances covered by R0
𝑝1: number of positive instances covered by R1
𝑛1: number of negative instances covered by R1
FOIL: First Order Inductive
Learner – an early rule-
based learning algorithm
𝐺𝑎𝑖𝑛 𝑅0, 𝑅1 = 𝑝1 × [ 𝑙𝑜𝑔2
𝑝1
𝑝1 + 𝑛1
− 𝑙𝑜𝑔2
𝑝0
𝑝0 + 𝑛0
]
17. 9/30/2020 Introduction to Data Mining, 2nd Edition 17
Direct Method: RIPPER
For 2-class problem, choose one of the classes as
positive class, and the other as negative class
– Learn rules for positive class
– Negative class will be default class
For multi-class problem
– Order the classes according to increasing class
prevalence (fraction of instances that belong to a
particular class)
– Learn the rule set for smallest class first, treat the rest
as negative class
– Repeat with next smallest class as positive class
18. 9/30/2020 Introduction to Data Mining, 2nd Edition 18
Direct Method: RIPPER
Growing a rule:
– Start from empty rule
– Add conjuncts as long as they improve FOIL’s
information gain
– Stop when rule no longer covers negative examples
– Prune the rule immediately using incremental reduced
error pruning
– Measure for pruning: v = (p-n)/(p+n)
p: number of positive examples covered by the rule in
the validation set
n: number of negative examples covered by the rule in
the validation set
– Pruning method: delete any final sequence of
conditions that maximizes v
19. 9/30/2020 Introduction to Data Mining, 2nd Edition 19
Direct Method: RIPPER
Building a Rule Set:
– Use sequential covering algorithm
Finds the best rule that covers the current set of
positive examples
Eliminate both positive and negative examples
covered by the rule
– Each time a rule is added to the rule set,
compute the new description length
Stop adding new rules when the new description
length is d bits longer than the smallest description
length obtained so far
20. 9/30/2020 Introduction to Data Mining, 2nd Edition 20
Direct Method: RIPPER
Optimize the rule set:
– For each rule r in the rule set R
Consider 2 alternative rules:
– Replacement rule (r*): grow new rule from scratch
– Revised rule(r′): add conjuncts to extend the rule r
Compare the rule set for r against the rule set for r*
and r′
Choose rule set that minimizes MDL principle
– Repeat rule generation and rule optimization
for the remaining positive examples
21. 9/30/2020 Introduction to Data Mining, 2nd Edition 21
Indirect Methods
Rule Set
r1: (P=No,Q=No) ==> -
r2: (P=No,Q=Yes) ==> +
r3: (P=Yes,R=No) ==> +
r4: (P=Yes,R=Yes,Q=No) ==> -
r5: (P=Yes,R=Yes,Q=Yes) ==> +
P
Q R
Q
- + +
- +
No No
No
Yes Yes
Yes
No Yes
22. 9/30/2020 Introduction to Data Mining, 2nd Edition 22
Indirect Method: C4.5rules
Extract rules from an unpruned decision tree
For each rule, r: A y,
– consider an alternative rule r′: A′ y where A′
is obtained by removing one of the conjuncts
in A
– Compare the pessimistic error rate for r
against all r’s
– Prune if one of the alternative rules has lower
pessimistic error rate
– Repeat until we can no longer improve
generalization error
23. 9/30/2020 Introduction to Data Mining, 2nd Edition 23
Indirect Method: C4.5rules
Instead of ordering the rules, order subsets of
rules (class ordering)
– Each subset is a collection of rules with the
same rule consequent (class)
– Compute description length of each subset
Description length = L(error) + g L(model)
g is a parameter that takes into account the
presence of redundant attributes in a rule set
(default value = 0.5)
24. 9/30/2020 Introduction to Data Mining, 2nd Edition 24
Example
Name Give Birth Lay Eggs Can Fly Live in Water Have Legs Class
human yes no no no yes mammals
python no yes no no no reptiles
salmon no yes no yes no fishes
whale yes no no yes no mammals
frog no yes no sometimes yes amphibians
komodo no yes no no yes reptiles
bat yes no yes no yes mammals
pigeon no yes yes no yes birds
cat yes no no no yes mammals
leopard shark yes no no yes no fishes
turtle no yes no sometimes yes reptiles
penguin no yes no sometimes yes birds
porcupine yes no no no yes mammals
eel no yes no yes no fishes
salamander no yes no sometimes yes amphibians
gila monster no yes no no yes reptiles
platypus no yes no no yes mammals
owl no yes yes no yes birds
dolphin yes no no yes no mammals
eagle no yes yes no yes birds
25. 9/30/2020 Introduction to Data Mining, 2nd Edition 25
C4.5 versus C4.5rules versus RIPPER
C4.5rules:
(Give Birth=No, Can Fly=Yes) Birds
(Give Birth=No, Live in Water=Yes) Fishes
(Give Birth=Yes) Mammals
(Give Birth=No, Can Fly=No, Live in Water=No) Reptiles
( ) Amphibians
Give
Birth?
Live In
Water?
Can
Fly?
Mammals
Fishes Amphibians
Birds Reptiles
Yes No
Yes
Sometimes
No
Yes No
RIPPER:
(Live in Water=Yes) Fishes
(Have Legs=No) Reptiles
(Give Birth=No, Can Fly=No, Live In Water=No)
Reptiles
(Can Fly=Yes,Give Birth=No) Birds
() Mammals
26. 9/30/2020 Introduction to Data Mining, 2nd Edition 26
C4.5 versus C4.5rules versus RIPPER
PREDICTED CLASS
Amphibians Fishes Reptiles Birds Mammals
ACTUAL Amphibians 0 0 0 0 2
CLASS Fishes 0 3 0 0 0
Reptiles 0 0 3 0 1
Birds 0 0 1 2 1
Mammals 0 2 1 0 4
PREDICTED CLASS
Amphibians Fishes Reptiles Birds Mammals
ACTUAL Amphibians 2 0 0 0 0
CLASS Fishes 0 2 0 0 1
Reptiles 1 0 3 0 0
Birds 1 0 0 3 0
Mammals 0 0 1 0 6
C4.5 and C4.5rules:
RIPPER:
27. 9/30/2020 Introduction to Data Mining, 2nd Edition 27
Advantages of Rule-Based Classifiers
Has characteristics quite similar to decision trees
– As highly expressive as decision trees
– Easy to interpret (if rules are ordered by class)
– Performance comparable to decision trees
Can handle redundant and irrelevant attributes
Variable interaction can cause issues (e.g., X-OR problem)
Better suited for handling imbalanced classes
Harder to handle missing values in the test set