This document contains summaries of different machine learning concepts including supervised learning, scatter plots, basic probability, support vector machines, decision trees, data sets, regression, outliers, and clustering. It provides examples and questions to test understanding of these topics, with points assigned for correct answers. The document covers a wide range of foundational machine learning topics at a basic level.
Judging the Relevance and worth of ideas part 2.pptx
Supervised learning (6 points total) 1) Does she like th.docx
1. Supervised learning (6 points total)
1) Does she like the song under ‘?’? Red circle are for the songs
she likes and the diamonds for the
ones she doesn’t. (3 points)
Yes
No
Unclear
2) Does she like the song under ‘?’? Red circle are for the songs
she likes and the diamonds for the
ones she doesn’t. (3 points)
Yes
2. No
Unclear
Scatter Plots (19 points total)
1) Match the road bumpiness and slope with the colored X
markings in the figure. (7 points)
1) green A) very steep and some bumpiness
2) blue B) low slope and no bumpiness
3) red C) low slope and extreme bumpiness
3. 2) Is ‘?’ more like circles or more like diamonds? (3 points)
circles
diamonds
unclear
3) What is ‘?’ more like? (3 points)
circles
diamonds
unclear
4. 4) What is ‘?’ more like? (3 points)
circles
diamonds
unclear
5) Which line separates the classes best. (3 points)
5. Steepest
Moderately steep
At least steep
Basic Probability (60 points total)
Question 1: A die is rolled, find the probability that an even
number is obtained. (2 points)
Question 2: Two coins are tossed, find the probability that two
heads are obtained. (2 points)
Question 3: Which of these numbers cannot be a probability? (4
points)
a) -0.00001
b) 0.5
6. c) 1.001
d) 0
e) 1
f) 20%
Question 4: If two dice are rolled, what is the probability that
the sum is
a) equal to 1 (2 points)
b) equal to 4 (2 points)
c) less than 13 (2 points)
Question 5: A die is rolled and a coin is tossed, find the
probability that the die shows an odd number
and the coin shows a head. (4 points)
Question 6: A card is drawn at random from a deck of cards.
Find the probability of getting the 3 of
diamond. (4 points)
Question 7: A card is drawn at random from a deck of cards.
Find the probability of getting a queen. (4
points)
7. Question 8: A jar contains 3 red marbles, 7 green marbles and
10 white marbles. If a marble is drawn
from the jar at random, what is the probability that this marble
is white? (4 points)
Question 9: The blood groups of 200 people is distributed as
follows: 50 have type A blood, 65 have B
blood type, 70 have O blood type and 15 have type AB blood. If
a person from this group is selected at
random, what is the probability that this person has O blood
type? (6 points)
10) A die is rolled, find the probability that the number
obtained is greater than 4. (6 points)
11) Two coins are tossed, find the probability that one head
only is obtained. (6 points)
12) Two dice are rolled, find the probability that the sum is
equal to 5. (4 points)
13) A card is drawn at random from a deck of cards. Find the
probability of getting the King of hearts.
8. (4 points)
14) Is message ‘Love life’ more likely from Chris or Sara? (2
points)
15) Is message ‘Love deal’ more likely from Chris or Sara?(2
points)
9. Support Vector Machines (26 points total)
1) Mark the line that separates the clusters the best.
A) (3 points)
Steepest
Moderately steep
At least steep
B) (3 points)
Vertical
12. 3) Can you separate clusters by line? (1 point)
4) Is the transformation below linearly separable? (1 point)
13. 5) How would you transform the data below to make it linearly
separable? (6 points)
Decision tree (10 points total)
1) Is the data below linearly separable? (1 point)
14. 2) Construct a decision tree that classifies data below. (9
points)
15. Data sets (15 points total)
1) What type of data is ‘job title’? (3 points)
2) What type of data is ‘time stamp’ on email? (3 points)
3) What type of data is content of email? (3 points)
4) What type of data is ‘number of emails sent by a person’? (3
points)
5) What type of data is ‘to/from’ fields in email? (3 points)
Regression (41 points total)
1) Pick two point to connect the best line approximating the
data. (3 points)
16. A
B C
D
E
F
2) What is the most reasonable slope and intercept for the line
that approximates the data below? (6
points)
17. 3) Which line has the greatest slope. (3 points)
Left-most
Middle
Right-most
4) Which line has the greatest intercept. (3 points)
Upper
Middle
Lower
5) For line 6.25 * x + 30 that predicts the wealth in thousands
18. of dollars given the age, what is the
wealth of person at age 36? (3 points)
6) Which figure is a good candidate for linear regression. (5
points)
1st (from left and top to bottom)
2nd
3rd
4th
5th
19. 7) What is a good formula of form y=a*x1 + b*x2 + c for the
data below; what are the good a, b, and c?
(9 points)
8) What is a good formula of form y=a*x1 + b*x2 + c for the
data below; what are the good a, b, and c?
(9 points)
20. Outliers (15 total points)
1) What is the best fit line for the data? (3 points)
Top horizontal
Bottom horizontal
Sloped
Unclear
2) Which data sets have outliers. (6 points)
1st
2nd
22. 4) What is the effect on the slope in linear regression when the
outlier below is removed? (3 points)
Clustering (3 points total)
1) How many clusters do you see in the data? (2 points)
2) Is it possible, by clustering, that the data below gets into 2
clusters, given 3 initial cluster centers? (1
point)