eSoftTools IMAP Backup Software and migration tools
CaseWare Data Scientist test.
1. CaseWare International Inc.
Data Scientist Exam
Time Limit: 1 hour 30 minutes
Name: Date:
This test will be used to evaluate your design skills and logical thought processes.
Although important in our interview process, it is only used as part of the
evaluation. Do not worry if you do not finish all of the questions, just do the best
that you can.
Use whichever programming language you are most comfortable with. Feel free
to use pseudocode if you want.
You should attempt to answer all questions.
2. Q) Write a function that takes three arrays of integers (a1, a2 and a3) and their
common size (a_size) as parameters and returns the number of values that occur
in all three arrays. You are told that each array does not contain any duplicates.
Also, the arrays are sorted in ascending order. For example, a valid input is
a1={1,2,3,4}, a2={2,3,4,5}, a3={3,5,8,9} and a_size=4. In this case, the
function returns 1 since only one value is common to all three arrays (i.e. the
value 3). The solution should iterate over the arrays and not use built-in methods
of any language to find the intersection of arrays.
Answer:
3. Q) Write a function that takes a string containing a sentence and reverses the
words in the sentence. For example, the sentence “Lord of the Rings” becomes
“Rings the of Lord”. You may not use high level language constructs such as split,
join, or reverse. Assume you are using a lower level language such as C.
Answer:
4. Q) if (var1 == true)
{
var2 = true;
}
else
{
var2 = false;
}
Provide a one line piece of code that is functionally equivalent to the above code
block.
Answer:
5. Q) The FizzBuzz test. Write code to print the numbers from 1 to 100, other than
numbers that are a multiple of 3 or 5. Instead of numbers that are a multiple of 3, print
Fizz. For numbers that are a multiple of 5, print Buzz. For numbers that are a multiple
of both 3 and 5, print FizzBuzz.
Answer:
6. Q) Imagine you have four cards such that each side of each card has either an “A”,
“B”, “2” or “3”. The four cards are arranged so you can see one side of each card. I
propose a theory that where a card has a vowel on one side, the other side will have an
even number. Which cards do we need to turn over in order to test my theory?
Explain?
7. Q) Given a requirement to write a method to add 2 integers, what are some
possibilities as to how you would handle the fact that one or both numbers may be or
may be close to int_max?
8. Q) Given 2 rectangles in 2-dimentional space, determine if they overlap. For each
rectangle, you’re given: x1, x2, y1, y2, the x position of the left line, x position of the
right line, y position of the top line, and y positions of the bottom line.
9. Patterns
Find the value of “x” in each of the following patterns:
1, 4, 27, 256, x ________
1,8,27,64,125,216,x ________
-1,0,3,8,15,24,35,48,63,80,x ________
60, 30, 20, 15, 12,x ________
2, 4,12, 48, 240, 1440,x ________
17,19,23,29,31,37,x ________
4,6,10,14,22,26,x ________
32,36,9,12,4,6,x ________
13,6,10,9,7,12,4,15,x ________
25,32,37,47,58,71,79,x ________
10. Machine Learning Quiz
You may answer the questions with one or two sentences each.
Q) Explain the difference between supervised, unsupervised, and
semisupervised learning.
Q) What is the difference between classification, regression, and
recommendation problems?
Q) What types of predictors are in common usage today (ex k-NN, SVM, etc. -
- please name some others)
Q) What is the purpose of regularization?
Q) What is the purpose of normalizing data?
Q) What methods may be used to create ensembles?
11. Q) What techniques may be used for feature selection?
Q) Why is feature selection important?
Q) Specifically, what issues can be caused by redundant features?
Q) What measures may be used to evaluate classification models?
Q) WIth SVM’s, what are the support vectors?
Q) What techniques may be used for hyper-parameter tuning?