3
Big Data Analyst Questionnaire
Within this document are four different questions. Each question is structured in the following manner:
1) Premise
- Contains any needed background information
2) Request
- The actual question, what you are to solve
3) Notes
- A space if you feel like including notes of any kind for the given question
Please place your answer for each question in a separate file, following this naming convention:
Name_Qn.docx, where n = the question number (i.e., 1, 2 ...). So the file for the first question should be named ‘Name_Q1.docx’.
When complete, please package everything together and send email responses to the designated POCs.
Page | 1
Premise:
You have a table named “TRADES” with the following six columns:
Column Name
Data Type
Description
Date
DATE
The calendar date on which the trade took place.
Firm
VARCHAR(255)
A symbol representing the Broker/Dealer who conducted the trade.
Symbol
VARCHAR(10)
The security traded.
Side
VARCHAR(1)
Denotes whether the trade was a buy (purchase) or a sell (sale) of a security.
Quantity
BIGINT
The number of shares involved in the trade.
Price
DECIMAL(18,8)
The dollar price per share traded.
You write a query looking for all trades in the month of August 2019. The query returns the following:
DATE
FIRM
SYMBOL
SIDE
QUANTITY
PRICE
8/5/2019
ABC
123
B
200
41
8/5/2019
CDE
456
B
601
60
8/5/2019
ABC
789
S
600
70
8/5/2019
CDE
789
S
600
70
8/5/2019
FGH
456
B
200
62
8/6/2019
3CDE
456
X
300
61
8/8/2019
ABC
123
B
300
40
8/9/2019
ABC
123
S
300
30
8/9/2019
FGH
789
B
2100
71
8/10/2019
CDE
456
S
1100
63
Questions:
1) Conduct an analysis of the data set returned by your query. Write a paragraph describing your analysis. Please also note any questions or assumptions made about this data.
2) Your business user asks you to show them a table output that includes an additional column categorizing the TRADES data into volume based Tiers, with a column named ‘Tier’. Quantities between 0-250 will be considered ‘Small’, quantities greater than ‘Small’ but less than or equal to 500 will be considered ‘Medium’, quantities greater than ‘Medium’ but less than or equal to 500 will be considered ‘Large’, and quantities greater than ‘Tier 3’ will be considered ‘Very Large’ .
a. Please write the SQL query you would use to add the column to the table output.
b. Please show the exact results you expect based on your SQL query.
3) Your business user asks you to show them a table output summarizing the TRADES data (Buy and Sell) on week-by-week basis.
a. Please write the SQL query you would use to query this table.
b. Please show the exact results you expect based on your SQL query.
Notes:
1
Premise:
You need to describe in writing how to accomplish a task. Your audience has never completed this task before.
Question:
In a few paragraphs, please describe how to complete a task of your choice. You may choose a task of your own liking or one of the sample tasks below:
1) How to make a p ...
3Big Data Analyst QuestionnaireWithin this document are fo.docx
1. 3
Big Data Analyst Questionnaire
Within this document are four different questions. Each
question is structured in the following manner:
1) Premise
- Contains any needed background information
2) Request
- The actual question, what you are to solve
3) Notes
- A space if you feel like including notes of any kind for the
given question
Please place your answer for each question in a separate file,
following this naming convention:
Name_Qn.docx, where n = the question number (i.e., 1, 2 ...).
So the file for the first question should be named
‘Name_Q1.docx’.
When complete, please package everything together and send
email responses to the designated POCs.
2. Page | 1
Premise:
You have a table named “TRADES” with the following six
columns:
Column Name
Data Type
Description
Date
DATE
The calendar date on which the trade took place.
Firm
VARCHAR(255)
A symbol representing the Broker/Dealer who conducted the
trade.
Symbol
VARCHAR(10)
The security traded.
Side
VARCHAR(1)
Denotes whether the trade was a buy (purchase) or a sell (sale)
of a security.
Quantity
BIGINT
The number of shares involved in the trade.
Price
DECIMAL(18,8)
3. The dollar price per share traded.
You write a query looking for all trades in the month of August
2019. The query returns the following:
DATE
FIRM
SYMBOL
SIDE
QUANTITY
PRICE
8/5/2019
ABC
123
B
200
41
8/5/2019
CDE
456
B
601
60
8/5/2019
ABC
789
S
600
70
8/5/2019
CDE
789
S
600
70
8/5/2019
FGH
5. 1) Conduct an analysis of the data set returned by your query.
Write a paragraph describing your analysis. Please also note
any questions or assumptions made about this data.
2) Your business user asks you to show them a table output that
includes an additional column categorizing the TRADES data
into volume based Tiers, with a column named ‘Tier’.
Quantities between 0-250 will be considered ‘Small’, quantities
greater than ‘Small’ but less than or equal to 500 will be
considered ‘Medium’, quantities greater than ‘Medium’ but less
than or equal to 500 will be considered ‘Large’, and quantities
greater than ‘Tier 3’ will be considered ‘Very Large’ .
a. Please write the SQL query you would use to add the column
to the table output.
b. Please show the exact results you expect based on your SQL
query.
3) Your business user asks you to show them a table output
summarizing the TRADES data (Buy and Sell) on week-by-week
basis.
a. Please write the SQL query you would use to query this table.
b. Please show the exact results you expect based on your SQL
query.
Notes:
1
Premise:
You need to describe in writing how to accomplish a task. Your
audience has never completed this task before.
Question:
In a few paragraphs, please describe how to complete a task of
your choice. You may choose a task of your own liking or one
of the sample tasks below:
1) How to make a peanut butter and jelly sandwich
2) How to get leaves off a lawn
6. 3) How to make a cup of tea
Notes:
2
Premise:
Below is a snapshot of data from two tables: “Orders” and
“Customers”, taken on 02/05/2016. You find the following
documentation:
· The ORDERS table gets updated at the end of every day
· The CUSTOMERS table gets updated at the end of every week
ORDERS Table
Field Name
Description
ORDER_DT
Date the order was placed.
ORDER_ID
A unique identifier for each order.
ORDER_STATUS
The status of an order.
CUSTOMER_ID
Identifies a unique customer.
CUSTOMERS table
Field Name
Description
CUSTOMER_ID
The unique identifier of the Customer trading in the market
CUSTOMER_STATUS
The Customer's account status. It should be ‘Active’ in order to
7. be eligible for Order processing.
CUSTOMER_FNAME
First name of a customer.
CUSTOMER_MNAME
Middle name of a customer.
CUSTOMER_LNAME
Last name of a customer.
GENDER
Gender of a customer.
AGE
Age of a customer.
Table Name: ORDERS
ORDER_DT
ORDER_ID
ORDER_STATUS
ORDER_STATUS_CD
CUSTOMER_ID
2/1/2016
1000002
Completed
S
4
2/2/2016
2000008
Processing
P
6
2/2/2016
2000009
Completed
S
7
2/2/2016
2000010
Completed
11. Any coding language can be used to query the data.
Question:
1) Your business user asks you to combine the details from
these two tables in one table output, without any duplicated
columns.
A. Please write the query you would use to query this (note
which language you are using).
B. Please show the exact results you expect based on your SQL
query.
C. If you make assumptions to complete the task, please
document them.
2) Through an investigation, your business user has learnt that
there has been an order that was processed successfully by
mistake.
A. Please write the query you would use to validate (or
disprove) this finding (note which language you are using).
B. Please show the exact results you expect based on your SQL
query.
C. If you make assumptions to complete the task, please
document them.
Notes: