2. History
• No
friendly
gateways
to
access
historical
forecas3ng
snapshot
(input,
interim,
output,
etc.)
• No
friendly
gateways
to
submit
ad-‐hoc
queries
(troubleshoo3ng)
and
new
algorithms
• SLA
ETLs
are
hard
to
launch
and
maintain
• …
2
3. Architecture
3
Cloud
Based
Data
Warehouse
Hadoop
(EMR)
Clusters
EASTEROS: Router
service
EASTEROS: Analy.c
Portal
/
CLI
4. Why
Easteros?
• Simple
gateways
for
job
submission
and
monitoring
– Access
to
each
snapshot
of
pipeline
run
• Separate
the
big
data
soGware
stack
from
users
(analysts,
scien3sts,
retail
in-‐stock
managers)
4
5. Easteros:Router
service
• Users’
perspec3ve
– REST-‐ful
service
to
run
Hive
and
Hadoop
jobs.
– Auto
select
the
proper
EMR
Clusters
based
on
cluster
load
– Users
doesn’t
need
to
setup
and
maintain
clusters
– Sophis3cated
users
can
provide
clusters
configs
– Check
job
logs
periodically
(flush
to
S3
every
5
minutes)
5
6. Easteros:Router
service
• SDE’
perspec3ve
– Spin
up
new
clusters
automa3cally
– Override
site-‐specific
hive/hadoop
configura3ons
6