BUILDING A MANAGED BIGDATA
PLATFORM FOR THE CLOUD
- NOKIA’S BUSINESS INSIGHTS SUITE

Chidambaran Kollengode (Director, Big...
AGENDA
• Business context

• Historical perspective
• Moving to the cloud
• Architecture
• Deep dive: job management
BUSINESS CASES
Device Analytics
• Device / app
activity data
• Quality Metrics –
Crash reports, etc.

Location Analytics

...
WHY A MANAGED PLATFORM
• Reuse infrastructure, data and
process
• Build and leverage specialized skill
sets for the platfo...
WHAT WAS OUR VISION?
• Merge all the islands of data into one
• Institute a data driven decision making culture

• Provide...
GROWTH OF THE PLATFORM
Volume of data
Volume (TB)
1500

1000
500

Volume (TB)

0

Volume of jobs
Number of jobs (1000s)
20...
HOW MUCH DID WE SUCCEED
• Merge all the islands of data into one
• Institute a data driven decision making culture

• Prov...
A PROBLEM OF PLENTY !
•

Explosive growth in volumes of data & jobs
•
•

Difficulty in experimenting with new platforms / ...
REQUIREMENTS OF THE NEW PLATFORM
•

MUST meet predictable and unpredictable resource needs
• Scalable resources on demand
...
WHY AMAZON WEB SERVICES
•

Provides services that support BigData processing
• S3: Highly scalable, reliable, cost effecti...
THINGS WE EVALUATED
• AWS account management & IAM – Unified Access Control Layer
• S3 Filesystem & abstractions on top of...
MIGRATION STRATEGY: INFRASTRUCTURE
• BigData infrastructure: Data in
S3, processing on EMR clusters
• Other Managed Servic...
MIGRATION STRATEGY: DATA
• Phased migration of data from DC to
S3
• Implemented a custom migration utility
for data transf...
MIGRATION STRATEGY: PROCESS
• Phased migration of application code
from DC to EMR
• Critical ETL processes needed for
othe...
SYSTEM ARCHITECTURE
Presentation

Download Manager

Command Line Tools

Analytics Workbench

Data Analytics

Data Asset Ca...
ELASTIC MAPREDUCE - 101
•

Provision a Hadoop cluster of given size, using given type of instances

•

Support for most of...
WHAT CAN AN ENTERPRISE TIER ON EMR
OFFER ?
• Improve usability by providing better abstractions, necessary
automation
• Im...
IMPROVE USABILITY
• EMR expects users to know
the cluster sizes when
launching jobs

• Users will either not know how
to l...
IMPROVE COST UTILIZATION
• Different cluster types in
EMR: ephemeral (default)
and static

• Static clusters can also
wast...
COLLABORATIONS

• Remove bias towards alternatives that perpetuate status quo
• Accelerate development of insights
• Inven...
IN CONCLUSION
• The Cloud offers new opportunities, but new challenges as well.
Evaluate & Embrace

• Be open to work bein...
THANK YOU !
BACKUP
A MANAGED BIGDATA PLATFORM

BigData

Infrastructure

Data

Process
FIRST HADOOP PLATFORM @ NOKIA - OPTIONAL
Access node

Ingest
Namenode

Access node

Extract

Access node

JobTracker

Sche...
BEYOND NUMBERS
•

Maturity level of Hadoop
• We should talk about our journey of using a vendor, then beginning to tailor ...
TODO: MISCELLANY ?
•

How much did we achieve this in Slough

•

Maturity level of Hadoop – how did it progress.

•

How d...
LEARNING FROM THE PAST - OPTIONAL
•

SHOULD democratize data
• Make data discoverable, accessible to everyone
• Canonicali...
IMPROVE USABILITY
•

EMR API expects some repetitive setup
steps as part of job submission. E.g. Hive
setup for all Hive j...
A JOB MANAGEMENT SYSTEM
Manage
Has knowledge of how to
convert a user jobflow to an
EMR jobflow. Also knows
how to submit ...
QUERY – QUBOLE INTEGRATION
Upcoming SlideShare
Loading in …5
×

NATC 2013 - Platform Development Efforts by Chidambaran Kollengode,Nokia and Hemanth Yamijala, Thoughtworks

638 views
488 views

Published on

NATC 2013 - Platform Development Efforts by Chidambaran Kollengode,Nokia and Hemanth Yamijala, Thoughtworks

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
638
On SlideShare
0
From Embeds
0
Number of Embeds
44
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Need updated data ?
  • NATC 2013 - Platform Development Efforts by Chidambaran Kollengode,Nokia and Hemanth Yamijala, Thoughtworks

    1. 1. BUILDING A MANAGED BIGDATA PLATFORM FOR THE CLOUD - NOKIA’S BUSINESS INSIGHTS SUITE Chidambaran Kollengode (Director, BigData Platform, Nokia) Hemanth Yamijala (Lead Consultant, ThoughtWorks)
    2. 2. AGENDA • Business context • Historical perspective • Moving to the cloud • Architecture • Deep dive: job management
    3. 3. BUSINESS CASES Device Analytics • Device / app activity data • Quality Metrics – Crash reports, etc. Location Analytics Customer Profiles • Maps Data • Traffic patterns • Activations • Usage & preference patterns • App recommendations
    4. 4. WHY A MANAGED PLATFORM • Reuse infrastructure, data and process • Build and leverage specialized skill sets for the platform management • Separate concerns of building apps and managing the platform • Enable evaluation and adoption of new technologies
    5. 5. WHAT WAS OUR VISION? • Merge all the islands of data into one • Institute a data driven decision making culture • Provide a framework where standard tasks are addressed so that business can innovate • Provide a self service model
    6. 6. GROWTH OF THE PLATFORM Volume of data Volume (TB) 1500 1000 500 Volume (TB) 0 Volume of jobs Number of jobs (1000s) 20 15 10 Number of jobs (1000s) 5 6 Dec-13 Nov-13 Oct-13 Sep-13 Aug-13 Jul-13 Jun-13 May-13 Apr-13 Mar-13 Feb-13 Jan-13 Dec-12 Nov-12 Oct-12 Sep-12 Aug-12 Jul-12 Jun-12 May-12 Apr-12 Mar-12 0
    7. 7. HOW MUCH DID WE SUCCEED • Merge all the islands of data into one • Institute a data driven decision making culture • Provide a framework where standard tasks are addressed so that business can innovate • Provide a self service model
    8. 8. A PROBLEM OF PLENTY ! • Explosive growth in volumes of data & jobs • • Difficulty in experimenting with new platforms / technologies • • Difficulty in procuring hardware fast enough for meeting needs Cost factor ? Handling variety and variability in workloads • Jobs with varying resource requirements like high memory / high CPU jobs • Jobs getting submitted in unpredictable loads • Administering the cluster became more & more difficult • Backlog of improvements needed kept growing
    9. 9. REQUIREMENTS OF THE NEW PLATFORM • MUST meet predictable and unpredictable resource needs • Scalable resources on demand • Cost effective • MUST be able to satisfy varying needs of users without complicating operations • MUST be able to adhere to corporate security policies • SHOULD be able to migrate with minimal disruption • Should support almost all components of current Hadoop eco-system
    10. 10. WHY AMAZON WEB SERVICES • Provides services that support BigData processing • S3: Highly scalable, reliable, cost effective storage • Elastic MapReduce: At the time, the most mature implementation of Hadoop on the cloud • Availability & choice • Hardware and Software platforms • Deployment models • Tools for building enterprise applications • Security, Identity management, devops, etc. • Sufficient control with users to implement custom solutions • Granular API based services to build platforms on
    11. 11. THINGS WE EVALUATED • AWS account management & IAM – Unified Access Control Layer • S3 Filesystem & abstractions on top of it • EMR integration • Data ingestion strategies • User and Data Provisioning model (partial)
    12. 12. MIGRATION STRATEGY: INFRASTRUCTURE • BigData infrastructure: Data in S3, processing on EMR clusters • Other Managed Service: Use Amazon equivalents – RDS, etc. • Build a managed service platform on top of AWS • Evaluate and adopt AWS components where relevant
    13. 13. MIGRATION STRATEGY: DATA • Phased migration of data from DC to S3 • Implemented a custom migration utility for data transfer • Forking current data from DC ingestion layer to AWS ingestion layer • 653 TB of data migrated
    14. 14. MIGRATION STRATEGY: PROCESS • Phased migration of application code from DC to EMR • Critical ETL processes needed for other migrations being looked at first • Attempting to provide as similar an environment as DC • Unsupported components being custom built or packaged from standard Hadoop ecosystem – E.g. Oozie • Until migration completes, data will be available at both places
    15. 15. SYSTEM ARCHITECTURE Presentation Download Manager Command Line Tools Analytics Workbench Data Analytics Data Asset Catalog ETL Query Aggregates Platform Ingest Extract Workflow Nokia Encrypted File System Job Management Provisioning UAC Trust Server Scheduler Eventing Infrastructure Redshift Data S3 EMR Compute IAM Identity & Entity SNS Framework
    16. 16. ELASTIC MAPREDUCE - 101 • Provision a Hadoop cluster of given size, using given type of instances • Support for most of the ecosystem- Hive, Pig, HBase, etc. • Can scale up and down nodes for a cluster on demand • User submits ‘jobflows’ - a sequence of Hadoop jobs • Integrates with S3 as permanent store of data • Integrates with other Amazon services • Cost = Std. EC2 instance cost + extra + Std s3 ops etc.
    17. 17. WHAT CAN AN ENTERPRISE TIER ON EMR OFFER ? • Improve usability by providing better abstractions, necessary automation • Improve cost utilization by reusing infrastructure • Improve performance by providing system level optimizations
    18. 18. IMPROVE USABILITY • EMR expects users to know the cluster sizes when launching jobs • Users will either not know how to launch clusters, or will launch incorrectly sized ones. • Separate cluster management from job management. • Have the system (or administrators) launch clusters on behalf of users • Have the system submit jobs to appropriate clusters • Scale them according to the needs of the jobs automatically or administratively
    19. 19. IMPROVE COST UTILIZATION • Different cluster types in EMR: ephemeral (default) and static • Static clusters can also waste money (if unused) Go with a Hybrid model • Launch clusters on demand, but maximize the cost to utilization ratio - keep them alive at least for an hour • Reuse them for other jobs transparently • • Ephemeral clusters can be a huge cost drain Note: minimum charges for a hour • Shutdown if not used anymore • Saved $2000 in a month with this strategy
    20. 20. COLLABORATIONS • Remove bias towards alternatives that perpetuate status quo • Accelerate development of insights • Invent-it-ourselves and diminishing returns • Agile (Training and accelerated learning) • Together we win! • Opportunistically seeking areas where we can solve common problems together • Intense collaboration on highly complex areas
    21. 21. IN CONCLUSION • The Cloud offers new opportunities, but new challenges as well. Evaluate & Embrace • Be open to work being done by others – stand on the shoulders of giants • Be agile – value simplicity and quick feedback, even if it is a failure • Persist. Seemingly tough problems are solvable, one step at a time
    22. 22. THANK YOU !
    23. 23. BACKUP
    24. 24. A MANAGED BIGDATA PLATFORM BigData Infrastructure Data Process
    25. 25. FIRST HADOOP PLATFORM @ NOKIA - OPTIONAL Access node Ingest Namenode Access node Extract Access node JobTracker Scheduler Access node Workflow Edge Nodes Control Tier Slave Slave Slave Slave Slave Slave Slave Slave Slave Slave Slave Slave Hadoop Cluster
    26. 26. BEYOND NUMBERS • Maturity level of Hadoop • We should talk about our journey of using a vendor, then beginning to tailor it, then seeding a team, then contributing to open source and ….. • How do you choose a distribution vendor • Here we should just say we went with a vendor we were comfortable then based on – knowledge of the space, acquaintances and growing industrial repute
    27. 27. TODO: MISCELLANY ? • How much did we achieve this in Slough • Maturity level of Hadoop – how did it progress. • How do you choose a distribution vendor
    28. 28. LEARNING FROM THE PAST - OPTIONAL • SHOULD democratize data • Make data discoverable, accessible to everyone • Canonicalize data attributes, formats, etc. • SHOULD consolidate heavy lifting processes • Develop ways to reuse ETL, aggregation frameworks
    29. 29. IMPROVE USABILITY • EMR API expects some repetitive setup steps as part of job submission. E.g. Hive setup for all Hive jobs • Simple, preventable errors in jobflows can result in long feedback cycles and unnecessary cost. E.g. presence or absence of file locations • Provide a service API with a simpler interface that automates the setup. • Let the service API deal with simple validations before the job is submitted to EMR.
    30. 30. A JOB MANAGEMENT SYSTEM Manage Has knowledge of how to convert a user jobflow to an EMR jobflow. Also knows how to submit jobflows to clusters identified by cluster manager Job Management Service Job Executor Front-end service API for users to submit their jobs. Monitors running jobs on clusters using CloudWatch (or similar system), and determines whether to add / delete more nodes to a cluster Resource Estimator provisioning, monitoring and terminating clusters. Matches job requests to suitable clusters based on policy Cluster Manager Pool of clusters brought up either on demand or predetermined, based on requirements of resource requirements, longevity, etc.
    31. 31. QUERY – QUBOLE INTEGRATION

    ×