Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Databricks
● What is Databricks ?
● Cloud services used
● Functionality
● Languages
● Spark Usage
●
3rd
Party Apps
● Archi...
Databricks – What is it ?
● A Cloud based Apache Spark cluster service
● Offers scalable Spark clusters based on AWS
● Dev...
Databricks – Cloud Services
● Currently uses Amazon AWS
● Uses EC2 and has access to S3 buckets
● Uses a minimum of 2 EC2 ...
Databricks – Functionality
● Architecture based on Notebooks and folders
● Has a cluster manager for
– Defined (min 54gb) ...
Databricks – Languages
● Can have Notebooks in
– Scala
– Python
– SQL
● SQL can be executed in non SQL Notebooks
● Markdow...
Databricks – Spark Usage
● Lastest Spark version available
– i.e. DB 1.3.4 uses Spark 1.3.1 at June 2015
● All Spark modul...
Databricks – 3rd
Party Apps
● Current available and more to come
– Pentaho
– Qlik
– Tableau
– TIBC Jaspersoft
– PanTera
– ...
Databricks – Architecture
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Available Books
● See our Hadoop book from Apress / Springer
– “Big Data Made Easy”
● Look out for our Apache Spark based ...
Contact Us
● Feel free to contact us at
– www.semtech-solutions.co.nz
– info@semtech-solutions.co.nz
● We offer IT project...
Upcoming SlideShare
Loading in …5
×

An introduction to Databricks

823 views

Published on

A introduction to Databricks, what is it and
how does it work ? What can it do ?

Published in: Technology
  • Be the first to comment

  • Be the first to like this

An introduction to Databricks

  1. 1. Databricks ● What is Databricks ? ● Cloud services used ● Functionality ● Languages ● Spark Usage ● 3rd Party Apps ● Architecture ● Books www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  2. 2. Databricks – What is it ? ● A Cloud based Apache Spark cluster service ● Offers scalable Spark clusters based on AWS ● Developed by the same people who created Spark ● Multiple cluster management ● Job scheduling and library import ● Offers access to all Spark modules www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  3. 3. Databricks – Cloud Services ● Currently uses Amazon AWS ● Uses EC2 and has access to S3 buckets ● Uses a minimum of 2 EC2 instances ● Attempts to optimise EC2 usage ● Plans to extend to other cloud providers www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  4. 4. Databricks – Functionality ● Architecture based on Notebooks and folders ● Has a cluster manager for – Defined (min 54gb) clusters – Spot clusters – On Demand clusters ● Has a job manager and scheduler ● Has user management ● Has full Spark functionality ● Has strong data visualisation capability ● Can export reports and dashboards www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  5. 5. Databricks – Languages ● Can have Notebooks in – Scala – Python – SQL ● SQL can be executed in non SQL Notebooks ● Markdown comments can be placed in Notebooks ● Notebooks can be shared by multiple sessions ● Libraries can be imported and called in Notebooks www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  6. 6. Databricks – Spark Usage ● Lastest Spark version available – i.e. DB 1.3.4 uses Spark 1.3.1 at June 2015 ● All Spark modules available – SQL, GraphX, MlLib, Streaming ● Strong integration between modules and visualisation ● Extensive use of tables to import data ● Tables available via SQL www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  7. 7. Databricks – 3rd Party Apps ● Current available and more to come – Pentaho – Qlik – Tableau – TIBC Jaspersoft – PanTera – ZoomData www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  8. 8. Databricks – Architecture www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  9. 9. Available Books ● See our Hadoop book from Apress / Springer – “Big Data Made Easy” ● Look out for our Apache Spark based book – from Packt in 2015 www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  10. 10. Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems

×