Successfully reported this slideshow.

Learn to Use Databricks for Data Science

0

Share

Loading in …3
×
1 of 12
1 of 12

Learn to Use Databricks for Data Science

0

Share

Download to read offline

Data scientists face numerous challenges throughout the data science workflow that hinder productivity. As organizations continue to become more data-driven, a collaborative environment is more critical than ever — one that provides easier access and visibility into the data, reports and dashboards built against the data, reproducibility, and insights uncovered within the data.. Join us to hear how Databricks’ open and collaborative platform simplifies data science by enabling you to run all types of analytics workloads, from data preparation to exploratory analysis and predictive analytics, at scale — all on one unified platform.

Data scientists face numerous challenges throughout the data science workflow that hinder productivity. As organizations continue to become more data-driven, a collaborative environment is more critical than ever — one that provides easier access and visibility into the data, reports and dashboards built against the data, reproducibility, and insights uncovered within the data.. Join us to hear how Databricks’ open and collaborative platform simplifies data science by enabling you to run all types of analytics workloads, from data preparation to exploratory analysis and predictive analytics, at scale — all on one unified platform.

More Related Content

Learn to Use Databricks for Data Science

  1. 1. Learn to Use Databricks for Data Science Sean Owen, Principal Solutions Architect Austin Ford, Sr. Product Manager
  2. 2. Data Science is a tough job ▪ Today, companies are becoming more and more data-driven, and the ones getting the most out of their data will be the ones to succeed ▪ As a result, Data Science is now a core capability of many businesses ▪ Unfortunately, it comes with a challenging, complex workflow at scale
  3. 3. What does a data science workflow look like? I need the correctly sized compute resource for my task I need to be able to find and access the right data sources to fuel my analysis I need to be sure my toolbox is ready with the packages and libraries required for my work 1. Setup I’ve been given a business question to answer with data. Before I can even get started on the data science, I need to set up my development environment.
  4. 4. What does a data science workflow look like? I uncover insights through statistical inference, modeling, or other methods I start with exploratory data analysis to familiarize myself with the data and form hypotheses I synthesize the results of my work and the answers to the original business question 2. Data Science Once the initial overhead of setup is complete, the real work begins. At any point, I could be sent back to the Setup phase to add another data source, change the size of my compute resource, or pull in another library.
  5. 5. What does a data science workflow look like? I share the results with my business stakeholders via email or Slack I formulate the results into a report or dashboard so they can be consumed I get feedback about my work from my stakeholders and iterate with them to have the biggest impact 3. Sharing Results The most important step comes once I finish the analysis: sharing the results with my stakeholders.
  6. 6. Our answer: The Databricks Lakehouse Platform We want to remove the overhead so you can focus on the most important part of your work — data science
  7. 7. Structured Semi-structured Unstructured Streaming BI & SQL Analytics Machine Learning Real-time Data Applications Data Management & Governance Open Data Storage Data Science & Engineering Lakehouse Platform Simple | Open | Collaborative Reliable | Scalable | Secure
  8. 8. Structured Semi-structured Unstructured Streaming BI & SQL Analytics Machine Learning Real-time Data Applications Data Management & Governance Open Data Storage Data Science & Engineering Lakehouse Platform Simple | Open | Collaborative Reliable | Scalable | Secure Our focus today
  9. 9. Databricks makes setup easy 1. Setup The Lakehouse brings all your company’s data together into a single place so you don’t have to go digging through a variety of data sources Easily choose the right compute resource for your task and switch as needed single-machine VMs GPUs Spark clusters Databricks’ runtimes come prepackaged with the most common data science tools, and customization is easy Add Python libraries on top of a runtime with a single line of code
  10. 10. Databricks has the tools to enable you to focus on your work 2. Data Science Multi-language, collaborative notebooks with co-presence, commenting, and co-editing Built-in visualizations that take you from raw data to insights in two clicks Auto-logged revision history and a git integration to ensure reproducibility and enable version control
  11. 11. Databricks lets your share results and iterate quickly 3. Sharing Results Easily share your notebooks with stakeholders, who can view them as reports Create a dashboard directly from your notebook’s results Iterate with your stakeholders directly in the notebook through comments and co-presence
  12. 12. Getting practical: hands-on with an expert Sean Owen Principal Solutions Architect

×