Successfully reported this slideshow.
Your SlideShare is downloading. ×

Big Data for Data Scientists - Info Session

Ad

Big Data for Data Scientists
Trends and Use Cases
WeCloudData
@WeCloudData @WeCloudData tordatascience
weclouddata
WeCloud...

Ad

Career
Services
Meetup
Events
Introduction
Data Skills
Training
WeCloudData offers Toronto’s first data
science accelerato...

Ad

WCD works with some of the most
talented and experienced data
science experts to deliver public
and corporate trainings. W...

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Loading in …3
×

Check these out next

1 of 85 Ad
1 of 85 Ad

Big Data for Data Scientists - Info Session

In this talk, WeCloudData introduces the Hadoop/Spark ecosystem and how businesses use big data tools and platforms. For more detail about WeCloudData's big data for data scientist course please visit: https://weclouddata.com/data-science/

In this talk, WeCloudData introduces the Hadoop/Spark ecosystem and how businesses use big data tools and platforms. For more detail about WeCloudData's big data for data scientist course please visit: https://weclouddata.com/data-science/

More Related Content

Big Data for Data Scientists - Info Session

  1. 1. Big Data for Data Scientists Trends and Use Cases WeCloudData @WeCloudData @WeCloudData tordatascience weclouddata WeCloudData tordatascience
  2. 2. Career Services Meetup Events Introduction Data Skills Training WeCloudData offers Toronto’s first data science accelerator program. We specialize in teaching lead-edge tools such as AWS, Spark, and Machine Learning and help our corporate clients upskill/reskill their data teams
  3. 3. WCD works with some of the most talented and experienced data science experts to deliver public and corporate trainings. We currently have 21 part-time and 2 full-time instructors. Our instructors bring their analytical expertise from various industries, teach students advanced tools such as Python, Hadoop, Spark, and AWS, mentor students on end- to-end data projects. Introduction Faculty Team 21 Instructors 10 Teaching Assistants
  4. 4. Python for SAS and SQL Users Machine Learning | Deep Learning Big Data Executive Workshops Product & Services Corporate Training We offer customized corporate training to Canadian companies with flexible schedules and learning support! We help train, upskill, and reskill data teams!
  5. 5. Python for SAS Users Machine Learning Big Data AI/DS for Executives Corporate Data Programs We’ve delivered customized trainings to many large Canadian companies WeCloudData Corporate Program We offer customized corporate training to Canadian companies with flexible schedules and learning support! We help train, upskill, and reskill data teams!
  6. 6. Introduction Communities we’re building 8,000 members 120 events We organize one of the most active DS communities in Canada!
  7. 7. Workshop Provider Conference/Clients Workshop Provider TMLS Conference November, 2018 Workshop Provider TD Canada Analytics Month October, 2018 • Machine Learning Open Data • Spark ML and MLflow • Deep Learning with PyTorch • Python for SAS Users • Machine Learning with Python Workshop Provider Big Data & AI Toronto 2019 June, 2019 • Big Data in AWS Cloud • Spark for Data Science • Moving from On-Prem to Cloud WeCloudData is the conference workshop choice of vendors in Toronto due to our expertise and specialty.
  8. 8. Analytics Events We help companies with hiring/branding events WeCloudData organizes one of the largest and most active data science communities in Toronto with 7,500 members and 110 past events. We help companies facilitate mini- conferences and help them run hiring events.
  9. 9. 2005 2007 2008 2010 2011 2015 2012 2014 2016 2018 Instructor Shaohua Zhang • Co-founder and CEO of WeCloudData. Lead instructor for the corporate training program • Certified SAS Predictive Modeler since 2007 (among the first 20 in the world) • Helped build and lead the data science team at BlackBerry (2010 – 2015) • Helping Communitech incubator and Open Data Exchange mentor startups on data strategies • Specializes in machine learning, big data, and cloud computing
  10. 10. Learning Path Data Science Program Prerequisites Data Science Learning Path Learn to build ML models using Sklearn ML Applied Master data wrangling with Python Data Science w/ Python Harness big data with Hadoop, Hive, Presto, and AtScale Big Data Build your portfolio with hands-on Capstone projects ML Advanced Machine Learning at Scale with PySpark ML and Real-time Deployment Spark Contact us about the courses: • info@weclouddata.com Upcoming courses: • https://weclouddata.com/upcoming-course-schedule
  11. 11. Linux/Docker Scala Spark Programming for Data Engineering Hadoop/Hive Data Ingestion Workflow NoSQL ETL (Big Data) Spark Internals Spark Tunings Spark In-Depth Kafka Spark Streaming Apache Flink Realtime Analytics Scaling ML Model Deployment Pipeline Automation Machine Learning Engineering Learn to build data pipelines, scale data processing with big data tools, and deployment real-time applications and machine learning models at scale. Data Engineering Learning Path Learning Path Data Engineering Program Contact us about the courses: • info@weclouddata.com Upcoming courses: • https://weclouddata.com/upcoming-course-schedule
  12. 12. Data Scientist
  13. 13. Data Jobs in the MarketData Handling Complex Analytics Big Data Storytelling Data Science Data Scientist
  14. 14. Coding/Tools Math/ML Storytelling Data Scientist Linux Python/Scala/Java Cloud (AWS) Hadoop, Spark Statistics Linear Algebra Regression Classification Clustering NLP Presentation Use cases Project Mgmt Communications Data Science Essential Skills
  15. 15. Data Scientist Data Analyst Data Science Job requirements
  16. 16. Data Application Scraping/API Labeled data Infra/ Platform RDBMS Hadoop Cloud Data Engineering ETL Enrichment Dataflow automation AI/ML Python ML Deployment Prediction API Stream processing Data Science The Myth
  17. 17. Data Scientist The Types Operational DS Focus: data wrangling, work with large/small messy data, builds predictive models Strength: data handling, tools, business knowledge ML Engineer Focus: ML model deployment, data pipelines Strength: coding, algorithms, machine learning, platforms and tools ML Researcher Focus: algorithm development, research, IP Strength: ML/DL algorithms, implmentation, research DS Product Mngr Focus: product strategy, business communications, project management Strength: product sense, business requirements, DS acumen
  18. 18. Data Scientists are like unicorns… so they’re hard to find. Let’s focus on building the data science teams.. that have data scientists, engineers, and analysts working towards the same goal. Data Science Team
  19. 19. 2008 2010 2015 2016 2018 Predictive Modeler Grad School Data Scientist Data Scientist Instructor DS Trainer Mentor My DS Journey Shaohua Zhang Operational Data Scientist Product Manager Data/ML Engineer Tools Projects Churn Up-sell/Cross-sell Social Network Recommender Big Data Cloud Chatbot Deployment HR | Retail | Digital Analytics Predictive Maintenance
  20. 20. Predictive Modeler GrowthAcquisition Maturity Decline Loss ● Lead Gen ● Digital Mktg ● Mobile Ads ● Cross/Up-sell ● Segmentation ● CLTV ● Taste graph ● Personalization ● Loyalty Management ● Context-based Mkgt ● Churn models ● Retention Acquisition Models LTV Loyalty Management Retention Winback Customer Value ● Winback models Predict high risk customers
  21. 21. Data Scientist
  22. 22. Data Scientist
  23. 23. Twitter API Data Scientist Business Our new product feature received a lot of negative review.. - Can we do some analysis?
  24. 24. Data Scientist Business Our new product feature received a lot of negative review.. - Can we do some analysis? The analysis looks good. Can we build a small tool?
  25. 25. DS Trainer
  26. 26. Big Data Analytics
  27. 27. Data Collection
  28. 28. Credit Approval Age Gender Annual Salary Months in Residence Months in Job Current Debt Paid off Credit Client 1 23 M $30,000 36 12 $5,000 Yes Client 2 30 F $45,000 12 12 $1,000 Yes Client 3 19 M $15,000 3 1 $10,000 No Client 4 25 M $25,000 12 27 $15,000 ? Data Preparation
  29. 29. Data Processing Engines
  30. 30. Relational Database NewSQLNoSQL NoSQL Goolge F1NoSQL GraphDB Search Cache Databases
  31. 31. Analytics Tools
  32. 32. Analytics Data Pipelines
  33. 33. Credit: https://arxiv.org/pdf/1409.3809.pdf GET /velox/catify/predict?userid=22&song=277568 GET /velox/catify/predict_top_k?userid=22&k=100 Velox Prediction Service Model Manager Web Application HDFS The Missing Piece ML Prediction API
  34. 34. HADOOP ECOSYSTEM
  35. 35. Big Data – 4 V’s Paris $1000mVolume London $1000mVelocity Tokyo $1000mVariety New York $1000mValue “More data cross the internet every second than were stored in the entire internet just 20 years ago” - Big Data: The Management Review (HBR) Internet • 2.3 Zetabytes/day (2014) Facebook • 500 TB/day (2012) Programmatic Ads • 200ms Fraud Detection • 400ms Fraud Prevention • 50ms Structured • Relational Unstructured • Image / Voice / Text Semi-structured • Graph “Regardless of its size, data is worthless if not turned into actionable insight”
  36. 36. Internet o 2.5 exabytes (2.5x1018) per day – 2012 o 2.3 zettabytes (2.3x1021) per day - 2014 Facebook o 500+ terabytes per day o 100+ petabytes in a single Hadoop cluster “More data cross the internet every second than were stored in the entire internet just 20 years ago” - Big Data: The Management Review (HBR) VelocityVolume Variety Big Data - Volume
  37. 37. VelocityVolume Variety video Big Data – Velocity Demo
  38. 38. ¨ Data Variety ¤ Structured n Table n Relational ¤ Unstructured n Text n Image n Audio/Video ¤ Semi-structured n XML n JSON n Graph Big Data – Variety
  39. 39. History of Big Data (Hadoop) Hadoop Big Data Map Reduce Apache Spark Big data - Google Trends Google MapReduce Paper Doug Cutting got hired by Yahoo! to work on Hadoop Spark took off
  40. 40. Knowing more tools is always helpful. Knowing how to put them to work together is more important!
  41. 41. Hadoop Ecosystem
  42. 42. Single Node Architecture • Traditionally, computation has been CPU bound • Complex computation on small data • For decades, the primary push is to increase the computing power of a single machine
  43. 43. Scale Up vs. Scale Out • Single Node Architecture • Scaling up advantage • Programming is easier than distributed computing • Faster processing on smaller data • Scale up disadvantage • Hardware cost • Scalability • Advantage of scale-out systems • Scalability • Cost
  44. 44. Traditional Distributed Systems: Problems • Modern large scale processing is distributed across machines • Often hundreds or thousands of nodes • Focuses on distributing the processing workload • Powerful compute nodes • Separate systems for data storage • Fast network connections to connect them • Problems with these distributed systems: • Complex programming model • It is difficult to deal with partial failures of the system • Bandwidth limitations • Data consistency • Typically at compute time, data is copied to the compute nodes • This doesn’t scale to today’s big data problems!
  45. 45. Data Becomes the Bottleneck • Traditional distributed systems don’t scale to today’s Internet- scale data • Getting data to the computer processor becomes the bottleneck • Disk I/O is slow • Network bandwidth is bottleneck • Solution à moving computation to the data! Internet o 2.5 exabytes (2.5x1018) per day – 2012 o 2.3 zettabytes (2.3x1021) per day - 2014 Facebook o 500+ terabytes per day o 100+ petabytes in a single Hadoop cluster
  46. 46. Modern Distributed Computing Cluster • Cluster architecture • A medium-to -large Hadoop cluster consists of a two-level or three-level architecture built with rack-mounted servers. Each rack of servers is interconnected using a 1 Gigabyte Ethernet switch. Each rack-level switch is connected to a cluster-level switch (which is typically a larger port-density 10GbE switch) Stunning Photos Of Google's Massive Data Centers: http://www.forbes.com/pictures/edej45emjgl/up-above-the-massive-floor/
  47. 47. split node1 node2 node4node3 Block 1 Block 3Block 2 HDFS Hadoop Distributed File System
  48. 48. • The blocks are replicated to nodes throughout the cluster • Based on the replication factor (3 by default) • Replication increases reliability and performance • Reliability: can tolerate data loss • Performance: more opportunities for data locality HDFS - Replications split DN1 DN2 DN4DN3 Block 1 Block 3Block 2
  49. 49. • The NameNode stores all metadata • Information about file locations in HDFS • Information about file ownership and permissions • Name of the individual blocks • Locations of the blocks • Metadata is stored on disk and read into memory when the NameNode daemon starts up • Changes/Edits to the files are written to the logs The Name Node file à /user/lab/myFile.txt replication à 3 blocksà red,green,blue block locations à … Name Node DN1 DN2 DN4DN3
  50. 50. I wish to wish the wish you wish to wish, but if you wish the wish the witch wishes, I won’t wish the wish you wish to wish I wish to wish the wish you wish to wish, but if you wish the wish the witch wishes, I won’t wish the wish you wish to wish 1 1 11 1 1 1 1 1 I wish to the you 1 11 1 1 1 wish but if you the 1 1 1 1 1 1 witch wishes I won’t wish 1 1 the you to 1 1 1 1 1 1 4 2 1 1 I wish to the you 3 1 1 1 wish but if you the 2 1 1 1 1 witch wishes I won’t wish 4 the you to 1 1 1 but I if to the witch wishes won’t wish you 1 2 1 3 4 1 1 1 11 3 1 1 1 but I if 2 1 to the 1 1 1 witch wishes won’t wish 4 you 1 1 2 1 3 4 1 1 1 Documents Splitting Map Shuffle/SortCombine Reduce MapReduce handles these automatically for you!! MapReduce - WordCount
  51. 51. HiveMapReduce (Java)
  52. 52. Slave Slave Slave Hadoop Hive HDFS hive > create table tweets_filter as > select * from tweets > where to_date(ts) in (‘2010-03-02’, ‘2010-0303’) Hive Driver Interpret the query Optimize the computation Create job plan and send to Hadoop Hive CLI TT 1 MySQL Metast ore Master Job2398564 Apache Hive Map JobTracker NameNode TT 2 TT 3
  53. 53. ImpalaPresto SQL on Hadoop
  54. 54. Apache Presto Advantage Daily/Hourly Batch Jobs Interactive Queries Daily/Hourly Batch Jobs Interactive Queries
  55. 55. Apache Presto Advantage Daily/Hourly Batch Jobs Interactive Queries SQL on any datasets
  56. 56. Apache Kylin (OLAP)
  57. 57. AtScale
  58. 58. Processing Engine
  59. 59. 10x – 100x MapReduce vs. Spark
  60. 60. Multi-core CPUs RAM Hard Drive SSD Nodes in a different rack Network 1Gb/s or 125 MB/s 100 MB/s 600 MB/s 10GB/s 0.1Gb/s RAM vs. Disk vs. Network
  61. 61. • A unified platform that supports many data processing needs including • Batch processing (Spark) • Stream processing (Spark Streaming) • Interactive (SparkSQL) • Iterative (MLlib, ML, GraphX, GraphFrame) Spark - Unified Data Platform O ne size fits m any!
  62. 62. Visualization Advanced Analytics Data Processing Database Data Scientist Toolbox (Big Data) Enterprise - Traditional
  63. 63. Visualization Advanced Analytics Data Processing Platform Data Scientist Toolbox (Big Data) Enterprise – New/Cloud
  64. 64. Visualization Advanced Analytics Data Processing Data Lake Data Scientist Toolbox (Big Data) Startups | Tech | Digital Labs | Big Data Teams
  65. 65. Visualization Advanced Analytics Data Processing Data Lake Data Scientist Toolbox (Big Data) Enterprise – New Trend
  66. 66. Big Data for Data Scientists Course Detail
  67. 67. Learning Path Data Science Program Prerequisites Data Science Learning Path Learn to build ML models using Sklearn ML Applied Master data wrangling with Python Data Science w/ Python Harness big data with Hadoop, Hive, Presto, and AtScale Big Data Build your portfolio with hands-on Capstone projects ML Advanced Machine Learning at Scale with PySpark ML and Real-time Deployment Spark Contact us about the courses: • info@weclouddata.com Upcoming courses: • https://weclouddata.com/upcoming-course-schedule
  68. 68. Big Data for Data Scientists About this course • For learners who want to get started with big data, the sheer number of tools in the ecosystem always feels overwhelming and confusing. With a well-structured curriculum and instructors who have years of industry experience implementing big data solutions, the Big Data for Data Scientist will help you focus on learning the tools that matter the most. • This course covers several popular big data platforms and frameworks that modern data scientists and analysts need to master. Students learn throughout the course to integrate different tools such as Hadoop, Hive, Presto, AWS, and NoSQL to solve real- world data challenges. • The course is built around an end-to-end big data pipeline to process terabyte scale data (billions of records) in a cloud environment. Students gain first-hand experience on data collection, ingestion, distributed storage, distributed processing, and interactive visualizations. • Many big data use cases will be covered to help consolidate the learnings and most importantly students gain real-life experience and confidence to apply the knowledge learned back to their data science projects at work.
  69. 69. Big Data for Data Scientists Who is this course for? • This course serves as a great foundational course for professionals who want to switch career, graduates who want to get into this field as a data scientist, and big data enthusiasts who want to learn the hottest big data tools such as Hadoop, Hive, Presto, AWS, and NoSQL and apply them to solve real-world big data problems. • For new graduates and job seekers, this course teaches you the essential big data tools and concepts required for modern data scientist jobs and then complementary big data interview questions will get your prepared for interview challenges. • For data scientists who want to gain new skills, the course will give you comprehensive view of the big data ecosystem and prepare you for the big data tasks at work. • For tech-savvy project managers who want to gain a comprehensive understanding of big data use cases and lifecycles, the hands-on project in this course gives you exactly what you hope for.
  70. 70. Big Data for Data Scientists Learning outcome After this course, the students will be able to • Gain competence to take on real data challenges at workplace and demonstrate experience and advantage in the job market with the learned skills added to the resume • Gaining solid understanding of the Big Data ecosystem and various real-world use cases • Comfortable working with different big data platforms such as Hortonworks and AWS EMR, run Hive ETL pipelines and querying large datasets with Apache Presto • Build and automate data pipelines with Apache Airflow and build a project demo via visualization dashboard with Superset • Gain real world experience through a hands-on project and convince your manager/peers that you’re up for big data related projects at work
  71. 71. 2005 2007 2008 2010 2011 2015 2012 2014 2016 2018 Big Data for Data Scientists Instructor – Shaohua Zhang • Co-founder and CEO of WeCloudData. Lead instructor for the Big Data course and the corporate training program • Helped build and lead the data science team at BlackBerry (2010 – 2015) • Helping Communitech incubator and Open Data Exchange mentor startups on data strategies • Specialize in machine learning, big data, and cloud computing
  72. 72. Big Data for Data Scientists Prerequisites Prerequisites • You do not need prior experience with programming languages such as python, but it helps! • Familiarity with Linux Commands, SQL and relational database concepts • Having an understanding of your company’s big data use case, technologies, and goals will motivate and direct your focus in this course
  73. 73. Lecture Content Lecture Content 1 Big Data • Introduction to Big Data • Big Data Use Cases • AWS – EC2/S3 7 Spark Core • Introduction to Spark Core • Spark RDD Operations 2 Hadoop • Hadoop Data Distributed Filesystem • MapReduce with Python • AWS - EMR 8 Spark DataFrame | SQL • Spark DataFrame and SQL • Complex Transformations and UDFs 3 Apache Hive | Sqoop • Hive Introduction • Hive Queries • Apache Sqoop • Project kick-off 9 Spark Performance Tuning • Spark Internals • Performance Tunings 4 SQL on Hadoop • Presto/Impala • Apache Kylin/AtScale 10 Spark ML • Spark Machine Learning API • Building Classification and Regression Models 5 NoSQL • Amazon DynamoDB • Cassandra • Elasticsearch 11 Spark ML II • Recommender System with Spark • Deep Learning on Spark 6 Data Pipeline • Data pipeline with Airflow • Visualization with Superset • Project Discussion 12 Spark Streaming • Kafka/Kinesis • Spark Streaming • Project Presentation Syllabus Big Data for Data Scientists Syllabus (Weekend Cohort – 12 sessions/48 hours)
  74. 74. Big Data for Data Scientists Industry Use Cases In this course, we not only teach students how to use the big data tools, but also common use cases. Understanding the real-world use cases and industry best practices will allow the students to apply skills to their company’s data problems Use Cases • Big data use cases in retail personalization • Big data use cases for retail banking • Big data use cases for fraud analytics • Big data use cases in compliance analytics • Big data use cases in online advertising
  75. 75. Big Data for Data Scientists Hands-on Project This course is instructor-led and project-based. Students will be able to apply the big data knowledge acquired during the lectures build an end-to-end big data project. Project: Building an AWS-based Big Data Pipeline • Real-time data collection and ingestion via Kinesis and NoSQL • Build Hive databases and ETL pipelines • Interactive data analysis with Presto • Building streaming MOLAP cubes with Apache Kylin • Real-time dashboard with Apache Superset • Workflow automation with Apache Airflow Data Size: 500GB ~ 1TG Records: 1 billion + Twitter API Kinesis
  76. 76. Student Project Demo Stock price prediction using twitter sentiment and deep learning
  77. 77. Student Project Demo Real-time Twitter Sentiment Pipeline
  78. 78. Student Project Demo Real-time Twitter Sentiment Pipeline
  79. 79. Big Data for Data Scientists Learning Support Support you will receive during this course include • Mentorship and advice from an industry expert • In-classroom learning assistance by our assistant instructor • Online learning support on Slack from instructor and TA • Hands-on labs and projects to help you apply what you learn • Additional resources to help you gain advanced knowledge • Help from our learning advisor on how to choose the learning path and specialization courses after the Big Data course
  80. 80. Big Data for Data Scientists Testimonials This course really helped me with in-depth explanation and application of Cloud and Big Data technologies. The lead instructor is very enthusiastic and gifted with years of industry experience as a chief data scientist. The course has a well-designed with systematic curriculum structure where you get to learn each component of the Big Data Ecosystem with a big picture of the whole Machine-Learning pipeline (online and offline). Jason Lee Student Testimonial I took the Big Data course with WeCloudData. The course introduces the latest big data tools and platforms such as Apache Hadoop and Amazon Web Services, as well as real-world use cases and industrial best practices. The course also includes an end- to-end group project which will definitely be something you can be proud of. I chose this course basically because my company uses Apache Spark and Hadoop distributed system, and I would like to learn more about it. Surprisingly, what I learned from this course has been far beyond my expectation! I wish I knew WeCloudData earlier so that I wouldn't have been that struggled at work. I would also like to express my gratitude and appreciation to the instructor Shaohua in this course. He is extraordinarily knowledgeable and experienced, one of the best instructor I have ever seen! The way he approaches to a theory is really straightforward and easy to understand. He is nice and patient while answering questions as well, and always makes sure every student is on the right track. The program managers of WeCloudData are kind and amiable too. It was a great pleasure to talk with them! Grace Tian
  81. 81. Big Data for Data Scientists How to convince your employer Do you know that most employers will reimburse the training costs? • We have a detailed course syllabus and email template that you can use to convince your manager that this is the right course for you and a good investment for your company • You will have a completed project and presentation that you can use to demo to your manager and showcase your newly minted Big Data skills and get ready for more interesting data analytics projects
  82. 82. Big Data for Data Scientists Price Course Pricing Big Data & Spark for DS $2000 + tax
  83. 83. Upcoming WeCloud Events Event Schedules
  84. 84. Upcoming Events Schedule Track Meetup Org Topic Date Big Data WeCloudData Big Data for Data Scientist – Open Class Jun 4 Big Data WeCloudData Spark on Kubernetes Jun 5 Big Data Lightbend Running Kafka on Kubernetes with Strimzi Jun 11 Cloud Big Data & AI Conference Machine Learning from Experimentation to Production on AWS Jun 12 Big Data Big Data & AI Conference Transforming big data from On-premise to the Cloud Jun 12 Big Data Big Data & AI Conference Spark for Data Science Jun 13 Data Science Big Data & AI Conference Moving Towards a Python Environment Jun 13 Big Data | Data Science WeCloudData Machine Learning Deployment with Spark and Amazon Sage Maker Jun 16 Big Data | Data Science WeCloudData Apache Spark Hands-on Workshop Jun 18 tordatascience For details, visit https://www.meetup.com/tordatascience/
  85. 85. TYPE OF DATA JOB SEEKERS

×