In this course students learned what the expected output of Data Scientist is and how they can use PySpark (part of Apache Spark) to deliver against these expectations. The course assignments included Log Mining, Textual Entity Recognition, Collaborative Filtering exercises that teach students how to manipulate data sets using parallel processing with PySpark.
Certificate of Accomplishment: Introduction to Big Data with Apache SparkFolco Bombardieri
Certificate of Accomplishment for "Introduction to Big Data with Apache Spark" Course from edX (offered by The University of California, Berkeley).
Part of the Big Data XSeries courses.
Duration: 5 weeks
Authenticity of the certificate can be verified at: https://verify.edx.org/cert/1a54a3e6ffa44c8d8aa24986adda4efd
Organizations use their data for decision support and to build data-intensive products and services, such as recommendation, prediction, and diagnostic systems. The collection of skills required by organizations to support these functions has been grouped under the term Data Science. This course will attempt to articulate the expected output of Data Scientists and then teach students how to use PySpark (part of Apache Spark) to deliver against these expectations. The course assignments include Log Mining, Textual Entity Recognition, and Collaborative Filtering exercises that teach students how to manipulate datasets using parallel processing with PySpark.
In this course students learned what the expected output of Data Scientist is and how they can use PySpark (part of Apache Spark) to deliver against these expectations. The course assignments included Log Mining, Textual Entity Recognition, Collaborative Filtering exercises that teach students how to manipulate data sets using parallel processing with PySpark.
Certificate of Accomplishment: Introduction to Big Data with Apache SparkFolco Bombardieri
Certificate of Accomplishment for "Introduction to Big Data with Apache Spark" Course from edX (offered by The University of California, Berkeley).
Part of the Big Data XSeries courses.
Duration: 5 weeks
Authenticity of the certificate can be verified at: https://verify.edx.org/cert/1a54a3e6ffa44c8d8aa24986adda4efd
Organizations use their data for decision support and to build data-intensive products and services, such as recommendation, prediction, and diagnostic systems. The collection of skills required by organizations to support these functions has been grouped under the term Data Science. This course will attempt to articulate the expected output of Data Scientists and then teach students how to use PySpark (part of Apache Spark) to deliver against these expectations. The course assignments include Log Mining, Textual Entity Recognition, and Collaborative Filtering exercises that teach students how to manipulate datasets using parallel processing with PySpark.
This course introduces the underlying statistical and algorithmic principles required to develop scalable real-world machine learning pipelines. We present an integrated view of data processing by highlighting the various components of these pipelines, including feature extraction, supervised learning, model evaluation, and exploratory data analysis. Students will gain hands-on experience applying these principles by using Apache Spark to implement several scalable learning pipelines.
This course introduces the underlying statistical and algorithmic principles required to develop scalable real-world machine learning pipelines. We present an integrated view of data processing by highlighting the various components of these pipelines, including feature extraction, supervised learning, model evaluation, and exploratory data analysis. Students will gain hands-on experience applying these principles by using Apache Spark to implement several scalable learning pipelines.
1. 7/26/2016 BerkeleyX CS105x Certificate | edX
https://courses.edx.org/certificates/f70ab6fc8b8c42f08a030fc1c38257c0 1/1
V E R I F I E D
CERTIFICATE of ACHIEVEMENT
This is to certify that
Jitendra Gehlot
successfully completed and received a passing grade in
CS105x: Introduction to Apache Spark
a course of study offered by BerkeleyX, an online learning initiative of University of
California, Berkeley through edX.
Anthony D. Joseph
Professor in Electrical Engineering and Computer Science
University of California, Berkeley
Diana Wu
Executive Director,
Berkeley Resource Center for Online Education
University of California, Berkeley
VERIFIED CERTIFICATE
Issued July 25, 2016
VALID CERTIFICATE ID
f70ab6fc8b8c42f08a030fc1c38257c0