SlideShare a Scribd company logo
1 of 11
www.cict.iba.edu.pk
education providing professional diplomas, certifications, and workshops in the field of Information technology, Information
Systems, and Computer Science. The CICT was established in 2016 providing high-quality professional education to the private
and public sectors in Pakistan. The CICT has been associated with renowned faculty members who conduct certain courses and
workshops which contribute to the digitalization in the educational and professional sectors.
The Center for Information & Communication Technology (CICT) aspires to meet the highest standards of IT excellence
required in pursuit of management strategies.
The Center for Information & Communication Technology (CICT) is to provide excellent teaching and research environment
specially in Information Technology to produce students/professionals who distinguish themselves by their professional
competence, research, entrepreneurship, humanistic outlook, ethical rectitude, pragmatic approach to problem solving,
managerial skills and ability to respond to the challenge of socio-economic development to serve as the vanguard of techno-
industrial transformation of the society.
• Be respectful to your teachers as well as your classmates, any kind of disrespect or misbehavior will be subject to dismissal from the course.
• Be on time for your classes and avoid being irregular.
• All students are required to carry their ID Cards provided by the administration to avoid any inconveniences while entering the premise.
• In case of Card loss, the student is required to report to the department to state their case.
• Avoid usage of mobile devices during your sessions, the instructor has a full right to dismiss any student who has been found using the phone during
class.
• Do not bring any food or drinks inside your classrooms especially computer labs as it is strictly prohibited.
• Don't forget to turn off the machine once you are done using it.
• Do not plug in external devices without scanning them for computer viruses.
Note: IBA plays a vital role in maintaining decorum which applies to every candidate entering the premises. Keeping in mind that the campus is Smoke-
free any candidate carrying out any mal activity is reluctantly charged & may result is dismissal from the course.
The convergence of big data and machine learning with technologies such as cloud services, sensors, ubiquitous computing, mobile
devices, and the Internet of Things has created vast new opportunities for business. Analytics has become a competitive and
sustainable advantage for many organizations. To harness the benefits of big data and machine learning, however, business
leaders face the pressing challenge of not only acquiring the right technologies and talent to analyze and interpret the data but also
weaving a data-centric mindset into the organization's structure and cultural fabric.
This Four-Month Diploma will empower me with the skills and confidence to tackle data-driven opportunities and accelerate data-
analysis transformation in the organization. Through lectures, case studies, and discussions, real-world insights will be gained on
various applications of big data analytics and machine learning, and how they can be used to fuel better decision-making within the
context of the attendee's own Department/Organization.
Mr. Sohail Imran:
For more than 18 years, Mr. Sohail Imran is conducting
training and workshops for databases (SQL and
NoSQL), Big Data Infrastructure, and Machine
Learning for different institutes, universities, and the
corporate sector. More than 8 years of professional
experience in Big Data Analytics, Data Science, Data
Mining, Data Warehousing, and DBMS (SQL and
NoSQL). Providing consultancy in designing and
developing Big Data Analytics platforms using Java,
RapideMiner, Radoop, Python, Hadoop, Hive, Spark,
Kafka, Spark Streaming, Storm, etc.
Mr. Muhammad Rizwan:
Muhammad Rizwan provides digital leadership to
organizations, from strategy to execution, globally and
locally. In his 25+ corporate career, he has worked in both
public and private equity spaces with technology-led and
digitally-enabled businesses in various management
vernaculars, including C-Suite. He holds a certificate from
Stanford University in Machine Learning. A master's degree
from Hamdard. Bachelor's degree in Statistics/Commerce
and Information Systems. He also carries an international
diploma in Software Engineering. Currently, heading the
information systems at Dollar Industries (Pvt) Ltd. He has
worked for Hino Pak Motors, Karachi Stock Exchange, and
CPLC (Car theft software) projects.
Dr. Affan Alim:
Dr. Muhammad Affan Alim has 16 years of teaching, research, and development experience in
Machine Learning, Deep Learning, Data Science, pattern classification, computer vision,
Optimization of models, and statistical & mathematical analysis. He also has several years of
professional experience in software development in Pakistan and the United Kingdom (UK). He has
developed several industry-based projects.
Mr. Muhammad Shamim Ahmed:
With overall 10 years of experience in Project management, Oracle functional consultation,
and 4 years as an Oracle University trainer, Mr. Shamim has been a great asset to IBA
CICT. He has executed 5 projects to date in various capacities, with proven experience with
the complete life cycle of Oracle EBS implementation. He has been actively involved in all
stages of project management such as initiation, Business Blueprint (AS-IS Document),
GAP Analysis, Solution design, and much more. He is exceptionally motivated, energetic,
and enthusiastic about learning and teaching
• History and Evolution of Python
• Advantages of Apache Spark with Python in a Big Data Environment
• Setting up Big Data Programming Development Environments
• Programming Language Basics
• Collections and their types
• Conditional Control Structures
• Iterative Control Structures
• Methods with Practice Examples
• Module and File I/O
• Object-Oriented Programming Concepts
• Apache Spark and Python for Machine Learning
• Exam
I. Python Foundations for Big Data Analytics
• Intro to data science, Role of the database designer, data engineer, data analyst, and data scientist. what data scientists do.
• The available format of data and what types of data a data scientist received. The life cycle of data science, data science competitions
• The difference between wrangling and feature engineering, the steps of wrangling, and its detail
• Reading of different methods of the dataset using pandas python
Steps of feature engineering for machine learning. During understanding, some real examples will also be discussed. Requirement of the tools and techniques of data science.
• Missing data imputation using pandas. How to find the missing values in the dataset, what handling of missing values is important
• Fillna() method with different parameters for missing values.
• Drop all rows of missing data in Data Frame, and drop missing data rows with respect to a specific column.
• How to fill using aggregate values, how to fill forward and backward, and fill with reference to other columns. Practice questions
• The real application-based problem for missing values
• What is the outlier, what is the impact of an outlier in data, and how do find and visualize the outliers?
• Outlier removal strategies, Performing winsorization, a python implementation
II. Big Data Wrangling
• Three case studies for EDA
• Exploratory data analytics
• Handling the structuring issues in the dataset, inconsistencies in date, and any other attributes. Unnecessary character attachment, Exploring these issues
• Handling the structuring issues using regex
• Categorical Variables: Encoding Categorical Variables: on hot encoding, dummy encoding, effect encoding, pros and cons of a categorical variable encoding
• Discussion of project
• Live demonstration of Kaggle and its features
• how Kaggle will helpful for data scientist
• Exam
II. Big Data Wrangling
III. Business Intelligence (BI) and Big Data Visualization
• Introduction to BI & commonly using BI tools.
• Power BI introduction & its components.
• Power View, Query, Pivot & Power BI Service
• Introduction to Power Query and its usage.
• Basic Power BI Navigation.
Basic Power BI Charts.
• Column Chart.
• Stacked Column Chart.
• Pie Chart.
• Donut Chart.
• Funnel Chart.
• Ribbon Chart.
• Include and Exclude.
• Export data from Visual.
Maps in Power BI Desktop
• Map.
• Filled Map.
• Map with Pie Chart.
• Formatting in Map.
• Background Changes in Map.
• Map of Pakistan in Power BI.
• Map of Australia.
Table & Matrix in Power BI Desktop
• Creating a Simple Table.
• Formatting in Table.
• Conditional Formatting in Table.
• Changing Aggregation in Table.
• Creating a Matrix in Power BI.
• Conditional Formatting in Matrix.
• Automatic Hierarchy in Matrix.
Other Charts in Power BI Desktop
• Line Chart.
• Drill down in Line Chart.
• Area Chart.
• Line vs Column Chart.
• Scatter Plot.
• Waterfall Chart.
Cards and Filters in Power BI Desktop.
• Number Card.
• Text Card.
• Date Card.
• Multi-Row Card.
• Filter on Visual.
• Filter on Page.
• Filter on All Pages.
• Drill through.
III. Business Intelligence (BI) and Big Data Visualization
Slicers in Power BI Desktop
• Slicer for Text.
• Format Text Slicer.
• Date Slicer.
• Format Date Slicer .
• Number Slicer.
Advanced Charts in Power BI Desktop
• Animated Bar Chart Race.
• Drill Down Donut Chart.
• Drill Down Column Chart.
• Word Cloud.
• Sankey Chart.
• Infographic.
• Play Axis.
• Scroller.
• Sunburst Chart.
• 10- Histogram
Objects and Actions (Hyperlinks) in PBI
• Insert Image.
• Insert Text.
• Insert Shapes.
• Insert Buttons.
• Action - Web URL.
• Action - Page Navigation.
• Action - Bookmark Action.
• Action - Drill through.
Power BI Service Introduction
• Creating a Superstore Report.
• Create an Account on Power BI Service.
• Publish Report to Power BI Service Account.
• Export (PPT, PDF, PBIX) Report and Share.
• Comment, Share and Subscribe to a report.
• Create a dashboard in Power BI Service.
• Problem in Power BI Dashboard & its solution.
• Automatic Refresh - Data Gatewayn.
• Exam
• Introduction to NoSQL databases
• Comparison with SQL databases
• Document NoSQL Store
• Introduction and Installation
• Basic commands
• Document NoSQL Data Modeling.
• Practice exercises
• Integration of Document NoSQL database with Apache Spark
Machine Learning
• Practice exercises
• Graph NoSQL Store
• Introduction and Installation
• Basic commands
• Graph NoSQL Data Modeling
• Integration of Graph NoSQL database with Apache Spark
Machine Learning
• Practice exercises
• Key-Value NoSQL Store
• Introduction and Installation
• Exam
IV. Big Data Management Systems with NoSQL Data Stores V. Machine Learning for Big Data
• What is Machine Learning, and what tools are required for
learning it
• Differences between classification and regression-based
problems, Supervise and unsupervised categories, and real-
world examples
• Machine learning protocol for implementation
• Linear regression; How it works, mathematical and graphical
representation of LR
• Python implementation of Linear regression
• Discuss the performance metrics for regression-based
problems
• Logistic regression; how it works, Decision boundaries,
Sigmoid function,
• Python implementation of Logistic regression
• Discuss the performance metrics for classification-based
problems
• The real-life problem of logistic regression
• For regression and classification-based problems
• K-neighbour nearest
• Support vector machine
• Discussion of overfitting and underfitting
• Cross-validation
• Hold out cross-validation
• K-fold and its types of cross-validation
• Leave one out cross-validation
• Bootstrap cross-validation
• Python implementation of Cross validation
• Classification and Regression
• Decision tree
• Random forest
• Parameter behavior of both algorithms
• Overfitting and underfitting handling
• Hyper-parameter
• Un supervised learning; K-mean
• Feature selection; PCA
• Discussion of project
• live demonstration on Kaggle submission
• real-life problem solving
• Final Exam
• Infrastructure Development for Real-Time Big Data Analytics
• Streaming Introduction
• Big Data Pipelines: The Rise of Real-Time
V. Machine Learning for Big Data VI. Case Study
Stream processing with Apache Storm
• How does Twitter compute trends
• Improve performance using distributed processing
• Building blocks of Storm Topologies
• Adding Parallelism in a Storm Topology
• Components of Storm Cluster
• A simple Hello World Topology
• Implementing Bolt & Submitting a Topology
Processing Data using Files
• Reading Data from a file
• Representing Data using Tuples
• Accessing Data from Tuples
• Writing Data to a File
• Assignment 1
VI. Case Study
Spark Streaming
• Streaming Architecture
• Deployment of Collection and Message Queuing Tiers
• Introduction of message queuing tier using Apache Kafka
Running The Collection Tier (Part II - Sending Data)
Data Access Tier
• Introduction to Data Access tier - MongoDB
• Exploring Spring Reactive
• Exposing Data Access tier in browser
• Analysis Tier
• Introduction to Analysis tier - Apache Spark
• Plug-in Spark Analysis Tier to Our Pipelines
• A brief overview of Spark RDDs
• Fault Tolerance
• Kafka Connect
• Assignment 2
Brief introduction to
• DaLambda vs Kafka architecture
• taFrame, DataSets, and SparkSQL
• Spark Structured Streaming
Benefits of Kappa architecture.
Building Data Pipelines using Apache Airflow
• Advantages of using DAGs in Apache Airflow
• Apache Airflow UI
• Building DAG using Airflow
• Airflow Monitoring and Logging
• Assignment 3
VII. Final Exam

More Related Content

Similar to Big Data - IBA.pptx

Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabadVamsiNihal
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)SayyedYusufali
 
data science training and placement
data science training and placementdata science training and placement
data science training and placementSaiprasadVella
 
online data science training
online data science trainingonline data science training
online data science trainingDIGITALSAI1
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabadVamsiNihal
 
data science online training in hyderabad
data science online training in hyderabaddata science online training in hyderabad
data science online training in hyderabadVamsiNihal
 
Best data science training in Hyderabad
Best data science training in HyderabadBest data science training in Hyderabad
Best data science training in HyderabadKumarNaik21
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training HyderabadNithinsunil1
 
Data Science Training and Placement
Data Science Training and PlacementData Science Training and Placement
Data Science Training and PlacementAkhilGGM
 
Rise of the Data Democracy
Rise of the Data DemocracyRise of the Data Democracy
Rise of the Data DemocracyBrendan Aldrich
 
Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)SayyedYusufali
 
Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)SayyedYusufali
 
Data science training in hydpdf converted (1)
Data science training in hydpdf  converted (1)Data science training in hydpdf  converted (1)
Data science training in hydpdf converted (1)SayyedYusufali
 
1 data science with python
1 data science with python1 data science with python
1 data science with pythonVishal Sathawane
 
Corporate presentation quant farm
Corporate presentation quant farmCorporate presentation quant farm
Corporate presentation quant farmSushil Jha
 
Explore BICT Presentation
Explore BICT PresentationExplore BICT Presentation
Explore BICT Presentationdavin scampton
 

Similar to Big Data - IBA.pptx (20)

Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
data science training and placement
data science training and placementdata science training and placement
data science training and placement
 
online data science training
online data science trainingonline data science training
online data science training
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
data science online training in hyderabad
data science online training in hyderabaddata science online training in hyderabad
data science online training in hyderabad
 
Best data science training in Hyderabad
Best data science training in HyderabadBest data science training in Hyderabad
Best data science training in Hyderabad
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
 
Data Science Training and Placement
Data Science Training and PlacementData Science Training and Placement
Data Science Training and Placement
 
Rise of the Data Democracy
Rise of the Data DemocracyRise of the Data Democracy
Rise of the Data Democracy
 
Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)
 
Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)
 
Data science training in hydpdf converted (1)
Data science training in hydpdf  converted (1)Data science training in hydpdf  converted (1)
Data science training in hydpdf converted (1)
 
Machine learning specialist ver#4
Machine learning specialist ver#4Machine learning specialist ver#4
Machine learning specialist ver#4
 
1 data science with python
1 data science with python1 data science with python
1 data science with python
 
Corporate presentation quant farm
Corporate presentation quant farmCorporate presentation quant farm
Corporate presentation quant farm
 
Explore BICT Presentation
Explore BICT PresentationExplore BICT Presentation
Explore BICT Presentation
 
PARTHASARATHY_RESUME
PARTHASARATHY_RESUMEPARTHASARATHY_RESUME
PARTHASARATHY_RESUME
 
Padmini parmar
Padmini parmarPadmini parmar
Padmini parmar
 
Padmini Parmar
Padmini ParmarPadmini Parmar
Padmini Parmar
 

More from Muhammad Shamim

More from Muhammad Shamim (11)

Sajid Sheikh CV-10-Jan-2023.docx
Sajid Sheikh CV-10-Jan-2023.docxSajid Sheikh CV-10-Jan-2023.docx
Sajid Sheikh CV-10-Jan-2023.docx
 
Oracle inventory cloud
Oracle inventory cloudOracle inventory cloud
Oracle inventory cloud
 
One source marketing
One source marketingOne source marketing
One source marketing
 
Oracle averge-vs-standard-costing
Oracle averge-vs-standard-costingOracle averge-vs-standard-costing
Oracle averge-vs-standard-costing
 
Itic profile
Itic profileItic profile
Itic profile
 
Itic profile 1.1
Itic profile 1.1Itic profile 1.1
Itic profile 1.1
 
Shamim pre sales review
Shamim pre sales reviewShamim pre sales review
Shamim pre sales review
 
Axelos
AxelosAxelos
Axelos
 
Sap basis course content
Sap basis course contentSap basis course content
Sap basis course content
 
Apps dev
Apps devApps dev
Apps dev
 
001 i acs company profile new
001 i acs company profile new001 i acs company profile new
001 i acs company profile new
 

Recently uploaded

Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerunnathinaik
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 

Recently uploaded (20)

Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developer
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 

Big Data - IBA.pptx

  • 2. education providing professional diplomas, certifications, and workshops in the field of Information technology, Information Systems, and Computer Science. The CICT was established in 2016 providing high-quality professional education to the private and public sectors in Pakistan. The CICT has been associated with renowned faculty members who conduct certain courses and workshops which contribute to the digitalization in the educational and professional sectors. The Center for Information & Communication Technology (CICT) aspires to meet the highest standards of IT excellence required in pursuit of management strategies. The Center for Information & Communication Technology (CICT) is to provide excellent teaching and research environment specially in Information Technology to produce students/professionals who distinguish themselves by their professional competence, research, entrepreneurship, humanistic outlook, ethical rectitude, pragmatic approach to problem solving, managerial skills and ability to respond to the challenge of socio-economic development to serve as the vanguard of techno- industrial transformation of the society.
  • 3. • Be respectful to your teachers as well as your classmates, any kind of disrespect or misbehavior will be subject to dismissal from the course. • Be on time for your classes and avoid being irregular. • All students are required to carry their ID Cards provided by the administration to avoid any inconveniences while entering the premise. • In case of Card loss, the student is required to report to the department to state their case. • Avoid usage of mobile devices during your sessions, the instructor has a full right to dismiss any student who has been found using the phone during class. • Do not bring any food or drinks inside your classrooms especially computer labs as it is strictly prohibited. • Don't forget to turn off the machine once you are done using it. • Do not plug in external devices without scanning them for computer viruses. Note: IBA plays a vital role in maintaining decorum which applies to every candidate entering the premises. Keeping in mind that the campus is Smoke- free any candidate carrying out any mal activity is reluctantly charged & may result is dismissal from the course.
  • 4. The convergence of big data and machine learning with technologies such as cloud services, sensors, ubiquitous computing, mobile devices, and the Internet of Things has created vast new opportunities for business. Analytics has become a competitive and sustainable advantage for many organizations. To harness the benefits of big data and machine learning, however, business leaders face the pressing challenge of not only acquiring the right technologies and talent to analyze and interpret the data but also weaving a data-centric mindset into the organization's structure and cultural fabric. This Four-Month Diploma will empower me with the skills and confidence to tackle data-driven opportunities and accelerate data- analysis transformation in the organization. Through lectures, case studies, and discussions, real-world insights will be gained on various applications of big data analytics and machine learning, and how they can be used to fuel better decision-making within the context of the attendee's own Department/Organization.
  • 5. Mr. Sohail Imran: For more than 18 years, Mr. Sohail Imran is conducting training and workshops for databases (SQL and NoSQL), Big Data Infrastructure, and Machine Learning for different institutes, universities, and the corporate sector. More than 8 years of professional experience in Big Data Analytics, Data Science, Data Mining, Data Warehousing, and DBMS (SQL and NoSQL). Providing consultancy in designing and developing Big Data Analytics platforms using Java, RapideMiner, Radoop, Python, Hadoop, Hive, Spark, Kafka, Spark Streaming, Storm, etc. Mr. Muhammad Rizwan: Muhammad Rizwan provides digital leadership to organizations, from strategy to execution, globally and locally. In his 25+ corporate career, he has worked in both public and private equity spaces with technology-led and digitally-enabled businesses in various management vernaculars, including C-Suite. He holds a certificate from Stanford University in Machine Learning. A master's degree from Hamdard. Bachelor's degree in Statistics/Commerce and Information Systems. He also carries an international diploma in Software Engineering. Currently, heading the information systems at Dollar Industries (Pvt) Ltd. He has worked for Hino Pak Motors, Karachi Stock Exchange, and CPLC (Car theft software) projects. Dr. Affan Alim: Dr. Muhammad Affan Alim has 16 years of teaching, research, and development experience in Machine Learning, Deep Learning, Data Science, pattern classification, computer vision, Optimization of models, and statistical & mathematical analysis. He also has several years of professional experience in software development in Pakistan and the United Kingdom (UK). He has developed several industry-based projects. Mr. Muhammad Shamim Ahmed: With overall 10 years of experience in Project management, Oracle functional consultation, and 4 years as an Oracle University trainer, Mr. Shamim has been a great asset to IBA CICT. He has executed 5 projects to date in various capacities, with proven experience with the complete life cycle of Oracle EBS implementation. He has been actively involved in all stages of project management such as initiation, Business Blueprint (AS-IS Document), GAP Analysis, Solution design, and much more. He is exceptionally motivated, energetic, and enthusiastic about learning and teaching
  • 6. • History and Evolution of Python • Advantages of Apache Spark with Python in a Big Data Environment • Setting up Big Data Programming Development Environments • Programming Language Basics • Collections and their types • Conditional Control Structures • Iterative Control Structures • Methods with Practice Examples • Module and File I/O • Object-Oriented Programming Concepts • Apache Spark and Python for Machine Learning • Exam I. Python Foundations for Big Data Analytics • Intro to data science, Role of the database designer, data engineer, data analyst, and data scientist. what data scientists do. • The available format of data and what types of data a data scientist received. The life cycle of data science, data science competitions • The difference between wrangling and feature engineering, the steps of wrangling, and its detail • Reading of different methods of the dataset using pandas python Steps of feature engineering for machine learning. During understanding, some real examples will also be discussed. Requirement of the tools and techniques of data science. • Missing data imputation using pandas. How to find the missing values in the dataset, what handling of missing values is important • Fillna() method with different parameters for missing values. • Drop all rows of missing data in Data Frame, and drop missing data rows with respect to a specific column. • How to fill using aggregate values, how to fill forward and backward, and fill with reference to other columns. Practice questions • The real application-based problem for missing values • What is the outlier, what is the impact of an outlier in data, and how do find and visualize the outliers? • Outlier removal strategies, Performing winsorization, a python implementation II. Big Data Wrangling
  • 7. • Three case studies for EDA • Exploratory data analytics • Handling the structuring issues in the dataset, inconsistencies in date, and any other attributes. Unnecessary character attachment, Exploring these issues • Handling the structuring issues using regex • Categorical Variables: Encoding Categorical Variables: on hot encoding, dummy encoding, effect encoding, pros and cons of a categorical variable encoding • Discussion of project • Live demonstration of Kaggle and its features • how Kaggle will helpful for data scientist • Exam II. Big Data Wrangling III. Business Intelligence (BI) and Big Data Visualization • Introduction to BI & commonly using BI tools. • Power BI introduction & its components. • Power View, Query, Pivot & Power BI Service • Introduction to Power Query and its usage. • Basic Power BI Navigation. Basic Power BI Charts. • Column Chart. • Stacked Column Chart. • Pie Chart. • Donut Chart. • Funnel Chart. • Ribbon Chart. • Include and Exclude. • Export data from Visual. Maps in Power BI Desktop • Map. • Filled Map. • Map with Pie Chart. • Formatting in Map. • Background Changes in Map. • Map of Pakistan in Power BI. • Map of Australia. Table & Matrix in Power BI Desktop • Creating a Simple Table. • Formatting in Table. • Conditional Formatting in Table. • Changing Aggregation in Table. • Creating a Matrix in Power BI. • Conditional Formatting in Matrix. • Automatic Hierarchy in Matrix. Other Charts in Power BI Desktop • Line Chart. • Drill down in Line Chart. • Area Chart. • Line vs Column Chart. • Scatter Plot. • Waterfall Chart.
  • 8. Cards and Filters in Power BI Desktop. • Number Card. • Text Card. • Date Card. • Multi-Row Card. • Filter on Visual. • Filter on Page. • Filter on All Pages. • Drill through. III. Business Intelligence (BI) and Big Data Visualization Slicers in Power BI Desktop • Slicer for Text. • Format Text Slicer. • Date Slicer. • Format Date Slicer . • Number Slicer. Advanced Charts in Power BI Desktop • Animated Bar Chart Race. • Drill Down Donut Chart. • Drill Down Column Chart. • Word Cloud. • Sankey Chart. • Infographic. • Play Axis. • Scroller. • Sunburst Chart. • 10- Histogram Objects and Actions (Hyperlinks) in PBI • Insert Image. • Insert Text. • Insert Shapes. • Insert Buttons. • Action - Web URL. • Action - Page Navigation. • Action - Bookmark Action. • Action - Drill through. Power BI Service Introduction • Creating a Superstore Report. • Create an Account on Power BI Service. • Publish Report to Power BI Service Account. • Export (PPT, PDF, PBIX) Report and Share. • Comment, Share and Subscribe to a report. • Create a dashboard in Power BI Service. • Problem in Power BI Dashboard & its solution. • Automatic Refresh - Data Gatewayn. • Exam
  • 9. • Introduction to NoSQL databases • Comparison with SQL databases • Document NoSQL Store • Introduction and Installation • Basic commands • Document NoSQL Data Modeling. • Practice exercises • Integration of Document NoSQL database with Apache Spark Machine Learning • Practice exercises • Graph NoSQL Store • Introduction and Installation • Basic commands • Graph NoSQL Data Modeling • Integration of Graph NoSQL database with Apache Spark Machine Learning • Practice exercises • Key-Value NoSQL Store • Introduction and Installation • Exam IV. Big Data Management Systems with NoSQL Data Stores V. Machine Learning for Big Data • What is Machine Learning, and what tools are required for learning it • Differences between classification and regression-based problems, Supervise and unsupervised categories, and real- world examples • Machine learning protocol for implementation • Linear regression; How it works, mathematical and graphical representation of LR • Python implementation of Linear regression • Discuss the performance metrics for regression-based problems • Logistic regression; how it works, Decision boundaries, Sigmoid function, • Python implementation of Logistic regression • Discuss the performance metrics for classification-based problems • The real-life problem of logistic regression • For regression and classification-based problems • K-neighbour nearest • Support vector machine • Discussion of overfitting and underfitting • Cross-validation • Hold out cross-validation
  • 10. • K-fold and its types of cross-validation • Leave one out cross-validation • Bootstrap cross-validation • Python implementation of Cross validation • Classification and Regression • Decision tree • Random forest • Parameter behavior of both algorithms • Overfitting and underfitting handling • Hyper-parameter • Un supervised learning; K-mean • Feature selection; PCA • Discussion of project • live demonstration on Kaggle submission • real-life problem solving • Final Exam • Infrastructure Development for Real-Time Big Data Analytics • Streaming Introduction • Big Data Pipelines: The Rise of Real-Time V. Machine Learning for Big Data VI. Case Study Stream processing with Apache Storm • How does Twitter compute trends • Improve performance using distributed processing • Building blocks of Storm Topologies • Adding Parallelism in a Storm Topology • Components of Storm Cluster • A simple Hello World Topology • Implementing Bolt & Submitting a Topology Processing Data using Files • Reading Data from a file • Representing Data using Tuples • Accessing Data from Tuples • Writing Data to a File • Assignment 1
  • 11. VI. Case Study Spark Streaming • Streaming Architecture • Deployment of Collection and Message Queuing Tiers • Introduction of message queuing tier using Apache Kafka Running The Collection Tier (Part II - Sending Data) Data Access Tier • Introduction to Data Access tier - MongoDB • Exploring Spring Reactive • Exposing Data Access tier in browser • Analysis Tier • Introduction to Analysis tier - Apache Spark • Plug-in Spark Analysis Tier to Our Pipelines • A brief overview of Spark RDDs • Fault Tolerance • Kafka Connect • Assignment 2 Brief introduction to • DaLambda vs Kafka architecture • taFrame, DataSets, and SparkSQL • Spark Structured Streaming Benefits of Kappa architecture. Building Data Pipelines using Apache Airflow • Advantages of using DAGs in Apache Airflow • Apache Airflow UI • Building DAG using Airflow • Airflow Monitoring and Logging • Assignment 3 VII. Final Exam