SlideShare a Scribd company logo
CLOUD-BASED BIG DATA
ANALYTICS
INTRODUCTION:
• With the advent of the digital age, the amount of data being
generated, stored and shared has been on the rise. From data
warehouses, social media, webpages and blogs to audio/video
streams, all of these are sources of massive amounts of data.
• This data has huge potential, ever-increasing complexity,
insecurity and risks, and irrelevance.
• Big data, by definition, is a term used to
describe a variety of data -structured, semi-
structured and unstructured, which makes it a
complex data infrastructure.
• Big data includes variety, volume, velocity
and veracity
• The different types of data available on a dataset
determine variety while the rate at which data is
produced determines Velocity.
• Predictably, the size of data is called Volume.
• Veracity indicates data reliability.
INTRODUCTION: CNTD…
INTRODUCTION: CNTD…
• The cloud computing environment offers
development, installation and
implementation of software and data
applications ‘as a service’.
• software as a service(SaaS)
• Platform as a service(PaaS)
• Infrastructure as a service(IaaS)
• Infrastructure-as-a-service is a model that
provides computing and storage resources as
a service.
• in case of PaaS and SaaS, the cloud services
provide software platform or software itself
LITERATURE SURVEY:
• Traditional data management tools and data processing or data
mining techniques cannot be used for Big Data Analytics for the
large volume and complexity of the datasets that it includes.
• Conventional business intelligence applications make use of
methods, which are based on traditional analytics methods and
techniques and make use of OLAP, BPM, Mining and database
systems like RDBMS.
• One of the most popular models used for data processing on
cluster of computers is MapReduce.
• Hadoop is simply an open-source implementation of the
MapReduce framework, which was originally created as a
distributed file system.
PROBLEM STATEMENT:
• In order to move beyond the existing techniques and strategies
used for machine learning and data analytics, some challenges
need to be overcome. NESSI identifies the following
requirements as critical.
• In order to select an adequate method or design, a solid scientific
foundation needs to be developed.
• New efficient and scalable algorithms need to be developed.
• For proper implementation of devised solutions, appropriate
development skills and technological platforms must be identified and
developed.
• Lastly, the business value of the solutions must be explored just as
much as the data structure and its usability.
PROBLEM STATEMENT:CNTD…
• This section, describes two example applications where large
scale data management over cloud is used. These are specific
use-case examples in telecom and finance.
• In the telecom domain, massive amount of call detail records
can be processed to generate near real-time network usage
information.
• In finance domain it can be describe the fraud detection
application.
DESIGN, IMPLEMENTATION AND RESULT
ANALYSIS DETAILS:
1.Dashboard for CDR Processing:
• Telecom operators are interested in building a dashboard that would
allow the analysts and architects to understand the traffic flowing
through the network along various dimensions of interest.
• The traffic is captured using Call Detail Records (CDRs) whose volume
runs into a terabyte per day.
• CDR is a structured stream generated by the telecom switches to
summarize various aspects of individual services like voice, SMS, MMS,
etc.
• The dashboard include determining the cell site used most for each
customer, identifying whether users are mostly making calls within cell
site calls, and for cell sites in rural areas identifying the source of traffic
i.e. local versus routed calls.
DESIGN, IMPLEMENTATION AND RESULT
ANALYSIS DETAILS:
1.Dashboard for CDR Processing: CNTD…
• Given the huge and ever growing customer base and large call volumes,
solutions using traditional warehouse will not be able to keep-up with
the rates required for effective operation.
• The need is to process the CDRs in near real-time, mediate them (i.e.,
collect CDRs from individual switches, stitch, validate, filter, and
normalize them), and create various indices which can be exploited by
dashboard among other applications.
• An IBM Stream Processing Language (SPL) based system leads to
mediating 6 billion CDRs per day.
• CDRs can be loaded periodically over cloud data management solution.
As cloud provides flexible storage, depending on traffic one can decide
on the storage required.
DESIGN, IMPLEMENTATION AND RESULT
ANALYSIS DETAILS:
2. Credit Card Fraud Detection:
• More than one-tenth of world’s population is shopping online. Credit
card is the most popular mode of online payments. As the number of
credit card transactions rise, the opportunities for attackers to steal
credit card details and commit fraud are also increasing.
• As the attacker only needs to know some details about the card (card
number, expiration date, etc.), the only way to detect online credit card
fraud is to analyze the spending patterns and detect any inconsistency
with respect to usual spending patterns.
• The companies keep tabs on the geographical locations where the credit
card transactions are made—if the area is far from the card holder’s area
of residence, or if two transactions from the same credit card are made
in two very distant areas within a relatively short timeframe, — then the
transactions are potentially fraud transactions.
DESIGN, IMPLEMENTATION AND RESULT
ANALYSIS DETAILS:
2. Credit Card Fraud Detection:CNTD…
• Various data mining algorithms are used to detect patterns within the
transaction data. Detecting these patterns requires the analysis of large
amount of data.
• Using tuples of the transactions, one can find the distance between
geographic locations of two consecutive transactions, amount of these
transactions, etc. By these parameters, one can find the potential
fraudulent transactions. Further data mining, based on a particular
user’s spending profile can be used to increase the confidence whether
the transaction is indeed fraudulent.
DESIGN, IMPLEMENTATION AND RESULT
ANALYSIS DETAILS:
2. Credit Card Fraud Detection:CNTD…
• As number of credit card transactions is huge and the kind of processing
required is not a typical relational processing (hence, warehouses are not
optimized to do such processing), one can use Hadoop based solution
for this purpose as depicted.
• Using Hadoop one can create customer profile as well as creating
matrices of consecutive transactions to decide whether a particular
transaction is a fraud transaction. As one needs to find the fraud with-in
some specified time, stream processing can help.
• By employing massive resources for analyzing potentially fraud
transactions one can meet the response time guarantees.
DESIGN, IMPLEMENTATION AND RESULT
ANALYSIS DETAILS:
3. Result Analysis:
• Several open source data mining techniques, resources
and tools exist. Some of these include R, Gate, Rapid-
Miner and Weka, in addition to many others.
• Cloud-based big data analytics solutions must provide
a provision for the availability of these affordable data
analytics on the cloud so that cost-effective and
efficient services can be provided.
• The fundamental reason why cloud-based analytics are
such a big thing is their easy accessibility, cost-
effectiveness and ease of setting up and testing.
CONCLUSION AND FUTURE RESEARCH
DIRECTION:
• This is an age of big data and the emergence of this field of
study has attracted the attention of many practitioners and
researchers.
• Considering the rate at which data is being created in the
digital world, big data analytics and analysis have become all
the more relevant.
• The cloud infrastructure suffices the storage and computing
requirements of data analytics algorithms. On the other hand,
open issues like security, privacy and the lack of ownership and
control exist.
• Research studies in the area of cloud-based big data analytics
THANK YOU

More Related Content

What's hot

Cloud Computing Security Challenges
Cloud Computing Security ChallengesCloud Computing Security Challenges
Cloud Computing Security Challenges
Yateesh Yadav
 
Data security in cloud computing
Data security in cloud computingData security in cloud computing
Data security in cloud computing
Prince Chandu
 
Cloud Computing & Big Data
Cloud Computing & Big DataCloud Computing & Big Data
Cloud Computing & Big Data
Mrinal Kumar
 
Big Data Analytics with R
Big Data Analytics with RBig Data Analytics with R
Big Data Analytics with R
Great Wide Open
 
Data Streaming For Big Data
Data Streaming For Big DataData Streaming For Big Data
Data Streaming For Big Data
Seval Çapraz
 
Privacy Preserving Data Mining
Privacy Preserving Data MiningPrivacy Preserving Data Mining
Privacy Preserving Data Mining
Vrushali Malvadkar
 
Unit 4
Unit 4Unit 4
Unit 4
Ravi Kumar
 
Load balancing in cloud
Load balancing in cloudLoad balancing in cloud
Load balancing in cloud
Souvik Maji
 
Overview of big data in cloud computing
Overview of big data in cloud computingOverview of big data in cloud computing
Overview of big data in cloud computing
Viet-Trung TRAN
 
Fault tolerance in distributed systems
Fault tolerance in distributed systemsFault tolerance in distributed systems
Fault tolerance in distributed systems
sumitjain2013
 
Hybrid Cloud and Its Implementation
Hybrid Cloud and Its ImplementationHybrid Cloud and Its Implementation
Hybrid Cloud and Its Implementation
Sai P Mishra
 
Big Data Architecture and Design Patterns
Big Data Architecture and Design PatternsBig Data Architecture and Design Patterns
Big Data Architecture and Design Patterns
John Yeung
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
Guido Schmutz
 
Security on Cloud Computing
Security on Cloud Computing Security on Cloud Computing
Security on Cloud Computing
Reza Pahlava
 
Cloud Security: A New Perspective
Cloud Security: A New PerspectiveCloud Security: A New Perspective
Cloud Security: A New Perspective
Wen-Pai Lu
 
Virtualization in cloud computing
Virtualization in cloud computingVirtualization in cloud computing
Virtualization in cloud computing
Mohammad Ilyas Malik
 
Cloud Computing: Hadoop
Cloud Computing: HadoopCloud Computing: Hadoop
Cloud Computing: Hadoop
darugar
 
Cloud security ppt
Cloud security pptCloud security ppt
Cloud security ppt
Venkatesh Chary
 
Cloud computing
Cloud computingCloud computing
Cloud computing
pgayatrinaidu
 
Optimistic concurrency control in Distributed Systems
Optimistic concurrency control in Distributed SystemsOptimistic concurrency control in Distributed Systems
Optimistic concurrency control in Distributed Systems
mridul mishra
 

What's hot (20)

Cloud Computing Security Challenges
Cloud Computing Security ChallengesCloud Computing Security Challenges
Cloud Computing Security Challenges
 
Data security in cloud computing
Data security in cloud computingData security in cloud computing
Data security in cloud computing
 
Cloud Computing & Big Data
Cloud Computing & Big DataCloud Computing & Big Data
Cloud Computing & Big Data
 
Big Data Analytics with R
Big Data Analytics with RBig Data Analytics with R
Big Data Analytics with R
 
Data Streaming For Big Data
Data Streaming For Big DataData Streaming For Big Data
Data Streaming For Big Data
 
Privacy Preserving Data Mining
Privacy Preserving Data MiningPrivacy Preserving Data Mining
Privacy Preserving Data Mining
 
Unit 4
Unit 4Unit 4
Unit 4
 
Load balancing in cloud
Load balancing in cloudLoad balancing in cloud
Load balancing in cloud
 
Overview of big data in cloud computing
Overview of big data in cloud computingOverview of big data in cloud computing
Overview of big data in cloud computing
 
Fault tolerance in distributed systems
Fault tolerance in distributed systemsFault tolerance in distributed systems
Fault tolerance in distributed systems
 
Hybrid Cloud and Its Implementation
Hybrid Cloud and Its ImplementationHybrid Cloud and Its Implementation
Hybrid Cloud and Its Implementation
 
Big Data Architecture and Design Patterns
Big Data Architecture and Design PatternsBig Data Architecture and Design Patterns
Big Data Architecture and Design Patterns
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
 
Security on Cloud Computing
Security on Cloud Computing Security on Cloud Computing
Security on Cloud Computing
 
Cloud Security: A New Perspective
Cloud Security: A New PerspectiveCloud Security: A New Perspective
Cloud Security: A New Perspective
 
Virtualization in cloud computing
Virtualization in cloud computingVirtualization in cloud computing
Virtualization in cloud computing
 
Cloud Computing: Hadoop
Cloud Computing: HadoopCloud Computing: Hadoop
Cloud Computing: Hadoop
 
Cloud security ppt
Cloud security pptCloud security ppt
Cloud security ppt
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Optimistic concurrency control in Distributed Systems
Optimistic concurrency control in Distributed SystemsOptimistic concurrency control in Distributed Systems
Optimistic concurrency control in Distributed Systems
 

Similar to Cloud-Based Big Data Analytics

Harnessing Big Data_UCLA
Harnessing Big Data_UCLAHarnessing Big Data_UCLA
Harnessing Big Data_UCLA
Paul Barsch
 
Kaushal Amin & Big 5 IT trends in the world
Kaushal Amin & Big 5 IT trends in the worldKaushal Amin & Big 5 IT trends in the world
Kaushal Amin & Big 5 IT trends in the world
Quang PM
 
Technology Trends and Big Data in 2013-2014
Technology Trends and Big Data in 2013-2014Technology Trends and Big Data in 2013-2014
Technology Trends and Big Data in 2013-2014
KMS Technology
 
Cloud computing
Cloud computingCloud computing
Cloud computing
Aamir chouhan
 
EVOLVING PATTERNS IN BIG DATA - NEIL AVERY
EVOLVING PATTERNS IN BIG DATA - NEIL AVERYEVOLVING PATTERNS IN BIG DATA - NEIL AVERY
EVOLVING PATTERNS IN BIG DATA - NEIL AVERY
Big Data Week
 
Denodo DataFest 2017: Conquering the Edge with Data Virtualization
Denodo DataFest 2017: Conquering the Edge with Data VirtualizationDenodo DataFest 2017: Conquering the Edge with Data Virtualization
Denodo DataFest 2017: Conquering the Edge with Data Virtualization
Denodo
 
Barga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 KeynoteBarga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 Keynote
Roger Barga
 
Big data
Big dataBig data
SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview
Rajesh Menon
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
ShivanandaVSeeri
 
10-IoT Data Analytics, Cloud Computing for IoT, Cloud Based platforms, ML for...
10-IoT Data Analytics, Cloud Computing for IoT, Cloud Based platforms, ML for...10-IoT Data Analytics, Cloud Computing for IoT, Cloud Based platforms, ML for...
10-IoT Data Analytics, Cloud Computing for IoT, Cloud Based platforms, ML for...
RahulJain989779
 
Insurtech, Cloud and Cybersecurity - Chartered Insurance Institute
Insurtech, Cloud and Cybersecurity -  Chartered Insurance InstituteInsurtech, Cloud and Cybersecurity -  Chartered Insurance Institute
Insurtech, Cloud and Cybersecurity - Chartered Insurance Institute
Henrique Centieiro
 
bigdataintro.pptx
bigdataintro.pptxbigdataintro.pptx
bigdataintro.pptx
Albert Alex
 
Overview of Cloud Computing
Overview of Cloud ComputingOverview of Cloud Computing
Overview of Cloud Computing
Nishant Munjal
 
Unit 1 (1).pptx
Unit 1 (1).pptxUnit 1 (1).pptx
Unit 1 (1).pptx
DhanrajJadhav15
 
Future of the cloud
Future of the cloud Future of the cloud
Future of the cloud
Putchong Uthayopas
 
Introduction to Cloud Computing, Overview
Introduction to Cloud Computing, OverviewIntroduction to Cloud Computing, Overview
Introduction to Cloud Computing, Overview
SudiptaDas684406
 
Securing Apps and Data in the Cloud - July 23 2014 Toronto Board of Trade
Securing Apps and Data in the Cloud - July 23 2014 Toronto Board of TradeSecuring Apps and Data in the Cloud - July 23 2014 Toronto Board of Trade
Securing Apps and Data in the Cloud - July 23 2014 Toronto Board of Trade
Lisa Abe-Oldenburg, B.Comm., JD.
 
Speaker Presention by Irena Bojanova of the University of Maryland University...
Speaker Presention by Irena Bojanova of the University of Maryland University...Speaker Presention by Irena Bojanova of the University of Maryland University...
Speaker Presention by Irena Bojanova of the University of Maryland University...
Tim Harvey
 
The most trusted, proven enterprise-class Cloud:Closer than you think
The most trusted, proven enterprise-class Cloud:Closer than you think The most trusted, proven enterprise-class Cloud:Closer than you think
The most trusted, proven enterprise-class Cloud:Closer than you think
Uni Systems S.M.S.A.
 

Similar to Cloud-Based Big Data Analytics (20)

Harnessing Big Data_UCLA
Harnessing Big Data_UCLAHarnessing Big Data_UCLA
Harnessing Big Data_UCLA
 
Kaushal Amin & Big 5 IT trends in the world
Kaushal Amin & Big 5 IT trends in the worldKaushal Amin & Big 5 IT trends in the world
Kaushal Amin & Big 5 IT trends in the world
 
Technology Trends and Big Data in 2013-2014
Technology Trends and Big Data in 2013-2014Technology Trends and Big Data in 2013-2014
Technology Trends and Big Data in 2013-2014
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
EVOLVING PATTERNS IN BIG DATA - NEIL AVERY
EVOLVING PATTERNS IN BIG DATA - NEIL AVERYEVOLVING PATTERNS IN BIG DATA - NEIL AVERY
EVOLVING PATTERNS IN BIG DATA - NEIL AVERY
 
Denodo DataFest 2017: Conquering the Edge with Data Virtualization
Denodo DataFest 2017: Conquering the Edge with Data VirtualizationDenodo DataFest 2017: Conquering the Edge with Data Virtualization
Denodo DataFest 2017: Conquering the Edge with Data Virtualization
 
Barga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 KeynoteBarga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 Keynote
 
Big data
Big dataBig data
Big data
 
SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
 
10-IoT Data Analytics, Cloud Computing for IoT, Cloud Based platforms, ML for...
10-IoT Data Analytics, Cloud Computing for IoT, Cloud Based platforms, ML for...10-IoT Data Analytics, Cloud Computing for IoT, Cloud Based platforms, ML for...
10-IoT Data Analytics, Cloud Computing for IoT, Cloud Based platforms, ML for...
 
Insurtech, Cloud and Cybersecurity - Chartered Insurance Institute
Insurtech, Cloud and Cybersecurity -  Chartered Insurance InstituteInsurtech, Cloud and Cybersecurity -  Chartered Insurance Institute
Insurtech, Cloud and Cybersecurity - Chartered Insurance Institute
 
bigdataintro.pptx
bigdataintro.pptxbigdataintro.pptx
bigdataintro.pptx
 
Overview of Cloud Computing
Overview of Cloud ComputingOverview of Cloud Computing
Overview of Cloud Computing
 
Unit 1 (1).pptx
Unit 1 (1).pptxUnit 1 (1).pptx
Unit 1 (1).pptx
 
Future of the cloud
Future of the cloud Future of the cloud
Future of the cloud
 
Introduction to Cloud Computing, Overview
Introduction to Cloud Computing, OverviewIntroduction to Cloud Computing, Overview
Introduction to Cloud Computing, Overview
 
Securing Apps and Data in the Cloud - July 23 2014 Toronto Board of Trade
Securing Apps and Data in the Cloud - July 23 2014 Toronto Board of TradeSecuring Apps and Data in the Cloud - July 23 2014 Toronto Board of Trade
Securing Apps and Data in the Cloud - July 23 2014 Toronto Board of Trade
 
Speaker Presention by Irena Bojanova of the University of Maryland University...
Speaker Presention by Irena Bojanova of the University of Maryland University...Speaker Presention by Irena Bojanova of the University of Maryland University...
Speaker Presention by Irena Bojanova of the University of Maryland University...
 
The most trusted, proven enterprise-class Cloud:Closer than you think
The most trusted, proven enterprise-class Cloud:Closer than you think The most trusted, proven enterprise-class Cloud:Closer than you think
The most trusted, proven enterprise-class Cloud:Closer than you think
 

Recently uploaded

Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
MDSABBIROJJAMANPAYEL
 
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
Las Vegas Warehouse
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
KrishnaveniKrishnara1
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
MIGUELANGEL966976
 
Textile Chemical Processing and Dyeing.pdf
Textile Chemical Processing and Dyeing.pdfTextile Chemical Processing and Dyeing.pdf
Textile Chemical Processing and Dyeing.pdf
NazakatAliKhoso2
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
VICTOR MAESTRE RAMIREZ
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
IJECEIAES
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
ecqow
 
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
171ticu
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
jpsjournal1
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
sachin chaurasia
 
john krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptxjohn krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptx
Madan Karki
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
Yasser Mahgoub
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Sinan KOZAK
 
Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
co23btech11018
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
shadow0702a
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
IJECEIAES
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
Mahmoud Morsy
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
gerogepatton
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
Nada Hikmah
 

Recently uploaded (20)

Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
 
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
 
Textile Chemical Processing and Dyeing.pdf
Textile Chemical Processing and Dyeing.pdfTextile Chemical Processing and Dyeing.pdf
Textile Chemical Processing and Dyeing.pdf
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
 
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
 
john krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptxjohn krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptx
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
 
Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
 

Cloud-Based Big Data Analytics

  • 2. INTRODUCTION: • With the advent of the digital age, the amount of data being generated, stored and shared has been on the rise. From data warehouses, social media, webpages and blogs to audio/video streams, all of these are sources of massive amounts of data. • This data has huge potential, ever-increasing complexity, insecurity and risks, and irrelevance.
  • 3. • Big data, by definition, is a term used to describe a variety of data -structured, semi- structured and unstructured, which makes it a complex data infrastructure. • Big data includes variety, volume, velocity and veracity • The different types of data available on a dataset determine variety while the rate at which data is produced determines Velocity. • Predictably, the size of data is called Volume. • Veracity indicates data reliability. INTRODUCTION: CNTD…
  • 4. INTRODUCTION: CNTD… • The cloud computing environment offers development, installation and implementation of software and data applications ‘as a service’. • software as a service(SaaS) • Platform as a service(PaaS) • Infrastructure as a service(IaaS) • Infrastructure-as-a-service is a model that provides computing and storage resources as a service. • in case of PaaS and SaaS, the cloud services provide software platform or software itself
  • 5. LITERATURE SURVEY: • Traditional data management tools and data processing or data mining techniques cannot be used for Big Data Analytics for the large volume and complexity of the datasets that it includes. • Conventional business intelligence applications make use of methods, which are based on traditional analytics methods and techniques and make use of OLAP, BPM, Mining and database systems like RDBMS. • One of the most popular models used for data processing on cluster of computers is MapReduce. • Hadoop is simply an open-source implementation of the MapReduce framework, which was originally created as a distributed file system.
  • 6. PROBLEM STATEMENT: • In order to move beyond the existing techniques and strategies used for machine learning and data analytics, some challenges need to be overcome. NESSI identifies the following requirements as critical. • In order to select an adequate method or design, a solid scientific foundation needs to be developed. • New efficient and scalable algorithms need to be developed. • For proper implementation of devised solutions, appropriate development skills and technological platforms must be identified and developed. • Lastly, the business value of the solutions must be explored just as much as the data structure and its usability.
  • 7. PROBLEM STATEMENT:CNTD… • This section, describes two example applications where large scale data management over cloud is used. These are specific use-case examples in telecom and finance. • In the telecom domain, massive amount of call detail records can be processed to generate near real-time network usage information. • In finance domain it can be describe the fraud detection application.
  • 8. DESIGN, IMPLEMENTATION AND RESULT ANALYSIS DETAILS: 1.Dashboard for CDR Processing: • Telecom operators are interested in building a dashboard that would allow the analysts and architects to understand the traffic flowing through the network along various dimensions of interest. • The traffic is captured using Call Detail Records (CDRs) whose volume runs into a terabyte per day. • CDR is a structured stream generated by the telecom switches to summarize various aspects of individual services like voice, SMS, MMS, etc. • The dashboard include determining the cell site used most for each customer, identifying whether users are mostly making calls within cell site calls, and for cell sites in rural areas identifying the source of traffic i.e. local versus routed calls.
  • 9. DESIGN, IMPLEMENTATION AND RESULT ANALYSIS DETAILS: 1.Dashboard for CDR Processing: CNTD… • Given the huge and ever growing customer base and large call volumes, solutions using traditional warehouse will not be able to keep-up with the rates required for effective operation. • The need is to process the CDRs in near real-time, mediate them (i.e., collect CDRs from individual switches, stitch, validate, filter, and normalize them), and create various indices which can be exploited by dashboard among other applications. • An IBM Stream Processing Language (SPL) based system leads to mediating 6 billion CDRs per day. • CDRs can be loaded periodically over cloud data management solution. As cloud provides flexible storage, depending on traffic one can decide on the storage required.
  • 10. DESIGN, IMPLEMENTATION AND RESULT ANALYSIS DETAILS: 2. Credit Card Fraud Detection: • More than one-tenth of world’s population is shopping online. Credit card is the most popular mode of online payments. As the number of credit card transactions rise, the opportunities for attackers to steal credit card details and commit fraud are also increasing. • As the attacker only needs to know some details about the card (card number, expiration date, etc.), the only way to detect online credit card fraud is to analyze the spending patterns and detect any inconsistency with respect to usual spending patterns. • The companies keep tabs on the geographical locations where the credit card transactions are made—if the area is far from the card holder’s area of residence, or if two transactions from the same credit card are made in two very distant areas within a relatively short timeframe, — then the transactions are potentially fraud transactions.
  • 11. DESIGN, IMPLEMENTATION AND RESULT ANALYSIS DETAILS: 2. Credit Card Fraud Detection:CNTD… • Various data mining algorithms are used to detect patterns within the transaction data. Detecting these patterns requires the analysis of large amount of data. • Using tuples of the transactions, one can find the distance between geographic locations of two consecutive transactions, amount of these transactions, etc. By these parameters, one can find the potential fraudulent transactions. Further data mining, based on a particular user’s spending profile can be used to increase the confidence whether the transaction is indeed fraudulent.
  • 12. DESIGN, IMPLEMENTATION AND RESULT ANALYSIS DETAILS: 2. Credit Card Fraud Detection:CNTD… • As number of credit card transactions is huge and the kind of processing required is not a typical relational processing (hence, warehouses are not optimized to do such processing), one can use Hadoop based solution for this purpose as depicted. • Using Hadoop one can create customer profile as well as creating matrices of consecutive transactions to decide whether a particular transaction is a fraud transaction. As one needs to find the fraud with-in some specified time, stream processing can help. • By employing massive resources for analyzing potentially fraud transactions one can meet the response time guarantees.
  • 13. DESIGN, IMPLEMENTATION AND RESULT ANALYSIS DETAILS: 3. Result Analysis: • Several open source data mining techniques, resources and tools exist. Some of these include R, Gate, Rapid- Miner and Weka, in addition to many others. • Cloud-based big data analytics solutions must provide a provision for the availability of these affordable data analytics on the cloud so that cost-effective and efficient services can be provided. • The fundamental reason why cloud-based analytics are such a big thing is their easy accessibility, cost- effectiveness and ease of setting up and testing.
  • 14.
  • 15. CONCLUSION AND FUTURE RESEARCH DIRECTION: • This is an age of big data and the emergence of this field of study has attracted the attention of many practitioners and researchers. • Considering the rate at which data is being created in the digital world, big data analytics and analysis have become all the more relevant. • The cloud infrastructure suffices the storage and computing requirements of data analytics algorithms. On the other hand, open issues like security, privacy and the lack of ownership and control exist. • Research studies in the area of cloud-based big data analytics