SlideShare a Scribd company logo
Data Analytics
Data analytics in Forensic Accounting
Data Analytics
 Data analytics is the science of analyzing raw data to make conclusions about that
information. Many of the techniques and processes of data analytics have been
automated into mechanical processes and algorithms that work over raw data for
human consumption.
 For example, manufacturing companies often record the runtime, downtime, and
work queue for various machines and then analyze the data to better plan the
workloads so the machines operate closer to peak capacity.
 Gaming companies use data analytics to set reward schedules for players that
keep the majority of players active in the game. Content companies use many of
the same data analytics to keep you clicking, watching, or re-organizing content to
get another view or another click.
Data Analytics process
Data Analytics process
 Data Collection
The first stage of the data pipeline is ingestion. During this stage, data is collected from
sources and moved into a system where it can be stored.
 Data Processing
The next stage of the data pipeline prepares the data for use and stores information in a
system accessible by users and applications. To maximize data quality, it must be cleaned
and transformed into information that can be easily accessed and queried.
 Data Modeling
In the next stage of the data pipeline, stored data is analyzed, and modeling algorithms are
created. Data may be analyzed by an end-to-end analytics platform like SAP, Oracle, or
SAS—or processed at scale by tools like Apache Spark*
 Decision-Making
After data has been ingested, prepared, and analyzed, it’s ready to be acted upon. Data
visualization and reporting help communicate the results of analytics.
Types of
Data Analytics
Data analytics is broken down into four basic types.
1. Descriptive analytics: This describes what has happened over a given period of time. Have
the number of views gone up? Are sales stronger this month than last?
2. Diagnostic analytics: This focuses more on why something happened. This involves more
diverse data inputs and a bit of hypothesizing. Did the weather affect icecream sales? Did
that latest marketing campaign impact sales?
3. Predictive analytics: This moves to what is likely going to happen in the near term. What
happened to sales the last time we had a hot summer? How many weather models predict a
hot summer this year?
4. Prescriptive analytics: This suggests a course of action. If the likelihood of a hot summer is
measured as an average of these five weather models is above 58%, we should add an
evening shift to the brewery and rent an additional tank to increase output.
Data Analytics Tools
 R programming – This tool is the leading analytics tool used for statistics and data modeling. R compiles and runs on various platforms such as UNIX,
Windows, and Mac OS. It also provides tools to automatically install all packages as per user-requirement.
 Python – Python is an open-source, object-oriented programming language that is easy to read, write, and maintain. It provides various machine
learning and visualization libraries such as Scikit-learn, TensorFlow, Matplotlib, Pandas, Keras, etc. It also can be assembled on any platform like SQL
server, a MongoDB database or JSON
 Tableau Public/Power BI– This is a free software that connects to any data source such as Excel, corporate Data Warehouse, etc. It then creates
visualizations, maps, dashboards etc with real-time updates on the web.
 SAS – A programming language and environment for data manipulation and analytics, this tool is easily accessible and can analyze data from
different sources.
 Microsoft Excel – This tool is one of the most widely used tools for data analytics. Mostly used for clients’ internal data, this tool analyzes the tasks
that summarize the data with a preview of pivot tables.
 RapidMiner – A powerful, integrated platform that can integrate with any data source types such as Access, Excel, Microsoft SQL, Tera data, Oracle,
Sybase etc. This tool is mostly used for predictive analytics, such as data mining, text analytics, machine learning.
 KNIME – Konstanz Information Miner (KNIME) is an open-source data analytics platform, which allows you to analyze and model data. With the
benefit of visual programming, KNIME provides a platform for reporting and integration through its modular data pipeline concept.
 Apache Spark – One of the largest large-scale data processing engine, this tool executes applications in Hadoop clusters 100 times faster in memory
and 10 times faster on disk. This tool is also popular for data pipelines and machine learning model development.
Data Analytics Methods
Data Analytics Methods
 Cluster analysis
The action of grouping a set of data elements in a way that said elements are more similar (in a particular
sense) to each other than to those in other groups – hence the term ‘cluster.’ Since there is no target variable
when clustering, the method is often used to find hidden patterns in the data. The approach is also used to
provide additional context to a trend or dataset.
 Regression analysis
Regression uses historical data to understand how a dependent variable's value is affected when one
(linear regression) or more independent variables (multiple regression) change or stay the same. By
understanding each variable's relationship and how they developed in the past, you can anticipate possible
outcomes and make better decisions in the future. i.e. weather forecasting, crop yield prediction etc.
 Data mining
A method of data analysis that is the umbrella term for engineering metrics and insights for additional
value, direction, and context. By using exploratory statistical evaluation, data mining aims to identify
dependencies, relations, patterns, and trends to generate advanced knowledge. When considering how to
analyze data, adopting a data mining mindset is essential to success - as such, it’s an area that is worth
exploring in greater detail.
Data Analytics Methods
 Neural networks
The neural network forms the basis for the intelligent algorithms of machine learning.
It is a form of analytics that attempts, with minimal intervention, to understand how the
human brain would generate insights and predict values. Neural networks learn from each
and every data transaction, meaning that they evolve and advance over time.
 Text analysis
Text analysis, also known in the industry as text mining, works by taking large sets of
textual data and arranging it in a way that makes it easier to manage. By working through
this cleansing process in stringent detail, you will be able to extract the data that is truly
relevant to your organization and use it to develop actionable insights that will propel you
forward.
Data Analytics Methods
 Time series analysis
As its name suggests, the time series analysis is used to analyze a set of data points
collected over a specified period of time. Although analysts use this method to monitor the
data points in a specific interval of time rather than just monitoring them intermittently, the
time series analysis is not uniquely used with the purpose of collecting data over time.
Instead, it allows researchers to understand if variables changed during the duration of the
study, how the different variables are dependent, and how did it reach the end result.
 Decision Trees
The decision tree analysis aims to act as a support tool to make smart and strategic
decisions. By visually displaying potential outcomes, consequences, and costs in a tree-
like model, researchers and business users can easily evaluate all factors involved and
choose the best course of action. Decision trees are helpful to analyze quantitative data
and they allow for an improved decision-making process by helping you spot improvement
opportunities, reduce costs, enhance operational efficiency and production.
Data Analytics Methods
Types of data visualization charts
AI in Accounting
AI possesses the potential to take the strength of human knowledge (skills and rules)
and apply these insights to gigantic datasets without the human weaknesses of
inattention, bias, and fatigue. AI use in many industries has proliferated due to the
availability of big data and the power of quantum computing.
 Accounting firms are heavily investing in the development of AI systems, ranging
from automation of processes (e.g. Robotic Process Automation, or RPA), to
contract analysis, to image recognition (using drones).
 Deloitte and EY have used Natural Language Processing (NLP) in their tax
services to expedite their sifting through thousands of legal documents.
 The use of machine learning algorithms to identify outliers and fraudulent records
have been among the accounting firms’ favorite AI applications.
Data analytics in Internal Audit
 A key benefit of data analytics is that it offers an alternative to sampling.
Previously, internal auditors relied on analyzing a few sample transactions – out of
millions – to identify instances of non-compliance, revenue leak, potentially
fraudulent activity, and other problems.
 For instance, when using the sampling method for internal auditing, it’s easy to
miss the fact that an unusually large number of transactions were entered on a
weekend, although the entity being audited is only open for business during
weekdays.
 Such mistakes can occur because audit sampling does not examine 100 percent
of the items within a class of transactions.
Benford’s Law application in uncovering
frauds
 A great example of accountants leveraging data analytics to uncover fraud took place in 2014 when
Caseware Analytics client KPMG audited a call center. In this organization, hundreds of call center
operators could issue—without need for their manager’s approval—refunds of up to USD $50.
Within the span of several years, each operator issued more than 10,000 refunds. This presented
an ideal opportunity for theft, so KPMG used data analytics—Benford’s Law, specifically—to verify
the validity of the refunds. Benford’s Law expects that 30.1% of numbers in a list of financial
transactions will begin with ‘1’, 18% with ‘2’, and so on, with each successive digit predicted to
represent a progressively smaller proportion. When digits fall outside the expected pattern, it may
indicate fraud.
 Using the Benford’s Law functionality in their data analysis software, KPMG found that there was a
large spike in fours—the refunds did not follow Benford’s Law. As the accountants soon discovered,
several operators had been issuing refunds just below the $50 threshold to friends, families and
even themselves. Hundreds of thousands of dollars in fraudulent refunds had been processed and
may have gone undetected had a Benford’s analysis not been conducted on the refund data.
Big data
Big data refers to data sets that are too large or complex to be
dealt with by traditional data-processing software.
Data with many fields (rows) offer greater statistical power,
while data with higher complexity (more attributes or columns)
may lead to a higher false discovery rate.
Five V’s of Big data
Here are five V’s of big data
• Volume refers to the increasing size of the datasets that the financial industry must
process and analyze, which now measure in the petabytes (one petabyte equals 1 million
• Variety relates to the many different data sources that big data applications tap to create
analyses that more accurately represent a business’s financial operations today and in the
• Velocity refers to the high speed at which data is created, which requires distributed
processing techniques to collect and curate information in many different formats and
• Veracity describes the quality of the data being analyzed, especially whether the data is
consistent and certain. It also relates to the data’s ready availability and controllability.
• Value means that the data contributes in a meaningful way to the analysis rather than

More Related Content

Similar to Data Analytics Introduction.pptx

Business Intelligence
Business IntelligenceBusiness Intelligence
Business IntelligenceSukirti Garg
 
Data analysis step by step guide
Data analysis   step by step guideData analysis   step by step guide
Data analysis step by step guide
Manish Gupta
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
Sukirti Garg
 
Moh.Abd-Ellatif_DataAnalysis1.pptx
Moh.Abd-Ellatif_DataAnalysis1.pptxMoh.Abd-Ellatif_DataAnalysis1.pptx
Moh.Abd-Ellatif_DataAnalysis1.pptx
AbdullahEmam4
 
MB2208A- Business Analytics- unit-4.pptx
MB2208A- Business Analytics- unit-4.pptxMB2208A- Business Analytics- unit-4.pptx
MB2208A- Business Analytics- unit-4.pptx
ssuser28b150
 
What is Data analytics? How is data analytics a better career option?
What is Data analytics? How is data analytics a better career option?What is Data analytics? How is data analytics a better career option?
What is Data analytics? How is data analytics a better career option?
Aspire Techsoft Academy
 
How To Transform Your Analytics Maturity Model Levels, Technologies, and Appl...
How To Transform Your Analytics Maturity Model Levels, Technologies, and Appl...How To Transform Your Analytics Maturity Model Levels, Technologies, and Appl...
How To Transform Your Analytics Maturity Model Levels, Technologies, and Appl...
Data Science Council of America
 
Introduction to Business Data Analytics
Introduction to Business Data AnalyticsIntroduction to Business Data Analytics
Introduction to Business Data Analytics
VadivelM9
 
Introduction to Data Analytics
Introduction to Data AnalyticsIntroduction to Data Analytics
Introduction to Data Analytics
Dr. C.V. Suresh Babu
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learning
Sandeep Garg
 
Data analytics vs. Data analysis
Data analytics vs. Data analysisData analytics vs. Data analysis
Data analytics vs. Data analysis
Dr. C.V. Suresh Babu
 
Business Analytics
 Business Analytics  Business Analytics
Business Analytics
ICFAI Business School
 
Inventory System
Inventory System Inventory System
Inventory System
Nasir152222
 
Analytics from data to better decision
Analytics   from data to better decisionAnalytics   from data to better decision
Analytics from data to better decision
Frehiwot Mulugeta
 
BUSINESS_INTELLIGENT_AND_ANALYTICS.pptx
BUSINESS_INTELLIGENT_AND_ANALYTICS.pptxBUSINESS_INTELLIGENT_AND_ANALYTICS.pptx
BUSINESS_INTELLIGENT_AND_ANALYTICS.pptx
obaroadewale
 
Tools for Unstructured Data Analytics
Tools for Unstructured Data AnalyticsTools for Unstructured Data Analytics
Tools for Unstructured Data AnalyticsRavi Teja
 
Unlocking big data
Unlocking big dataUnlocking big data
using big-data methods analyse the Cross platform aviation
 using big-data methods analyse the Cross platform aviation using big-data methods analyse the Cross platform aviation
using big-data methods analyse the Cross platform aviation
ranjit banshpal
 

Similar to Data Analytics Introduction.pptx (20)

Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
 
Data analysis step by step guide
Data analysis   step by step guideData analysis   step by step guide
Data analysis step by step guide
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
 
Moh.Abd-Ellatif_DataAnalysis1.pptx
Moh.Abd-Ellatif_DataAnalysis1.pptxMoh.Abd-Ellatif_DataAnalysis1.pptx
Moh.Abd-Ellatif_DataAnalysis1.pptx
 
MB2208A- Business Analytics- unit-4.pptx
MB2208A- Business Analytics- unit-4.pptxMB2208A- Business Analytics- unit-4.pptx
MB2208A- Business Analytics- unit-4.pptx
 
What is Data analytics? How is data analytics a better career option?
What is Data analytics? How is data analytics a better career option?What is Data analytics? How is data analytics a better career option?
What is Data analytics? How is data analytics a better career option?
 
How To Transform Your Analytics Maturity Model Levels, Technologies, and Appl...
How To Transform Your Analytics Maturity Model Levels, Technologies, and Appl...How To Transform Your Analytics Maturity Model Levels, Technologies, and Appl...
How To Transform Your Analytics Maturity Model Levels, Technologies, and Appl...
 
Introduction to Business Data Analytics
Introduction to Business Data AnalyticsIntroduction to Business Data Analytics
Introduction to Business Data Analytics
 
Introduction to Data Analytics
Introduction to Data AnalyticsIntroduction to Data Analytics
Introduction to Data Analytics
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learning
 
Data analytics vs. Data analysis
Data analytics vs. Data analysisData analytics vs. Data analysis
Data analytics vs. Data analysis
 
Business Analytics
 Business Analytics  Business Analytics
Business Analytics
 
Inventory System
Inventory System Inventory System
Inventory System
 
1 kwyfvb
1 kwyfvb1 kwyfvb
1 kwyfvb
 
Analytics from data to better decision
Analytics   from data to better decisionAnalytics   from data to better decision
Analytics from data to better decision
 
BUSINESS_INTELLIGENT_AND_ANALYTICS.pptx
BUSINESS_INTELLIGENT_AND_ANALYTICS.pptxBUSINESS_INTELLIGENT_AND_ANALYTICS.pptx
BUSINESS_INTELLIGENT_AND_ANALYTICS.pptx
 
Tools for Unstructured Data Analytics
Tools for Unstructured Data AnalyticsTools for Unstructured Data Analytics
Tools for Unstructured Data Analytics
 
Unlocking big data
Unlocking big dataUnlocking big data
Unlocking big data
 
ForresterPredictiveWave
ForresterPredictiveWaveForresterPredictiveWave
ForresterPredictiveWave
 
using big-data methods analyse the Cross platform aviation
 using big-data methods analyse the Cross platform aviation using big-data methods analyse the Cross platform aviation
using big-data methods analyse the Cross platform aviation
 

Recently uploaded

Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 

Recently uploaded (20)

Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 

Data Analytics Introduction.pptx

  • 1. Data Analytics Data analytics in Forensic Accounting
  • 2. Data Analytics  Data analytics is the science of analyzing raw data to make conclusions about that information. Many of the techniques and processes of data analytics have been automated into mechanical processes and algorithms that work over raw data for human consumption.  For example, manufacturing companies often record the runtime, downtime, and work queue for various machines and then analyze the data to better plan the workloads so the machines operate closer to peak capacity.  Gaming companies use data analytics to set reward schedules for players that keep the majority of players active in the game. Content companies use many of the same data analytics to keep you clicking, watching, or re-organizing content to get another view or another click.
  • 4. Data Analytics process  Data Collection The first stage of the data pipeline is ingestion. During this stage, data is collected from sources and moved into a system where it can be stored.  Data Processing The next stage of the data pipeline prepares the data for use and stores information in a system accessible by users and applications. To maximize data quality, it must be cleaned and transformed into information that can be easily accessed and queried.  Data Modeling In the next stage of the data pipeline, stored data is analyzed, and modeling algorithms are created. Data may be analyzed by an end-to-end analytics platform like SAP, Oracle, or SAS—or processed at scale by tools like Apache Spark*  Decision-Making After data has been ingested, prepared, and analyzed, it’s ready to be acted upon. Data visualization and reporting help communicate the results of analytics.
  • 5. Types of Data Analytics Data analytics is broken down into four basic types. 1. Descriptive analytics: This describes what has happened over a given period of time. Have the number of views gone up? Are sales stronger this month than last? 2. Diagnostic analytics: This focuses more on why something happened. This involves more diverse data inputs and a bit of hypothesizing. Did the weather affect icecream sales? Did that latest marketing campaign impact sales? 3. Predictive analytics: This moves to what is likely going to happen in the near term. What happened to sales the last time we had a hot summer? How many weather models predict a hot summer this year? 4. Prescriptive analytics: This suggests a course of action. If the likelihood of a hot summer is measured as an average of these five weather models is above 58%, we should add an evening shift to the brewery and rent an additional tank to increase output.
  • 6. Data Analytics Tools  R programming – This tool is the leading analytics tool used for statistics and data modeling. R compiles and runs on various platforms such as UNIX, Windows, and Mac OS. It also provides tools to automatically install all packages as per user-requirement.  Python – Python is an open-source, object-oriented programming language that is easy to read, write, and maintain. It provides various machine learning and visualization libraries such as Scikit-learn, TensorFlow, Matplotlib, Pandas, Keras, etc. It also can be assembled on any platform like SQL server, a MongoDB database or JSON  Tableau Public/Power BI– This is a free software that connects to any data source such as Excel, corporate Data Warehouse, etc. It then creates visualizations, maps, dashboards etc with real-time updates on the web.  SAS – A programming language and environment for data manipulation and analytics, this tool is easily accessible and can analyze data from different sources.  Microsoft Excel – This tool is one of the most widely used tools for data analytics. Mostly used for clients’ internal data, this tool analyzes the tasks that summarize the data with a preview of pivot tables.  RapidMiner – A powerful, integrated platform that can integrate with any data source types such as Access, Excel, Microsoft SQL, Tera data, Oracle, Sybase etc. This tool is mostly used for predictive analytics, such as data mining, text analytics, machine learning.  KNIME – Konstanz Information Miner (KNIME) is an open-source data analytics platform, which allows you to analyze and model data. With the benefit of visual programming, KNIME provides a platform for reporting and integration through its modular data pipeline concept.  Apache Spark – One of the largest large-scale data processing engine, this tool executes applications in Hadoop clusters 100 times faster in memory and 10 times faster on disk. This tool is also popular for data pipelines and machine learning model development.
  • 8. Data Analytics Methods  Cluster analysis The action of grouping a set of data elements in a way that said elements are more similar (in a particular sense) to each other than to those in other groups – hence the term ‘cluster.’ Since there is no target variable when clustering, the method is often used to find hidden patterns in the data. The approach is also used to provide additional context to a trend or dataset.  Regression analysis Regression uses historical data to understand how a dependent variable's value is affected when one (linear regression) or more independent variables (multiple regression) change or stay the same. By understanding each variable's relationship and how they developed in the past, you can anticipate possible outcomes and make better decisions in the future. i.e. weather forecasting, crop yield prediction etc.  Data mining A method of data analysis that is the umbrella term for engineering metrics and insights for additional value, direction, and context. By using exploratory statistical evaluation, data mining aims to identify dependencies, relations, patterns, and trends to generate advanced knowledge. When considering how to analyze data, adopting a data mining mindset is essential to success - as such, it’s an area that is worth exploring in greater detail.
  • 9. Data Analytics Methods  Neural networks The neural network forms the basis for the intelligent algorithms of machine learning. It is a form of analytics that attempts, with minimal intervention, to understand how the human brain would generate insights and predict values. Neural networks learn from each and every data transaction, meaning that they evolve and advance over time.  Text analysis Text analysis, also known in the industry as text mining, works by taking large sets of textual data and arranging it in a way that makes it easier to manage. By working through this cleansing process in stringent detail, you will be able to extract the data that is truly relevant to your organization and use it to develop actionable insights that will propel you forward.
  • 10. Data Analytics Methods  Time series analysis As its name suggests, the time series analysis is used to analyze a set of data points collected over a specified period of time. Although analysts use this method to monitor the data points in a specific interval of time rather than just monitoring them intermittently, the time series analysis is not uniquely used with the purpose of collecting data over time. Instead, it allows researchers to understand if variables changed during the duration of the study, how the different variables are dependent, and how did it reach the end result.  Decision Trees The decision tree analysis aims to act as a support tool to make smart and strategic decisions. By visually displaying potential outcomes, consequences, and costs in a tree- like model, researchers and business users can easily evaluate all factors involved and choose the best course of action. Decision trees are helpful to analyze quantitative data and they allow for an improved decision-making process by helping you spot improvement opportunities, reduce costs, enhance operational efficiency and production.
  • 12. Types of data visualization charts
  • 13. AI in Accounting AI possesses the potential to take the strength of human knowledge (skills and rules) and apply these insights to gigantic datasets without the human weaknesses of inattention, bias, and fatigue. AI use in many industries has proliferated due to the availability of big data and the power of quantum computing.  Accounting firms are heavily investing in the development of AI systems, ranging from automation of processes (e.g. Robotic Process Automation, or RPA), to contract analysis, to image recognition (using drones).  Deloitte and EY have used Natural Language Processing (NLP) in their tax services to expedite their sifting through thousands of legal documents.  The use of machine learning algorithms to identify outliers and fraudulent records have been among the accounting firms’ favorite AI applications.
  • 14. Data analytics in Internal Audit  A key benefit of data analytics is that it offers an alternative to sampling. Previously, internal auditors relied on analyzing a few sample transactions – out of millions – to identify instances of non-compliance, revenue leak, potentially fraudulent activity, and other problems.  For instance, when using the sampling method for internal auditing, it’s easy to miss the fact that an unusually large number of transactions were entered on a weekend, although the entity being audited is only open for business during weekdays.  Such mistakes can occur because audit sampling does not examine 100 percent of the items within a class of transactions.
  • 15. Benford’s Law application in uncovering frauds  A great example of accountants leveraging data analytics to uncover fraud took place in 2014 when Caseware Analytics client KPMG audited a call center. In this organization, hundreds of call center operators could issue—without need for their manager’s approval—refunds of up to USD $50. Within the span of several years, each operator issued more than 10,000 refunds. This presented an ideal opportunity for theft, so KPMG used data analytics—Benford’s Law, specifically—to verify the validity of the refunds. Benford’s Law expects that 30.1% of numbers in a list of financial transactions will begin with ‘1’, 18% with ‘2’, and so on, with each successive digit predicted to represent a progressively smaller proportion. When digits fall outside the expected pattern, it may indicate fraud.  Using the Benford’s Law functionality in their data analysis software, KPMG found that there was a large spike in fours—the refunds did not follow Benford’s Law. As the accountants soon discovered, several operators had been issuing refunds just below the $50 threshold to friends, families and even themselves. Hundreds of thousands of dollars in fraudulent refunds had been processed and may have gone undetected had a Benford’s analysis not been conducted on the refund data.
  • 16. Big data Big data refers to data sets that are too large or complex to be dealt with by traditional data-processing software. Data with many fields (rows) offer greater statistical power, while data with higher complexity (more attributes or columns) may lead to a higher false discovery rate.
  • 17. Five V’s of Big data Here are five V’s of big data • Volume refers to the increasing size of the datasets that the financial industry must process and analyze, which now measure in the petabytes (one petabyte equals 1 million • Variety relates to the many different data sources that big data applications tap to create analyses that more accurately represent a business’s financial operations today and in the • Velocity refers to the high speed at which data is created, which requires distributed processing techniques to collect and curate information in many different formats and • Veracity describes the quality of the data being analyzed, especially whether the data is consistent and certain. It also relates to the data’s ready availability and controllability. • Value means that the data contributes in a meaningful way to the analysis rather than