This document discusses big data analytics and its use in digital marketing. It begins by introducing big data and how early adopters like Google, eBay, and Facebook were built around big data. It then discusses how both individuals and companies now generate and consume large amounts of data. Examples are given of how much data companies like Google and Facebook process daily. The characteristics of big data are described. Traditional analytics are compared to big data analytics. Applications of big data analytics are discussed for various sectors like retail, healthcare, and government. Specific examples are provided of how analytics can provide insights from website visitors. The challenges and power of big data are also summarized before concluding with references.
Alternative Data is everywhere. We MUST start using them as a competitive edge over the competitors who are all looking to only their traditional data sources
Alternative Data is everywhere. We MUST start using them as a competitive edge over the competitors who are all looking to only their traditional data sources
Presentation: Big Data – From Strategy to Production - Mario Meir-Huber, Big Data Leader Eastern Europe, Teradata GmbH (AT), at the European Data Economy Workshop taking place back to back to SEMANTiCS2015 on 15 September 2015 in Vienna
DISUMMIT - Rishi Nalin Kumar from DatakindDigitYser
Rishi Nalin Kumar
Chief Scientist at eBench
Half professional, half collaborator, one quarter mathematician. Currently at eBench helping brands understand their consumers and win with their content. Previously leading data science & analytics in large-corporate consumer goods with a light touch of news & media. Proud volunteer at DataKind and a regular on the data & analytics speaker circuit.
DISUMMIT Keynote presentation from Kirk Borne - From Sensors to Sense-Making DigitYser
Dr. Kirk Borne is a Principal Data Scientist at Booz Allen Hamilton. With a rich background in Astrophysics and Computational Science, he was a precursor on implementing courses of big data in academia. He is one of the most important promotors of data literacy in the world.
About Kirk and his view on data literacy and evolution
On his first visit to Brussels, Kirk first activity was sharing his best practices to promote data literacy. While enjoying a magnificent view of Brussels from the ING headquarter building, Kirk playfully (with a pair of socks!) explained how subjectivity plays a major role in the way that data is understood, derived by the wide variety of involved. This keynote was delivered at the speakers reception, which took place the day before the DI Summit.
The following day, Kirk wrapped up the DI summit with his closing keynote on how data has shifted into something that is sense-making, following the evolution from “data” to “big data” into “smart data” composed by both enriched and semantic data and essential for IoT. He also discussed the levels of maturity in a self-driving enterprise, wrapping up his participation sharing this equation:
Big Data + IoT + Citizen Data Scientists = Partners in Sustainability
Kirk’s impression on the DI Summit was that it was a fun and informative event to join. His favorite format were the 5” pitches, as they were properly structured, providing the most critical information to the attendees. He also think that the networking dynamic ensured that all attendees met interesting people.
A takeaway from Kirk’s presentation
“Big data is not about how big it is, but the value you extract from it”
We look forward to have Kirk sometime soon back in Brussels!
Kirk’s interview:
Kirk’s presentation recording:
Kirk’s decks:
Kirk’s presentation drawing:
2) Here are some video interviews that I have done:
https://www.youtube.com/watch?v=ku2na1mLZZ8
https://www.youtube.com/watch?v=iXjvht91nFk
Here is my TedX talk: https://www.youtube.com/watch?v=Zr02fMBfuRA
Trends in Big Data & Business Challenges Experian_US
Join our #DataTalk on Thursdays at 5 p.m. ET. This week, we tweeted with Sushil Pramanick – who is the founder and president of the The Big Data Institute (TBDI).
You can learn about upcoming chats and see the archive of past big data tweetchats here
http://www.experian.com/blogs/news/about/datadriven
Whitepaper: Thriving in the Big Data era Manage Data before Data Manages you Intellectyx Inc
Paper Overview -
Every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the world today has been created in the last two years alone.
Data comes from everywhere and we are generating data more than ever before.
This white paper will explain what Big Data is and provide practical examples, concluding with a message how to put data your data to work.
Learn about the emerging field of big data and advanced quantitative models and how the Rady School's MS in Business Analytics program is designed to solve important business problems.
This presentation offers a basic understanding of Big Data. It does this by defining Big Data, offers a History of Big Data, Big Data by the Numbers and the 8 Laws of Big Data
Big Data : From HindSight to Insight to ForesightSunil Ranka
When it comes to Analytics and Reporting , There is a fine line between HindSight to Insight to Foresight . With the evolution of BigData technology, there is a need in deriving value out of the larger datasets, not available in the past. Even before we can start using the new shiny technologies, there is a need of understanding what is categorized as reporting or business intelligence or Big Data and Analytics. Based on my experience, people struggle to distinguish between reporting, Analytics, and Business Intelligence.
Why Everything You Know About bigdata Is A LieSunil Ranka
As a big data technologist, you can bet that you have heard it all: every crazy claim, myth, and outright lie about what big data is and what it isn't that you can imagine, and probably a few that you can't.If your company has a big data initiative or is considering one, you should be aware of these false statements and the reasons why they are wrong.
Welcome to the Age of Big Data in Banking Andy Hirst
Big Data in banking presentation from Sibos Dubai 2013 . What are use cases driving deployments in Banking ? See the use cases SAP is involved In banking in 2013
Big Data Applications | Big Data Application Examples | Big Data Use Cases | ...Simplilearn
In this Big Data presentation, we will be discussing the Big data growth over the last few years followed by the various big data applications. We will look into the various sectors where big data is used such as weather forecast, healthcare, media and entertainment, logistics, travel & tourism and finally in the government & law enforcement sector.
We will be discussing how below industries are using Big Data presentation:
1. Weather forecast
2. Media and entertainment
3. Healthcare
4. Logistics
5. Travel n tourism
6. Government and law enforcement
What is this Big Data Hadoop training course about?
The Big Data Hadoop and Spark developer course have been designed to impart an in-depth knowledge of Big Data processing using Hadoop and Spark. The course is packed with real-life projects and case studies to be executed in the CloudLab.
What are the course objectives?
This course will enable you to:
1. Understand the different components of Hadoop ecosystem such as Hadoop 2.7, Yarn, MapReduce, Pig, Hive, Impala, HBase, Sqoop, Flume, and Apache Spark
2. Understand Hadoop Distributed File System (HDFS) and YARN as well as their architecture, and learn how to work with them for storage and resource management
3. Understand MapReduce and its characteristics, and assimilate some advanced MapReduce concepts
4. Get an overview of Sqoop and Flume and describe how to ingest data using them
5. Create database and tables in Hive and Impala, understand HBase, and use Hive and Impala for partitioning
6. Understand different types of file formats, Avro Schema, using Arvo with Hive, and Sqoop and Schema evolution
7. Understand Flume, Flume architecture, sources, flume sinks, channels, and flume configurations
8. Understand HBase, its architecture, data storage, and working with HBase. You will also understand the difference between HBase and RDBMS
9. Gain a working knowledge of Pig and its components
10. Do functional programming in Spark
11. Understand resilient distribution datasets (RDD) in detail
12. Implement and build Spark applications
13. Gain an in-depth understanding of parallel processing in Spark and Spark RDD optimization techniques
14. Understand the common use-cases of Spark and the various interactive algorithms
15. Learn Spark SQL, creating, transforming, and querying Data frames
Learn more at https://www.simplilearn.com/big-data-and-analytics/big-data-and-hadoop-training
Met 80 procent van de klantdata wordt dit jaar niets gedaan. Hoeveel geld laat uw organisatie daardoor liggen? Data is geld waard. Bedrijven die vooruitdenken, managen hun data zo dat ze er winst uit halen. En dat geeft ze een flinke voorsprong.
Abstract della presentazione di Fabio Rizzotto, IT Research & Consulting Director di IDC Italia, tenuta all’IDC Big Data Conference II, a Bologna il 19 novembre 2013
Abstract:
Big Data concern large-volume, complex, growing data sets with multiple, autonomous sources. With the fast development of networking, data storage, and the data collection capacity, Big Data are now rapidly expanding in all science and engineering domains, including physical, biological and biomedical sciences. This paper presents a HACE theorem that characterizes the features of the Big Data revolution, and proposes a Big Data processing model, from the data mining perspective. This data-driven model involves demand-driven aggregation of information sources, mining and analysis, user interest modeling, and security and privacy considerations. We analyze the challenging issues in the data-driven model and also in the Big Data revolution.
Whether you believe into the hype around Big Data's affirmation to transform business, it is true that learning how to use the present deluge of data can help you make better decisions. Thanks to big data technologies, everything can now be used as data, giving you unparalleled access to market determinants. Contact V2Soft's Big Data Solutions if you wish to implement big data technology in your business and need help getting started. https://bit.ly/2kmiYFp
Presentation: Big Data – From Strategy to Production - Mario Meir-Huber, Big Data Leader Eastern Europe, Teradata GmbH (AT), at the European Data Economy Workshop taking place back to back to SEMANTiCS2015 on 15 September 2015 in Vienna
DISUMMIT - Rishi Nalin Kumar from DatakindDigitYser
Rishi Nalin Kumar
Chief Scientist at eBench
Half professional, half collaborator, one quarter mathematician. Currently at eBench helping brands understand their consumers and win with their content. Previously leading data science & analytics in large-corporate consumer goods with a light touch of news & media. Proud volunteer at DataKind and a regular on the data & analytics speaker circuit.
DISUMMIT Keynote presentation from Kirk Borne - From Sensors to Sense-Making DigitYser
Dr. Kirk Borne is a Principal Data Scientist at Booz Allen Hamilton. With a rich background in Astrophysics and Computational Science, he was a precursor on implementing courses of big data in academia. He is one of the most important promotors of data literacy in the world.
About Kirk and his view on data literacy and evolution
On his first visit to Brussels, Kirk first activity was sharing his best practices to promote data literacy. While enjoying a magnificent view of Brussels from the ING headquarter building, Kirk playfully (with a pair of socks!) explained how subjectivity plays a major role in the way that data is understood, derived by the wide variety of involved. This keynote was delivered at the speakers reception, which took place the day before the DI Summit.
The following day, Kirk wrapped up the DI summit with his closing keynote on how data has shifted into something that is sense-making, following the evolution from “data” to “big data” into “smart data” composed by both enriched and semantic data and essential for IoT. He also discussed the levels of maturity in a self-driving enterprise, wrapping up his participation sharing this equation:
Big Data + IoT + Citizen Data Scientists = Partners in Sustainability
Kirk’s impression on the DI Summit was that it was a fun and informative event to join. His favorite format were the 5” pitches, as they were properly structured, providing the most critical information to the attendees. He also think that the networking dynamic ensured that all attendees met interesting people.
A takeaway from Kirk’s presentation
“Big data is not about how big it is, but the value you extract from it”
We look forward to have Kirk sometime soon back in Brussels!
Kirk’s interview:
Kirk’s presentation recording:
Kirk’s decks:
Kirk’s presentation drawing:
2) Here are some video interviews that I have done:
https://www.youtube.com/watch?v=ku2na1mLZZ8
https://www.youtube.com/watch?v=iXjvht91nFk
Here is my TedX talk: https://www.youtube.com/watch?v=Zr02fMBfuRA
Trends in Big Data & Business Challenges Experian_US
Join our #DataTalk on Thursdays at 5 p.m. ET. This week, we tweeted with Sushil Pramanick – who is the founder and president of the The Big Data Institute (TBDI).
You can learn about upcoming chats and see the archive of past big data tweetchats here
http://www.experian.com/blogs/news/about/datadriven
Whitepaper: Thriving in the Big Data era Manage Data before Data Manages you Intellectyx Inc
Paper Overview -
Every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the world today has been created in the last two years alone.
Data comes from everywhere and we are generating data more than ever before.
This white paper will explain what Big Data is and provide practical examples, concluding with a message how to put data your data to work.
Learn about the emerging field of big data and advanced quantitative models and how the Rady School's MS in Business Analytics program is designed to solve important business problems.
This presentation offers a basic understanding of Big Data. It does this by defining Big Data, offers a History of Big Data, Big Data by the Numbers and the 8 Laws of Big Data
Big Data : From HindSight to Insight to ForesightSunil Ranka
When it comes to Analytics and Reporting , There is a fine line between HindSight to Insight to Foresight . With the evolution of BigData technology, there is a need in deriving value out of the larger datasets, not available in the past. Even before we can start using the new shiny technologies, there is a need of understanding what is categorized as reporting or business intelligence or Big Data and Analytics. Based on my experience, people struggle to distinguish between reporting, Analytics, and Business Intelligence.
Why Everything You Know About bigdata Is A LieSunil Ranka
As a big data technologist, you can bet that you have heard it all: every crazy claim, myth, and outright lie about what big data is and what it isn't that you can imagine, and probably a few that you can't.If your company has a big data initiative or is considering one, you should be aware of these false statements and the reasons why they are wrong.
Welcome to the Age of Big Data in Banking Andy Hirst
Big Data in banking presentation from Sibos Dubai 2013 . What are use cases driving deployments in Banking ? See the use cases SAP is involved In banking in 2013
Big Data Applications | Big Data Application Examples | Big Data Use Cases | ...Simplilearn
In this Big Data presentation, we will be discussing the Big data growth over the last few years followed by the various big data applications. We will look into the various sectors where big data is used such as weather forecast, healthcare, media and entertainment, logistics, travel & tourism and finally in the government & law enforcement sector.
We will be discussing how below industries are using Big Data presentation:
1. Weather forecast
2. Media and entertainment
3. Healthcare
4. Logistics
5. Travel n tourism
6. Government and law enforcement
What is this Big Data Hadoop training course about?
The Big Data Hadoop and Spark developer course have been designed to impart an in-depth knowledge of Big Data processing using Hadoop and Spark. The course is packed with real-life projects and case studies to be executed in the CloudLab.
What are the course objectives?
This course will enable you to:
1. Understand the different components of Hadoop ecosystem such as Hadoop 2.7, Yarn, MapReduce, Pig, Hive, Impala, HBase, Sqoop, Flume, and Apache Spark
2. Understand Hadoop Distributed File System (HDFS) and YARN as well as their architecture, and learn how to work with them for storage and resource management
3. Understand MapReduce and its characteristics, and assimilate some advanced MapReduce concepts
4. Get an overview of Sqoop and Flume and describe how to ingest data using them
5. Create database and tables in Hive and Impala, understand HBase, and use Hive and Impala for partitioning
6. Understand different types of file formats, Avro Schema, using Arvo with Hive, and Sqoop and Schema evolution
7. Understand Flume, Flume architecture, sources, flume sinks, channels, and flume configurations
8. Understand HBase, its architecture, data storage, and working with HBase. You will also understand the difference between HBase and RDBMS
9. Gain a working knowledge of Pig and its components
10. Do functional programming in Spark
11. Understand resilient distribution datasets (RDD) in detail
12. Implement and build Spark applications
13. Gain an in-depth understanding of parallel processing in Spark and Spark RDD optimization techniques
14. Understand the common use-cases of Spark and the various interactive algorithms
15. Learn Spark SQL, creating, transforming, and querying Data frames
Learn more at https://www.simplilearn.com/big-data-and-analytics/big-data-and-hadoop-training
Met 80 procent van de klantdata wordt dit jaar niets gedaan. Hoeveel geld laat uw organisatie daardoor liggen? Data is geld waard. Bedrijven die vooruitdenken, managen hun data zo dat ze er winst uit halen. En dat geeft ze een flinke voorsprong.
Abstract della presentazione di Fabio Rizzotto, IT Research & Consulting Director di IDC Italia, tenuta all’IDC Big Data Conference II, a Bologna il 19 novembre 2013
Abstract:
Big Data concern large-volume, complex, growing data sets with multiple, autonomous sources. With the fast development of networking, data storage, and the data collection capacity, Big Data are now rapidly expanding in all science and engineering domains, including physical, biological and biomedical sciences. This paper presents a HACE theorem that characterizes the features of the Big Data revolution, and proposes a Big Data processing model, from the data mining perspective. This data-driven model involves demand-driven aggregation of information sources, mining and analysis, user interest modeling, and security and privacy considerations. We analyze the challenging issues in the data-driven model and also in the Big Data revolution.
Whether you believe into the hype around Big Data's affirmation to transform business, it is true that learning how to use the present deluge of data can help you make better decisions. Thanks to big data technologies, everything can now be used as data, giving you unparalleled access to market determinants. Contact V2Soft's Big Data Solutions if you wish to implement big data technology in your business and need help getting started. https://bit.ly/2kmiYFp
Beyond the Classroom consists of events, workshops and presentations meant to introduce Computer Science students to learning opportunities in addition to their regular classroom experiences. Beyond the Classroom events are free and open to all NHCC CSci students.
This presentation is about Big Data, how it changes the traditional data landscape, how different companies are using it, and which skills are in demand.
It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...Edgar Alejandro Villegas
Presentation slides of:
It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 2013 - PDF
Scott Mackenzie - Sr. Director, Platform & Analytics CoE
Michael Golzc - CIO for SAP Americas
Ken Demma - VP, Insight Driven Marketing
20 Aug 2013 - Webcast - http://goo.gl/T74WAL
Big Data Trends and Challenges Report - WhitepaperVasu S
In this whitepaper read How companies address common big data trends & challenges to gain greater value from their data.
https://www.qubole.com/resources/report/big-data-trends-and-challenges-report
This talk is an introduction to Data Science. It explains Data Science from two perspectives - as a profession and as a descipline. While covering the benefits of Data Science for business, It explaints how to get started for embracing data science in business.
The world around us is changing. Data is embedded in everything, and users from all lines of business want to leverage this data to influence decisions. The trick is to create a culture for pervasive analytics and empower the business to use data everywhere.
The core enabling technology to make this happen is Apache Hadoop. By leveraging Hadoop, organizations of all sizes and across all industries are making business models more predictable, and creating significant competitive advantages using big data.
Join Cloudera and Forrester to learn:
- What we mean by pervasive analytics, how it impacts your organization, and how to get started
- How leading organizations are using pervasive analytics for competitive advantage
- How Cloudera’s extensive partner ecosystem complements your strategy, helping deliver results faster
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
1. Big Data Analytics and its use in Digital Marketing
Group 6
Members:
Mr. Sadanand Gupta (2015PGPMX022)
Mr. Samir Shah (2015PGPMX023)
Mr. Sanmeet Dhokay (2015PGPMX025)
Mr. Vishit Trivedi (2015PGPMX030)
11th June, 2017
2. Introduction
• Big Data burst upon the scene in the first decade of the 21st century is the Next Big Thing in the IT
world.
• The first organizations to embrace it were online and startup firms.
• Firms like Google, eBay, LinkedIn, and Facebook were built around big
data from the beginning.
• Like many new information technologies, big data can bring about dramatic cost reductions,
substantial improvements in the time required to perform a computing task, or new product and
service offerings.
• Big Data generates value from the storage and processing of very large quantities of digital
information that cannot be analyzed with traditional computing techniques.
3. The Model Has Changed…
• The Model of Generating/Consuming Data has Changed
Old Model: Few companies are generating data, all others are consuming data
New Model: all of us are generating data, and all of us are consuming data
3
4. How much data?
• Google processes 20 PB a day
• Facebook has over 2.5 PB of user
data + 15 TB/day and handles 40
billion photos from its user base.
• eBay has 6.5 PB of user data + 50
TB/day (5/2009)
• Walmart handles more than 1
million customer transactions
every hour.
5. Who is Generating Big Data
Social media and networks
(all of us are generating data)
Scientific instruments
(collecting all sorts of data)
Mobile devices
(tracking all objects all the time)
Sensor technology and networks
(measuring all kinds of data)
5
7. Big Data Analytics
Traditional Analytics (BI) Big Data Analytics
Focus on • Descriptive analytics
• Diagnosis analytics
• Predictive analytics
• Data Science
Data Sets • Limited data sets
• Cleansed data
• Simple models
• Large scale data sets
• More types of data
• Raw data
• Complex data models
Supports Causation: what
happened, and why?
Correlation: new
insight More accurate
answers
vs
7
8. Merging the Traditional and Big Data Approaches
IT
Structures the data
to answer that
question
IT
Delivers a platform to
enable creative
discovery
Business Users
Explores what questions
could be asked
Business Users
Determine what
question to ask
Monthly sales reports
Profitability analysis
Customer surveys
Brand sentiment
Product strategy
Maximum asset utilization
Big Data Approach
Iterative & Exploratory Analysis
Traditional Approach
Structured & Repeatable Analysis
8
Structured
vs.
Exploratory
9. Challenges in Handling Big Data
• The Bottleneck is in technology
• Data volume, Processing capabilities, New architecture, algorithms,
techniques are needed
• Also in technical skills
• Experts in using the new technology and dealing with big data
9
12. Applications of Big Data Analytics
12
Insights from website visitors:
• How they reached your site
• The pages they visited
• What they bought
• How long they stayed on the site
• What page they were on when they left
And much more……
What their occupation is
Their level of education
How much they earn
If they have a family to support
Their financial priorities
How they like to spend their disposable income
Why they have/haven’t bought from you
What they were looking for when they came to your site
Whether they found what they were looking for when they
came to your site
13. Applications of Big Data Analytics
13
For instance, Google Analytics can tell us:
• If certain pages of our site have higher than expected
exit rates.
• Which pages of our site keep visitors engaged the
longest (we can use this in collaboration with exit data to
compare our best and worst performing pages and make
changes accordingly).
• Which devices your visitors are using (and aren’t using)
to access your website. This allows developers to
prioritize optimizing the functionality of the site for the
devices your visitors are actually using.
14. Big data: This is just the beginning
2010
VolumeinExabytes
9000
2025
Percentage of uncertain data
Percentofuncertaindata
You are here
Sensors
& Devices
VoIP
Enterprise
Data
Social
Media
3000
6000
100
0
50
14
Veracity
Source: IBM Global Technology Outlook 2012 IBM source data is based on analysis done by the IBM Market Intelligence Department. IBM Market Intelligence data
is provided for illustrative purposes and is not intended to be a guarantee of future growth rates or market opportunity
Volume
Variety