Presentation by John Repko to the Colorado Society for Information Management (http://www.sim-colorado.org/), March 19, 2013. It talks about big data "killer apps," and the two kinds of innovation ("Hindsight" and "Foresight") that big data can bring to any business.
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...Connotate
This document discusses how web data can reveal information about employees, business partners, and persons of interest. It outlines the business case for using web data to conduct background checks and screenings. It also discusses challenges like collecting good data from various sources and analyzing large amounts of unstructured data. Advanced text analytics solutions that use entity resolution and relationship extraction are presented as helping to understand web data. The document concludes by describing how these techniques were applied in a project with Thorn to detect child sex trafficking online.
The document discusses challenges organizations face in making analytics actionable. It notes that while data is growing exponentially from both internal and external sources, only a small portion of business leaders have the information they need to make decisions. Additionally, most employees do not understand their organization's goals or how their work aligns with these goals. The document advocates for improving data integration and rationalization to make more data available and useful for analytics. It also discusses how analytics maturity can be improved by moving from basic reporting to predictive analytics and optimizing business processes. Recent technological advances are helping organizations tackle more complex analytics to gain new insights.
The document discusses strategies for organizations to better manage big data when resources are limited. It recommends identifying unused data in the data warehouse in order to reduce costs by moving that data to cheaper platforms like Hadoop. Organizations can save millions by offloading data that is not frequently queried but must be retained for regulatory reasons. The document also suggests purging data that is not needed at all to further reduce storage and management costs. Proper classification and placement of data onto platforms suited to its usage level and type, such as Hadoop for less critical datasets, can help organizations get more value from their data with fewer resources.
Thwart Fraud Using Graph-Enhanced Machine Learning and AINeo4j
This webinar will discuss using graph-enhanced machine learning and AI to thwart fraud. On February 6th, Scott Heath from Expero and Amy Hodler from Neo4j will discuss how graph databases can be used to identify patterns and relationships in complex transactional data to detect fraud. The webinar is part of a series that will also cover building intelligent fraud prevention systems using machine learning and graphs, and obtaining funding for graph-enhanced fraud solutions.
Move It Don't Lose It: Is Your Big Data Collecting Dust?Jennifer Walker
The document discusses the rapid growth of big data and challenges of gaining insights from data. Some key points:
- By 2020, the digital universe is projected to reach 40 zettabytes, with 5,200 GB of data for every person on Earth.
- Data is coming from a growing number of sources like IoT devices, mobile devices, social media, and more. Much of this data is unstructured.
- Moving large amounts of data to storage and analytics platforms in a timely manner is challenging using traditional ETL and bulk transfer methods, which can take months.
- Freshness of data is important for insights but current methods result in data becoming stale before it reaches its destination.
Analytics 3.0 Measurable business impact from analytics & big dataMicrosoft
Presentación del evento de Harvard Business Review sobre Analítica y Big Data
(15 de Octubre 2013)
"Featuring analytics expert Tom Davenport, author of Competing on Analytics, Analytics at Work, and the just-released Keeping Up with the Quants" 
The document discusses Jongwook Woo and his background working with big data. It provides details on Woo's experience as a professor focusing on big data research and education partnerships. It also outlines some of the topics Woo covers in his presentations including introductions to big data, artificial intelligence, and the relationship between AI and big data. Key technologies like Hadoop, Spark, and neural networks are mentioned.
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...Connotate
This document discusses how web data can reveal information about employees, business partners, and persons of interest. It outlines the business case for using web data to conduct background checks and screenings. It also discusses challenges like collecting good data from various sources and analyzing large amounts of unstructured data. Advanced text analytics solutions that use entity resolution and relationship extraction are presented as helping to understand web data. The document concludes by describing how these techniques were applied in a project with Thorn to detect child sex trafficking online.
The document discusses challenges organizations face in making analytics actionable. It notes that while data is growing exponentially from both internal and external sources, only a small portion of business leaders have the information they need to make decisions. Additionally, most employees do not understand their organization's goals or how their work aligns with these goals. The document advocates for improving data integration and rationalization to make more data available and useful for analytics. It also discusses how analytics maturity can be improved by moving from basic reporting to predictive analytics and optimizing business processes. Recent technological advances are helping organizations tackle more complex analytics to gain new insights.
The document discusses strategies for organizations to better manage big data when resources are limited. It recommends identifying unused data in the data warehouse in order to reduce costs by moving that data to cheaper platforms like Hadoop. Organizations can save millions by offloading data that is not frequently queried but must be retained for regulatory reasons. The document also suggests purging data that is not needed at all to further reduce storage and management costs. Proper classification and placement of data onto platforms suited to its usage level and type, such as Hadoop for less critical datasets, can help organizations get more value from their data with fewer resources.
Thwart Fraud Using Graph-Enhanced Machine Learning and AINeo4j
This webinar will discuss using graph-enhanced machine learning and AI to thwart fraud. On February 6th, Scott Heath from Expero and Amy Hodler from Neo4j will discuss how graph databases can be used to identify patterns and relationships in complex transactional data to detect fraud. The webinar is part of a series that will also cover building intelligent fraud prevention systems using machine learning and graphs, and obtaining funding for graph-enhanced fraud solutions.
Move It Don't Lose It: Is Your Big Data Collecting Dust?Jennifer Walker
The document discusses the rapid growth of big data and challenges of gaining insights from data. Some key points:
- By 2020, the digital universe is projected to reach 40 zettabytes, with 5,200 GB of data for every person on Earth.
- Data is coming from a growing number of sources like IoT devices, mobile devices, social media, and more. Much of this data is unstructured.
- Moving large amounts of data to storage and analytics platforms in a timely manner is challenging using traditional ETL and bulk transfer methods, which can take months.
- Freshness of data is important for insights but current methods result in data becoming stale before it reaches its destination.
Analytics 3.0 Measurable business impact from analytics & big dataMicrosoft
Presentación del evento de Harvard Business Review sobre Analítica y Big Data
(15 de Octubre 2013)
"Featuring analytics expert Tom Davenport, author of Competing on Analytics, Analytics at Work, and the just-released Keeping Up with the Quants" 
The document discusses Jongwook Woo and his background working with big data. It provides details on Woo's experience as a professor focusing on big data research and education partnerships. It also outlines some of the topics Woo covers in his presentations including introductions to big data, artificial intelligence, and the relationship between AI and big data. Key technologies like Hadoop, Spark, and neural networks are mentioned.
Knowledge Graphs for a Connected World - AI, Deep & Machine Learning MeetupBenjamin Nussbaum
We live in an era where the world is more connected than ever before and the trajectory is such that data relationships will only continue to increase with no signs of slowing down. Connected data is the key to your business succeeding and growing in today’s connected world. Leading enterprises will be the ones that utilize relationship-centric technologies to leverage connections from their internal operations and supply chain to their customer and user interactions. This ability to utilize connected data to understand all the nuanced relationships within their organization will propel them forward as they act on more holistic insights.
Every organization needs a knowledge graph because connected data is an essential foundation to advancing business. Additional reading on connected can be found here: https://www.graphgrid.com/why-connected-data-is-more-useful/
The document discusses how big data is changing business due to the massive increase in data creation in recent years. It notes that 90% of data in the world was created in just the last two years alone. The document then provides an overview of what big data means and the factors involved, including volume, velocity, variety, and value. It also reviews some case studies and discusses how big data is affecting software companies and creating new opportunities.
How to use your data science team: Becoming a data-driven organizationYael Garten
Talk given at Strata Hadoop World conference March 2016.
http://conferences.oreilly.com/strata/hadoop-big-data-ca/public/schedule/detail/48305
In this talk we review the culture, process and tools needed for a data driven organization. We review an example of how companies like LinkedIn use data to make business decisions, and then walk through the culture, process, and tools needed to foster this. We review the spectrum of data science used within an organization and explore organizational needs, such as the democratization of data via self-serve data platforms for experimentation, monitoring, and data exploration, as well as the challenges that come with such systems. Participants leave this session with the ability to identify opportunities for data scientists to contribute within their organization and with an understanding of what investments are needed to drive transformation into a data-driven organization.
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.Jennifer Walker
The document discusses how Hadoop is often used primarily as a data storage system rather than an agile analytics platform. It argues that for Hadoop to enable productive analytics, companies need to transform Hadoop into a system that allows for iterative exploration of diverse data sources through intuitive interfaces that leverage machine learning. This requires addressing challenges such as a lack of data understanding, scarce expertise, and time-consuming data preparation processes. Adopting platforms that provide self-service access and leverage business context can help democratize data access and analysis.
GGV Capital: Venture Investing and the Cloud (2012)GGV Capital
This document discusses venture investing in cloud computing. It provides an overview of why VCs continue to see opportunities in the cloud sector. The presentation agenda covers trends disrupting the cloud like mobile and big data, as well as opportunities in serving small and medium businesses. The document concludes with advice for cloud startups on effectively approaching VCs for funding, emphasizing differentiation, market size, scalability, financial model, and chemistry over legal terms.
The world around us is changing. Data is embedded in everything, and users from all lines of business want to leverage this data to influence decisions. The trick is to create a culture for pervasive analytics and empower the business to use data everywhere.
The core enabling technology to make this happen is Apache Hadoop. By leveraging Hadoop, organizations of all sizes and across all industries are making business models more predictable, and creating significant competitive advantages using big data.
Join Cloudera and Forrester to learn:
- What we mean by pervasive analytics, how it impacts your organization, and how to get started
- How leading organizations are using pervasive analytics for competitive advantage
- How Cloudera’s extensive partner ecosystem complements your strategy, helping deliver results faster
This talk is an introduction to Data Science. It explains Data Science from two perspectives - as a profession and as a descipline. While covering the benefits of Data Science for business, It explaints how to get started for embracing data science in business.
Loras College 2016 Business Analytics Symposium KeynoteRich Clayton
Leaders who embrace data have a profound impact on their organizations yet too few seize the opportunity. Biases in decision making, technology myths, data quality and analytical skills and are the most frequently cited obstacles by organizations of all sizes. Technology advances have neutralized the scale advantage and have democratized analytics for every organization – so now what? Are you to engage more data in your management decisions? Do you have an analytic strategy that has two speeds – one for innovation and one for scale? Are you investing in your top talent so they can ask new questions?
We’ll explore these topics and how to create an analytic culture in your organization. We’ll share how leaders have transformed their organizations by innovating their analytic processes, re-designing the way they work and embracing new technology innovation. We’ll dispel myths about technology and provide you a foundation for building your journey to analytic excellence.
Intel, Cloudera and guest speaker Forrester Research, Inc. discuss the strategy of pervasive analytics and real life examples of how analytics have already been embedded into applications and workflows.
HPE IDOL Technical Overview - july 2016Andrey Karpov
Search and Analytics Platform for Text and Rich Media
Open Innovation is transforming everything
Connected people, apps and things generating massive data in many forms
How do you bridge the gap between data and outcomes?
Augmented Intelligence power apps for competitive advantage
Machine Learning at the Service of Business Augmented Intelligence
HPE Big Data Advanced Analytics Software Solutions
Strong information and weak information
HPE IDOL: Natural Language Processing (NLP) engine
Gayatri Patel, eBay, presents at the Big Analytics 2012 Roadshow
The wonders of what data can do for an organization is measured in the productivity and competitiveness of their team's decisions. Some believe more data is the key. Agreed...but good decisions require more than just deriving intelligence from big data. In this dynamic market, the need to socialize and evolve ideas with other teams, quickly correlate information across sources, and test ideas to fail fast early are strong enablers to gain competitive footing. eBay¹s analytic and technology advancements garners insights and approaches that continue to help our employees tell their "data stories" and make better decisions.
Module 6 The Future of Big and Smart Data- Online caniceconsulting
This document provides an overview of the future predictions and trends related to big data. Some of the key predictions discussed include machine learning becoming prominent in big data analysis, privacy emerging as a major challenge, and the creation of chief data officer positions. Emerging trends covered include the growth of open source solutions like Hadoop, the use of in-memory technologies to speed processing, and the incorporation of machine learning and predictive analytics. The document also discusses opportunities that big data presents for industries like increased productivity and sales.
What is the impact of Big Data on Analytics from a Data Science perspective.
Presented at the Big Data and Analytics Summit 2014, Nasscom by Mamatha Upadhyaya.
Big Data: The Force That’s Good for Consumers and SocietyExperian_US
Craig Boundy, CEO of Experian North America, discusses how big data is being used as a force for good. Good for consumers, good for business, and good for society. He shares his perspective how Experian’s work in data and analytics has real-life applications.
The document discusses 25 predictions about the future of big data:
1) Data volumes and ways to analyze data will continue growing exponentially with improvements in machine learning and real-time analytics.
2) More companies will appoint chief data officers and use data as a competitive advantage.
3) Data governance, visualization, and delivery through data fabrics and marketplaces will be key to extracting insights from diverse data sources and empowering partners.
4) Data is becoming a new global currency and companies are monetizing their data through algorithms, services, and by becoming "data businesses."
From the MarTech Conference in London, UK, October 20-21, 2015. SESSION: The Human Side of Analytics. PRESENTATION: The Human Side of Data - Given by Colin Strong - @colinstrong - Managing Director - Verve, Author of Humanizing Big Data. #MarTech DAY2
Oracle ACE Director Dan Morgan and PTC Chief Strategy Officer Mark Swanholm, presented this special webinar to discuss Big Data and the choices ahead for organizations. for more details about Performance Tuning Corporation, visit www.peftuning.com .
Organizations are being bombarded with messages telling you that you must make an investment in Big Data, that without it your organization will be rendered obsolete, a mere bystander, on the road to increased growth and profitability.
But do you? How exactly will your organization benefit from Big Data? When do you invest – and does investing in Big Data mean leaving the rest of your data strategy stranded?
Oracle ACE Director Dan Morgan, an internationally recognized expert in database technology and former University of Washington lecturer, and Mark Swanholm, PTC’s Chief Strategy Officer and 22 year IT Veteran, will address the issue of Big Data from the standpoint of what it is, where the value can be found, what is actually required to turn this new technology into something of value.
This Performance Tuning Corporation online event will focus on strategy, management, planning, and budgeting, and will provide you and your management team the information they need to plan make the best possible decision with respect to an investment in Big Data technology.
Cutting Edge Predictive Analytics with Eric Siegel Databricks
Apache Spark empowers predictive analytics and machine learning by increasing the reach and potential. But, before jumping to new deployments, it’s critical we 1) get the analytics right and 2) not overlook less conspicuous business opportunities. In this keynote, Predictive Analytics World founder and “Predictive Analytics” author Eric Siegel ramps you up on a dangerous pitfall and a critical value proposition:
– PITFALL: Avoiding BS predictive insights, i.e., “bad science,” spurious discoveries
– OPPORTUNITY: Optimizing marketing persuasion by predicting the *influence* of marketing treatments, i.e., uplift modeling
A Journey into bringing (Artificial) Intelligence to the EnterprisePatrick Deglon
- Dr. Patrick Deglon has a PhD in particle physics and spent 10 years studying the creation of the universe before moving into the business world. He has since held leadership roles in analytics at eBay, Motorola Mobility, and currently Teradata, where he drives Teradata's advanced analytics strategy.
- The document discusses using particle physics methods like combining all possibilities in a "cross-product" to analyze large datasets and extract signals from statistical noise, as well as examples of how these methods have been applied at CERN and in marketing analytics.
- It presents a vision of how cyberphysical systems and artificial intelligence will continue transforming enterprises and society over the coming decades.
This document discusses the need for a new paradigm in big data analytics using algorithms. It begins by describing the limitations of traditional analytics approaches like statistical analysis, data mining, visualization and business intelligence tools when applied to big data. These approaches are query-based and labor intensive. Emerging big data tools like Hadoop and in-memory databases help with storage and queries but do not provide automated insights. The document argues that the new paradigm should focus on algorithms that can automatically surface insights from data in seconds, replacing the need for data analysts to manually query databases. This represents a shift from humans digging for insights to algorithms surfacing insights for humans to evaluate.
This document provides an agenda for a presentation titled "Pictures at an Exhibition: Ruby, Rails, NoSQL and Big Data". The presentation explores solving big data problems using NoSQL databases and Ruby on Rails. It discusses key-value, document, and graph databases as well as MapReduce. Examples and code snippets are provided for Redis, Riak, MongoDB, Cassandra, Neo4J, and using MapReduce with Hadoop, Riak/MongoDB, and Elastic MapReduce. The goal is to show how big data problems typically have one of two solution patterns: using past patterns to predict the future (foresight) or using past events to explain current outcomes (hindsight).
2015 is knocking on the door and will be an exciting and surprising year for the BI industry. However, not everything will be a surprise for Panorama as we are always on top of the latest trends influencing the Business Intelligence community.
• What will the future hold for the industry?
• What are our BI experts thoughts, predictions and internal assessments on what new directions the Business Intelligence community will see in the coming year?
• Countdown of the most important trends in the industry
Knowledge Graphs for a Connected World - AI, Deep & Machine Learning MeetupBenjamin Nussbaum
We live in an era where the world is more connected than ever before and the trajectory is such that data relationships will only continue to increase with no signs of slowing down. Connected data is the key to your business succeeding and growing in today’s connected world. Leading enterprises will be the ones that utilize relationship-centric technologies to leverage connections from their internal operations and supply chain to their customer and user interactions. This ability to utilize connected data to understand all the nuanced relationships within their organization will propel them forward as they act on more holistic insights.
Every organization needs a knowledge graph because connected data is an essential foundation to advancing business. Additional reading on connected can be found here: https://www.graphgrid.com/why-connected-data-is-more-useful/
The document discusses how big data is changing business due to the massive increase in data creation in recent years. It notes that 90% of data in the world was created in just the last two years alone. The document then provides an overview of what big data means and the factors involved, including volume, velocity, variety, and value. It also reviews some case studies and discusses how big data is affecting software companies and creating new opportunities.
How to use your data science team: Becoming a data-driven organizationYael Garten
Talk given at Strata Hadoop World conference March 2016.
http://conferences.oreilly.com/strata/hadoop-big-data-ca/public/schedule/detail/48305
In this talk we review the culture, process and tools needed for a data driven organization. We review an example of how companies like LinkedIn use data to make business decisions, and then walk through the culture, process, and tools needed to foster this. We review the spectrum of data science used within an organization and explore organizational needs, such as the democratization of data via self-serve data platforms for experimentation, monitoring, and data exploration, as well as the challenges that come with such systems. Participants leave this session with the ability to identify opportunities for data scientists to contribute within their organization and with an understanding of what investments are needed to drive transformation into a data-driven organization.
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.Jennifer Walker
The document discusses how Hadoop is often used primarily as a data storage system rather than an agile analytics platform. It argues that for Hadoop to enable productive analytics, companies need to transform Hadoop into a system that allows for iterative exploration of diverse data sources through intuitive interfaces that leverage machine learning. This requires addressing challenges such as a lack of data understanding, scarce expertise, and time-consuming data preparation processes. Adopting platforms that provide self-service access and leverage business context can help democratize data access and analysis.
GGV Capital: Venture Investing and the Cloud (2012)GGV Capital
This document discusses venture investing in cloud computing. It provides an overview of why VCs continue to see opportunities in the cloud sector. The presentation agenda covers trends disrupting the cloud like mobile and big data, as well as opportunities in serving small and medium businesses. The document concludes with advice for cloud startups on effectively approaching VCs for funding, emphasizing differentiation, market size, scalability, financial model, and chemistry over legal terms.
The world around us is changing. Data is embedded in everything, and users from all lines of business want to leverage this data to influence decisions. The trick is to create a culture for pervasive analytics and empower the business to use data everywhere.
The core enabling technology to make this happen is Apache Hadoop. By leveraging Hadoop, organizations of all sizes and across all industries are making business models more predictable, and creating significant competitive advantages using big data.
Join Cloudera and Forrester to learn:
- What we mean by pervasive analytics, how it impacts your organization, and how to get started
- How leading organizations are using pervasive analytics for competitive advantage
- How Cloudera’s extensive partner ecosystem complements your strategy, helping deliver results faster
This talk is an introduction to Data Science. It explains Data Science from two perspectives - as a profession and as a descipline. While covering the benefits of Data Science for business, It explaints how to get started for embracing data science in business.
Loras College 2016 Business Analytics Symposium KeynoteRich Clayton
Leaders who embrace data have a profound impact on their organizations yet too few seize the opportunity. Biases in decision making, technology myths, data quality and analytical skills and are the most frequently cited obstacles by organizations of all sizes. Technology advances have neutralized the scale advantage and have democratized analytics for every organization – so now what? Are you to engage more data in your management decisions? Do you have an analytic strategy that has two speeds – one for innovation and one for scale? Are you investing in your top talent so they can ask new questions?
We’ll explore these topics and how to create an analytic culture in your organization. We’ll share how leaders have transformed their organizations by innovating their analytic processes, re-designing the way they work and embracing new technology innovation. We’ll dispel myths about technology and provide you a foundation for building your journey to analytic excellence.
Intel, Cloudera and guest speaker Forrester Research, Inc. discuss the strategy of pervasive analytics and real life examples of how analytics have already been embedded into applications and workflows.
HPE IDOL Technical Overview - july 2016Andrey Karpov
Search and Analytics Platform for Text and Rich Media
Open Innovation is transforming everything
Connected people, apps and things generating massive data in many forms
How do you bridge the gap between data and outcomes?
Augmented Intelligence power apps for competitive advantage
Machine Learning at the Service of Business Augmented Intelligence
HPE Big Data Advanced Analytics Software Solutions
Strong information and weak information
HPE IDOL: Natural Language Processing (NLP) engine
Gayatri Patel, eBay, presents at the Big Analytics 2012 Roadshow
The wonders of what data can do for an organization is measured in the productivity and competitiveness of their team's decisions. Some believe more data is the key. Agreed...but good decisions require more than just deriving intelligence from big data. In this dynamic market, the need to socialize and evolve ideas with other teams, quickly correlate information across sources, and test ideas to fail fast early are strong enablers to gain competitive footing. eBay¹s analytic and technology advancements garners insights and approaches that continue to help our employees tell their "data stories" and make better decisions.
Module 6 The Future of Big and Smart Data- Online caniceconsulting
This document provides an overview of the future predictions and trends related to big data. Some of the key predictions discussed include machine learning becoming prominent in big data analysis, privacy emerging as a major challenge, and the creation of chief data officer positions. Emerging trends covered include the growth of open source solutions like Hadoop, the use of in-memory technologies to speed processing, and the incorporation of machine learning and predictive analytics. The document also discusses opportunities that big data presents for industries like increased productivity and sales.
What is the impact of Big Data on Analytics from a Data Science perspective.
Presented at the Big Data and Analytics Summit 2014, Nasscom by Mamatha Upadhyaya.
Big Data: The Force That’s Good for Consumers and SocietyExperian_US
Craig Boundy, CEO of Experian North America, discusses how big data is being used as a force for good. Good for consumers, good for business, and good for society. He shares his perspective how Experian’s work in data and analytics has real-life applications.
The document discusses 25 predictions about the future of big data:
1) Data volumes and ways to analyze data will continue growing exponentially with improvements in machine learning and real-time analytics.
2) More companies will appoint chief data officers and use data as a competitive advantage.
3) Data governance, visualization, and delivery through data fabrics and marketplaces will be key to extracting insights from diverse data sources and empowering partners.
4) Data is becoming a new global currency and companies are monetizing their data through algorithms, services, and by becoming "data businesses."
From the MarTech Conference in London, UK, October 20-21, 2015. SESSION: The Human Side of Analytics. PRESENTATION: The Human Side of Data - Given by Colin Strong - @colinstrong - Managing Director - Verve, Author of Humanizing Big Data. #MarTech DAY2
Oracle ACE Director Dan Morgan and PTC Chief Strategy Officer Mark Swanholm, presented this special webinar to discuss Big Data and the choices ahead for organizations. for more details about Performance Tuning Corporation, visit www.peftuning.com .
Organizations are being bombarded with messages telling you that you must make an investment in Big Data, that without it your organization will be rendered obsolete, a mere bystander, on the road to increased growth and profitability.
But do you? How exactly will your organization benefit from Big Data? When do you invest – and does investing in Big Data mean leaving the rest of your data strategy stranded?
Oracle ACE Director Dan Morgan, an internationally recognized expert in database technology and former University of Washington lecturer, and Mark Swanholm, PTC’s Chief Strategy Officer and 22 year IT Veteran, will address the issue of Big Data from the standpoint of what it is, where the value can be found, what is actually required to turn this new technology into something of value.
This Performance Tuning Corporation online event will focus on strategy, management, planning, and budgeting, and will provide you and your management team the information they need to plan make the best possible decision with respect to an investment in Big Data technology.
Cutting Edge Predictive Analytics with Eric Siegel Databricks
Apache Spark empowers predictive analytics and machine learning by increasing the reach and potential. But, before jumping to new deployments, it’s critical we 1) get the analytics right and 2) not overlook less conspicuous business opportunities. In this keynote, Predictive Analytics World founder and “Predictive Analytics” author Eric Siegel ramps you up on a dangerous pitfall and a critical value proposition:
– PITFALL: Avoiding BS predictive insights, i.e., “bad science,” spurious discoveries
– OPPORTUNITY: Optimizing marketing persuasion by predicting the *influence* of marketing treatments, i.e., uplift modeling
A Journey into bringing (Artificial) Intelligence to the EnterprisePatrick Deglon
- Dr. Patrick Deglon has a PhD in particle physics and spent 10 years studying the creation of the universe before moving into the business world. He has since held leadership roles in analytics at eBay, Motorola Mobility, and currently Teradata, where he drives Teradata's advanced analytics strategy.
- The document discusses using particle physics methods like combining all possibilities in a "cross-product" to analyze large datasets and extract signals from statistical noise, as well as examples of how these methods have been applied at CERN and in marketing analytics.
- It presents a vision of how cyberphysical systems and artificial intelligence will continue transforming enterprises and society over the coming decades.
This document discusses the need for a new paradigm in big data analytics using algorithms. It begins by describing the limitations of traditional analytics approaches like statistical analysis, data mining, visualization and business intelligence tools when applied to big data. These approaches are query-based and labor intensive. Emerging big data tools like Hadoop and in-memory databases help with storage and queries but do not provide automated insights. The document argues that the new paradigm should focus on algorithms that can automatically surface insights from data in seconds, replacing the need for data analysts to manually query databases. This represents a shift from humans digging for insights to algorithms surfacing insights for humans to evaluate.
This document provides an agenda for a presentation titled "Pictures at an Exhibition: Ruby, Rails, NoSQL and Big Data". The presentation explores solving big data problems using NoSQL databases and Ruby on Rails. It discusses key-value, document, and graph databases as well as MapReduce. Examples and code snippets are provided for Redis, Riak, MongoDB, Cassandra, Neo4J, and using MapReduce with Hadoop, Riak/MongoDB, and Elastic MapReduce. The goal is to show how big data problems typically have one of two solution patterns: using past patterns to predict the future (foresight) or using past events to explain current outcomes (hindsight).
2015 is knocking on the door and will be an exciting and surprising year for the BI industry. However, not everything will be a surprise for Panorama as we are always on top of the latest trends influencing the Business Intelligence community.
• What will the future hold for the industry?
• What are our BI experts thoughts, predictions and internal assessments on what new directions the Business Intelligence community will see in the coming year?
• Countdown of the most important trends in the industry
This document provides an overview of big data and big data analytics. It defines big data as large, complex datasets that grow quickly in volume and variety. Big data analytics involves examining these large datasets to find patterns and useful information. The challenges of big data include increased storage needs and handling diverse data formats. Hadoop is a framework that allows distributed processing of big data across clusters of computers. Common big data analytics tools include MapReduce, Spark, HBase and Hive. The benefits of big data analytics include improved decision making, customer service and efficiency.
Data is being generated at a feverish pace and forward thinking companies are integrating big data and analytics as part of their core strategy from day one. However, it is often hard to sift through the hype around big data and many companies start with only a small subset of data. Can smaller companies benefit from big data efforts? We will discuss several use cases and examples of how startups are using data to optimize their operations, connect with their users, and expand their market.
Top Business Intelligence Trends for 2016 by Panorama SoftwarePanorama Software
10 top BI trends for 2016 – by Panorama
Its all about the insight
Visual perception rules
The learning suggestive system - AI gets real
The data product chain becomes democratized
Cloud (finally)
“Mobile”
Automated data integration
Interned of things data accelerating into reality
Hadoop accelerators are the last chance for Hadoop
Fading of the centralized on–premise DWH
1) In-memory computing is growing rapidly, with the total data market expected to grow from $69 billion in 2015 to $132 billion in 2020.
2) In-memory databases are gaining popularity for applications that require fast response times, like telecommunications and mobile advertising, as memory access is faster than disk access.
3) Modern applications are driving adoption of in-memory solutions as they generate more data from more users and transactions and require faster performance to handle growing traffic.
4) Two examples presented were DellEMC using MemSQL for a real-time customer 360 application and an IoT logistics application called MemEx that processes sensor data from warehouses for predictive analytics.
This document discusses how big data and analytics are moving from on-premises data warehouses to hybrid cloud environments that leverage technologies like Hadoop, Spark, and machine learning. It provides examples of how Oracle is helping customers with this transition by offering big data cloud services that give them flexibility to run workloads both on-premises and in the cloud while simplifying data management and enabling new types of advanced analytics.
Chad Richeson gave a presentation on harnessing big data. He discussed how nearly every industry is trying to apply big data concepts to improve opportunities, efficiencies, and minimize risk. Examples of big data applications in different industries were provided. Richeson emphasized that successful big data projects require blending analytics, business, and technical skills. He outlined key steps for moving big data projects from development to implementation, including focusing on business goals and gaining user agreement.
Enabling data scientists within an enterprise requires a well-thought out approach from an organization, technology, and business results perspective. In this talk, Tim and Hussain will share common pitfalls to data science enablement in the enterprise and provide their recommendations to avoid them. Taking an example, actionable use case from the financial services industry, they will focus on how Anaconda plays a pivotal role in setting up big data infrastructure, integrating data science experimentation and production environments, and deploying insights to production. Along the way, they will highlight opportunities for leveraging open source and unleashing data science teams while meeting regulatory and compliance challenges.
Watch here: https://bit.ly/3i2iJbu
You will often hear that "data is the new gold". In this context, data management is one of the areas that has received more attention by the software community in recent years. From Artificial Intelligence and Machine Learning to new ways to store and process data, the landscape for data management is in constant evolution. From the privileged perspective of an enterprise middleware platform, we at Denodo have the advantage of seeing many of these changes happen.
Join us for an exciting session that will cover:
- The most interesting trends in data management.
- Our predictions on how those trends will change the data management world.
- How these trends are shaping the future of data virtualization and our own software.
Enable Advanced Analytics with Hadoop and an Enterprise Data HubCloudera, Inc.
This document discusses enabling advanced analytics with Hadoop and an enterprise data hub. It describes current challenges around siloed data and long timelines for analytics projects. An agile analytics process is proposed using an enterprise data hub to break down data silos and deliver insights faster. Case studies are presented on how Monsanto used such a system to automate research and development decisions to reduce product development time from years to months. A second case study describes a system that analyzes mobile and social data to identify suicide risk factors in veterans in real time.
Oracle is a leading technology company focused on database software and cloud computing. It generates revenue from software licenses and cloud services. While Oracle faces competition from other large tech companies, its strengths include consulting services, global sales channels, and expertise in data storage and applications. The rise of big data presents both opportunities and challenges for Oracle to leverage new types and volumes of customer information through its products.
Big data is generated from a variety of sources like web data, purchases, social networks, sensors, and IoT devices. Telecom companies process exabytes and zettabytes of data daily, including call detail records, network configuration data, and customer information. This big data is analyzed to enhance customer experience through personalization, predict churn, and optimize networks. Analytics also helps with operations, data monetization through services, and identifying new revenue streams from IoT and M2M data. Frameworks like Hadoop and MapReduce are used to analyze this distributed big data across clusters in a distributed manner for faster insights.
Why Everything You Know About bigdata Is A LieSunil Ranka
As a big data technologist, you can bet that you have heard it all: every crazy claim, myth, and outright lie about what big data is and what it isn't that you can imagine, and probably a few that you can't.If your company has a big data initiative or is considering one, you should be aware of these false statements and the reasons why they are wrong.
Webinar: BI in the Sky - The New Rules of Cloud AnalyticsSnapLogic
In this webinar, we talk about the shift in data gravity as more and more business applications are moving to the cloud, and how the ability to deliver analytics in the cloud has evolved from idea to enterprise reality with new solutions being announced constantly that appeal to the need for speed, simplicity and access to insight on demand. Joining us in this webinar is David Glueck, Sr. Director of Data Science and Engineering at Bonobos.
To learn more, visit: www.SnapLogic.com/salesforce-analytics
This document provides an introduction and overview of big data for an organization. It begins by outlining the topics that will be covered, including what big data means beyond Hadoop, the historical forces that led to big data, whether big data is just another buzzword, how Canadian companies compare to the world in adopting big data, a reference big data architecture, big data at BMO Financial Group, and the road ahead. It then discusses the origins and definitions of big data, assessing where Canada stands in adoption compared to global leaders. Finally, it outlines challenges organizations face in adopting big data strategies and capabilities.
This document provides an introduction to big data, including definitions of big data and its key characteristics of volume, variety, velocity, variability, and veracity. It discusses big data analysis and how it differs from traditional analytics by examining large, diverse datasets. Hadoop is presented as a popular open-source framework for managing and analyzing big data, and its use by companies like Facebook, LinkedIn, Walmart, and Twitter is described. The document also briefly outlines Hadoop's history and architecture, common Hadoop variants, skills needed to work with Hadoop, and examples of big data case studies.
This document discusses big data analytics and its use in digital marketing. It begins by introducing big data and how early adopters like Google, eBay, and Facebook were built around big data. It then discusses how both individuals and companies now generate and consume large amounts of data. Examples are given of how much data companies like Google and Facebook process daily. The characteristics of big data are described. Traditional analytics are compared to big data analytics. Applications of big data analytics are discussed for various sectors like retail, healthcare, and government. Specific examples are provided of how analytics can provide insights from website visitors. The challenges and power of big data are also summarized before concluding with references.
This document provides an introduction to a training course on big data analytics. It discusses why big data has become important due to the exponential growth in data volume, velocity, and variety. The course aims to focus on cloud-based storage and processing of big data using systems like HDFS, MapReduce, HBase and Storm. It emphasizes that learning involves actively asking questions. Big data is introduced by explaining the three V's of volume, velocity and variety. Examples of big data usage are given in areas like baseball analytics, political campaigns and election predictions. Challenges of big data integration and processing large volumes of heterogeneous data are also covered.
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...alexjohnson7307
Predictive maintenance is a proactive approach that anticipates equipment failures before they happen. At the forefront of this innovative strategy is Artificial Intelligence (AI), which brings unprecedented precision and efficiency. AI in predictive maintenance is transforming industries by reducing downtime, minimizing costs, and enhancing productivity.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyScyllaDB
Freshworks creates AI-boosted business software that helps employees work more efficiently and effectively. Managing data across multiple RDBMS and NoSQL databases was already a challenge at their current scale. To prepare for 10X growth, they knew it was time to rethink their database strategy. Learn how they architected a solution that would simplify scaling while keeping costs under control.
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Tatiana Kojar
Skybuffer AI, built on the robust SAP Business Technology Platform (SAP BTP), is the latest and most advanced version of our AI development, reaffirming our commitment to delivering top-tier AI solutions. Skybuffer AI harnesses all the innovative capabilities of the SAP BTP in the AI domain, from Conversational AI to cutting-edge Generative AI and Retrieval-Augmented Generation (RAG). It also helps SAP customers safeguard their investments into SAP Conversational AI and ensure a seamless, one-click transition to SAP Business AI.
With Skybuffer AI, various AI models can be integrated into a single communication channel such as Microsoft Teams. This integration empowers business users with insights drawn from SAP backend systems, enterprise documents, and the expansive knowledge of Generative AI. And the best part of it is that it is all managed through our intuitive no-code Action Server interface, requiring no extensive coding knowledge and making the advanced AI accessible to more users.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3Data Hops
Free A4 downloadable and printable Cyber Security, Social Engineering Safety and security Training Posters . Promote security awareness in the home or workplace. Lock them Out From training providers datahops.com
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
This presentation provides valuable insights into effective cost-saving techniques on AWS. Learn how to optimize your AWS resources by rightsizing, increasing elasticity, picking the right storage class, and choosing the best pricing model. Additionally, discover essential governance mechanisms to ensure continuous cost efficiency. Whether you are new to AWS or an experienced user, this presentation provides clear and practical tips to help you reduce your cloud costs and get the most out of your budget.
A Comprehensive Guide to DeFi Development Services in 2024Intelisync
DeFi represents a paradigm shift in the financial industry. Instead of relying on traditional, centralized institutions like banks, DeFi leverages blockchain technology to create a decentralized network of financial services. This means that financial transactions can occur directly between parties, without intermediaries, using smart contracts on platforms like Ethereum.
In 2024, we are witnessing an explosion of new DeFi projects and protocols, each pushing the boundaries of what’s possible in finance.
In summary, DeFi in 2024 is not just a trend; it’s a revolution that democratizes finance, enhances security and transparency, and fosters continuous innovation. As we proceed through this presentation, we'll explore the various components and services of DeFi in detail, shedding light on how they are transforming the financial landscape.
At Intelisync, we specialize in providing comprehensive DeFi development services tailored to meet the unique needs of our clients. From smart contract development to dApp creation and security audits, we ensure that your DeFi project is built with innovation, security, and scalability in mind. Trust Intelisync to guide you through the intricate landscape of decentralized finance and unlock the full potential of blockchain technology.
Ready to take your DeFi project to the next level? Partner with Intelisync for expert DeFi development services today!
Digital Marketing Trends in 2024 | Guide for Staying AheadWask
https://www.wask.co/ebooks/digital-marketing-trends-in-2024
Feeling lost in the digital marketing whirlwind of 2024? Technology is changing, consumer habits are evolving, and staying ahead of the curve feels like a never-ending pursuit. This e-book is your compass. Dive into actionable insights to handle the complexities of modern marketing. From hyper-personalization to the power of user-generated content, learn how to build long-term relationships with your audience and unlock the secrets to success in the ever-shifting digital landscape.
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Using big data_to_your_advantage
1. John Repko -- Pikasoft LLC
Using Big Data to your Advantage
It’s not just about toy elephants anymore…
March 19, 2013
John Repko – john.repko@pikasoft.com
Source: http://blog.questionpro.com/2012/12/24/market-research-trends-2013-big-data/
2. John Repko -- Pikasoft LLC
Big Data Is Not Just About “Big” Data … It’s About FAST Data!
(http://www.pikasoft.com/journal/2011/5/13/not-big-data-fast-data.html)
2
Source: http://www.startribune.com/sports/164830346.htmlSource: https://thedailyload.files.wordpress.com/2010/12/william_perry.jpg
So How Did We Get to Big Data Anyway?
3. John Repko -- Pikasoft LLC
There Are Big Data Breakthroughs Everywhere…
3
I’ve Heard About Big Data Successes…
“Watson” Wins on Jeopardy Google Wins
the Search
Market
Progressive’s Instant
“Overnight” rate
quotes
Beat the best Jeopardy players ever
Massively parallel
web searches with
results back in a
tenth of a second
Progressive creates
an insurance quote
for every car and
truck in the US –
every night
4. John Repko -- Pikasoft LLC
How Can I Determine If These Big Data Wins Apply to My Business?
4
• Where do I put the data?
• How do I load the system?
• How do I find the value in the
data?
• How do I present it?
• How long is this going to take?
• How much is this going to cost?
You Need A Proven Approach to Finding the Value in Your Data
Source: http://www.beingjavaguys.com/2013/01/what-is-big-data-introduction-and.html
5. John Repko -- Pikasoft LLC
The Key is to Recognize That There IS a Pattern to Big Data Wins
• Foresight
– We are presented a pattern – What has the
outcome been when we’ve seen similar patterns in
the past?
• Hindsight
– We are presented an outcome -- What pattern of
events anticipated the outcome in the past?
5
The Variety Of Big Data Wins In The Press Fall Into Just Two Solution Patterns
We Don’t Need Dozens Of Solution Approaches For Big Data – Just Two
6. John Repko -- Pikasoft LLC
Big Data Wins – Not “10 Problems” But Only 2
6
1. Modeling True Risk
• What past patterns led to success or default?
1. Customer Churn Analysis
• What do customer churn patterns predict about our products and markets?
1. Recommendation Engine
• We have search terms – what have the results been from similar searches in the past?
1. Ad Targeting
• We have profile information – what offers have led to sales for similar profiles in the past?
1. PoS Transaction Analysis
• We have your purchase history – what deals might we offer in the future?
Summary – 10 Common Hadoop-able Problems*
Foresight Hindsight
In This Light, Let’s Take A Look At The “10 Hadoop-able Problems”
* http://info.cloudera.com/TenCommonHadoopableProblemsWhitePaper.html
7. John Repko -- Pikasoft LLC
Big Data Wins – Not “10 Problems” But Only 2
7
6. Analyzing Data Logs to Forecast Events
• We have your logs – what pattern of events have anticipated failures before?
7. Threat Analysis
• We have a specific event – what results have we seen from similar threats in the past?
8. Trade Surveillance
• Does this parcel raise any alarms, based on our history of past parcel-tracking?
9. Search Quality
• We have a set of search terms – what have similar searches succeeded in finding in the past?
• Data “Sandbox”
• We have your data, possibly unstructured data. What patterns in that data might we bring to your
attention now?
These Two Solution Types Apply Generally To The Hadoop-able Problems
Summary – 10 Common Hadoop-able Problems*
Foresight Hindsight
8. John Repko -- Pikasoft LLC
Data Warehouse Advanced Analytics Is Expensive and Generally Restricted To
Structured Data
• According to Gartner, Enterprise Data will grow 650% by 2014. 85% of these
data will be “unstructured data”, with a CAGR of 62% per year, far larger than
transactional data
• Growth is taking place in areas not well served by RDBMS’s and DW’s
8
Source http://www.vertica.com/writable/knowledge_articles/file/bi_vertica.pdf: http://thecloudtutorial.com/hadoop-tutorial.html
Structured:
Managed by
RDBMS & DW
Unstructured:
Growth Areas Not
Managed well by
RDBMS or DW
9. John Repko -- Pikasoft LLC
The Tremendous Growth Of Data Is In Unstructured Data That Is Best Managed
Outside The RDBMS
9
Structured:
Managed by RDBMS
or DW
Unstructured:
Not Managed by
RDBMS or DW
10. John Repko -- Pikasoft LLC
The New Areas Of Non-RDBMS Managed Data Are Rich In Business Value And
Are Ripe For Analysis
10
Structured:
Managed by
RDBMS
Unstructured:
Not Managed by
RDBMS or DW
11. John Repko -- Pikasoft LLC
Big Data Stores Are Increasingly Architected With Open-Source Tools
11
Data
Integration
Tools which extract, transform, and
load data between Relational and
Non-Relational datasets.
NoSQL
Data Store
Datasets structured as columnar,
key-value, or document-based in
order to overcome limitations in
traditional relational modeling for
‘Big’ datasets.
Map
Reduce
Languages
Higher-level wrapper languages
which simplify Map Reduce
development efforts.
Map
Reduce
Engine
Cloud
MapReduce
Processes (‘Map’ and ‘Reduce’
functions) which analyze very large
datasets across distributed systems.
12. John Repko -- Pikasoft LLC
You Have Data. Here’s What You Need to Unlock It
• Load the data in a system equipped
with the tools to analyze it
– Via a standard interface, or
– Programmatically
• Determine valid relationships in the
data
• Analyze the data for these common
patterns
• Tune the analytics
• Visualize the results
• Pursue the patterns that emerge
12
• The system has to live where the data lives (otherwise
transmission costs become prohibitive)
• REST or SOAP are the most common interfaces
• Bloom Filters can provide set operations in large data sets
• ORM (Object-Relational Management) simplifies data access
• Hadoop provides parallelized analysis for unstructured data
• Starfish provides automatic analytics tuning for Hadoop
• Structured data can be analyzed via statistical analysis (for
numbers) or free-text search (for text)
• Solution patterns can be applied automatically once the data is
sandboxed
• Visualization can help to grasp the key patterns and results
Needs Requirements
The Right Platform Can Meet All Of These Requirements
13. John Repko -- Pikasoft LLC
Additional Tools: With a Platform for Big Data, We Can Expand Our Analysis
with Rich Analytics Tools
13
1. Predictive Modeling
2. Data Visualization
3. Cluster Partitioning
Key Big Data Analytics Solution Patterns
4. Outlier Analysis
5. AB Testing
6. Markov Chains
These Patterns Provide Straightforward Way to Finding Big Data Wins –
Here’s How
Source: http://www.cognizant.com/InsightsCognizantiarticles/Cognizanti_Sow'sEar_Analytics.pdf
14. John Repko -- Pikasoft LLC
Big Data And Classic Analysis Patterns Are Creating A New Class Of Enterprise
Applications
14
Data Sources Data Processing Data Presentation
Google Chart Tools
Public Data Sets on AWS
These Offerings Emerged In The Consumer Domain And
Enterprise Users Are Coming To Have Similar Expectations
15. John Repko -- Pikasoft LLC
But New Applications Will Remain Just Curiosities, “One-Offs” Unless The
Underlying Patterns Are Drawn Out
• There’s Nothing New Here: Hadoop is Turing-complete, as are most general-purpose
processing and analytics packages
• To provide richer insights, tools like Hadoop need more advanced processing patterns:
Basic Patterns
Filtering | Parsing | Counting/Summing | Collating | Sorting | Distributed Tasks | Chained Jobs
Advanced Patterns
Distinct | Group By | Secondary Sorts | Joins | Distributed Sorting
Leading-Edge Work
Classification | Clustering | Regression | Dimension Reduction | Evolutionary Code
15
To See More Advanced Patterns and Richer Presentation, The Basic
Patterns Must First Become Routine
16. John Repko -- Pikasoft LLC
Software Will Capture the Value of Intellectual Property
17
2012 Internet Company Valuations as %Revenue
• Pure services companies generally yield a company valuation of 0.5 to 1.0x Annual Revenue
• Recurring revenue businesses (hosting, support) typically generate 2.5 – 4.0x Revenue
• Product businesses derive their multiples from: growth, product margin, network effects, customer
lock-in, and ecosystem effects) – with a good product, valuations of > 5X Revenue are possible
http://abovethecrowd.com/wp-content/uploads/2011/05/pr_mults.png
17. John Repko -- Pikasoft LLC
Capturing Trends – Where Is the IT Industry Headed?
18
IT Product Breakthroughs Happen When Technology Advances Invalidate “Old”
Product Assumptions. Here Are The Principal Areas Where Old Assumptions
Will Be Obsoleted.
• 5 major trends
– Big Data: Big Data Just Beginning to Explode
– Cloud: Cloud Computing Market Size – Facts and Trends
– In-Memory: The Coming In-Memory Database Tipping Point
– Handheld: Five Emerging Trends in Analytics
– Real-time: Using Analytics to Create a Sense-and-Respond Organization
18. John Repko -- Pikasoft LLC
Capturing Trends – Why Bother? Who Cares?
• Big Data:
– According to Michael Stonebraker and Jeremy Kepner the future of Hadoop is doomed
– According to Mike Miller of Cloudant the days are numbered for Hadoop as we know it
• Cloud:
– Even PCI and HIPAA data is evolving into cloud-hosted models
• In-Memory:
– Spinning disk is "the new tape" (overflow, recovery)
• Handheld:
– Mobile Internet devices will outnumber humans this year, Cisco predicts
• Real-time:
– Future of computing technology belongs to handheld devices
19
“You can’t just ask customers what they want and then try to give that to them. By the time you get it built,
they’ll want something new. It took us three years to build the NeXT computer. If we’d given customers
what they said they wanted, we’d have built a computer they’d have been happy with a year after we spoke
to them — not something they’d want now.”
~ Steve Jobs
19. John Repko -- Pikasoft LLC
The Cloud Provides a Platform For Do It Yourself Analytics
• Why the cloud matters
– Analytics cannot be “do it yourself” until everyone has access to a platform suitable for
holding and processing Big Data.
– Only the cloud has the scale, speed, and availability to process Big Data universally
• What it gives us that is unique and differentiating
– Big Data projects today are 1) expensive, 2) long lead-time, and 3) run on masses of
local hardware. With inevitable commoditization this has to change.
– The trend is to “do it yourself” analytics – if we build the ability to give do it yourself
analytics, applications will appear that were inconceivable before the environment was
created
• What we need to make happen
– Robustness –at least 3-nines of availability and zero data loss
– Security – starting with things like 5 Ways Amazon Web Services Protects Cloud Data
– Privacy – where it begins: Complying to the Higher Standard
20
20. John Repko -- Pikasoft LLC
Handhelds Make Analytics Available Everywhere
• Why handheld client delivery matters
– There are now more smartphones than client PCs
– More than 25% of users use smartphones for their primary web access
– The future of internet computing is mobile
• What it gives us that is unique and differentiating
– Hadoop is dreadfully mismatched with handheld access (batch, no standard client or
reporting interface)
– Coming in-memory databases (HANA, Vertica, VoltDB) will provide a much-better
mesh with handheld
• What we need to make happen
– Make handheld our primary target UI (design for thumbs, not mice … and more)
– Target do-it-yourself analytics use cases
21
21. John Repko -- Pikasoft LLC
Real-time Makes Previously Unthinkable Apps Possible
• Why real-time matters
– Users increasingly expect real-time analytics
– The first wave of real-time analytics tools is becoming available
• What it gives us that is unique and differentiating
– "Self-service" analytics
– Intuitive and unconstrained data exploration
– Instant visualization of complex datasets
– Viable plays for a variety of asset types
• Credit card debt, Student load debt, Properties, Insurance, etc.
• What we need to make happen
– If Hadoop – we must evolve to interactive batch execution (or overnight batch, like
Progressive Insurance)
– If In-memory DB– need to select and groom a handheld interface and design for sub-
100ms response times
22
22. John Repko -- Pikasoft LLC
Beyond Big Data – The Emerging Big Data Tech Platform
23
RDBMS In-Memory RDBMS
On-Premise Distributed Cloud
Structured Data DWs Big Data Universal Data
Batch Hadoop Batch Always
Hindsight Foresight
Lumpenprogramming Today Tomorrow
Report Specialists Data Scientists Everyone
Reports
Data
Warehouses
Big Data
DIY
Analytics
For what?
By whom?
What?
With what?
Stored where?
Processed where?
How?
When?
Here’s Where Our World Is Headed
What Happened? Why Did That Happen? What’s Next?
23. John Repko -- Pikasoft LLC
The Future: Here’s What The Evolution Looks Like
24
Trend Development Initiatives Who’s Doing It
Big Data • APIs. No one is likely to reach a market with Big Data analytics
fronted by their own UI. Success will come from API links to
• Level 1: REST Access API
• Level 2: Plug-in API
• Level 3: Runtime environment
Open territory! Infochimps has
Level 1, Amazon (Elastic
Mapreduce) has levels 2 and
3. Who else will play???
Cloud • All of the Cloud players are investigating DB-rich offerings
• VoltDB options with AWS High IO option
• “38% of all companies are planning a BI SaaS project before the end of 2013.”
Everybody: Amazon,
Rackspace, Heroku ...
Accenture
In-Memory • Move demo to DAHANA architecture (not hand-coded)
• Select non-HANA in-memory DB (probably VoltDB) as secondary
platform
• Hadoop evolves for a processing platform to an ETL gateway from
unstructured to structured data
• SAP / Hana
• HP / Vertica
• other NewSQL players
Handheld • Evolving UIs with HTML5 + JQuery Mobile
• Reporting platforms increasingly offer mobile interfaces
• Review Big Data interfaces to IPad and Android devices
Two principal camps -- Apple
IOS and Android
Real-Time • Investigate CDN options for Big Data deployment
• Confirm DB performance on buffer pool, locking, latching, recovery
• Design for sub-100ms delivery
Just getting started...
24. John Repko -- Pikasoft LLC
• Vision:
– Target Audience: Product Executives
– Anticipated Benefit: Keep up with market
leader Amazon, build up-sell and cross-sell
revenue
– Delivered Benefit: Better market
segmentation, enhanced revenue through
“customers who bought xxx also bought...”
recommendations.
– Alternatives: CRM recommendations do
not draw on deep sense of customer intent
– Why It Kills: Provable revenue growth
through A-B testing
25
Today’s Killer Apps: Recommendation Engine For Enhanced Retail Marketing
• How to Implement It:
– Proof of Concept: Small cloud-based recognition
engine, based on readily-available (customer profile,
purchase history) data stores
– Initial Rollout: Still cloud-based, but with broader
streams (e.g. search histories) and dynamic updates
– Test and Customer Acceptance: Pilot program with
configuration from the Initial Rollout, but now tied (on a
limited basis) into retailing process and systems
– Full Rollout: Could be cloud or in-house, but moving to
richer streams and real-time (i.e. in-memory) data access
– Maintenance: Tools updates, streams updates, transition
to real-time data access
Today’s Tools: The Killer Apps
25. John Repko -- Pikasoft LLC 26
• Vision:
– Target Audience: High end retailers with profitable
service contracts (e.g. computers, cameras, sound
systems)
– Anticipated Benefit: Increase penetration rate of
service contracts by pre-calculating terms in advance of
sale or service renewal
– Delivered Benefit: Reward customer with historically
low service costs, and increase penetration of profitable
service deals by pre-calculation of ideal rates
– Alternatives: Consumers generally know one-size-fits-all
service contracts are overpriced. If you can’t fit the terms
to the customer then you can’t complete the service
contract
– Why It Kills: Big data approach pre-calculates
appropriate terms for all customers in advance of a sales
or renewal transaction
• How to Implement It:
– Proof of Concept: Small cloud-based run with limited data
sets to confirm data adoption approaches and identify most
profitable segments in that sub-population
– Initial Rollout: Still cloud-based, but with larger data sets
and dynamic updates
– Test and Customer Acceptance: Pilot program with
configuration from the Initial Rollout, but now tied (on a limited
basis) promotions and target marketing
– Full Rollout: Could be cloud or in-house, but moving to
larger data stores, real-time (i.e. in-memory) data access and
notifications across the full customer set
– Maintenance: Tools updates, stores updates, transition to
real-time data access and notifications
Today’s Tools: The Killer Apps
Today’s Killer Apps: Analysis and Prediction Engine
26. John Repko -- Pikasoft LLC 27
• Vision:
– Target Audience: Utilities executives
– Anticipated Benefit: Sell a energy or utilities package
that better fits customer interests and reduces customer
costs while increasing energy/utility margins
– Delivered Benefit: Customer gets a package that
better fits their specific interests (e.g. “green”) and exec
sells higher-margin offerings
– Alternatives: One size plan fits all does not capture
customer interests or delivery high-margin offerings well
– Why It Kills: More customized packages better fit
customer needs while reducing capital expenses and
increasing margins for the utility
• How to Implement It:
– Proof of Concept: Small cloud-based run with limited
data sets to capture basic patterns and confirm data
adoption approaches
– Initial Rollout: Still cloud-based, but with larger data
stores and dynamic updates
– Test and Customer Acceptance: Pilot program with
configuration from the Initial Rollout, but now tied (on a
limited basis) into production logs with reporting
– Full Rollout: Could be cloud or in-house, but moving to
larger data stores, real-time (i.e. in-memory) data access
and notifications
– Maintenance: Tools updates, stores updates, transition
to real-time data access and notifications
Today’s Tools: The Killer Apps
Today’s Killer Apps: Log Analysis Engine
27. John Repko -- Pikasoft LLC
This Is Only The Beginning. With A Standard
Platform We’ll See Richer Big Data
Discoveries Become Routine
The Solution Tools (Slide 13) Become
Straightforward if We Run Them on a
Standard Architecture
“One man’s noise is another man’s data.”
~ Bill Stensrud - InstantEncore
29
Summary
28. John Repko -- Pikasoft LLC
• John Repko: john.repko@pikasoft.com - (720) 624-6025
30
Contacts
https://pikasoft.s3.amazonaws.com/Using_Big_Data_To_Your_Advantage.ppt