With many organisations considering getting on the Hadoop bandwagon, this document provides an overview of the planned use cases for Hadoop, an illustration of some of the common technology components, suggestions on when Hadoop is worth considering, some the challenges organisations are experiencing, cost considerations and finally, how an organisation should position for a Big Data initiative. Any organisation considering a Big Data initiative with Hadoop should thoroughly consider each of these areas before embarking on a course of action.
The current challenges and opportunities of big data and analytics in emergen...IBM Analytics
Big data and analytics present many possibilities for emergency management specialists and first responders. Some of these benefits include pinpointing vulnerabilities, bringing in the right resources and maximizing existing resources to pave the way to adoption. However, these opportunities are not without challenges. Emergency management experts Adam Crowe, Director, Emergency Preparedness at Virginia Commonwealth University; William Moorhead, President of All Clear Emergency Management Group; and Gary Nestler, Associate Partner and Global Leader, Emergency Management solutions at IBM discuss these challenges and opportunities in this slideshare—which is intended to help disaster management stakeholders achieve the most accurate situational awareness using analytics.
Discover analytics solutions for emergency management http://ibm.co/emergencymgmt
Who needs Big Data? What benefits can organisations realistically achieve with Big Data? What else required for success? What are the opportunities for players in this space? In this paper, Cartesian explores these questions surrounding Big Data.
www.cartesian.com
A REVIEW ON CLASSIFICATION OF DATA IMBALANCE USING BIGDATAIJMIT JOURNAL
Classification is one among the data mining function that assigns items in a collection to target categories
or collection of data to provide more accurate predictions and analysis. Classification using supervised
learning method aims to identify the category of the class to which a new data will fall under. With the
advancement of technology and increase in the generation of real-time data from various sources like
Internet, IoT and Social media it needs more processing and challenging. One such challenge in
processing is data imbalance. In the imbalanced dataset, majority classes dominate over minority classes
causing the machine learning classifiers to be more biased towards majority classes and also most
classification algorithm predicts all the test data with majority classes. In this paper, the author analysis
the data imbalance models using big data and classification algorithm
This Document Includes lecture/workshop notes for BIG DATA SCIENCE workshop at NTI 6-7th of Dec 2017
Hint: 1:This is an Initial Version, and it will be updated.
2: Telecommunication/5G parts were not covered through the workshop, although, I will add a comprehensive analysis regarding mentioned cases.
If anyone is interesting in working practically (HANDS ON) mentioned case study, just drop me an e-mail: m.rahm7n@gmail.com
With many organisations considering getting on the Hadoop bandwagon, this document provides an overview of the planned use cases for Hadoop, an illustration of some of the common technology components, suggestions on when Hadoop is worth considering, some the challenges organisations are experiencing, cost considerations and finally, how an organisation should position for a Big Data initiative. Any organisation considering a Big Data initiative with Hadoop should thoroughly consider each of these areas before embarking on a course of action.
The current challenges and opportunities of big data and analytics in emergen...IBM Analytics
Big data and analytics present many possibilities for emergency management specialists and first responders. Some of these benefits include pinpointing vulnerabilities, bringing in the right resources and maximizing existing resources to pave the way to adoption. However, these opportunities are not without challenges. Emergency management experts Adam Crowe, Director, Emergency Preparedness at Virginia Commonwealth University; William Moorhead, President of All Clear Emergency Management Group; and Gary Nestler, Associate Partner and Global Leader, Emergency Management solutions at IBM discuss these challenges and opportunities in this slideshare—which is intended to help disaster management stakeholders achieve the most accurate situational awareness using analytics.
Discover analytics solutions for emergency management http://ibm.co/emergencymgmt
Who needs Big Data? What benefits can organisations realistically achieve with Big Data? What else required for success? What are the opportunities for players in this space? In this paper, Cartesian explores these questions surrounding Big Data.
www.cartesian.com
A REVIEW ON CLASSIFICATION OF DATA IMBALANCE USING BIGDATAIJMIT JOURNAL
Classification is one among the data mining function that assigns items in a collection to target categories
or collection of data to provide more accurate predictions and analysis. Classification using supervised
learning method aims to identify the category of the class to which a new data will fall under. With the
advancement of technology and increase in the generation of real-time data from various sources like
Internet, IoT and Social media it needs more processing and challenging. One such challenge in
processing is data imbalance. In the imbalanced dataset, majority classes dominate over minority classes
causing the machine learning classifiers to be more biased towards majority classes and also most
classification algorithm predicts all the test data with majority classes. In this paper, the author analysis
the data imbalance models using big data and classification algorithm
This Document Includes lecture/workshop notes for BIG DATA SCIENCE workshop at NTI 6-7th of Dec 2017
Hint: 1:This is an Initial Version, and it will be updated.
2: Telecommunication/5G parts were not covered through the workshop, although, I will add a comprehensive analysis regarding mentioned cases.
If anyone is interesting in working practically (HANDS ON) mentioned case study, just drop me an e-mail: m.rahm7n@gmail.com
Big Data Trends and Challenges Report - WhitepaperVasu S
In this whitepaper read How companies address common big data trends & challenges to gain greater value from their data.
https://www.qubole.com/resources/report/big-data-trends-and-challenges-report
Analytics: The Real-world Use of Big DataDavid Pittman
UPDATE: Register now to participate in the 2013 survey: http://ibm.com/2013bigdatasurvey IBM’s Institute for Business Value (IBV) and the University of Oxford released their information-rich and insightful report “Analytics: The real-world use of big data.” Based on a survey of over 1000 professionals from 100 countries across 25+ industries, the report provides insights into organizations’ top business objectives, where they are in their big data journey, and how they are advancing their big data efforts. It also provides a pragmatic set of recommendations to organizations as they proceed down the path of big data. For additional information, including links to a podcast with one of the lead researchers and a link to download the full report, visit http://ibm.co/RB14V0
Analytics 3.0 Measurable business impact from analytics & big dataMicrosoft
Presentación del evento de Harvard Business Review sobre Analítica y Big Data
(15 de Octubre 2013)
"Featuring analytics expert Tom Davenport, author of Competing on Analytics, Analytics at Work, and the just-released Keeping Up with the Quants" 
Big Data: Real-life examples of Business Value Generation with ClouderaCapgemini
Capgemini has helped multiple organizations to put Big Data to work and create value for their business and their clients.
This prsentation looks at real-world cases of how organizations are using, or planning to use, big data technology. It will look at the different ways in which the technology is being used in a business context.
Examples are drawn from Retail, Telco, Financial Services, Public Sector and Consumer goods.
It will look at a range of business scenarios from simple cost reduction through to new business models looking at how the business case has been built and what value has been realized.
It will also look at some of the practical challenges and approaches taken and specifically the application of Enterprise Data Hubs in collaboration with its prime partner Cloudera.
Written by Richard Brown, Global Programme Leader, Big Data & Analytics, Capgemini
The only way to get where we need to be in security analysis is if we use Security Intelligence. This means working harder and understanding the big picture of your data.
Big Data Trends and Challenges Report - WhitepaperVasu S
In this whitepaper read How companies address common big data trends & challenges to gain greater value from their data.
https://www.qubole.com/resources/report/big-data-trends-and-challenges-report
Analytics: The Real-world Use of Big DataDavid Pittman
UPDATE: Register now to participate in the 2013 survey: http://ibm.com/2013bigdatasurvey IBM’s Institute for Business Value (IBV) and the University of Oxford released their information-rich and insightful report “Analytics: The real-world use of big data.” Based on a survey of over 1000 professionals from 100 countries across 25+ industries, the report provides insights into organizations’ top business objectives, where they are in their big data journey, and how they are advancing their big data efforts. It also provides a pragmatic set of recommendations to organizations as they proceed down the path of big data. For additional information, including links to a podcast with one of the lead researchers and a link to download the full report, visit http://ibm.co/RB14V0
Analytics 3.0 Measurable business impact from analytics & big dataMicrosoft
Presentación del evento de Harvard Business Review sobre Analítica y Big Data
(15 de Octubre 2013)
"Featuring analytics expert Tom Davenport, author of Competing on Analytics, Analytics at Work, and the just-released Keeping Up with the Quants" 
Big Data: Real-life examples of Business Value Generation with ClouderaCapgemini
Capgemini has helped multiple organizations to put Big Data to work and create value for their business and their clients.
This prsentation looks at real-world cases of how organizations are using, or planning to use, big data technology. It will look at the different ways in which the technology is being used in a business context.
Examples are drawn from Retail, Telco, Financial Services, Public Sector and Consumer goods.
It will look at a range of business scenarios from simple cost reduction through to new business models looking at how the business case has been built and what value has been realized.
It will also look at some of the practical challenges and approaches taken and specifically the application of Enterprise Data Hubs in collaboration with its prime partner Cloudera.
Written by Richard Brown, Global Programme Leader, Big Data & Analytics, Capgemini
The only way to get where we need to be in security analysis is if we use Security Intelligence. This means working harder and understanding the big picture of your data.
Real time big data analytical architecture for remote sensing applicationLeMeniz Infotech
Real time big data analytical architecture for remote sensing application
Do Your Projects With Technology Experts
To Get this projects Call : 9566355386 / 99625 88976
Web : http://www.lemenizinfotech.com
Web : http://www.ieeemaster.com
Mail : projects@lemenizinfotech.com
Blog : http://ieeeprojectspondicherry.weebly.com
Blog : http://www.ieeeprojectsinpondicherry.blogspot.in/
Youtube:https://www.youtube.com/watch?v=eesBNUnKvws
Big Data Analytics (BDA) is rapidly turning out to be a significant global enterprise need. It aims to facilitate the storage, querying and analysis of enterprise big data, which is getting more complicated and time-consuming with traditional database technologies. Apache Hadoop is a well-known Open-source BDA enterprise solution which is seeing an annual application growth rate of 60% globally.
With the rise of Apache Hadoop, a next-generation enterprise data architecture is emerging that allows organizations to efficiently rein in their big data business transactions. Hadoop is uniquely capable of storing, aggregating, querying and analyzing big data sources into formats that fuel new business insights. Organizations that embrace solution architectures focused on maximizing data-driven insights will put themselves in a position to drive more business, enhance productivity, maintain competitive edge or discover new and lucrative business opportunities. Over the coming years, Hadoop could be in a position to process more than half the world’s data.
To educate organizations about how best to leverage Apache Hadoop as a key component of their enterprise big data architecture, Innovative Management Services is pleased to host the 1st annual Open-BDA Hadoop Summit 2014 which is scheduled to be held on 18th & 19th November, 2014 at Marriott Hotel, Karachi.
Demystify big data data science
An overview of the shift to Data Science Platforms
The 3 critical components of a Data Science platform
Industries that are most likely to get disrupted and shift to Data Science
Characteristics of firms that get left behind the Data Science wave
Factors that push an industry towards Data Science
A brief overview of aspects of platform architecture beyond technology
To Serve and Protect: Making Sense of Hadoop Security Inside Analysis
The Briefing Room with Dr. Robin Bloor and HP Security Voltage
Live Webcast September 22, 2015
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=45ece7082b1d7c2cc8179bc7a1a69ea5
Hadoop is rapidly becoming a development platform and dominant server environment, and organizations are keen to take advantage of its massively scalable – and relatively inexpensive – resources. It is not, however, without its limitations, and it often requires a contingent of complementary components in order to behave within an information architecture. One area often overlooked is security, a factor that, if not considered from the onset, can insert great risk when putting sensitive data in Hadoop.
Register for this episode of The Briefing Room to learn from veteran Analyst Dr. Robin Bloor as he discusses how security was never a design point for Hadoop and what organizations can do about it. He’ll be briefed by Sudeep Venkatesh of HP Security Voltage, who will explain the intricacies surrounding a secure Hadoop implementation. He will show how techniques like format-preserving and partial-field encryption can allow for analytics over protected data, with zero performance impact.
Visit InsideAnalysis.com for more information.
Building Hadoop Data Applications with Kite by Tom WhiteThe Hive
With a such a large number of components in the Hadoop ecosystem, writing Hadoop applications can be a big challenge for newcomers. In this talk Tom looks at best practices for building data applications that run on Hadoop, and introduces the Kite SDK, an open source project created at Cloudera with the goal of simplifying Hadoop application development by codifying many of these best practices.
Meet with Tom White:
Tom White is one of the foremost experts on Hadoop. He has been an Apache Hadoop committer since February 2007, and is a Member of the Apache Software Foundation. Tom is a software engineer at Cloudera, where he has worked, since its foundation, on the core distributions from Apache and Cloudera. Previously he was an independent Hadoop consultant, working with companies to set up, use, and extend Hadoop. He has written numerous articles for O’Reilly, java.net and IBM’s developerWorks, and has spoken at many conferences, including ApacheCon and OSCON. Tom has a B.A. in mathematics from the University of Cambridge and an M.A. in philosophy of science from the University of Leeds, UK. He currently lives in Wales with his family.
Big Data and the Energy domain (vis-a-vis the respective H2020 Societal Challenge) - Opportunities, Challenges and Requirements. As presented and discussed in the public launch of the BigDataEurope project.
Enterprise Approach towards Cost Savings and Enterprise AgilityNUS-ISS
Presented by Mr Poon See Hong, Deputy Director (Planning), Police Logistics Department, Singapore Police Force, at our 14th Architecture Community of Practice Forum on 21 Jul 2016.
As Hadoop becomes a critical part of Enterprise data infrastructure, securing Hadoop has become critically important. Enterprises want assurance that all their data is protected and that only authorized users have access to the relevant bits of information. In this session we will cover all aspects of Hadoop security including authentication, authorization, audit and data protection. We will also provide demonstration and detailed instructions for implementing comprehensive Hadoop security.
Are you excited and want to learn Big Data Technologies? Do you feel that internet is loaded with free materials is complicated for a newbie?
There are many things that may go wrong when learning a new technology. Free internet material are sometimes can of worms for a beginner and training is advised for a jumpstart.
Open-BDA Big Data Hadoop Developer Training which is going to be held on 11th & 12th May 2015 @ Marriott Hotel Karachi, will cover everything you need to know to start a career in Hadoop technology and achieve expertise to a level where you can take certification exams with MAPR, Cloudera & Hortonworks with confidence. You can start as a beginner and this course will help you become a certified professional.
This white paper: Analyzes the big data revolution and the potential it offers organizations. Explores the critical talent needs and emerging talent gaps related to big data. Offers examples of organizations that are meeting this challenge head on. Recommends four steps HR and talent management professionals can take to bridge the talent gap.
How ‘Big Data’ Can Create Significant Impact on Enterprises? Part I: Findings...IJERA Editor
Big data is the latest buzz word in the BI domain, and is increasingly gaining traction amongst enterprises. The prospect of gaining highly targeted business and market insight from unmanageable and unstructured data sets is creating huge adoption potential for such solutions. The scope of big data moves beyond conventional enterprise databases to more open environments, covering new sources of information typically relating to various social networking sites, wikis and blogs. Moreover, advancements in communications and M2M technologies are also contributing to the massive availability of big data
The objective of this module is to provide an overview of what the future impacts of big data are likely to be.
Upon completion of this module you will:
Gain valuable insight into the predictions for the future of Big Data
Be better placed to recognise some of the trends that are emerging
Acquire an overview of the possible opportunities your business can have with Big Data
Understand some of the start up challenges you might have with Big Data
Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...Dr. Cedric Alford
While companies have been using various CRM and automation technologies for many years to capture and retain traditional business data, these existing technologies were not built to handle the massive explosion in data that is occurring today. The shift started nearly 10 years ago with expanding usage of the internet and the introduction of social media. But the pace has accelerated in the past five years following the introduction of smart phones and digital devices such as tablets and GPS devices. The continued rise in these technologies is creating a constant increase in complex data on a daily basis.
The result? Many companies don't know how to get value and insights from the massive amounts of data they have today. Worse yet, many more are uncertain how to leverage this data glut for business advantage tomorrow. In this white paper, we will explore three important things to know about big data and how companies can achieve major business benefits and improvements through effective data mining of their own big data.
Dr. Cedric Alford provides a roadmap for organizations seeking to understand how to make Big Data actionable.
Three big questions about AI in financial servicesWhite & Case
To ride the rising wave of AI, financial services companies will have to navigate evolving standards, regulations and risk dynamics—particularly regarding data rights, algorithmic accountability and cybersecurity.
How Insurers Can Tame Data to Drive InnovationCognizant
To thrive among entrenched rivals and compete more effectively with digital natives, insurers will need to get their data right. That will mean moving to more responsive, AI-enabled architectures that accelerate data management and deliver insights that drive business performance.
The presentation covers the application of a machine learning approach to classification and regression for modelling the expected loss in P&C insurance.
Revenue Generation Ideas for Tesla MotorsGregg Barrett
An overview of potential sources of revenue generation for Tesla Motors.
- Tesla Roadster (road and track vehicle)
- Tesla Travel and Experience
- Tesla Driver Training
This overview is not intended to be a business case for data science. It is expected that you are already familiar with the value proposition. However, a reference to several case study examples has been included at the end of this document as a reminder of the broad applicability of the subject at hand.
The intent of this document is to set in motion the discussion for the creation of a startup in South Africa that is focused on data science.
ACCORDING to AG Lafley, CEO of Procter & Gamble, collaboration is a key ingredient in a company’s arsenal to help it innovate better and faster, and proactively
respond to the increased demand we face in a global and connected economy.
What is Social Networking?
The power behind the new communication paradigm exemplified by internet sites such as Facebook, Wikipedia and YouTube is that it promotes the flow of ideas — including
advice, feedback and criticism — all of it free of charge.
I think that AG Lafley puts it best when he says: “No company today, no matter how large or how global, can innovate fast enough or big enough by itself.
A CARTOON by American cartoonist Randy Glasbergen goes: “My team is having trouble thinking outside the box. We can‟t agree on the size of the box, what materials the box should be constructed from, a reasonable budget for the box or our first choice of box vendors.”
For me, this sums up the problem most procurement functions in South Africa face – stagnant thinking and behaviour. This has led to procurement being stifled in a world driven by a myopic focus on cost with innovation and value creation.
Variable selection for classification and regression using RGregg Barrett
This document provides a brief summary of several variable selection methods that can be utilised within the R environment. Examples are provided for classification and regression.
Diabetes data - model assessment using RGregg Barrett
This report analyses the diabetes data in Efron et al. (2003) to examine the effects of ten baseline predictor variables on a quantitative measure of disease progression one year after baseline.
An introduction to Microsoft R Services,
Microsoft R Open and Microsoft R Server.
This presentation will briefly cover the following:
-Why consider MRO and R Server
-R Server
-MRO
-Microsoft R Services/R Server Platform
-DistributedR
-RevoScaleR/ScaleR
-ConnectR
-DevelopR
-DeployR
-Resources
-References
Review of mit sloan management review case study on analytics at IntermountainGregg Barrett
This document is based on the MIT Sloan Management Review case study on data and analytics at Intermountain Healthcare titled “When Health Care Gets a Healthy Dose of Data”. The document should be viewed as a summary of the attributes that played a role in advancing data and analytics at Intermountain.
Utilizing Mahout, implement a Collaborative Filtering framework using historical data, in this instance, movie ratings by 943 users, to provide Item-based recommendations. Three item based recommendations will be provided for each user.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
2. 1
Acknowledgment
This report draws extensively, and focuses on, the work and viewpoints from industry participants
including:
Diversity Limited
Economist Intelligence Unit
Gartner
HBR
Hortonworks
IBM
ITG
Intel
McKinsey
Ordnance Survey
John Standish Consulting
Christopher Bienko @ IBM
Dirk deRoos @ IBM
John Choi @ IBM
Marc Andrews @ IBM
Paul Zikopoulos @ IBM
Rick Buglio @ IBM
Strategy Meets Action
References are included in-text as well as in the References section at the end of the report.
3. 2
Challenges facing the industry
Difficult and uncertain economic conditions, low interest rates, decreasing underwriting profitability,
higher combined ratios and low investment returns are placing insurers under stress. Insurers also
have to confront commoditisation of the business, more informed consumers, high customer churn
rates, new distribution channels and strong competition. If this was not enough natural perils,
increases in regulatory intervention and greater demands for transparency by regulators, together
with ever increasing compliance requirements are placing immense strain on the capabilities of
insurers.
According to IBM (2013) to thrive in this environment insurers must gain a specific set of capabilities
that will allow them to:
- Build a customer-centric business model
- Find profitable ways to sustain growth
- Develop new, competitively priced products
- Increase claims efficiency and effectiveness
- Improve capital management and investment decisions
- Improve risk management and regulatory reporting
(IBM, 2013, pg. 2)
Insurers are turning to analytics
The business of insurance is based on analysing data to understand and evaluate risks. Two important
insurance professions, actuarial and underwriting, emerged at the beginning of the modern insurance
era in the 17th century. These both revolve around and are dependent upon the analysis of data.
(Strategy Meets Action, 2012, pg. 3)
While the insurance industry has long been recognized for analysing data, the new news involves the
overwhelming amount of data that is now available for analysis and the sophistication of the
technology tools that can be used to perform the analysis. The opportunities for advanced analysis
are many and the potential business impact is enormous.
(Strategy Meets Action, 2013, pg. 3)
4. 3
The Concept of Big Data
In simple terms Big Data refers to a data environment that cannot be handled by traditional
technologies.
Big Data is often described in terms of the three V’s, and if you are at IBM, it is likely to be the four V’s
. Figure 1 below illustrates the IBM four V representation of Big Data:
Figure 1: Big Data in dimensions
Figure 1. Four dimensions of big data. Copyright 2012 by IBM. Reprinted with permission.
Volume refers to the quantity (gigabytes, terabytes, petabytes etc.) of data that organizations are
trying to harness. Importantly there is no specific measure of volume that defines Big Data, as what
constitutes truly “high” volume varies by industry and even geography. What is clear is that data
volumes continue to rise.
Variety refers to different types (forms) of data and data sources. When referring to data types this
includes; numeric, text, image, audio, web, log files etc., whether structured or unstructured. The
growth of data sources such as social media, smart devices, sensors and the Internet of Things has not
only resulted in increases in the volume of data but increases in the types of data as well.
Velocity refers to speed at which data is created, processed and analysed. Velocity impacts latency,
which is the lag time between when data is created or captured, and when it is processed into an
output form for decision making purposes. Importantly, certain types of data must be analysed in real-
time to be of value to the business, a task that places impossible demands on traditional systems
where the ability to capture, store and analyse data in real-time is severely limited.
Veracity refers to the level of reliability associated with certain types of data. According to IBM some
data is inherently uncertain, for example: sentiment and truthfulness in humans; GPS sensors
bouncing among the skyscrapers of Manhattan; weather conditions; economic factors; and the future.
When dealing with these types of data, no amount of data cleansing can correct for it. Yet despite
uncertainty, the data still contains valuable information. The need to acknowledge and embrace this
uncertainty is a hallmark of Big Data. (IBM, 2012, pg. 5)
5. 4
The Big Data Impact
According to McKinsey (2011), Big Data creates value in several ways:
- Creating transparency
- Enabling experimentation to discover needs, expose variability, and improve performance
- Segmenting populations to customize actions
- Replacing/supporting human decision making with automated algorithms
- Innovating new business models, products, and services
To understand the impact at an organisational level, Erik Brynjolfsson with a team at MIT, working in
partnership with McKinsey, Lorin Hitt at Wharton and the MIT doctoral student Heekyung Kim,
conducted structured interviews with executives at 330 public North American companies about their
organizational and technology management practices, and gathered performance data from their
annual reports and independent sources.
Based on the analyses they conducted one relationship stood out: The more companies characterized
themselves as data-driven, the better they performed on objective measures of financial and
operational results. In particular, companies in the top third of their industry in the use of data-driven
decision making were, on average, 5% more productive and 6% more profitable than their
competitors. This performance difference remained robust after accounting for the contributions of
labour, capital, purchased services, and traditional IT investment. (HBR, 2012)
Further an IBM study based on survey responses of more than 1,000 business and IT executives from
more than 60 countries, revealed four transformative shifts in the use of Big Data:
1. A solid majority of organizations are now realizing a return on their Big Data investments
within a year.
2. Customer centricity still dominates analytics activities, but organizations are increasingly
solving operational challenges using Big Data.
3. Integrating digital capabilities into business processes is transforming organizations.
4. The value driver for Big Data has shifted from volume to velocity.
(IBM, 2014, pg. 1)
While Big Data has resulted in significant opportunity it has also brought new challenges. According
to Zikopoulos, deRoos, Bienko, Buglio and Andrews (2014), some challenges include:
- Greater volumes of data than ever before
Placing more demands on the organisations security plan.
- The experimental and analytical usage of the data
Democratizing data within the organisation requires building trust into the Big Data
platform. A data governance framework covering lineage, ownership etc. is required for any
successful Big Data project.
- The nature and characteristics of Big Data
The data consists of more sensitive personal details than ever before raising governance, risk
and compliance concerns.
- The adoption of technologies that are still maturing
6. 5
Big Data technologies like Hadoop (and much of the NoSQL world) do not have all of the
enterprise hardening from a security perspective that’s needed, and there’s no doubt
compromises are being made.
A look at Big Data in Insurance
Exploration and discovery
Big Data necessitates an approach of exploration and discovery. As articulated by Gartner (2013),
business analysts have typically worked to a requirements-based model, answering clearly-defined
business questions. Big Data, however, demands a different approach, using opportunistic analytics
and exploring answers to ill-formed or non-existent questions. (Gartner, 2013, pg. 1)
Figure 2: Culture change - Discovery versus control
Figure 2. A better assessment of the data around and connected to a single piece of information enables a more complete,
in-context understanding. Copyright 2013 by IBM. Reprinted with permission.
Moving to a data driven culture
Gartner (2014) has found that many insurance IT departments lack a consistent, enterprise-wide
business intelligence and data management strategy, because of siloed, line of-business-centric IT
systems. (Gartner, 2014, pg. 6)
In embracing the Big Data paradigm the Economist Intelligence Unit (2013) suggests moving towards
what they call a “data driven culture”. According to the report, in promoting a data driven culture
organisations should consider:
- Data-driven companies place a high value on sharing. Companies own data, not employees. Data
are a resource that can power growth, not something to be hoarded.
- Shared data should be utilised by as many employees as possible, which in practice means rolling
out training wherever it is needed.
7. 6
- Data collection needs to be a primary activity across departments
- Perhaps most importantly, implementing a data driven culture requires buy-in from the top;
without that, little will change.
(Economist Intelligence Unit, 2013, pg. 11)
Emerging techniques in Big Data on the insurance front
According to Ordnance Survey (2013) the following are some of the emerging techniques being
deployed by insurers:
- Predictive modelling: already well used by insurance companies, this works even better when
more data is fed into the model.
- Data-clustering: automated grouping of similar data points can provide new insights into
apparently familiar situations. Livehoods.org is an example of how social media and ‘machine
learning’ can reveal previously-unseen patterns.
- Sentiment analysis: textual keyword analysis can help analyse the mood of Twitter chatter on a
given topic or brand.
- Web crawling: sophisticated programmes that can identify an individual’s ‘web footprint’ as a
result of posting on social media websites, blogs and photo-sharing services. Using data-matching,
this can be linked to public records and data from other third parties to build a multi-dimensional
profile of an individual.
(Ordnance Survey, 2013, pg. 22)
Data protection, a lurking risk
In addition to the transformative shifts in the use of Big Data mentioned earlier, the same IBM report
found that respondents rated data protection lowest on the list of data priorities; only 11 percent of
respondents identified it a “top three” priority. Given the proliferation of large-scale data breaches in
recent years, organizations risk the loss of customer and business partner confidence if adequate
precautions are not taken to safeguard data, as well as legal and remediation fees. Moreover, business
leaders should thoughtfully consider how their organizations use data to minimize any potential
backlash in perceived privacy infringement. (IBM, 2014, pg. 9)
Skills gap
The Big Data environment requires a skill set that is new to most organisations – requiring people with
deep expertise in statistics and machine learning, as well as managers and analysts who know how to
operate companies by using insights from Big Data.
According to McKinsey (2011), the United States alone faces a shortage of 140,000 to 190,000 people
with deep analytical skills as well as 1.5 million managers and analysts to analyse Big Data and make
decisions based on their findings.
In addressing the skills gap, IBM (2014) suggests organisations should consider the following:
8. 7
Learn from the best within your organization.
- Tap into the pockets of talent within the organization - those few using predictive or
prescriptive analytics - to expand the skills of others.
- Create a strong internal professional program to arm analysts and executives who already
understand the organization’s business fundamentals with analytics. Sharing resources and
knowledge is a cost-effective way to build skills and helps limit the need to seek talent
elsewhere.
Externally supplement skills based on business case.
Not all organizations need a data scientist full time; the same is true for niche analytics skills that may
be used only to solve specific challenges.
- Organizations should invest in the talent and skills they need to solve the majority of their
analytics demands
- Consider vendors to supplement critical niche skills that are hard to find and expensive to
employ.
(IBM, 2014, pg. 15)
Big Data technologies
Apache Hadoop is the starting point for most organizations wanting to take the plunge into Big Data
analysis.
The Hadoop ecosystem
In their book, Big Data Beyond the Hype, Zikopoulos, deRoos, Bienko, Buglio and Andrews (2014)
classify Hadoop as an ecosystem of software packages that provides a computing framework. These
include MapReduce, which leverages a K/V (key/value) processing framework (don’t confuse that with
a K/V database); a file system (HDFS); and many other software packages that support everything from
importing and exporting data (Sqoop) to storing transactional data (HBase), orchestration (Avro and
ZooKeeper), and more.
When you hear that someone is running a Hadoop cluster, it’s likely to mean MapReduce (or some
other framework like Spark) running on HDFS, but others will be using HBase (which also runs on
HDFS). Vendors in this space include IBM (with BigInsights for Hadoop), Cloudera, Hortonworks, MapR,
and Pivotal. On the other hand, NoSQL refers to non-RDBMS SQL database solutions such as HBase,
Cassandra, MongoDB, Riak, and CouchDB, among others.
(Zikopoulos, deRoos, Bienko, Buglio, Andrews, 2014, pg. 38)
Key components of many Big Data environments:
MapReduce
MapReduce is a system for parallel processing of large data sets.
According to IBM (2015) as an analogy, you can think of map and reduce tasks as the way a census
was conducted in Roman times, where the census bureau would dispatch its people to each city in the
empire. Each census taker in each city would be tasked to count the number of people in that city and
then return their results to the capital city. At the capital, the results from each city would be reduced
to a single count (sum of all cities) to determine the overall population of the empire. This mapping of
9. 8
people to cities, in parallel, and then combining the results (reducing) is much more efficient than
sending a single person to count every person in the empire in a serial fashion. (IBM, 2015)
Hadoop
MapReduce is the heart of Hadoop. Hadoop is an open source software stack that runs on a cluster of
machines. Hadoop provides distributed storage and distributed processing for very large data sets.
NoSQL
NoSQL is a database environment. Using the definition from Planet Cassandra (2015), a NoSQL
database environment is, simply put, a non-relational and largely distributed database system that
enables rapid, ad-hoc organization and analysis of extremely high-volume, disparate data types.
NoSQL databases were developed in response to the sheer volume of data being generated, stored
and analyzed by modern users (user-generated data) and their applications (machine-generated data).
(Planet Cassandra, 2015)
Spark
What is Spark and what does it mean for Hadoop?
IBM (2014) refers to Spark as an open source engine for fast, large-scale data processing that can be
used with Hadoop, boasting speeds up to 100 times faster than Hadoop MapReduce in memory, or 10
times faster on disk. As with the early enthusiasm around Hadoop, Spark should not be thought of as
a singular platform for analytics, as it can be used with existing investments for the widest variety of
data types and analytics workloads. (IBM, 2014)
Figure 3: Example of a Big Data environment
Figure 3. Application Enrichment with Hadoop. Copyright 2013 by Hortonworks Inc.. Reprinted with permission.
10. 9
The impact of Hadoop
According to IBM (2015), Hadoop changes the economics and dynamics of large-scale computing by
enabling a solution that is:
- Scalable: Add new nodes as needed without changing data formats, how data is loaded, how
jobs are written or the applications on top.
- Cost-effective: Hadoop brings massively parallel computing to commodity servers. The result
is a significant decrease in the cost per terabyte of storage, which in turn makes it affordable
to model all your data.
- Flexible: Hadoop is schema-less, and can absorb any type of data, structured or not, from a
number of sources. Data from multiple sources can be joined and aggregated in arbitrary
ways, enabling deeper analyses than any one system can provide by itself.
- Fault-tolerant: When you lose a node, the system redirects work to another location of the
data and continues processing without missing a beat.
(IBM, 2015, pg. 2)
Hadoop challenges
Hadoop is not without its own set of challenges. According to IBM (2014), there are four key areas of
Hadoop that need to mature in order to drive wider adoption, these include:
1) Performance
2) the reduction of skills
3) data governance
4) deep integration with existing technologies
(IBM, 2014)
Along similar lines TDWI Research (2015) in a recent survey found respondents struggling with the
following barriers to Hadoop implementation:
Barriers to Hadoop:
- Skills gap
- Weak business support
- Security concerns
- Data management hurdles
- Tool deficiencies
- Containing costs
(TDWI Research, 2015)
According to a study by the International Technology Group, organisations need to be particularly
mindful in the highly skilled programming requirements demanded of most Hadoop environments,
noting that:
Although the field of players has since expanded to include hundreds of venture capital-funded start-
ups, along with established systems and services vendors and large end users, social media businesses
continue to control Hadoop. Most of the more than one billion lines of code – more than 90 percent,
according to some estimates – in the Apache Hadoop stack has to date been contributed by these.
11. 10
The priorities of this group have inevitably influenced Hadoop evolution. There tends to be an
assumption that Hadoop developers are highly skilled, capable of working with “raw” open source
code and configuring software components on a case-by-case basis as needs change. Manual coding
is the norm.
Decades of experience have shown that, regardless of which technologies are employed, manual
coding offers lower developer productivity and greater potential for errors than more sophisticated
techniques.
(ITG, 2013, pg. 2)
Big Data in the context of traditional technologies
The Big Data environment has been brought about by the advancement in technology enabling the
processing and storage of the volume, variety, velocity and veracity of data, which is beyond the
capabilities of traditional technology.
Big Data supplements traditional systems
As illustrated in Figure 3, the Big Data environment supports traditional technology, extending
capabilities into areas previously unsupported.
Gartner (2013) suggest that Big Data doesn't replace traditional data and analytics:
“…..big data technologies are not really replacing incumbents such as business intelligence, relational
database management systems and enterprise data warehouses. Instead, they supplement traditional
information management and analytics.” (Gartner, 2013, pg. 13)
Examples of three insurance use cases with Big Data
According to Gartner (2013) Big Data and the associated technology has been shown to provide the
following benefits:
- Detection and prevention of fraud or other security violations
- High ROI
- Little operational disruption
(Gartner, 2013, pg. 5)
Big Data to fight fraud
According to John Standish Consulting (2013), mobilizing Big Data is gaining wider attention in anti-
fraud circles. Insurers are sitting on troves of data, hard and soft. Much is never accessed for fraud-
fighting. Insurers can dramatically increase their anti-fraud assertiveness by insightfully accessing,
analyzing and mobilizing their large volumes of untapped data.
Marshaling analytics and big data with current rules and indicators into a seamless and unified anti-
fraud effort creates an expansive world of possibilities.
- Imagine the ability to search a billion rows of data and derive incisive answers to complex
questions in seconds.
- Imagine being able to comb through huge numbers of claim files quickly.
12. 11
- Imagine more-quickly linking numerous ring members and entities acting in well-disguised
concert. These suspects likely could not be detected with sole or even primary reliance on
basic methods such as fraud indicators.
- Ultimately, imagine analyzing entire caseloads faster and more completely, thus addressing
the largest fraud problems and cost drivers in any of an insurer’s coverage territories.
(Standish, 2013)
Case study: Fraud at IBC
The Insurance Bureau of Canada (IBC) is the national insurance industry association representing
Canada’s home, car and business insurers. Because investigation of cases of suspected automobile
insurance fraud often took several years, the company’s investigative services division wanted to
accelerate its’ process. The IBC worked with IBM to conduct a proof of concept (POC) in Ontario,
Canada that explored new ways to increase the efficiency of fraud identification. The POC showed
how IBM solutions for big data can help identify suspect individuals and flag suspicious claims. IBM
solutions also help users visualize relationships and linkages to increase the accuracy and speed of
discovering potential fraud. In the POC, more than 233,000 claims from six years were analyzed. The
IBM solutions identified more than 2,000 suspected fraudulent claims with a value of CAD41 million.
IBM and the IBC estimate that these solutions could save the Ontario automobile insurance industry
approximately CAD200 million per year.
(IBM, 2012)
Big Data for customer segmentation
Case study: Customer segmentation at Progressive
In July 2012, Progressive Insurance released new findings from an analysis of five billion real-time
driving miles, confirming that driving behaviour has more than twice the predictive power of any other
insurance rating factor. Loss costs for drivers with the highest-risk driving behaviour are approximately
two-and-a-half times the costs for drivers with the lowest-risk behaviour. These results suggest that
car insurance rates could be far more personalized than they are today.
Progressive has also found that 70% of drivers who have signed up for its’ Snapshot UBI program pay
less for their insurance. The program involves installing a small monitoring device in the car (900,000
drivers have already done this) and driving normally. After the device has collected enough data,
customers receive a personalized rate for their insurance. Progressive is currently expanding access to
Snapshot to all of its’ drivers - not just Progressive customers - who can take a free test drive of the
technology and after 30 days find out whether their own driving behaviour can lower the price they
pay for insurance.
The problem with today's less granular systems of customer classification in the property and casualty
insurance market is that the majority of drivers who present a lower risk subsidize the minority of
higher-risk drivers.
(Gartner, 2013, pg. 5)
Big Data for underwriting
Case study: Improving underwriting decisions
A large global property casualty insurance company wanted to accelerate catastrophe risk modelling
in order to improve underwriting decisions and determine when to cap exposures in its’ portfolio. The
current modelling environment was too slow and unable to handle the large-scale data volumes that
13. 12
the company wanted to analyze. The goal was to run multiple scenarios and model losses in hours,
but the current environment required up to 16 weeks. As a result, the company conducted analysis
only three or four times per year. A proof of concept demonstrated that the company could improve
performance by 100 times, accelerating query execution from three minutes to less than three
seconds.
The company decided to implement IBM solutions for big data, and can now run multiple catastrophe
risk models every month instead of only three or four times per year. Once data is refreshed, the
company can create “what-if” scenarios in hours rather than weeks. With a better and faster
understanding of exposures and probable maximum losses, the company can take action sooner to
change loss reserves and optimize its’ portfolio.
(IBM, 2013, pg. 7)
Costs associated with typical Big Data implementations
Although a Big Data environment such as that illustrated in Figure 3 can be constructed from open
source software, such as Hadoop and a NoSQL database such as MongoDB, there are still substantial
costs involved. These include:
1) Hardware costs
2) IT and operational costs in setting up a machine cluster and supporting it
3) Cost of personnel to work on the ecosystem
These costs are NOT trivial for the following reasons:
- Dealing with cutting edge technology and finding people who know the technology is
challenging
- The technology introduces a different programming paradigm, frequently requiring additional
training of existing engineering teams
- These technologies are new and still evolving and are not yet mature in the enterprise
ecosystem
- The hardware is server grade and large clusters require resources including network
administration, security administration, system administration etc., as well as data centre
operational costs including electricity, cooling etc.
Infrastructure as a Service (IaaS)
One consideration that can mitigate the cost implications of hardware and support personnel is the
use of a cloud offering. As pointed out by Intel (2015) clouds are already deployed on pools of
server, storage, and networking resources and can scale up or down as needed. Cloud computing
offers a cost-effective way to support Big Data technologies and the advanced analytics applications
that can drive business value.
Diversity Limited (2010) defines Infrastructure as a Service (IaaS) as “a way of delivering Cloud
Computing infrastructure – servers, storage, network and operating systems – as an on-demand
service. Rather than purchasing servers, software, datacenter space or network equipment,
organisations instead buy those resources as a fully outsourced service on demand.”
14. 13
Recommended course for Big Data
IBM (2015) recommends that organisations consider the following when embarking on the Big Data
journey:
1. Choose projects with a high potential return on investment, for which data sources are
readily accessible and already in electronic form, and establish clear goals and quantifiable
metrics. There should be a strong business need for making the resulting data easily
accessible to broad user communities.
2. The data architecture should be extensible to allow addition of other data sources, including
streaming data, as needed.
3. As the project continues, create a feedback loop to inform other departments of insights
derived about products, marketing and sales. This helps promote the value of analytics,
builds a culture that focuses on deriving even better information from analytics, and instils a
high level of trust in the data’s veracity and completeness.
4. Surround Hadoop with a strong ecosystem of Big Data tools and analytics capabilities. The
richer the portfolio of capabilities in the selected Hadoop solution, the more freedom teams
have to solve problems and advance the organization’s insights.
(IBM, 2015, pg. 4)
Recommended Big Data platform
- Utilise an IaaS offering
- Explore the MapR and the IBM BigInsights offerings further.
IBM BigInsights example:
IBM BigInsights is based on 100 percent open source Hadoop. It extends Hadoop with enterprise-
grade technology including administration and integration capabilities, visualization and discovery
tools as well as security, audit history and performance management.
According to IBM, the BigInsights platform offers:
- Increased performance: An average 4 times performance gain over open source Hadoop.1
- Usability: BigInsights is optimized for a wide range of roles, including integration developers,
administrators, data scientists, analysts and line-of-business contacts.
- Integrated with IBM Watson™ Foundations big data platform: BigInsights comes bundled with
search and streaming analytics capabilities.
- Analytics: Built-in Hadoop analytics capabilities for machine data, social data, text and Big R
enable you to locate actionable insights from data in the Hadoop cluster rather than having
to move the data around.
Figure 4: Three-year Costs for Use of IBM InfoSphere BigInsights and Open Source Apache
Hadoop for Major Applications – Averages for All Installations
15. 14
Figure 4. Three-year Costs for Use of IBM InfoSphere BigInsights and Open Source Apache
Hadoop for Major Applications. Copyright 2013 by 2013 by the International Technology Group. Reprinted with permission.
Conclusion
Big Data is having a substantive impact on the P&C insurance industry. Insurers are combining Big Data
and analytics to overcome many of the challenges confronting the industry, and to support new
capabilities. Although implementing a Big Data platform is not without its’ challenges, through careful
consideration, the organisation should be able to generate an appreciable return on its’ Big Data and
analytics initiative. The availability of IaaS platforms for Big Data reduce many of the initial risks that
would traditionally be associated with such projects. In addition the Big Data offerings from MapR
Technologies and IBM, based on initial research appear to be strong candidates for evaluation.
16. 15
References
Diversity Limited. (2010). Moving your infrastructure to the cloud. [pdf]. Retrieved from
http://diversity.net.nz/wp-content/uploads/2011/01/Moving-to-the-Clouds.pdf
Economist Intelligence Unit. (2013). Fostering a data-driven culture. [pdf].
Retrieved from
http://www.economistinsights.com/search/node/sites%20default%20files%20downloads%20Tableau%20DataCu
lture%20130219%20pdf
Gartner. (2013). Characteristics of the traditional versus the big data approach. [Table]. Retrieved from Gartner. (2013).
Big data business benefits are hampered by 'culture clash'. [pdf]. Retrieved from
https://www.gartner.com/doc/2588415
Gartner. (2013). Use big data to solve fraud and security problems. [pdf]. Retrieved from
https://www.gartner.com/doc/2397715
Gartner. (2013). How it should deepen big data analysis to support customer-centricity. [pdf].
Retrieved from https://www.gartner.com/doc/2531116
Gartner. (2013). Consistent view of the customer for big data. [Diagram]. Retrieved from Gartner. (2013). How it should
deepen big data analysis to support customer-centricity. [pdf].
Retrieved from https://www.gartner.com/doc/2531116
Gartner. (2014). Agenda overview for p&c and life insurance. [pdf].
Retrieved from https://www.gartner.com/doc/2643327
HBR. (2012). Big Data: The management revolution. [pdf].
Retrieved from https://hbr.org/2012/10/big-data-the-management-revolution/ar
Hortonworks. (2013). Application enrichment with hadoop. [Diagram]. Retrieved from Hortonworks. (2013).
Apache Hadoop patterns of use. [pdf]. Retrieved from http://hortonworks.com/blog/apache-hadoop-patterns-of-
use-refine-enrich-and-explore/
IBM. (2012). Four dimensions of big data. [Diagram] Retrieved from IBM, (2012). Analytics: the real-world use of big
data. [pdf]. Retrieved from
http://public.dhe.ibm.com/common/ssi/ecm/en/gbe03519usen/GBE03519USEN.PDF
IBM. (2012). Analytics: the real-world use of big data. [pdf].
Retrieved from http://public.dhe.ibm.com/common/ssi/ecm/en/gbe03519usen/GBE03519USEN.PDF
IBM. (2012). Insurance bureau of Canada. [pdf]. Retrieved from
http://www-01.ibm.com/common/ssi/cgi-
bin/ssialias?subtype=AB&infotype=PM&appname=SWGE_IM_IM_USEN&htmlfid=IMC14775USEN&attachment=I
MC14775USEN.PDF
IBM. (2013). A better assessment of the data around and connected to a single piece of information enables a more
complete, in-context understanding. [Diagram]. Retrieved from IBM. (2013). The future of insurance. [pdf].
Retrieved from http://public.dhe.ibm.com/common/ssi/ecm/en/imw14671usen/IMW14671USEN.PDF
IBM. (2013). Harnessing the power of big data and analytics for insurance. [pdf]. Retrieved from
http://public.dhe.ibm.com/common/ssi/ecm/en/imw14672usen/IMW14672USEN.PDF
IBM. (2014). Analytics: The speed advantage. [pdf].
Retrieved from http://www-935.ibm.com/services/us/gbs/thoughtleadership/2014analytics/
IBM. (2014). IBM expands hadoop commitment with support for spark.. [blog].
Retrieved from http://www.ibmbigdatahub.com/blog/ibm-expands-hadoop-commitment-support-spark
IBM. (2015). Analytics: What is mapreduce. [web page].
Retrieved from http://www-01.ibm.com/software/data/infosphere/hadoop/mapreduce/
17. 16
IBM. (2015). BigInsights for apache hadoop quick start edition. [pdf].
Retrieved from http://www-01.ibm.com/common/ssi/cgi-
bin/ssialias?infotype=PM&subtype=BR&htmlfid=IMB14164USEN#loaded
IBM. (2015). Making the case for hadoop and big data in the enterprise. [pdf].
Retrieved from http://www-01.ibm.com/common/ssi/cgi-
bin/ssialias?infotype=PM&subtype=BK&htmlfid=IMM14161USEN#loaded
ITG. (2013). Business case for enterprise big data deployments. [pdf].
Retrieved from http://www-01.ibm.com/common/ssi/cgi-
bin/ssialias?htmlfid=IME14028USEN&appname=skmwww
ITG. (2013). Three-year Costs for Use of IBM InfoSphere BigInsights and Open Source Apache
Hadoop for Major Applications. [Diagram].
Retrieved from ITG. (2013). Business case for enterprise big data deployments. [pdf].
Retrieved from http://www-01.ibm.com/common/ssi/cgi-
bin/ssialias?htmlfid=IME14028USEN&appname=skmwww
Intel. (2015). Big data cloud technology. [pdf].
Retrieved from http://www.intel.co.za/content/dam/www/public/us/en/documents/product-briefs/big-data-
cloud-technologies-brief.pdf
McKinsey. (2011). Big data: The next frontier for innovation, competition, and productivity. [pdf].
Retrieved from
http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovation
Ordnance Survey. (2013) The big data rush: how data analytics can yield underwriting gold. [pdf].
Retrieved from http://events.marketforce.eu.com/big-data-underwriting-report-email
Planet Cassandra. (2015). Nosql databases defined and explained. [web page].
Retrieved from http://www.planetcassandra.org/what-is-nosql/
Standish, J. (2013). Speed to detection - strategically leveraging advanced analytics for insurance fraud. [blog]. Retrieved
from
http://www.johnstandishconsultinggroup.com/JohnStandishConsultingGroup.com/Blog/Entries/2013/8/9_Speed
_to_Detection_-_Strategically_Leveraging_Advanced_Analytics_for_Insurance_Fraud.html
Strategy Meets Action. (2012). Data and analytics in insurance. [pdf].
Retrieved from https://www.acord.org/library/Documents/2012_SMA_Data_Analytics.pdf
Strategy Meets Action. (2013). Data and analytics in insurance: p&c plans and priorities for 2013 and beyond. [pdf].
Retrieved from https://strategymeetsaction.com/data-and-analytics-in-insurance-p-and-c-plans-and-priorities-
for-2013-and-beyond/
Zikopoulos, P., deRoos, D., Bienko, C., Buglio, R., Andrews, M. (2014). Big data beyond the hype. [pdf].
Retrieved from
https://www.ibm.com/developerworks/community/blogs/SusanVisser/entry/big_data_beyond_the_hype_a_gui
de_to_conversations_for_today_s_data_center?lang=en