The document discusses how a semantic "data lake" can help organizations extract meaning and insights from large amounts of digital data. A data lake combines data from different sources and uses semantic models, tagging, and algorithms to help users more quickly find relevant data relationships and insights. It describes how semantic technology plays a key role in data ingestion, management, modeling of different views, querying, and exposing analytics as web services to create personalized customer experiences.
Learn the basics of business intelligence, including common terms, how to implement solutions, and what it can do for your company. For even more insight into how project management can benefit your work, visit: http://bit.ly/GuideToBI
To find a custom business intelligence solution that fits the specific needs of your work, visit: http://bit.ly/GetBI1
DATA VIRTUALIZATION FOR DECISION MAKING IN BIG DATAijseajournal
Data analytics and Business Intelligence (BI) are essential components of decision support technologies that gather and analyze data for faster and better strategic and operational decision making in an organization. Data analytics emphasizes on algorithms to control the relationship between data offering insights. The major difference between BI and analytics is that analytics has predictive competence which helps in making future predictions whereas Business Intelligence helps in informed decision-making built on the analysis of past data. Business Intelligence solutions are among the most valued data management tools whose main objective is to enable interactive access to real-time data, manipulation of data and provide business organizations with appropriate analysis. Business Intelligence solutions leverage software and services to collect and transform raw data into useful information that enable more informed and quality business decisions regarding customers, market competitors, internal operations and so on. Data needs to be integrated from disparate sources in order to derive valuable insights. Extract-Transform-Load (ETL), which are traditionally employed by organizations help in extracting data from different sources, transforming and aggregating and finally loading large volume of data into warehouses. Recently Data virtualization has been used to speed up the data integration process. Data virtualization and ETL often serve unique and complementary purposes in performing complex, multi-pass data transformation and cleansing operations, and bulk loading the data into a target data store. In this paper we provide an overview of Data virtualization technique used for Data analytics and BI.
Slow Data Kills Business eBook - Improve the Customer ExperienceInterSystems
We live in an era where customer experience trumps product features and functions. How do you exceed customer’s expectations every time they interact with your organization? By leveraging more information and applying insights you have learned over time. Turning data-driven power into delightful experiences will give you the advantages required to succeed in today’s climate of one-click shopping and crowd-sourced feedback. Whether you are a retailer, a banker, a care provider, or a policy maker, your organization must harness the power of growing data volumes, data types, and data sources to foster experiences that matter.
Understanding the DSR Market looks at the differences between a team and enterprise solution for handling multiple data sources in the consumer goods industry.
Rapid digitization has resulted in the production of large volumes of unstructured data. This trend is expected to provide significant opportunities for graph database market in the upcoming years
Financial Markets Data & Analytics Led TransformationGianpaolo Zampol
How big data, advanced analytics and cognitive computing is disrupting traditional business and operating models in financial markets? New competitors, powered by social, mobile, analytics, and cloud computing, are making new business models emerging rapidly. Wealth Management, Corporate Banking and Transaction Banking & Payments are significant sources of growth in Financial Markets. How take advantage from those new technologies to face this new scenario?
Learn the basics of business intelligence, including common terms, how to implement solutions, and what it can do for your company. For even more insight into how project management can benefit your work, visit: http://bit.ly/GuideToBI
To find a custom business intelligence solution that fits the specific needs of your work, visit: http://bit.ly/GetBI1
DATA VIRTUALIZATION FOR DECISION MAKING IN BIG DATAijseajournal
Data analytics and Business Intelligence (BI) are essential components of decision support technologies that gather and analyze data for faster and better strategic and operational decision making in an organization. Data analytics emphasizes on algorithms to control the relationship between data offering insights. The major difference between BI and analytics is that analytics has predictive competence which helps in making future predictions whereas Business Intelligence helps in informed decision-making built on the analysis of past data. Business Intelligence solutions are among the most valued data management tools whose main objective is to enable interactive access to real-time data, manipulation of data and provide business organizations with appropriate analysis. Business Intelligence solutions leverage software and services to collect and transform raw data into useful information that enable more informed and quality business decisions regarding customers, market competitors, internal operations and so on. Data needs to be integrated from disparate sources in order to derive valuable insights. Extract-Transform-Load (ETL), which are traditionally employed by organizations help in extracting data from different sources, transforming and aggregating and finally loading large volume of data into warehouses. Recently Data virtualization has been used to speed up the data integration process. Data virtualization and ETL often serve unique and complementary purposes in performing complex, multi-pass data transformation and cleansing operations, and bulk loading the data into a target data store. In this paper we provide an overview of Data virtualization technique used for Data analytics and BI.
Slow Data Kills Business eBook - Improve the Customer ExperienceInterSystems
We live in an era where customer experience trumps product features and functions. How do you exceed customer’s expectations every time they interact with your organization? By leveraging more information and applying insights you have learned over time. Turning data-driven power into delightful experiences will give you the advantages required to succeed in today’s climate of one-click shopping and crowd-sourced feedback. Whether you are a retailer, a banker, a care provider, or a policy maker, your organization must harness the power of growing data volumes, data types, and data sources to foster experiences that matter.
Understanding the DSR Market looks at the differences between a team and enterprise solution for handling multiple data sources in the consumer goods industry.
Rapid digitization has resulted in the production of large volumes of unstructured data. This trend is expected to provide significant opportunities for graph database market in the upcoming years
Financial Markets Data & Analytics Led TransformationGianpaolo Zampol
How big data, advanced analytics and cognitive computing is disrupting traditional business and operating models in financial markets? New competitors, powered by social, mobile, analytics, and cloud computing, are making new business models emerging rapidly. Wealth Management, Corporate Banking and Transaction Banking & Payments are significant sources of growth in Financial Markets. How take advantage from those new technologies to face this new scenario?
Data-Ed Online Presents: Data Warehouse StrategiesDATAVERSITY
Integrating data across systems has been a perpetual challenge. Unfortunately, the current technology-focused solutions have not helped IT to improve its dismal project success statistics. Data warehouses, BI implementations, and general analytical efforts achieve the same levels of success as other IT projects – approximately 1/3rd are considered successes when measured against price, schedule, or functionality objectives. The first step is determining the appropriate analysis approach to the data system integration challenge. The second step is understanding the strengths and weaknesses of various approaches. Turns out that proper analysis at this stage makes actual technology selection far more accurate. Only when these are accomplished can proper matching between problem and capabilities be achieved as the third step and true business value be delivered. This webinar will illustrate that good systems development more often depends on at least three data management disciplines in order to provide a solid foundation.
Takeaways:
Data system integration challenge analysis
Understanding of a range of data system-integration technologies including
Problem space (BI, Analytics, Big Data), Data (Warehousing, Vault, Cube) and alternative approaches (Virtualization, Linked Data, Portals, Meta-models)
Understanding foundational data warehousing & BI concepts based on the Data Management Body of Knowledge (DMBOK)
How to utilize data warehousing & BI in support of business strategy
Reference data management in financial services industryNIIT Technologies
This white paper analyse s the need for Reference Data Management in the financial services industry and elucidates the challenges associated with its implementation. The paper also focuses on the critical elements of RDM implementation and some of the major benefits an organization can derive by implementing a robust Reference Data Management into its IT infrastructure.
The purpose of business intelligence is to support better business decision making. BI systems provide historical, current, and predictive views of business operations, most often using data that has been gathered into a data warehouse or a data mart and occasionally working from operational data.
Role of business intelligence in knowledge managementShakthi Fernando
This study is fundamentally based on the most common components of a Business Intelligence System, data warehouses, ETL tools, OLAP techniques and data mining, which comfort the decision making function. It further describe about the role of each component in a Business Intelligence System and how Business Intelligence Systems can be used for better business decision making at each level of management.
The opportunity of the business data lakeCapgemini
The Pivotal Business Data Lake is a new way to deliver information for the enterprise based around four simple principles:
- Store everything
- Encourage local
- Govern only the common
- Treat global as a local view
Principles that match the way business works today and now principles that can be delivered efficiently in technology using the Pivotal Business Data Lake and Capgemini's information governance and delivery methods.
What is business intelligence? Where have we been, where are we now, and where are we going? These slides provide a brief history of business intelligence, enjoy.
What is BI,Definition, examples, BI industry, Solutions, Evolution, Catogeries, Key Stages of BI, BI significance, BI technologies, tools, future of BI
Learn about Addressing Storage Challenges to Support Business Analytics and Big Data Workloads and how Storage teams, IT executives, and business users will benefit by recognizing that deploying appropriate storage infrastructure to support a wide range of business analytics workloads will require constant evaluation and willingness to adjust the infrastructure as needed. For more information on IBM Storage Systems, visit http://ibm.co/LIg7gk.
Visit the official Scribd Channel of IBM India Smarter Computing at http://bit.ly/VwO86R to get access to more documents.
The winners among vendors will have exceptional capability in implementation of large projects pulling together capabilities in reporting and querying, multidimensional analysis, analytics, data management, visualization and business process management
Semantic 'Radar' Steers Users to Insights in the Data LakeThomas Kelly, PMP
By infusing information with intelligence, users can discover meaning in the digital data that envelops people, organizations, processes, products and things.
Data-Ed Online Presents: Data Warehouse StrategiesDATAVERSITY
Integrating data across systems has been a perpetual challenge. Unfortunately, the current technology-focused solutions have not helped IT to improve its dismal project success statistics. Data warehouses, BI implementations, and general analytical efforts achieve the same levels of success as other IT projects – approximately 1/3rd are considered successes when measured against price, schedule, or functionality objectives. The first step is determining the appropriate analysis approach to the data system integration challenge. The second step is understanding the strengths and weaknesses of various approaches. Turns out that proper analysis at this stage makes actual technology selection far more accurate. Only when these are accomplished can proper matching between problem and capabilities be achieved as the third step and true business value be delivered. This webinar will illustrate that good systems development more often depends on at least three data management disciplines in order to provide a solid foundation.
Takeaways:
Data system integration challenge analysis
Understanding of a range of data system-integration technologies including
Problem space (BI, Analytics, Big Data), Data (Warehousing, Vault, Cube) and alternative approaches (Virtualization, Linked Data, Portals, Meta-models)
Understanding foundational data warehousing & BI concepts based on the Data Management Body of Knowledge (DMBOK)
How to utilize data warehousing & BI in support of business strategy
Reference data management in financial services industryNIIT Technologies
This white paper analyse s the need for Reference Data Management in the financial services industry and elucidates the challenges associated with its implementation. The paper also focuses on the critical elements of RDM implementation and some of the major benefits an organization can derive by implementing a robust Reference Data Management into its IT infrastructure.
The purpose of business intelligence is to support better business decision making. BI systems provide historical, current, and predictive views of business operations, most often using data that has been gathered into a data warehouse or a data mart and occasionally working from operational data.
Role of business intelligence in knowledge managementShakthi Fernando
This study is fundamentally based on the most common components of a Business Intelligence System, data warehouses, ETL tools, OLAP techniques and data mining, which comfort the decision making function. It further describe about the role of each component in a Business Intelligence System and how Business Intelligence Systems can be used for better business decision making at each level of management.
The opportunity of the business data lakeCapgemini
The Pivotal Business Data Lake is a new way to deliver information for the enterprise based around four simple principles:
- Store everything
- Encourage local
- Govern only the common
- Treat global as a local view
Principles that match the way business works today and now principles that can be delivered efficiently in technology using the Pivotal Business Data Lake and Capgemini's information governance and delivery methods.
What is business intelligence? Where have we been, where are we now, and where are we going? These slides provide a brief history of business intelligence, enjoy.
What is BI,Definition, examples, BI industry, Solutions, Evolution, Catogeries, Key Stages of BI, BI significance, BI technologies, tools, future of BI
Learn about Addressing Storage Challenges to Support Business Analytics and Big Data Workloads and how Storage teams, IT executives, and business users will benefit by recognizing that deploying appropriate storage infrastructure to support a wide range of business analytics workloads will require constant evaluation and willingness to adjust the infrastructure as needed. For more information on IBM Storage Systems, visit http://ibm.co/LIg7gk.
Visit the official Scribd Channel of IBM India Smarter Computing at http://bit.ly/VwO86R to get access to more documents.
The winners among vendors will have exceptional capability in implementation of large projects pulling together capabilities in reporting and querying, multidimensional analysis, analytics, data management, visualization and business process management
Semantic 'Radar' Steers Users to Insights in the Data LakeThomas Kelly, PMP
By infusing information with intelligence, users can discover meaning in the digital data that envelops people, organizations, processes, products and things.
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to LifeSG Analytics
The new data technologies, along with legacy infrastructure, are driving market-driven innovations like personalized offers, real-time alerts, and predictive maintenance. However, these technical additions - ranging from data lakes to analytics platforms to stream processing and data mesh —have increased the complexity of data architectures. They are significantly hampering the ongoing ability of an organization to deliver new capabilities while ensuring the integrity of artificial intelligence (AI) models. https://us.sganalytics.com/blog/evolving-big-data-strategies-with-data-lakehouses-and-data-mesh/
The content of the document, "Implementing Data Mesh: Six Ways That Can Improve the Odds of Your Success," is a whitepaper authored by Ranganath Ramakrishna from LTIMindtree. The whitepaper introduces the concept of Data Mesh, a socio-technical paradigm that aims to help organizations fully leverage the value of their analytical data.
Data analytics can immensely impact and improve a business’s decision-making processes. From better strategies to profits, explore the full scope of analytics.
Data analytics can immensely impact and improve a business’s decision-making processes. From better strategies to profits, explore the full scope of analytics.
All business sizes can benefit from better use of their data to gain insights, how the cloud can help overcome common data challenges and accelerate transformation with the cloud technology
https://www.rapyder.com/cloud-data-analytics-services/
Now companies are in the middle of a renovation that forces them to be analytics-driven to
continue being competitive. Data analysis provides a complete insight about their business. It
also gives noteworthy advantages over their competitors. Analytics-driven insights compel
businesses to take action on service innovation, enhance client experience, detect irregularities in
process and provide extra time for product or service marketing. To work on analytics driven
activities, companies require to gather, analyse and store information from all possible sources.
Companies should bring appropriate tools and workflows in practice to analyse data rapidly and
unceasingly. They should obtain insight from data analysis result and make changes in their
business process and practice on the basis of gained result. It would help to be more agile than
their previous process and function.
Running head Database and Data Warehousing design1Database and.docxhealdkathaleen
Running head: Database and Data Warehousing design 1
Database and Data Warehousing Design 3
Database and Data Warehousing Design
Thien Thai
CIS599
Professor Wade M. Poole
Strayer University
Feb 20, 2020
Database and Data Warehousing Design
Introduction
Technology has highly revolutionized the world of business –hence presenting more challenges and opportunities for businesses. Companies which fail to embrace and incorporate technology in their operations risks being edged out of the market due to stiff competition witnessed in the market today. On the flipside, cloud-based technology allows businesses to “easily retrieve and store valuable data about their customers, products, and employees.” Data is an important component that help to support core business decisions. In today’s highly competitive and constantly evolving business world, embracing cloud-based technology business managers an opportunity to make informed and result-oriented decisions regarding day-to-day organizational operations (Dimitriu & Matei, 2015).
Notably, business growth and competitiveness depends on its ability to transform data into information. Data warehousing and adoption of relational databases are some of cloud-based technologies which have positively impacted on businesses. The two technologies have had a strategic value to companies –helping them to have the extra edge over their competitors. Both data warehousing and relational databases help businesses to “take smart decisions in a smarter manner.” However, failure to adopt these cloud-based technologies has hindered business executives’ ability to make experienced-based and fact-based decisions which are vital to business survival. Both “databases and data warehouses are relational data systems” which serve different and equally crucial roles within an organization. For instance, data warehousing helps to support management decisions while relational databases help to perform ongoing business transactions in real-time. Basically, embracing cloud-based technologies within the organization will help to give the company a competitive advantage in the market. However, the adoption and maintenance of such technologies require full support and endorsement of the business management. Organizational management must understand the feasibility, functionality, and the importance of embracing such technologies. Movement towards relational databases and data warehousing requires a lot of funding –hence the need to convince the management to support and fund them. This paper seeks to explore the concepts of data warehousing, relational databases, their importance to the business, as whey as their design.
“Importance of Data Warehousing and Relational Databases”
Today, technology has changed the market landscape. Business are striving to adopt cloud-based technology in order to improve efficiency in business functions –among them analytical queries as well as transactional operations. Both relational databases a ...
Running head Database and Data Warehousing design1Database and.docxtodd271
Running head: Database and Data Warehousing design 1
Database and Data Warehousing Design 3
Database and Data Warehousing Design
Thien Thai
CIS599
Professor Wade M. Poole
Strayer University
Feb 20, 2020
Database and Data Warehousing Design
Introduction
Technology has highly revolutionized the world of business –hence presenting more challenges and opportunities for businesses. Companies which fail to embrace and incorporate technology in their operations risks being edged out of the market due to stiff competition witnessed in the market today. On the flipside, cloud-based technology allows businesses to “easily retrieve and store valuable data about their customers, products, and employees.” Data is an important component that help to support core business decisions. In today’s highly competitive and constantly evolving business world, embracing cloud-based technology business managers an opportunity to make informed and result-oriented decisions regarding day-to-day organizational operations (Dimitriu & Matei, 2015).
Notably, business growth and competitiveness depends on its ability to transform data into information. Data warehousing and adoption of relational databases are some of cloud-based technologies which have positively impacted on businesses. The two technologies have had a strategic value to companies –helping them to have the extra edge over their competitors. Both data warehousing and relational databases help businesses to “take smart decisions in a smarter manner.” However, failure to adopt these cloud-based technologies has hindered business executives’ ability to make experienced-based and fact-based decisions which are vital to business survival. Both “databases and data warehouses are relational data systems” which serve different and equally crucial roles within an organization. For instance, data warehousing helps to support management decisions while relational databases help to perform ongoing business transactions in real-time. Basically, embracing cloud-based technologies within the organization will help to give the company a competitive advantage in the market. However, the adoption and maintenance of such technologies require full support and endorsement of the business management. Organizational management must understand the feasibility, functionality, and the importance of embracing such technologies. Movement towards relational databases and data warehousing requires a lot of funding –hence the need to convince the management to support and fund them. This paper seeks to explore the concepts of data warehousing, relational databases, their importance to the business, as whey as their design.
“Importance of Data Warehousing and Relational Databases”
Today, technology has changed the market landscape. Business are striving to adopt cloud-based technology in order to improve efficiency in business functions –among them analytical queries as well as transactional operations. Both relational databases a.
Big Data is Here for Financial Services White PaperExperian
Conquering Big Data Challenges
Financial institutions have invested in Big Data for many years, and new advances in technology infrastructure have opened the door for leveraging data in ways that can make an even greater impact on your business.
Learn how Big Data challenges are easier to overcome and how to find opportunities in your existing data and scale for the future.
I have been drinking from a virtual fire hose since joining my most recent technology company, Anametrix, a cloud-based digital analytics innovator. A whole new book opened for me on how digital analytics can both increase top line revenue and reduce spend by shining a very bright flashlight into marketing efforts.
We are all painfully aware of the data explosion problem. In 2011, the Gartner Group stated that information volume collected by businesses today is growing at a minimum 59% annually. The rapid adoption of social media has also caused customer data to explode in the last few years, creating entirely new challenges for marketers. It is now imperative for organizations to think differently to accommodate the variety, volume, and velocity of their growing customer-related data.
This is where my recent experiences come in: I have personally seen how digital analytics can harness the power of massive amounts customer-related data. It can literally simplify the accelerating complexity by providing deep visibility – as well as clarity – into the effectiveness of various marketing efforts, across both online and offline channels.
I will now outline the role of IT and CFO in adopting cloud-based digital analytics solutions, discuss the benefits as well as challenges of moving to this emerging category, and provide some illustrative examples on how digital analytics can transform your marketing organization.
An efficient data science team is crucial for deriving value from the humongous data a business collect. Learn how the data science team can help in this regard.
Similar to Semantic 'Radar' Steers Users to Insights in the Data Lake (20)
Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...Cognizant
Organizations rely on analytics to make intelligent decisions and improve business performance, which sometimes requires reproducing business processes from a legacy application to a digital-native state to reduce the functional, technical and operational debts. Adaptive Scrum can reduce the complexity of the reproduction process iteratively as well as provide transparency in data analytics porojects.
It Takes an Ecosystem: How Technology Companies Deliver Exceptional ExperiencesCognizant
Experience is evolving into a strategy that reaches across technology companies. We offer guidance on the rise of experience and its role in business modernization, with details on how orgnizations can build the ecosystem to support it.
The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...Cognizant
The T&L industry appears poised to accelerate its long-overdue modernization drive, as the pandemic spurs an increased need for agility and resilience, according to our study.
Enhancing Desirability: Five Considerations for Winning Digital InitiativesCognizant
To be a modern digital business in the post-COVID era, organizations must be fanatical about the experiences they deliver to an increasingly savvy and expectant user community. Getting there requires a mastery of human-design thinking, compelling user interface and interaction design, and a focus on functional and nonfunctional capabilities that drive business differentiation and results.
The Work Ahead in Manufacturing: Fulfilling the Agility MandateCognizant
According to our research, manufacturers are well ahead of other industries in their IoT deployments but need to marshal the investment required to meet today’s intensified demands for business resilience.
The Work Ahead in Higher Education: Repaving the Road for the Employees of To...Cognizant
Higher-ed institutions expect pandemic-driven disruption to continue, especially as hyperconnectivity, analytics and AI drive personalized education models over the lifetime of the learner, according to our recent research.
Engineering the Next-Gen Digital Claims Organisation for Australian General I...Cognizant
In recent years, insurers have invested in technology platforms and process improvements to improve
claims outcomes. Leaders will build on this foundation across the claims landscape, spanning experience,
operations, customer service and the overall supply chain with market-differentiating capabilities to
achieve sustainable results.
Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...Cognizant
Amid constant change, industry leaders need an upgraded IT infrastructure capable of adapting to audience expectations while proactively anticipating ever-evolving business requirements.
Green Rush: The Economic Imperative for SustainabilityCognizant
Green business is good business, according to our recent research, whether for companies monetizing tech tools used for sustainability or for those that see the impact of these initiatives on business goals.
Policy Administration Modernization: Four Paths for InsurersCognizant
The pivot to digital is fraught with numerous obstacles but with proper planning and execution, legacy carriers can update their core systems and keep pace with the competition, while proactively addressing customer needs.
The Work Ahead in Utilities: Powering a Sustainable Future with DigitalCognizant
Utilities are starting to adopt digital technologies to eliminate slow processes, elevate customer experience and boost sustainability, according to our recent study.
AI in Media & Entertainment: Starting the Journey to ValueCognizant
Up to now, the global media & entertainment industry (M&E) has been lagging most other sectors in its adoption of artificial intelligence (AI). But our research shows that M&E companies are set to close the gap over the coming three years, as they ramp up their investments in AI and reap rising returns. The first steps? Getting a firm grip on data – the foundation of any successful AI strategy – and balancing technology spend with investments in AI skills.
Operations Workforce Management: A Data-Informed, Digital-First ApproachCognizant
As #WorkFromAnywhere becomes the rule rather than the exception, organizations face an important question: How can they increase their digital quotient to engage and enable a remote operations workforce to work collaboratively to deliver onclient requirements and contractual commitments?
Five Priorities for Quality Engineering When Taking Banking to the CloudCognizant
As banks move to cloud-based banking platforms for lower costs and greater agility, they must seamlessly integrate technologies and workflows while ensuring security, performance and an enhanced user experience. Here are five ways cloud-focused quality assurance helps banks maximize the benefits.
Getting Ahead With AI: How APAC Companies Replicate Success by Remaining FocusedCognizant
Changing market dynamics are propelling Asia-Pacific businesses to take a highly disciplined and focused approach to ensuring that their AI initiatives rapidly scale and quickly generate heightened business impact.
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...Cognizant
Intelligent automation continues to be a top driver of the future of work, according to our recent study. To reap the full advantages, businesses need to move from isolated to widespread deployment.
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...
Semantic 'Radar' Steers Users to Insights in the Data Lake
1. Semantic ‘Radar’
Steers Users to Insights
in the Data Lake
By infusing information with intelligence,
users can discover meaning in the digital
data that envelops people, organizations,
processes, products and things.
2. 2 KEEP CHALLENGING September 2014
Executive Summary
In the past several years, we’ve seen Amazon grow to nearly $75 billion in
sales without operating a single storefront, in part by using information
about customers’ past purchases to recommend new offers to them. More
recently, Airbnb managed 10 million stays in 2013 without owning a single
hotel; instead, it encourages members to offer space in their own homes to
rent out to others and publishes online ratings of guests and hosts.
These are only two examples of how businesses are transforming customer
experiences, establishing new business models and redefining their markets.
At the heart of their success is an innate ability to continuously understand
and manage the digital data generated by people, organizations, processes,
products and things1
— what we call a Code Halo™.
In an increasingly “data-rich, meaning-poor” business world, companies
everywhere are attempting to learn the value of signal and the cost of noise
when coming to grips with their big data and analytics initiatives. Extracting
meaning from this data empowers companies, brands and employers
to better understand and meet customer needs, and to create more
personalized experiences for them. Doing so requires new tools to manage
and understand the torrents of data that will come from the projected tens
of billions of devices as the “Internet of Things” links everything from cars
to clothing.2
It also requires algorithms that filter data “noise,” generate
new insights and create personalized and enriching experiences that help
Code Halo pioneers like Pandora and Netflix treat us as if we were individual
markets of one.
Combining streams of data from different sources into a “data lake”3
can
help steer business decision-makers to the insights they need more quickly
and efficiently. At its core is high-performance, low-cost data storage
technology to store and manage frequently used, large volumes of data.
Multiple analytics engines speed the delivery of a personalized experience
to each customer at the point of interaction. Semantic models standardize
the definition and mapping of internal and external data. Domain expertise,
3. SEMANTIC ‘RADAR’ STEERS USERS TO INSIGHTS IN THE DATA LAKE 3
such as knowledge of which patterns in the data represent risk or
opportunity, is embedded in the semantic models4
to guide business users
to insights. Finally, self-service features help business users rapidly find
and analyze data without delay.
This white paper highlights the challenges that organizations face when
using today’s data architectures to find and act on business-changing
big data insights. It describes how a “data lake“ can help, and the central
role that semantic technology plays by slashing the time required to find,
integrate, share and analyze the data. It finally describes how enterprises
can begin creating their own data lakes.
4. 4 KEEP CHALLENGING September 2014
Today’s Information Architectures:
Too Slow, Expensive and Hard to Understand
A data lake eliminates many of the roadblocks that now stand between business
users and business insights. Among these are:
• Onboarding and integrating data is slow and expensive. Creating personalized
customer experiences and tracking complex markets requires collecting, inte-
grating, analyzing and responding to data from many different sources. However,
the ETL (extract, transform and load) function requires custom processes and
technologies for each source system, causing delays when skilled staff is busy on
other projects.
• Optimizing data for different users is difficult and costly. Departments such
as finance, sales and manufacturing may organize and define data differently.
Finance, for example, may organize customers based on their annual purchases
or credit ratings, while sales organizes them by geography, and manufacturing by
the products they purchase. The cost and delay of building data marts tailored
to the needs of each unit can make it impossible to quickly uncover vital insights.
• Moving and reorganizing data introduces delays and cost. Data storage is be-
coming steadily less expensive, but it still represents a capital and operational ex-
pense. Hampered by insufficient funding and the high costs of onboarding data,
organizations often wait until there is a defined need for a set of data before
loading it into an analytics database. By then, a competitor may have already
captured the data and acted on the analysis.
• Data provenance and meaning is often lost in translation. Before knowing
what data to query, a user must understand what that data represents and how
it is used. One example: “This number (X) represents defect rate per thousand,
which we use to justify our premium pricing for a customer.” As data knowledge
is transferred from expert to designer to developer, meaning is often replaced
with technical syntax and rules that the business user cannot understand, effec-
tively removing all-important context from “corporate memory.” Maintaining the
provenance and meaning allows the data to tell a story that matters to a real-life
decision-maker.
• Who transformed what data, and why? In order to ensure that users aren’t
making decisions based on faulty data, it is essential to understand who changed
the format of the data, cleansed it to eliminate inconsistencies or changed values
to comply with standards. Such management artifacts are often tracked only in
spreadsheets and don’t capture the latest changes made to live data. This prob-
lem extends to tracking changes to data after it is onboarded, increasing the
costs and complexity of performing data analysis.
• Models are too rigid or incomplete. Excessive standardization of the vocabulary
used to define and organize data may make it harder to conduct specific types of
analysis. On the other hand, the lack of a model makes it difficult or impossible
to identify what data is available, how to access it and how to integrate it with
other data. That means delays when users must search for the required data, de-
termine who owns it, request permission to access it, and integrate it with other
data.
• The speed of change. With every new competitor, product offering, business
model or distribution channel, an organization must rapidly assimilate and query
new types of data, organized and optimized in new and innovative ways. Lengthy
data design and standardization processes are not effective in today’s business
environment.
5. Business Needs Traditional Technologies
and Practices
Data Lake Technologies
and Practices
Onboard
new data
Load data
Connect external
data to the data
repository
Organize data for
specific purposes
Discover data
and data
relationships
SEMANTIC ‘RADAR’ STEERS USERS TO INSIGHTS IN THE DATA LAKE 5
All of these issues are solvable with a “smart” data lake strategy. Figure 1 depicts
best practices that overcome key points of friction that organizations encounter
when using traditional approaches.
Data Lake Essentials
Filling the data lake to generate unique customer experiences and unlock market
insights requires new processes and technologies across the data and query
management lifecycle (see Figure 2, next page). At each step, semantic technology
plays a central role (see sidebar, page 7).
• Data ingestion: This involves finding and importing the data needed for Code
Halo analysis. Structured and unstructured data can be collected through tradi-
tional scheduled batch loading, data streaming or real-time, on-demand queries
for internally- and externally-hosted data.
Data is prioritized based on the data’s potential ability
to drive insights.
Flexible data models can be revised or extended
without redesign of the database.
Agile, evolutionary refinement of the data organiza-
tion makes it easier to use the data and leverage new
insights. A catalog of data assets captures all details of
onboarded data.
Model-based ETL processing is defined and generated
automatically, with data loaded unfiltered and untrans-
formed. Self-service features speed time to business
value.
A common model maps to all internal and external
data assets. Frequently queried data is collected and
loaded to provide predictable query performance.
Internal and external data is queried through common
methods, with real-time values delivered at query time.
Automation facilitates data organization and syn-
chronization within the repository. All data remains
connected to data definitions, sources and provenance,
and it is updated automatically. Automation manages
data retention.
Semantic search helps end users ask questions about
the data as needed to improve their understanding of
it. Data discovery tools continuously mine and analyze
the data for currently unknown data relationships and
recommend updates to the models that increase their
business value.
Data is prioritized, based on the current
business case for analytics.
Comprehensive analysis creates rigid
structures that are difficult to change.
Minimal definition of data organization
during onboarding requires users to
have a detailed understanding of data
contents.
Comprehensive, custom ETL processes
capture, cleanse and load the data into
the data repository.
Source data is collected and loaded
into the data repository. The data is
refreshed on a scheduled frequency.
Data is copied to a separate region of
the data repository, with little documen-
tation, organization or management.
Data silos proliferate.
Users must create projects to mine and
analyze data for currently unknown data
relationships.
The Data Lake Defined
Figure 1
6. 6 KEEP CHALLENGING September 2014
Organizations can quickly analyze the data needed without loading every con-
ceivably useful bit of data beforehand. The combination of semantic models and
automated data movement greatly reduces the need for manual coding of ETL
processes by defining all the available data, how it is organized, where it is located
and what data movement/transformations must be performed. During ingestion,
the data is semantically tagged and categorized so it can be easily identified for
future needs.
Pre-modeling also lets business users perform self-service data mapping and
queries of new data that will only be needed for a short time or that are awaiting
integration with formal models. They can then use this data for rapid, on-demand
Code Halo analysis to meet urgent needs without waiting for help from IT.
Data Lake Components
Figure 2
Data Management
DDDaaatttaaa LLLaaakkkeee MMMaaannnaaaggggeeemmeenntt
Semantic
GraphHPC
Columnar
In-memory
Linked Data
Streaming
NoSQL Map
Reduce
HDFS Storage
Desktop
and Mobile
Social Media
and Cloud
Operational
Systems
Internet of Things
ManagementData Ingestion
Model-driven
Models Governance Workflow
Access/
Security
Data Assets Catalog
Semantic
Tagging
On-demand
Query
Scheduled
Batch Load
Self-service
Streaming
On-demand
Query
SScchheedduulleedd
Batch Load
Structured and
Unstructured Data
Data
Movement Provenance Semantic Search
Data Discovery
Analytics Directed to
the Best Query Engine
Capture, Embed
and Share
Analytics Expertise
Query Data, Metadata
and Provenance
geeeemmmmeeeeeennnnnttt
eemmmeent
sets Cata
n
e
t
c Sea
o
n
scov
ed
gine
D
uer
s D
is
C
a
ics
t
e
nd S
beb
rti
The intelligent data management layer directs
the model-driven movement of data, while
provenance capabilities capture and track details
about its ingestion, transformation, movement
and the queries run against it.
7. Quick Take
A smart semantics model can help organizations make
use of their Code Halos by capturing meaning, as well
as the related domain expertise, from structured,
unstructured or semi-structured data. The building
blocks for such a model are standards and technolo-
gies, such as:
• The Resource Definition Framework (RDF), which
organizes data in a graph structure, reducing devel-
opment time and cost while delivering business
value sooner.
• The Web Ontology Language (OWL), which
provides a comprehensive model of data definitions
and relationships that is human- and machine-
readable.
• The SPARQL Query Language, a SQL-like query
language for semantic data that can leverage onto-
logical relationships to execute smarter questions
across multiple databases in a single query.
• Inferencing, which makes it easier for users to
construct queries by capturing and embedding
expertise in the ontology model.
Semantics Maps the Data Lake
SEMANTIC ‘RADAR’ STEERS USERS TO INSIGHTS IN THE DATA LAKE 7
By performing a semantic analysis of the data during data loading, organizations
can capture an understanding of the content, which makes it easier to embed
intelligence in the data. This intelligence might include which characteristics
or patterns indicate the likelihood that a customer will “churn” from one cable
provider to another, or that a homeowner is at increased risk of falling behind in
loan payments. A trigger can generate an offer to customers that meets their
needs and enhances the relationship.
• Data management: Data lakes benefit from modern, low-cost, high-performance
storage technologies, such as the Hadoop Distributed File System (HDFS), that
provide high availability and automated failover. Specialized database and ana-
lytic engines, such as high-performance computing, NoSQL and semantic, can be
configured to speed Code Halo analytics at the lowest possible cost.
The intelligent data management layer directs the model-driven movement of
data, while provenance capabilities capture and track details about its ingestion,
transformation, movement and the queries run against it.
The data management layer supports the various engines required for different
types of analytics, including future technologies, as an organization’s use of Code
Halos matures. It also allows the engines to access and analyze data directly from
high-performance storage, eliminating the need for custom ETL processes.
• Data lake management: Like any other corporate resource, the data lake re-
quires management and governance to ensure that it operates as efficiently,
consistently and securely as possible. The management layer directs data in-
gestion and movement workflows while managing access and balancing perfor-
mance, cost and security to deliver the best results at the lowest cost.
The model manager allows analysts to embed expert knowledge, such as a
complex calculation or set of logic, into the ontology model for use by non-
experts. For example, a researcher can easily share a new risk-assessment model
for home loans with underwriters in the field. A genomics specialist can share a
model that maps new biomarker indications to patients’ genetic backgrounds to
help researchers at clinical trial sites test a new cancer treatment.
8. 8 KEEP CHALLENGING September 2014
Proper governance is required to ensure the data is accurate and consistent and
that it complies with organizational and industry standards. This should cover
both master and reference data, and take advantage of industry standards, such
as the Financial Industry Business Ontology (FIBO)5
in banking/financial services
and CDISC/RDF6
in life sciences. The model manager captures governance-based
standards to ensure consistency across models that reference shared data.
Business units will organize views of required data that include enterprise- and
business unit-specific data. The business unit-level data models will inherit the
definitions of enterprise standard models while adding business unit-specific
data relationships and terminology (see Figure 3).
The model manager enables organizations to construct data views to support a
specific analytic or class of analytics, without the cost of building data marts or
custom data movement processing. By capturing the details of data relationships
and data movement that directly support the analytics process, organizations
can enable automation that reduces delays and costs.
The data catalog manager captures the details of both internal and external
data assets, describing the content, definitions and organization of the data and
supporting multiple views of shared data. End-users can define workflows for
scheduled data movement.
Different Data Models for Different Users
Figure 3
Industry
Ontology
North America
Market Ontology
Enterprise
Ontology
International
Market
Ontology
The model manager enables organizations to
construct data views to support a specific analytic or
class of analytics, without the cost of building data
marts or custom data movement processing.
9. SEMANTIC ‘RADAR’ STEERS USERS TO INSIGHTS IN THE DATA LAKE 9
Access management features help users find data assets available in the data
lake, automate registration and access to new data assets, and provide secure
access to the data. Monitoring functions ensure that everything from the storage
technology to the analytics and reporting functions is working properly, with
tools to support intervention when necessary.
• The query management layer: This helps users understand what data is avail-
able and is most likely to deliver the richest Code Halo insights. Semantic search
uses natural language queries to analyze the data and metadata to best identify
the data that will support an analytical process. To reduce the time and effort
required to find and model business-critical data, data discovery functions ana-
lyze the data to find and link related data concepts. Optimized data structures
speed analytics, with queries directed to the analytics engine that delivers the
best performance.
The query management layer thus enables “smart” analytics in which users can
define not only which data to use, how to transform it, which analytics engine to
use and which analysis to run, but also what type of chart to plot the results to,
what file format to save the results in and who to mail the results to. Workflow
tools allow users to automate the analytics process.
• Rules-based data sourcing: This helps users fine-tune their queries using a con-
text such as metadata and provenance. For example, a user could specify a query
that says, “Read the data from the data lake, unless that data is more than X
hours old, in which case it should be read directly from the source system.”
While the data lake is most often used by analysts or business decision-makers,
true Code Halo value is achieved when it responds to a customer’s click with
an immediate, personalized response. This is enabled by features that expose
analytics processes as a Web service, or through an application program interface
(API) that is callable from a Web page or mobile app.
Moving Forward
Building an intelligent semantic model on top of your current information architec-
ture may sound daunting. But it can and must be done, if you hope to discover and
act on the next game-changing insight before your competitors do. You can get
started with these important but gradual changes:
• Prioritize the onboarding of data by its ability to create truly individualized
customer experiences or business-changing insights into market needs or
operations.
• Onboard data assets as they become available, without waiting for a specific
use case. Those uses will emerge when you least expect them, and when they do,
you’ll need the data immediately. At a minimum, map the data to the semantic
model for easier access.
• Load the data without filtering or transforming it, since new data rules may
override old rules. Filtering and transformation rules can be applied as the data
is moved to an analytics engine or during query execution.
• Model the data using familiar terminology. This makes it easier to change the
model as needed without physically moving the data. Customize models for spe-
cific business groups, encouraging them to ensure its accuracy and complete-
ness.
• Enable search mechanisms that make it easier for business users to see what
data is available and accessible.
• Balance legal and compliance needs for security with the imperative to improve
the customer experience through analytics.
10. 10 KEEP CHALLENGING September 2014
Code Halos are powerful sources of industry-changing insights. Too often, this
knowledge is difficult to find within the volume and complexity of the data. A data
lake provides the radar that allows end users to quickly see past the chaos of too
much data and focus on the actionable insights they need.
Footnotes
1
For more on Code Halo thinking, see Malcolm Frank, Paul Roehrig and Ben Pring,
“Code Rules: A Playbook for Managing at the Crossroads,” Cognizant Technology
Solutions, June 2013, http://www.cognizant.com/Futureofwork/Documents/
code-rules.pdf, or read our book Code Halos: How the Digital Lives of People,
Things, and Organizations are Changing the Rules of Business, by Malcolm Frank,
Paul Roehrig and Ben Pring, John Wiley & Sons, April 2014, http://www.wiley.com/
WileyCDA/WileyTitle/productCd-1118862074.html.
2
“Gartner Says the Internet of Things Installed Base Will Grow to 26 Billion Units by
2020,” Gartner, Inc., Dec. 13, 2013, http://www.gartner.com/newsroom/id/2636073.
3
James Dixon, “Pentaho, Big Data, and Data Lakes,” Pentaho, October 2010,
http://jamesdixon.wordpress.com/2010/10/14/pentaho-hadoop-and-data-lakes/.
4
Thomas Kelly, “Semantic Technology Drives Agile Business,” Cognizant Technology
Solutions, August 2013, http://www.cognizant.com/InsightsWhitepapers/How-
Semantic-Technology-Drives-Agile-Business.pdf.
5
“Financial Industry Business Ontology,” Object Management Group,
http://www.omg.org/hot-topics/fibo.htm.
6
“Representing Regulations and Guidance in RDF,” Phuse Wiki, http://bit.ly/18WAxxf.
Note: The logos and company names presented throughout this white paper are the
property of their respective trademark owners, are not affiliated with Cognizant
Technology Solutions, and are displayed for illustrative purposes only. Use of the
logo does not imply endorsement of the organization by Cognizant, nor vice versa.
11. SEMANTIC “RADAR” STEERS USERS TO INSIGHTS IN THE DATA LAKE 11
About the Authors
Thomas Kelly is a Director in Cognizant’s Enterprise Information Management (EIM)
Practice and heads its Semantic Technology Center of Excellence, a technology
specialty of Cognizant Business Consulting. He has 20-plus years of technology
consulting experience in leading data warehousing, business intelligence and big
data projects, focused primarily on the life sciences and healthcare industries. He
can be reached at Thomas.Kelly@cognizant.com.
Robert Hoyle Brown is an Associate Vice President in Cognizant’s Center for the
Future of Work, and drives strategy and market outreach for the Business Process
Services Practice. Robert is responsible for looking at how Code Halos will change
enterprise business process services in the coming years and is a regular contributor
to the blog www.unevenlydistributed.com, “Signals from the Future of Work.” Prior
to joining Cognizant, he was Managing Vice President of the Business and Appli-
cations Services team at Gartner, and as a research analyst, he was a recognized
subject matter expert in BPO, cloud services/BPaaS and HR services. He also held
roles at Hewlett-Packard and G2 Research, a boutique outsourcing research firm in
Silicon Valley. He holds a Bachelor of Arts degree from the University of California
at Berkeley and, prior to his graduation, attended the London School of Economics
as a Hansard Scholar. He can be reached at Robert.H.Brown@cognizant.com.
The authors would like to thank Editorial Director Alan Alper for his insights and
contributions to this white paper.