SlideShare a Scribd company logo
Introducing:
Trillium DQ for Big Data
Harald Smith, Director Product Marketing
Housekeeping
Webcast Audio
• Today’s webcast audio is streamed through your computer speakers.
• If you need technical assistance with the web interface or audio,
please reach out to us using the chat window.
Questions Welcome
• Submit your questions at any time during the presentation
using the chat window.
• We will answer them during our Q&A session following the
presentation.
Recording and slides
• This webcast is being recorded. You will receive an
email following the webcast with a link to download
both the recording and the slides.
Speaker
Harald Smith
• Director of Product Marketing, Syncsort
• 20+ years in Information Management with a focus
on data quality, integration, and governance
• Co-author of Patterns of Information Management
• Author of two Redbooks on Information Governance
and Data Integration
• Blog on InfoWorld: “Data Democratized”
3
Data challenges across the business
Business Leaders
Lack trust in data needed to
make rapid, accurate
decisions that grow business
Business Analysts
Can’t access or understand
data and spend excessive
time on investigating
Information Leaders
Must facilitate business
collaboration and data
transparency and governance
Chief Data Officers
Make data a strategic
business asset utilizing
scientific skills from basic
spreadsheet knowledge
4
Only 35% of senior
executives have a high
level of trust in the
accuracy of their Big
Data Analytics
92% of executives are
concerned about the
negative impact of data
and analytics on
corporate reputation
New survey indicates
nearly 80% of AI/ML
projects stalling due to
poor data quality
84% of CEOs
are concerned about
the quality of the data
they’re basing
decisions on
Big Data Needs
Data Quality
6
Data Quality Challenges of Big Data
Profiling Data
• Organizations are storing vast amounts of data in data lakes and the Cloud –
from many different sources – but that data isn’t usable unless it is understood
and to understand it, the business users who work with the data must be able
to access and profile it without constant IT help
Matching Entities Accurately
• Distinguishing matches that indicate a single specific entity across so much data
requires sophisticated multi-field matching algorithms – that need to be
understandable by business users to be meaningful
Scalability
• Distinguishing matches across massive datasets requires a lot of compute
power - compare everything has to be compared to everything else, multiple
times in multiple ways
• Taking advantage of Big Data processing for scalability requires specialized skills
and takes a long time – and requires tuning, re-writing as technology changes
• Traditional data quality tools are not designed to work on that scale of data
Trillium DQ for Big Data
Understand, Evaluate, and Resolve Big Data Quality Problems
Trillium Discovery for Big Data
Data Profiling
Gain a complete picture of your data before
use
• Understand the data
• Analyze the data
• Find data quality problems
• Build and evaluate data quality rules
7
Trillium DQ for Big Data
On Premises or via Trillium Cloud
Deploy any or all products to the cloud - Completely managed SaaS in AWS or Azure
Trillium Quality for Big Data
Data Cleansing and Matching
Cleanse, standardize, and connect
data in accordance with your predefined
standards
• Entity matching and resolution
• Data cleansing and correction
• Data record enrichment
Feature-rich data profiling and data quality processing engines
• Leveraging over two decades of data quality expertise
An efficient orchestration of this engine in Big Data distributed
frameworks
• Powered by an architecture that has been in production with very large
(2000+ node) environments running natively across the cluster
• Partnered with Cloudera and Hortonworks closely, native integration with the stack
• Syncsort has been a major contributor to Apache Hadoop open source project
• With efficient orchestration, we can process any number of attributes with a handful
of MapReduce jobs
• Same architecture is used for Apache Spark
“Design once, deploy anywhere” architecture
• Native connectivity providing breadth and performance
• “Intelligent Execution” to optimize process execution at run-time
(MapReduce, Spark 1.x, Spark 2.x)
• On-premise and in the cloud (e.g. Amazon EMR)
8
Data Quality for Your Big Data Needs
Key Outcomes
• Reduce the time for business analysts to discover and understand
data on Big Data platforms
• Allow business analysts who understand the data but have little
technical expertise to quickly find data and run data profiling in
three steps
• Let analysts explore results and drilldown to details within 2-5
seconds per view to review and then report on data issues to
business leaders
• Scale to large volumes of data sources & attributes so that business
analysts can understand the contents of any data source needed for
business decisions
• Data is always secured in process and at rest and only available to
authorized users to comply with regulations and avoid fines
9
Trillium Discovery for Big Data
10
Trillium Discovery for Big Data
• Delivers enterprise trusted Trillium Discovery on distributed big data
platforms (e.g. Hadoop, Spark) for high-volume, scalable data profiling
• Provides complete Trillium Discovery data profiling for analysis & review
• Attribute metadata, value & pattern frequencies, key & dependency analysis,
cross-source join analysis, drill down to any outlier or issue, and more…
• Provides easily configured native connectivity for Big Data sources
• Provides managing and monitoring for task execution
• Integrates with the security frameworks (Kerberos, AD, LDAP) of
Big Data platforms
Run Profiling
1
n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
11
Trillium Discovery for Big Data – Data Profiling at Scale
Select Source Explore ProfilesRun Profiling
Stored Profiling Results
▪ Metadata & Statistics
▪ Frequency Distributions
▪ Drilldown Indices
Share &
Govern
Results
Integration
(APIs)
Notification
Collaboration
Native Connectors
▪ HDFS source directories
▪ …
Drilldown to IssuesEvaluate Business Rules
Key Outcomes
• Match and link any data entity – customers, suppliers, products, etc. –
into a trusted single view to support a broad array of business-critical
use cases (e.g. Customer 360, fraud, AML)
• Parse and standardize complex multi-domain data, extended with
enrichment and verification of critical address and geolocation data –
all leveraging out-of-the-box templates
• Utilize “design once, deploy anywhere” approach to speed time-to-
value and focus on building data quality business logic while letting the
product handle the technical aspects of framework execution with no
coding or tuning required
• Leverage the high-performance compute power of distributed Big Data
frameworks including Hadoop MapReduce and Spark to process high
volumes within targeted time windows to meet critical Service Level
Agreements (SLA’s)
12
Trillium Quality for Big Data
13
Trillium Quality for Big Data
• Integrate, parse, standardize, and match new and legacy customer data
from multiple disparate sources.
• Provide high-quality entity resolution through multi-domain deduplication
and matching with the most comprehensive set of match comparisons
available, including fuzzy matching, distance comparisons, and more.
• Standardize, enhance, and match international data sets with postal and
country-code validation.
• Deploy data quality workflows as native, parallel MapReduce or Spark
processes for optimal efficiency.
• Process hundreds of millions of records of data.
• Increase processing efficiency.
• Support failover through Hadoop’s fault-tolerant design; during a node
failure, processing is redirected to another node.
Trillium Quality for Big Data – Data Cleansing at Scale
Boost effectiveness of machine learning, AI with complete, standardized, matched data.
1. Visually create and test data
quality processes locally
2. Execute in MapReduce or Spark
On premise or in the Cloud
Big Data Platform
14
Syncsort Trillium Delivers Data You can Trust
Data Profiling Business Rules &
Data Quality
Assessment
Data Validation,
Standardization,
Enrichment & more
Matching, Entity
Resolution &
Verification
•Customer 360
•AI/ML
Operational Integrations
•Analytics &
Reporting
Data Governance
Trillium Discovery for Big Data
Trillium Quality for Big Data
+ Global Address Verification
Trillium DQ for Big Data
15
Trillium DQ for Big Data
Use Cases
16
Turn your Big Data
into a trusted view
of your customers,
products and more
Power machine
learning and
advanced analytics
with reliable, fit-for-
purpose data
Gain actionable
business insights
from high-volume
disparate data sets
from across the
enterprise
Deploy industry-
leading data quality
processes at massive
scale, with no coding
or Big Data skills
required
Trillium DQ for Big
Data evaluates &
transforms your Big
Data for trusted
business insights
Anti-Money
Laundering on
Hadoop at
Global Bank
S O LU T I O N
CHAL L ENGE
• Must provide highly accurate
entity resolution
• Must be secure – Kerberos, LDAP
• Must have lineage – data origin
to end point
• Massive data volumes
• Scattered data – Mainframe,
RDBMS, Cloud, …
• Must archive unaltered
mainframe data
Full Anti-Money Laundering
regulatory compliance with
financial crimes data lake –
high performance results at
massive scale.
• Full end-to-end data lineage
supplied to Apache Atlas
and ASG Data Intelligence
• Cluster-native data
verification, enrichment,
and demanding multi-field
entity resolution on Spark
• Unmodified mainframe
“Golden Records” stored
on Hadoop
Bank must monitor transactions
to detect Money Laundering for
FCA compliance.
Machine learning can detect
patterns, but …
Requires large amounts of
current, clean data.
• Trillium DQ for Big Data
• Connect CDC
• Connect for Big Data
18
Trillium DQ for
Big Data Cleanses
Credit Data for
Creditsafe
C H A L L E N G E
Ensure ALL DATA on each company is
analyzed – and NO DATA from another
company is accidentally included –
to get accurate corporate credit ratings.
• Need to profile, cleanse and enhance
data to evaluate credit ratings for
80 million companies in U.S. alone
• Existing solution lacked flexible
de-dupe matching rules, scalability
• Millions of records to analyze per
company, in multiple inconsistent
data sources, about 800 million/day
total and growing
• Solution must scale!
S O LU T I O N
• Amazon EMR Cloud
• Trillium DQ for Big Data cleansed,
standardized and matched over
130 million recs/hour on basic
10-node test cluster– met the
business SLA with room to grow
96% Address Matching Accuracy
after Trillium cleansing,
standardization
Saved software costs – Replaced
multiple solutions and tools
Saved Amazon cluster costs and
left room for company growth
“We can’t afford to miss
information, or mix up information
about businesses with similar
names. Companies count on our
highly accurate predictive scoring
to provide fast, accurate ratings
for their potential customers
and vendors.”
19
Next Steps
For more information on Trillium DQ for Big Data and our other
Syncsort Trillium data quality solutions, please visit:
https://www.syncsort.com/en/products/trillium-dq-for-big-data
And:
https://www.syncsort.com/en/integrate
Q & A
21
Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for the Data Lake

More Related Content

What's hot

Data Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationData Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital Transformation
DATAVERSITY
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
Denodo
 
Snowflake essentials
Snowflake essentialsSnowflake essentials
Snowflake essentials
qureshihamid
 
Elastic Data Warehousing
Elastic Data WarehousingElastic Data Warehousing
Elastic Data Warehousing
Snowflake Computing
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
Kent Graziano
 
Rethinking Trust in Data
Rethinking Trust in Data Rethinking Trust in Data
Rethinking Trust in Data
DATAVERSITY
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Cathrine Wilhelmsen
 
Data Mesh 101
Data Mesh 101Data Mesh 101
Data Mesh 101
ChrisFord803185
 
Snowflake Company Presentation
Snowflake Company PresentationSnowflake Company Presentation
Snowflake Company Presentation
AndrewJiang18
 
Reference master data management
Reference master data managementReference master data management
Reference master data management
Dr. Hamdan Al-Sabri
 
Data Catalog as a Business Enabler
Data Catalog as a Business EnablerData Catalog as a Business Enabler
Data Catalog as a Business Enabler
Srinivasan Sankar
 
Data Lake: A simple introduction
Data Lake: A simple introductionData Lake: A simple introduction
Data Lake: A simple introduction
IBM Analytics
 
Snowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleSnowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at Scale
Adam Doyle
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
DATAVERSITY
 
Most Common Data Governance Challenges in the Digital Economy
Most Common Data Governance Challenges in the Digital EconomyMost Common Data Governance Challenges in the Digital Economy
Most Common Data Governance Challenges in the Digital Economy
Robyn Bollhorst
 
MDM Mistakes & How to Avoid Them!
MDM Mistakes & How to Avoid Them!MDM Mistakes & How to Avoid Them!
MDM Mistakes & How to Avoid Them!
Alan Lee White
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
Databricks
 
Introducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseIntroducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data Warehouse
Snowflake Computing
 
The Path to Data and Analytics Modernization
The Path to Data and Analytics ModernizationThe Path to Data and Analytics Modernization
The Path to Data and Analytics Modernization
Analytics8
 

What's hot (20)

Data Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationData Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital Transformation
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
 
Snowflake essentials
Snowflake essentialsSnowflake essentials
Snowflake essentials
 
Elastic Data Warehousing
Elastic Data WarehousingElastic Data Warehousing
Elastic Data Warehousing
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
 
Rethinking Trust in Data
Rethinking Trust in Data Rethinking Trust in Data
Rethinking Trust in Data
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
 
Data Mesh 101
Data Mesh 101Data Mesh 101
Data Mesh 101
 
Snowflake Company Presentation
Snowflake Company PresentationSnowflake Company Presentation
Snowflake Company Presentation
 
Reference master data management
Reference master data managementReference master data management
Reference master data management
 
Data Catalog as a Business Enabler
Data Catalog as a Business EnablerData Catalog as a Business Enabler
Data Catalog as a Business Enabler
 
Data Lake: A simple introduction
Data Lake: A simple introductionData Lake: A simple introduction
Data Lake: A simple introduction
 
Snowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleSnowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at Scale
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Most Common Data Governance Challenges in the Digital Economy
Most Common Data Governance Challenges in the Digital EconomyMost Common Data Governance Challenges in the Digital Economy
Most Common Data Governance Challenges in the Digital Economy
 
MDM Mistakes & How to Avoid Them!
MDM Mistakes & How to Avoid Them!MDM Mistakes & How to Avoid Them!
MDM Mistakes & How to Avoid Them!
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
 
Introducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseIntroducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data Warehouse
 
The Path to Data and Analytics Modernization
The Path to Data and Analytics ModernizationThe Path to Data and Analytics Modernization
The Path to Data and Analytics Modernization
 

Similar to Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for the Data Lake

The New Trillium DQ: Big Data Insights When and Where You Need Them
The New Trillium DQ: Big Data Insights When and Where You Need ThemThe New Trillium DQ: Big Data Insights When and Where You Need Them
The New Trillium DQ: Big Data Insights When and Where You Need Them
Precisely
 
What’s New in Syncsort’s Trillium Software System (TSS) 15.7
What’s New in Syncsort’s Trillium Software System (TSS) 15.7What’s New in Syncsort’s Trillium Software System (TSS) 15.7
What’s New in Syncsort’s Trillium Software System (TSS) 15.7
Precisely
 
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
DATAVERSITY
 
Foundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the CloudFoundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the Cloud
Precisely
 
Deliveinrg explainable AI
Deliveinrg explainable AIDeliveinrg explainable AI
Deliveinrg explainable AI
Gary Allemann
 
Empowering Business & IT Teams:  Modern Data Catalog Requirements
Empowering Business & IT Teams:  Modern Data Catalog RequirementsEmpowering Business & IT Teams:  Modern Data Catalog Requirements
Empowering Business & IT Teams:  Modern Data Catalog Requirements
Precisely
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
DATAVERSITY
 
Accelerate Cloud Migrations and Architecture with Data Virtualization
Accelerate Cloud Migrations and Architecture with Data VirtualizationAccelerate Cloud Migrations and Architecture with Data Virtualization
Accelerate Cloud Migrations and Architecture with Data Virtualization
Denodo
 
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy:  A Simple, Scalable Solution for Getting Started with HadoopBig Data Made Easy:  A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
Precisely
 
Trends in Enterprise Advanced Analytics
Trends in Enterprise Advanced AnalyticsTrends in Enterprise Advanced Analytics
Trends in Enterprise Advanced Analytics
DATAVERSITY
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Dataconomy Media
 
Bridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the CloudBridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the Cloud
Inside Analysis
 
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Denodo
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Cloudera, Inc.
 
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Precisely
 
Customer Intelligence_ Harnessing Elephants at Transamerica Presentation (1)
Customer Intelligence_ Harnessing Elephants at Transamerica    Presentation (1)Customer Intelligence_ Harnessing Elephants at Transamerica    Presentation (1)
Customer Intelligence_ Harnessing Elephants at Transamerica Presentation (1)Vishal Bamba
 
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersR+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
Revolution Analytics
 
Challenges of Operationalising Data Science in Production
Challenges of Operationalising Data Science in ProductionChallenges of Operationalising Data Science in Production
Challenges of Operationalising Data Science in Production
iguazio
 

Similar to Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for the Data Lake (20)

The New Trillium DQ: Big Data Insights When and Where You Need Them
The New Trillium DQ: Big Data Insights When and Where You Need ThemThe New Trillium DQ: Big Data Insights When and Where You Need Them
The New Trillium DQ: Big Data Insights When and Where You Need Them
 
What’s New in Syncsort’s Trillium Software System (TSS) 15.7
What’s New in Syncsort’s Trillium Software System (TSS) 15.7What’s New in Syncsort’s Trillium Software System (TSS) 15.7
What’s New in Syncsort’s Trillium Software System (TSS) 15.7
 
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
 
Foundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the CloudFoundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the Cloud
 
Deliveinrg explainable AI
Deliveinrg explainable AIDeliveinrg explainable AI
Deliveinrg explainable AI
 
Empowering Business & IT Teams:  Modern Data Catalog Requirements
Empowering Business & IT Teams:  Modern Data Catalog RequirementsEmpowering Business & IT Teams:  Modern Data Catalog Requirements
Empowering Business & IT Teams:  Modern Data Catalog Requirements
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
 
Accelerate Cloud Migrations and Architecture with Data Virtualization
Accelerate Cloud Migrations and Architecture with Data VirtualizationAccelerate Cloud Migrations and Architecture with Data Virtualization
Accelerate Cloud Migrations and Architecture with Data Virtualization
 
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy:  A Simple, Scalable Solution for Getting Started with HadoopBig Data Made Easy:  A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
 
Trends in Enterprise Advanced Analytics
Trends in Enterprise Advanced AnalyticsTrends in Enterprise Advanced Analytics
Trends in Enterprise Advanced Analytics
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
 
Bridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the CloudBridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the Cloud
 
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
 
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
 
Customer Intelligence_ Harnessing Elephants at Transamerica Presentation (1)
Customer Intelligence_ Harnessing Elephants at Transamerica    Presentation (1)Customer Intelligence_ Harnessing Elephants at Transamerica    Presentation (1)
Customer Intelligence_ Harnessing Elephants at Transamerica Presentation (1)
 
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersR+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
 
Challenges of Operationalising Data Science in Production
Challenges of Operationalising Data Science in ProductionChallenges of Operationalising Data Science in Production
Challenges of Operationalising Data Science in Production
 

More from Precisely

AI-Ready Data - The Key to Transforming Projects into Production.pptx
AI-Ready Data - The Key to Transforming Projects into Production.pptxAI-Ready Data - The Key to Transforming Projects into Production.pptx
AI-Ready Data - The Key to Transforming Projects into Production.pptx
Precisely
 
Building a Multi-Layered Defense for Your IBM i Security
Building a Multi-Layered Defense for Your IBM i SecurityBuilding a Multi-Layered Defense for Your IBM i Security
Building a Multi-Layered Defense for Your IBM i Security
Precisely
 
Optimierte Daten und Prozesse mit KI / ML + SAP Fiori.pdf
Optimierte Daten und Prozesse mit KI / ML + SAP Fiori.pdfOptimierte Daten und Prozesse mit KI / ML + SAP Fiori.pdf
Optimierte Daten und Prozesse mit KI / ML + SAP Fiori.pdf
Precisely
 
Chaining, Looping, and Long Text for Script Development and Automation.pdf
Chaining, Looping, and Long Text for Script Development and Automation.pdfChaining, Looping, and Long Text for Script Development and Automation.pdf
Chaining, Looping, and Long Text for Script Development and Automation.pdf
Precisely
 
Revolutionizing SAP® Processes with Automation and Artificial Intelligence
Revolutionizing SAP® Processes with Automation and Artificial IntelligenceRevolutionizing SAP® Processes with Automation and Artificial Intelligence
Revolutionizing SAP® Processes with Automation and Artificial Intelligence
Precisely
 
Navigating the Cloud: Best Practices for Successful Migration
Navigating the Cloud: Best Practices for Successful MigrationNavigating the Cloud: Best Practices for Successful Migration
Navigating the Cloud: Best Practices for Successful Migration
Precisely
 
Unlocking the Power of Your IBM i and Z Security Data with Google Chronicle
Unlocking the Power of Your IBM i and Z Security Data with Google ChronicleUnlocking the Power of Your IBM i and Z Security Data with Google Chronicle
Unlocking the Power of Your IBM i and Z Security Data with Google Chronicle
Precisely
 
How to Build Data Governance Programs That Last - A Business-First Approach.pdf
How to Build Data Governance Programs That Last - A Business-First Approach.pdfHow to Build Data Governance Programs That Last - A Business-First Approach.pdf
How to Build Data Governance Programs That Last - A Business-First Approach.pdf
Precisely
 
Zukuntssichere SAP Prozesse dank automatisierter Massendaten
Zukuntssichere SAP Prozesse dank automatisierter MassendatenZukuntssichere SAP Prozesse dank automatisierter Massendaten
Zukuntssichere SAP Prozesse dank automatisierter Massendaten
Precisely
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
Precisely
 
Crucial Considerations for AI-ready Data.pdf
Crucial Considerations for AI-ready Data.pdfCrucial Considerations for AI-ready Data.pdf
Crucial Considerations for AI-ready Data.pdf
Precisely
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Precisely
 
Justifying Capacity Managment Webinar 4/10
Justifying Capacity Managment Webinar 4/10Justifying Capacity Managment Webinar 4/10
Justifying Capacity Managment Webinar 4/10
Precisely
 
Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...
Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...
Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...
Precisely
 
Leveraging Mainframe Data in Near Real Time to Unleash Innovation With Cloud:...
Leveraging Mainframe Data in Near Real Time to Unleash Innovation With Cloud:...Leveraging Mainframe Data in Near Real Time to Unleash Innovation With Cloud:...
Leveraging Mainframe Data in Near Real Time to Unleash Innovation With Cloud:...
Precisely
 
Testjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3f
Testjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3fTestjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3f
Testjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3f
Precisely
 
Data Innovation Summit: Data Integrity Trends
Data Innovation Summit: Data Integrity TrendsData Innovation Summit: Data Integrity Trends
Data Innovation Summit: Data Integrity Trends
Precisely
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
Precisely
 
Optimisez la fonction financière en automatisant vos processus SAP
Optimisez la fonction financière en automatisant vos processus SAPOptimisez la fonction financière en automatisant vos processus SAP
Optimisez la fonction financière en automatisant vos processus SAP
Precisely
 
SAPS/4HANA Migration - Transformation-Management + nachhaltige Investitionen
SAPS/4HANA Migration - Transformation-Management + nachhaltige InvestitionenSAPS/4HANA Migration - Transformation-Management + nachhaltige Investitionen
SAPS/4HANA Migration - Transformation-Management + nachhaltige Investitionen
Precisely
 

More from Precisely (20)

AI-Ready Data - The Key to Transforming Projects into Production.pptx
AI-Ready Data - The Key to Transforming Projects into Production.pptxAI-Ready Data - The Key to Transforming Projects into Production.pptx
AI-Ready Data - The Key to Transforming Projects into Production.pptx
 
Building a Multi-Layered Defense for Your IBM i Security
Building a Multi-Layered Defense for Your IBM i SecurityBuilding a Multi-Layered Defense for Your IBM i Security
Building a Multi-Layered Defense for Your IBM i Security
 
Optimierte Daten und Prozesse mit KI / ML + SAP Fiori.pdf
Optimierte Daten und Prozesse mit KI / ML + SAP Fiori.pdfOptimierte Daten und Prozesse mit KI / ML + SAP Fiori.pdf
Optimierte Daten und Prozesse mit KI / ML + SAP Fiori.pdf
 
Chaining, Looping, and Long Text for Script Development and Automation.pdf
Chaining, Looping, and Long Text for Script Development and Automation.pdfChaining, Looping, and Long Text for Script Development and Automation.pdf
Chaining, Looping, and Long Text for Script Development and Automation.pdf
 
Revolutionizing SAP® Processes with Automation and Artificial Intelligence
Revolutionizing SAP® Processes with Automation and Artificial IntelligenceRevolutionizing SAP® Processes with Automation and Artificial Intelligence
Revolutionizing SAP® Processes with Automation and Artificial Intelligence
 
Navigating the Cloud: Best Practices for Successful Migration
Navigating the Cloud: Best Practices for Successful MigrationNavigating the Cloud: Best Practices for Successful Migration
Navigating the Cloud: Best Practices for Successful Migration
 
Unlocking the Power of Your IBM i and Z Security Data with Google Chronicle
Unlocking the Power of Your IBM i and Z Security Data with Google ChronicleUnlocking the Power of Your IBM i and Z Security Data with Google Chronicle
Unlocking the Power of Your IBM i and Z Security Data with Google Chronicle
 
How to Build Data Governance Programs That Last - A Business-First Approach.pdf
How to Build Data Governance Programs That Last - A Business-First Approach.pdfHow to Build Data Governance Programs That Last - A Business-First Approach.pdf
How to Build Data Governance Programs That Last - A Business-First Approach.pdf
 
Zukuntssichere SAP Prozesse dank automatisierter Massendaten
Zukuntssichere SAP Prozesse dank automatisierter MassendatenZukuntssichere SAP Prozesse dank automatisierter Massendaten
Zukuntssichere SAP Prozesse dank automatisierter Massendaten
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Crucial Considerations for AI-ready Data.pdf
Crucial Considerations for AI-ready Data.pdfCrucial Considerations for AI-ready Data.pdf
Crucial Considerations for AI-ready Data.pdf
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Justifying Capacity Managment Webinar 4/10
Justifying Capacity Managment Webinar 4/10Justifying Capacity Managment Webinar 4/10
Justifying Capacity Managment Webinar 4/10
 
Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...
Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...
Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...
 
Leveraging Mainframe Data in Near Real Time to Unleash Innovation With Cloud:...
Leveraging Mainframe Data in Near Real Time to Unleash Innovation With Cloud:...Leveraging Mainframe Data in Near Real Time to Unleash Innovation With Cloud:...
Leveraging Mainframe Data in Near Real Time to Unleash Innovation With Cloud:...
 
Testjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3f
Testjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3fTestjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3f
Testjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3f
 
Data Innovation Summit: Data Integrity Trends
Data Innovation Summit: Data Integrity TrendsData Innovation Summit: Data Integrity Trends
Data Innovation Summit: Data Integrity Trends
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
Optimisez la fonction financière en automatisant vos processus SAP
Optimisez la fonction financière en automatisant vos processus SAPOptimisez la fonction financière en automatisant vos processus SAP
Optimisez la fonction financière en automatisant vos processus SAP
 
SAPS/4HANA Migration - Transformation-Management + nachhaltige Investitionen
SAPS/4HANA Migration - Transformation-Management + nachhaltige InvestitionenSAPS/4HANA Migration - Transformation-Management + nachhaltige Investitionen
SAPS/4HANA Migration - Transformation-Management + nachhaltige Investitionen
 

Recently uploaded

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 

Recently uploaded (20)

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 

Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for the Data Lake

  • 1. Introducing: Trillium DQ for Big Data Harald Smith, Director Product Marketing
  • 2. Housekeeping Webcast Audio • Today’s webcast audio is streamed through your computer speakers. • If you need technical assistance with the web interface or audio, please reach out to us using the chat window. Questions Welcome • Submit your questions at any time during the presentation using the chat window. • We will answer them during our Q&A session following the presentation. Recording and slides • This webcast is being recorded. You will receive an email following the webcast with a link to download both the recording and the slides.
  • 3. Speaker Harald Smith • Director of Product Marketing, Syncsort • 20+ years in Information Management with a focus on data quality, integration, and governance • Co-author of Patterns of Information Management • Author of two Redbooks on Information Governance and Data Integration • Blog on InfoWorld: “Data Democratized” 3
  • 4. Data challenges across the business Business Leaders Lack trust in data needed to make rapid, accurate decisions that grow business Business Analysts Can’t access or understand data and spend excessive time on investigating Information Leaders Must facilitate business collaboration and data transparency and governance Chief Data Officers Make data a strategic business asset utilizing scientific skills from basic spreadsheet knowledge 4
  • 5. Only 35% of senior executives have a high level of trust in the accuracy of their Big Data Analytics 92% of executives are concerned about the negative impact of data and analytics on corporate reputation New survey indicates nearly 80% of AI/ML projects stalling due to poor data quality 84% of CEOs are concerned about the quality of the data they’re basing decisions on Big Data Needs Data Quality
  • 6. 6 Data Quality Challenges of Big Data Profiling Data • Organizations are storing vast amounts of data in data lakes and the Cloud – from many different sources – but that data isn’t usable unless it is understood and to understand it, the business users who work with the data must be able to access and profile it without constant IT help Matching Entities Accurately • Distinguishing matches that indicate a single specific entity across so much data requires sophisticated multi-field matching algorithms – that need to be understandable by business users to be meaningful Scalability • Distinguishing matches across massive datasets requires a lot of compute power - compare everything has to be compared to everything else, multiple times in multiple ways • Taking advantage of Big Data processing for scalability requires specialized skills and takes a long time – and requires tuning, re-writing as technology changes • Traditional data quality tools are not designed to work on that scale of data
  • 7. Trillium DQ for Big Data Understand, Evaluate, and Resolve Big Data Quality Problems Trillium Discovery for Big Data Data Profiling Gain a complete picture of your data before use • Understand the data • Analyze the data • Find data quality problems • Build and evaluate data quality rules 7 Trillium DQ for Big Data On Premises or via Trillium Cloud Deploy any or all products to the cloud - Completely managed SaaS in AWS or Azure Trillium Quality for Big Data Data Cleansing and Matching Cleanse, standardize, and connect data in accordance with your predefined standards • Entity matching and resolution • Data cleansing and correction • Data record enrichment
  • 8. Feature-rich data profiling and data quality processing engines • Leveraging over two decades of data quality expertise An efficient orchestration of this engine in Big Data distributed frameworks • Powered by an architecture that has been in production with very large (2000+ node) environments running natively across the cluster • Partnered with Cloudera and Hortonworks closely, native integration with the stack • Syncsort has been a major contributor to Apache Hadoop open source project • With efficient orchestration, we can process any number of attributes with a handful of MapReduce jobs • Same architecture is used for Apache Spark “Design once, deploy anywhere” architecture • Native connectivity providing breadth and performance • “Intelligent Execution” to optimize process execution at run-time (MapReduce, Spark 1.x, Spark 2.x) • On-premise and in the cloud (e.g. Amazon EMR) 8 Data Quality for Your Big Data Needs
  • 9. Key Outcomes • Reduce the time for business analysts to discover and understand data on Big Data platforms • Allow business analysts who understand the data but have little technical expertise to quickly find data and run data profiling in three steps • Let analysts explore results and drilldown to details within 2-5 seconds per view to review and then report on data issues to business leaders • Scale to large volumes of data sources & attributes so that business analysts can understand the contents of any data source needed for business decisions • Data is always secured in process and at rest and only available to authorized users to comply with regulations and avoid fines 9 Trillium Discovery for Big Data
  • 10. 10 Trillium Discovery for Big Data • Delivers enterprise trusted Trillium Discovery on distributed big data platforms (e.g. Hadoop, Spark) for high-volume, scalable data profiling • Provides complete Trillium Discovery data profiling for analysis & review • Attribute metadata, value & pattern frequencies, key & dependency analysis, cross-source join analysis, drill down to any outlier or issue, and more… • Provides easily configured native connectivity for Big Data sources • Provides managing and monitoring for task execution • Integrates with the security frameworks (Kerberos, AD, LDAP) of Big Data platforms
  • 11. Run Profiling 1 n . . . . . . . . . . . . . . . . . . . . . . 11 Trillium Discovery for Big Data – Data Profiling at Scale Select Source Explore ProfilesRun Profiling Stored Profiling Results ▪ Metadata & Statistics ▪ Frequency Distributions ▪ Drilldown Indices Share & Govern Results Integration (APIs) Notification Collaboration Native Connectors ▪ HDFS source directories ▪ … Drilldown to IssuesEvaluate Business Rules
  • 12. Key Outcomes • Match and link any data entity – customers, suppliers, products, etc. – into a trusted single view to support a broad array of business-critical use cases (e.g. Customer 360, fraud, AML) • Parse and standardize complex multi-domain data, extended with enrichment and verification of critical address and geolocation data – all leveraging out-of-the-box templates • Utilize “design once, deploy anywhere” approach to speed time-to- value and focus on building data quality business logic while letting the product handle the technical aspects of framework execution with no coding or tuning required • Leverage the high-performance compute power of distributed Big Data frameworks including Hadoop MapReduce and Spark to process high volumes within targeted time windows to meet critical Service Level Agreements (SLA’s) 12 Trillium Quality for Big Data
  • 13. 13 Trillium Quality for Big Data • Integrate, parse, standardize, and match new and legacy customer data from multiple disparate sources. • Provide high-quality entity resolution through multi-domain deduplication and matching with the most comprehensive set of match comparisons available, including fuzzy matching, distance comparisons, and more. • Standardize, enhance, and match international data sets with postal and country-code validation. • Deploy data quality workflows as native, parallel MapReduce or Spark processes for optimal efficiency. • Process hundreds of millions of records of data. • Increase processing efficiency. • Support failover through Hadoop’s fault-tolerant design; during a node failure, processing is redirected to another node.
  • 14. Trillium Quality for Big Data – Data Cleansing at Scale Boost effectiveness of machine learning, AI with complete, standardized, matched data. 1. Visually create and test data quality processes locally 2. Execute in MapReduce or Spark On premise or in the Cloud Big Data Platform 14
  • 15. Syncsort Trillium Delivers Data You can Trust Data Profiling Business Rules & Data Quality Assessment Data Validation, Standardization, Enrichment & more Matching, Entity Resolution & Verification •Customer 360 •AI/ML Operational Integrations •Analytics & Reporting Data Governance Trillium Discovery for Big Data Trillium Quality for Big Data + Global Address Verification Trillium DQ for Big Data 15
  • 16. Trillium DQ for Big Data Use Cases 16
  • 17. Turn your Big Data into a trusted view of your customers, products and more Power machine learning and advanced analytics with reliable, fit-for- purpose data Gain actionable business insights from high-volume disparate data sets from across the enterprise Deploy industry- leading data quality processes at massive scale, with no coding or Big Data skills required Trillium DQ for Big Data evaluates & transforms your Big Data for trusted business insights
  • 18. Anti-Money Laundering on Hadoop at Global Bank S O LU T I O N CHAL L ENGE • Must provide highly accurate entity resolution • Must be secure – Kerberos, LDAP • Must have lineage – data origin to end point • Massive data volumes • Scattered data – Mainframe, RDBMS, Cloud, … • Must archive unaltered mainframe data Full Anti-Money Laundering regulatory compliance with financial crimes data lake – high performance results at massive scale. • Full end-to-end data lineage supplied to Apache Atlas and ASG Data Intelligence • Cluster-native data verification, enrichment, and demanding multi-field entity resolution on Spark • Unmodified mainframe “Golden Records” stored on Hadoop Bank must monitor transactions to detect Money Laundering for FCA compliance. Machine learning can detect patterns, but … Requires large amounts of current, clean data. • Trillium DQ for Big Data • Connect CDC • Connect for Big Data 18
  • 19. Trillium DQ for Big Data Cleanses Credit Data for Creditsafe C H A L L E N G E Ensure ALL DATA on each company is analyzed – and NO DATA from another company is accidentally included – to get accurate corporate credit ratings. • Need to profile, cleanse and enhance data to evaluate credit ratings for 80 million companies in U.S. alone • Existing solution lacked flexible de-dupe matching rules, scalability • Millions of records to analyze per company, in multiple inconsistent data sources, about 800 million/day total and growing • Solution must scale! S O LU T I O N • Amazon EMR Cloud • Trillium DQ for Big Data cleansed, standardized and matched over 130 million recs/hour on basic 10-node test cluster– met the business SLA with room to grow 96% Address Matching Accuracy after Trillium cleansing, standardization Saved software costs – Replaced multiple solutions and tools Saved Amazon cluster costs and left room for company growth “We can’t afford to miss information, or mix up information about businesses with similar names. Companies count on our highly accurate predictive scoring to provide fast, accurate ratings for their potential customers and vendors.” 19
  • 20. Next Steps For more information on Trillium DQ for Big Data and our other Syncsort Trillium data quality solutions, please visit: https://www.syncsort.com/en/products/trillium-dq-for-big-data And: https://www.syncsort.com/en/integrate