SlideShare a Scribd company logo
1 of 71
1 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Hortonworks Inc. Dataguise Syncsort Synerscope
Ali Bajwa, Partner Solutions Subra Ramesh, SVP Marco de Jong, Director Jan-Kees Buenen, CEO
Srikanth Venkat, Product Management
DataWorks Summit - Berlin
April 2018
Partner Ecosystem Showcase For
Apache Ranger And Apache Atlas
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
Apache Ranger & Apache Atlas
Journey, Ecosystem & Partners
Hortonworks Partner Certification Program
SEC Ready & GOV Ready program
Partner Technology Showcase
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Community Snapshot
May 2014
XASecure
Acquisition
July 2014
Enters Apache
Incubation
Nov 2014
Ranger 0.4.0
Release
July 2015
Ranger 0.5/
HDP2.3
Aug 2016
Ranger 0.6/
HDP2.5
Nov 2016
Ranger 0.6.2/
HDP2.5.3
Jan 2017
Ranger TLP
graduation!
Jun 2017
Ranger 0.7.1
/HDP2.6.1
1.0.0
Q3 2018
• Committers: 27
• Contributors from:
Ebay, MSFT, Huawei,
Pandora, Accenture,
ING, Talend, ZTE
Ranger 0.7.1/
HDP2.6.1- HDP 2.6.4
Ranger 0.7/HDP2.6
• Export/import of Policies
• $User and macros
• Plugin status tab
• “Show columns” and “describe
extended support”
• Incremental LDAP Sync
• SmartSense Metrics
• User Sync Nested LDAP Support
• Tag based Masking
• Tag Attribute Based Policy
• Hive Replication Authorization
• Hive kill query Authorization
• Default Admin Group Mapping
Apr 2017
Ranger 0.7
/HDP2.6
Oct 2017
Ranger 0.7.1++
/HDP2.6.3
Aug 2017
Ranger 0.7.1+
/HDP2.6.2
4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Apache Ranger: Ecosystem
PartnerPartner Integrations
Apache Ranger
Apache
Kafka
Native Hadoop
Service Authorizers
Azure Data Lake
Store (ADLS)*
(Future)
Authorizer
Extensions
for Non-
Hadoop
Filesystems
& Stores
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Community Snapshot
May
2015
Apache
Atlas
Incubation
DGI group
Kickoff
Dec
2014
Apr
2017
Apache 0.8
Release
Global Financial
Company
Aug
2016
Apache 0.7
Foundation
Release
Apache Atlas 0.8.2/HDP2.6.1-2.6.4
• Business User Friendly Search &
Filtering
• Knox Token Based Auth. Support
• Knox Proxying of Atlas UI
• Tag Deletion
Apache Atlas 0.8/HDP2.6.0
• Simplified Search UI
• Simplified API
• Classification-based security for
HDFS, Kafka, HBase
• Knox SSO
• Performance/scalability
improvements
• Committers – 37
• Code contributors:
Hortonworks, IBM, Aetna, Merck, Target
Jun
2017
Atlas
Becomes
TLP!
Q4
2017
Apache 0.8.1
Release
Apache 1.0
Release
Q3
2018
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Atlas: Current Connectors and Ecosystem
Custom
Integration
PartnerPartner
Apache Atlas
RDBMS
Apache
Kafka
Pending:
Dataguise, Hortonworks, and GDPR
Subra Ramesh,
SVP Products, Dataguise, Inc.
7
©2017 Dataguise, Inc.
Confidential and Proprietary
Dataguise Company Background
8
 Based in Silicon Valley
 10 years since founding
 Company Focus is on Sensitive, Personal Data
 Pioneers in Data-Centric Security in Big Data (first deployed Hadoop
customer 2012)
 Wide coverage of Cloud Technologies over the last 3-4 years
 Product: DgSecure, Service: DgSecure On-Demand
©2017 Dataguise, Inc.
Confidential and Proprietary
Proven technology, trusted by the world’s largest brands
9
©2017 Dataguise, Inc.
Confidential and Proprietary
Dataguise Technology
10
Where does DgSecure work? On-premises Cloud
What does DgSecure do?
What Hadoop Distributions does it support?
What Databases can it support?
What Cloud Stores does it support?
What other Repositories does it support?
Detection Mask/Encrypt/Access
Control Integration
Right of Access Right of Erasure ReportingBreach Detection
SQL Server
Right to Restrict
Processing
©2017 Dataguise, Inc.
Confidential and Proprietary
Dataguise has the widest automated coverage of key back-
end GDPR requirements
11
 Finding and cataloging personal data – structured and unstructured – in a wide
range of data stores
 Pseudonymizing – and where needed, anonymizing – data
 Breach detection with a focus on personal data access
 Right of access
 Right to be forgotten
 Right to restrict processing
©2017 Dataguise, Inc.
Confidential and Proprietary
Dataguise support of the Hortonworks Ecosystem
12
 Support for Detection of Personal Data in all the latest Hortonworks
distributions
 Support for Masking (35+ different options), AES, Format Preserving
Encryption in HDP
 Same level of support via direct HDFS and Hive APIs
 Support for YARN, Tez, Spark, and classical MapReduce execution engines
 Integration with Kerberos, Hortonworks TDE
 Monitoring of sensitive data access in both Hive and Hadoop
 HDInsight support
 Integration with Atlas and Ranger
©2017 Dataguise, Inc.
Confidential and Proprietary
Access Control Integration: DgSecure -> Atlas -> Ranger
13
 DgSecure does detection on a continuous basis
 DgSecure pipes its results to Atlas, marking elements or columns containing
personal data via Tags
 Ranger Tag-Based Policies can be used to protect access at the level of
personal data types
 Optionally, Ranger Tag-based Masking can be used to hide
data selectively.
©2017 Dataguise, Inc.
Confidential and Proprietary
DgSecure -> Atlas -> Ranger Integration
14
DgSecure Detection
Atlas Populated with
Personal Data Tags
Ranger Policies based
on tags
Access Control based
on type of Personal
Data
Demo
Thank You
Back-up Slides
©2017 Dataguise, Inc.
Confidential and Proprietary
DgSecure Policies
18
©2017 Dataguise, Inc.
Confidential and Proprietary
DgSecure Sensitive Types
19
©2017 Dataguise, Inc.
Confidential and Proprietary
DgSecure Dashboard
20
©2017 Dataguise, Inc.
Confidential and Proprietary
DgSecure Hadoop Results
21
©2017 Dataguise, Inc.
Confidential and Proprietary
DgSecure Hive Results
22
©2017 Dataguise, Inc.
Confidential and Proprietary
DgSecure Monitor Overview
23
©2017 Dataguise, Inc.
Confidential and Proprietary
DgSecure GDPR DSAR Overview
24
©2017 Dataguise, Inc.
Confidential and Proprietary
DgSecure Atlas Tags
25
©2017 Dataguise, Inc.
Confidential and Proprietary
DgSecure Atlas Tags - Detail
26
©2017 Dataguise, Inc.
Confidential and Proprietary
Ranger Tag-Based Policy (Disabled)
27
©2017 Dataguise, Inc.
Confidential and Proprietary
Hive Query – fails, user doesn’t have Ranger permission
28
©2017 Dataguise, Inc.
Confidential and Proprietary
Hive Query – failure details
29
©2017 Dataguise, Inc.
Confidential and Proprietary
Ranger Policy – enabled
30
©2017 Dataguise, Inc.
Confidential and Proprietary
Hive Query Executes Successfully
31
©2017 Dataguise, Inc.
Confidential and Proprietary
Hive Query Results
32
Lineage in DMX-h – ingestion to the cluster
DMX-h job executes
• In the cluster Sources/Targets: HDFS, Hive, S3
• Out of the cluster Sources/Targets: Mainframe, DBMSs, local and remote FS –
Syncsort External Datasets
DMX-h job collects lineage information
• Source/Target File or Table level
DMX-h job lineage is published into Apache Atlas
• Connect with lineage published from other tools (REST)
33Syncsort Confidential and Proprietary - do not copy or distribute
Syncsort DMX-h Atlas Integration
34Syncsort Confidential and Proprietary - do not copy or distribute
Govern and Track Everything for Compliance
• Metadata and data lineage for Hive, Avro and
Parquet through HCatalog
• Metadata lineage export and API from DMX/DMX-h
– Simplify audits, analytics dashboards, metrics
– Integrate with enterprise metadata repositories
• Apache Ambari integration
– Native LDAP and Kerberos support
– Secure mainframe data access through FTPS and
Connect:Direct
• Apache Atlas ingestion lineage integration
– Audit and track data from source to cluster
– Lineage & tagging of Metadata for GDPR
Compliance
35Syncsort Confidential and Proprietary - do not copy or distribute
End-to-End Data Lineage in Apache Atlas
36Syncsort Confidential and Proprietary - do not copy or distribute
Data Sources
End-to-End Data Lineage in Apache Atlas
37Syncsort Confidential and Proprietary - do not copy or distribute
Data Sources
Syncsort accesses
data from sources
outside cluster.
End-to-End Data Lineage in Apache Atlas
38Syncsort Confidential and Proprietary - do not copy or distribute
Syncsort onboards
data, modifies
on-the-fly to match
Hadoop storage
model.
Data Sources
Syncsort accesses
data from sources
outside cluster.
End-to-End Data Lineage in Apache Atlas
39Syncsort Confidential and Proprietary - do not copy or distribute
Syncsort onboards
data, modifies
on-the-fly to match
Hadoop storage
model.
Data Sources
Syncsort accesses
data from sources
outside cluster.
Syncsort changes,
enhances, joins
data in cluster with
MapReduce or
Spark.
Data Hub
End-to-End Data Lineage in Apache Atlas
40Syncsort Confidential and Proprietary - do not copy or distribute
Syncsort onboards
data, modifies
on-the-fly to match
Hadoop storage
model.
Data Sources
Syncsort accesses
data from sources
outside cluster.
Syncsort changes,
enhances, joins
data in cluster with
MapReduce or
Spark.
Syncsort passes
source-to-cluster
data lineage info
to Atlas.
Data Hub
End-to-End Data Lineage in Apache Atlas
41Syncsort Confidential and Proprietary - do not copy or distribute
Syncsort onboards
data, modifies
on-the-fly to match
Hadoop storage
model.
Data Sources
Syncsort accesses
data from sources
outside cluster.
Syncsort changes,
enhances, joins
data in cluster with
MapReduce or
Spark.
Analytics and
visualizations
get complete
data.
Data analyst
gets end-to-
end data
lineage info
from Atlas
Syncsort passes
source-to-cluster
data lineage info
to Atlas.
Data Hub
Analytics,
Visualization
Syncsort: High Performance Import from Existing Databases
42
• Connect to virtually any data source, including
mainframe and MPP databases.
• Move data into and out of Hadoop up to 6x
faster without the need for manual scripts.
• Develop ETL processes without writing code.
• Seamlessly accelerate Hadoop performance and
scalability for ETL operations in both
MapReduce and Spark.
Benefits
Syncsort Confidential and Proprietary - do not copy or distribute
Syncsort + Hortonworks Advantages
• Apache Ambari Integration
• Deploy DMX-h across cluster
• Monitor DMX-h jobs
• Process in MapReduce or Spark
• Source relational and non relational data
(including mainframes)
• Out-of-the-box integration, interoperability &
certifications
• Kerberos-secured clusters
• Apache Ranger security certified
• Early beta, release certification
• Metadata lineage export from DMX
• Supports easy identification and management
of GDPR relevant Metadata
Technical Benefits
43Syncsort Confidential and Proprietary - do not copy or distribute
Demo: Apache Atlas
44Syncsort Confidential and Proprietary - do not copy or distribute
Demo: Apache Atlas
45Syncsort Confidential and Proprietary - do not copy or distribute
Demo: Apache Atlas
46Syncsort Confidential and Proprietary - do not copy or distribute
Demo: Apache Atlas
47Syncsort Confidential and Proprietary - do not copy or distribute
Demo: Apache Atlas
48Syncsort Confidential and Proprietary - do not copy or distribute
Demo: Apache Atlas
49Syncsort Confidential and Proprietary - do not copy or distribute
Demo: Apache Atlas
50Syncsort Confidential and Proprietary - do not copy or distribute
Demo: Apache Atlas
51Syncsort Confidential and Proprietary - do not copy or distribute
Demo: Apache Atlas
52Syncsort Confidential and Proprietary - do not copy or distribute
Demo: Apache Atlas
53Syncsort Confidential and Proprietary - do not copy or distribute
Demo: Apache Atlas
54Syncsort Confidential and Proprietary - do not copy or distribute
Demo: Apache Atlas
55Syncsort Confidential and Proprietary - do not copy or distribute
Demo: Apache Atlas
56Syncsort Confidential and Proprietary - do not copy or distribute
Demo: Apache Atlas
57Syncsort Confidential and Proprietary - do not copy or distribute
Demo: Apache Atlas
58Syncsort Confidential and Proprietary - do not copy or distribute
Demo: Apache Atlas
59Syncsort Confidential and Proprietary - do not copy or distribute
Demo: Apache Atlas
60Syncsort Confidential and Proprietary - do not copy or distribute
Demo: Apache Atlas
61Syncsort Confidential and Proprietary - do not copy or distribute
GDPR
Be transparent with all Pii data
Why not turn GDPR into a new
customer experience?
Dataworks Summit Berlin 2018
Jan-Kees Buenen, CEO
62(C) 2018 SynerScope
Discover
and classify
data
content in
full context
Know the
entire data
infrastructur
e
Know the
entire data
flows
patterns
Establish
and
execute
remediation
policies
Apply same
governance
to
processing
Monitor
through
certified
audits
“6 Steps for GDPR” expanded to unstructured enterprise data
63
Know who and what application produces and uses
Pii Data
Know the Pii data that rests in your unstructured data
Know its exact location, expiry date, consent status
Set and execute your policies based on your granular
knowledge of the content
Log every event touching your data  Atlas and
Ranger are integrated in fully automated processes in
SynerScope
Have the data instantly available at individual record
level for external (certified) audit purposes (Big4 love
sampling)
GDPR compliance for all content
Transparency for governance
• Data Discovery
• Data Search
• Data Matching
• Data Context
• Data Quality
• Data Use patterns
• Audit Ready (Big Four
endorsed)
64(C) 2018 SynerScope
Numbers Text
IoT Video, Audio
Eco-
system
Include the “other“ 80% of
the enterprise data
SynerScope Product position
for GDPR and IFRS
FAST, FLEXIBLE AND TRANSPARENT
o Fast and flexible with raw data, no cleaning, no
upfront modeling
o Fast and flexible for complex new combinations of
data brought in from many different silos
o Transparency at individual cell record level, with data
presented in full context allows for certified audits
o The big audit firms will play an important role
between the enterprise, regulators and supervisories
o Eco system demands independent certification of
data operations
Unstructured data is the Achilles Heel for true
GDPR compliance
…… “SynerScope’s Intelligence Augmentation (IA)
can handle the most complex data situations fast
and reliably” (Big 4 accounting firm)
(C) 2018 SynerScope
How SynerScope Ixiwa uses Atlas
ATLAS
MongoDB
IXIWA API
Push
RANGER
ATLAS
GraphDB
YARN
Launch
SPARK
content scan
TAGS
SPARK
HIVE
HDFS
(C) 2018 SynerScope
LIVE DEMO
(C) 2018 SynerScope
CONTACTS
WWW.SYNERSCOPE.COM
CEO jan-kees.Buenen@synerscope.com
FS Lead David.de.jong@synerscope.com
Thank
You
(C) 2018 SynerScope
69 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
HDP SEC READY & GOV READY Programs
✔ Choice: Customers choose features that they want to deploy—a la carte
✔ Curated & Fast: Partners to provide rich, complimentary and complete features ready to
deploy
✔ Agile: Faster deployment and accelerate innovation
✔ Centralized : Open metadata/governance and security infrastructure
✔ Flexibility: Portfolio of partner reference architectures and integration patterns
✔ Safe: HDP at core to provide stability and interoperability
70 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Hortonworks Certified Technology Program
HDP YARN Ready
Integrates with YARN
(native, Tez, Slider) or
uses/runs on a YARN
Ready engine
HDP Operations Ready
Integrates with Ambari
APIs, Stacks, Blueprints,
or Views
HDP Governance Ready
Integrates with Atlas
HDP Security Ready
Integrates with
Ranger, Knox, or other
security features
Sign up to be a partner and request certification kit!
http://hortonworks.com/partners/product-integration-certification/
71 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Questions

More Related Content

What's hot

Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...DataWorks Summit
 
Security and Data Governance using Apache Ranger and Apache Atlas
Security and Data Governance using Apache Ranger and Apache AtlasSecurity and Data Governance using Apache Ranger and Apache Atlas
Security and Data Governance using Apache Ranger and Apache AtlasDataWorks Summit/Hadoop Summit
 
Saving the elephant—now, not later
Saving the elephant—now, not laterSaving the elephant—now, not later
Saving the elephant—now, not laterDataWorks Summit
 
Security Updates: More Seamless Access Controls with Apache Spark and Apache ...
Security Updates: More Seamless Access Controls with Apache Spark and Apache ...Security Updates: More Seamless Access Controls with Apache Spark and Apache ...
Security Updates: More Seamless Access Controls with Apache Spark and Apache ...DataWorks Summit
 
Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success DataWorks Summit/Hadoop Summit
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceHortonworks
 
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...DataWorks Summit/Hadoop Summit
 
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015 Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015 Seetharam Venkatesh
 
Navigating the World of User Data Management and Data Discovery
Navigating the World of User Data Management and Data DiscoveryNavigating the World of User Data Management and Data Discovery
Navigating the World of User Data Management and Data DiscoveryDataWorks Summit/Hadoop Summit
 
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...DataWorks Summit
 
Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...DataWorks Summit/Hadoop Summit
 
Next gen tooling for building streaming analytics apps: code-less development...
Next gen tooling for building streaming analytics apps: code-less development...Next gen tooling for building streaming analytics apps: code-less development...
Next gen tooling for building streaming analytics apps: code-less development...DataWorks Summit
 
Data governance in Hadoop (My Personal Notes)
Data governance in Hadoop (My Personal Notes)Data governance in Hadoop (My Personal Notes)
Data governance in Hadoop (My Personal Notes)Komes Chandavimol
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Rittman Analytics
 
Spark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark Summit
 
Enterprise large scale graph analytics and computing base on distribute graph...
Enterprise large scale graph analytics and computing base on distribute graph...Enterprise large scale graph analytics and computing base on distribute graph...
Enterprise large scale graph analytics and computing base on distribute graph...DataWorks Summit
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...DataWorks Summit
 

What's hot (20)

Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
 
Security and Data Governance using Apache Ranger and Apache Atlas
Security and Data Governance using Apache Ranger and Apache AtlasSecurity and Data Governance using Apache Ranger and Apache Atlas
Security and Data Governance using Apache Ranger and Apache Atlas
 
Saving the elephant—now, not later
Saving the elephant—now, not laterSaving the elephant—now, not later
Saving the elephant—now, not later
 
Security Updates: More Seamless Access Controls with Apache Spark and Apache ...
Security Updates: More Seamless Access Controls with Apache Spark and Apache ...Security Updates: More Seamless Access Controls with Apache Spark and Apache ...
Security Updates: More Seamless Access Controls with Apache Spark and Apache ...
 
Enterprise Data Classification and Provenance
Enterprise Data Classification and ProvenanceEnterprise Data Classification and Provenance
Enterprise Data Classification and Provenance
 
Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data Governance
 
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
 
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015 Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
 
Navigating the World of User Data Management and Data Discovery
Navigating the World of User Data Management and Data DiscoveryNavigating the World of User Data Management and Data Discovery
Navigating the World of User Data Management and Data Discovery
 
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
 
Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...
 
Next gen tooling for building streaming analytics apps: code-less development...
Next gen tooling for building streaming analytics apps: code-less development...Next gen tooling for building streaming analytics apps: code-less development...
Next gen tooling for building streaming analytics apps: code-less development...
 
Beyond TCO
Beyond TCOBeyond TCO
Beyond TCO
 
Data governance in Hadoop (My Personal Notes)
Data governance in Hadoop (My Personal Notes)Data governance in Hadoop (My Personal Notes)
Data governance in Hadoop (My Personal Notes)
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
 
Spark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun Murthy
 
Enterprise large scale graph analytics and computing base on distribute graph...
Enterprise large scale graph analytics and computing base on distribute graph...Enterprise large scale graph analytics and computing base on distribute graph...
Enterprise large scale graph analytics and computing base on distribute graph...
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...
 
HDP Next: Governance
HDP Next: GovernanceHDP Next: Governance
HDP Next: Governance
 

Similar to GDPR-focused partner community showcase for Apache Ranger and Apache Atlas

All data accessible to all my organization - Presentation at OW2con'19, June...
 All data accessible to all my organization - Presentation at OW2con'19, June... All data accessible to all my organization - Presentation at OW2con'19, June...
All data accessible to all my organization - Presentation at OW2con'19, June...OW2
 
BigData Security - A Point of View
BigData Security - A Point of ViewBigData Security - A Point of View
BigData Security - A Point of ViewKaran Alang
 
Big Data Customer Education Webcast: The Latest Advancements in Syncsort DMX ...
Big Data Customer Education Webcast: The Latest Advancements in Syncsort DMX ...Big Data Customer Education Webcast: The Latest Advancements in Syncsort DMX ...
Big Data Customer Education Webcast: The Latest Advancements in Syncsort DMX ...Precisely
 
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...DataWorks Summit
 
CCD-410 Cloudera Study Material
CCD-410 Cloudera Study MaterialCCD-410 Cloudera Study Material
CCD-410 Cloudera Study MaterialRoxycodone Online
 
Equinix Big Data Platform and Cassandra - A view into the journey
Equinix Big Data Platform and Cassandra - A view into the journeyEquinix Big Data Platform and Cassandra - A view into the journey
Equinix Big Data Platform and Cassandra - A view into the journeyPraveen Kumar
 
Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Vantara
 
Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Hortonworks
 
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for AnalyticsVerizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for AnalyticsDataWorks Summit
 
Hadoop and Hive in Enterprises
Hadoop and Hive in EnterprisesHadoop and Hive in Enterprises
Hadoop and Hive in Enterprisesmarkgrover
 
Cassandra Day SV 2014: Apache Cassandra at Equinix for High Performance, Scal...
Cassandra Day SV 2014: Apache Cassandra at Equinix for High Performance, Scal...Cassandra Day SV 2014: Apache Cassandra at Equinix for High Performance, Scal...
Cassandra Day SV 2014: Apache Cassandra at Equinix for High Performance, Scal...DataStax Academy
 
Strata Hadoop Hopsworks
Strata Hadoop HopsworksStrata Hadoop Hopsworks
Strata Hadoop HopsworksJim Dowling
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksHortonworks
 
Scaling Data Science on Big Data
Scaling Data Science on Big DataScaling Data Science on Big Data
Scaling Data Science on Big DataDataWorks Summit
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Pactera_US
 
Big data and lynda_Subash_DSouza.com
Big data and lynda_Subash_DSouza.comBig data and lynda_Subash_DSouza.com
Big data and lynda_Subash_DSouza.comData Con LA
 
Built-In Security for the Cloud
Built-In Security for the CloudBuilt-In Security for the Cloud
Built-In Security for the CloudDataWorks Summit
 
End-to-End, Source to Analytics, Data Lineage with Syncsort DMX-h
End-to-End, Source to Analytics, Data Lineage with Syncsort DMX-hEnd-to-End, Source to Analytics, Data Lineage with Syncsort DMX-h
End-to-End, Source to Analytics, Data Lineage with Syncsort DMX-hPrecisely
 
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data avanttic Consultoría Tecnológica
 
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013Amazon Web Services
 

Similar to GDPR-focused partner community showcase for Apache Ranger and Apache Atlas (20)

All data accessible to all my organization - Presentation at OW2con'19, June...
 All data accessible to all my organization - Presentation at OW2con'19, June... All data accessible to all my organization - Presentation at OW2con'19, June...
All data accessible to all my organization - Presentation at OW2con'19, June...
 
BigData Security - A Point of View
BigData Security - A Point of ViewBigData Security - A Point of View
BigData Security - A Point of View
 
Big Data Customer Education Webcast: The Latest Advancements in Syncsort DMX ...
Big Data Customer Education Webcast: The Latest Advancements in Syncsort DMX ...Big Data Customer Education Webcast: The Latest Advancements in Syncsort DMX ...
Big Data Customer Education Webcast: The Latest Advancements in Syncsort DMX ...
 
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
 
CCD-410 Cloudera Study Material
CCD-410 Cloudera Study MaterialCCD-410 Cloudera Study Material
CCD-410 Cloudera Study Material
 
Equinix Big Data Platform and Cassandra - A view into the journey
Equinix Big Data Platform and Cassandra - A view into the journeyEquinix Big Data Platform and Cassandra - A view into the journey
Equinix Big Data Platform and Cassandra - A view into the journey
 
Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop Solution
 
Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014
 
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for AnalyticsVerizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
 
Hadoop and Hive in Enterprises
Hadoop and Hive in EnterprisesHadoop and Hive in Enterprises
Hadoop and Hive in Enterprises
 
Cassandra Day SV 2014: Apache Cassandra at Equinix for High Performance, Scal...
Cassandra Day SV 2014: Apache Cassandra at Equinix for High Performance, Scal...Cassandra Day SV 2014: Apache Cassandra at Equinix for High Performance, Scal...
Cassandra Day SV 2014: Apache Cassandra at Equinix for High Performance, Scal...
 
Strata Hadoop Hopsworks
Strata Hadoop HopsworksStrata Hadoop Hopsworks
Strata Hadoop Hopsworks
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and Hortonworks
 
Scaling Data Science on Big Data
Scaling Data Science on Big DataScaling Data Science on Big Data
Scaling Data Science on Big Data
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks
 
Big data and lynda_Subash_DSouza.com
Big data and lynda_Subash_DSouza.comBig data and lynda_Subash_DSouza.com
Big data and lynda_Subash_DSouza.com
 
Built-In Security for the Cloud
Built-In Security for the CloudBuilt-In Security for the Cloud
Built-In Security for the Cloud
 
End-to-End, Source to Analytics, Data Lineage with Syncsort DMX-h
End-to-End, Source to Analytics, Data Lineage with Syncsort DMX-hEnd-to-End, Source to Analytics, Data Lineage with Syncsort DMX-h
End-to-End, Source to Analytics, Data Lineage with Syncsort DMX-h
 
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data
 
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 

GDPR-focused partner community showcase for Apache Ranger and Apache Atlas

  • 1. 1 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Hortonworks Inc. Dataguise Syncsort Synerscope Ali Bajwa, Partner Solutions Subra Ramesh, SVP Marco de Jong, Director Jan-Kees Buenen, CEO Srikanth Venkat, Product Management DataWorks Summit - Berlin April 2018 Partner Ecosystem Showcase For Apache Ranger And Apache Atlas
  • 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda Apache Ranger & Apache Atlas Journey, Ecosystem & Partners Hortonworks Partner Certification Program SEC Ready & GOV Ready program Partner Technology Showcase
  • 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Community Snapshot May 2014 XASecure Acquisition July 2014 Enters Apache Incubation Nov 2014 Ranger 0.4.0 Release July 2015 Ranger 0.5/ HDP2.3 Aug 2016 Ranger 0.6/ HDP2.5 Nov 2016 Ranger 0.6.2/ HDP2.5.3 Jan 2017 Ranger TLP graduation! Jun 2017 Ranger 0.7.1 /HDP2.6.1 1.0.0 Q3 2018 • Committers: 27 • Contributors from: Ebay, MSFT, Huawei, Pandora, Accenture, ING, Talend, ZTE Ranger 0.7.1/ HDP2.6.1- HDP 2.6.4 Ranger 0.7/HDP2.6 • Export/import of Policies • $User and macros • Plugin status tab • “Show columns” and “describe extended support” • Incremental LDAP Sync • SmartSense Metrics • User Sync Nested LDAP Support • Tag based Masking • Tag Attribute Based Policy • Hive Replication Authorization • Hive kill query Authorization • Default Admin Group Mapping Apr 2017 Ranger 0.7 /HDP2.6 Oct 2017 Ranger 0.7.1++ /HDP2.6.3 Aug 2017 Ranger 0.7.1+ /HDP2.6.2
  • 4. 4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Apache Ranger: Ecosystem PartnerPartner Integrations Apache Ranger Apache Kafka Native Hadoop Service Authorizers Azure Data Lake Store (ADLS)* (Future) Authorizer Extensions for Non- Hadoop Filesystems & Stores
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Community Snapshot May 2015 Apache Atlas Incubation DGI group Kickoff Dec 2014 Apr 2017 Apache 0.8 Release Global Financial Company Aug 2016 Apache 0.7 Foundation Release Apache Atlas 0.8.2/HDP2.6.1-2.6.4 • Business User Friendly Search & Filtering • Knox Token Based Auth. Support • Knox Proxying of Atlas UI • Tag Deletion Apache Atlas 0.8/HDP2.6.0 • Simplified Search UI • Simplified API • Classification-based security for HDFS, Kafka, HBase • Knox SSO • Performance/scalability improvements • Committers – 37 • Code contributors: Hortonworks, IBM, Aetna, Merck, Target Jun 2017 Atlas Becomes TLP! Q4 2017 Apache 0.8.1 Release Apache 1.0 Release Q3 2018
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache Atlas: Current Connectors and Ecosystem Custom Integration PartnerPartner Apache Atlas RDBMS Apache Kafka Pending:
  • 7. Dataguise, Hortonworks, and GDPR Subra Ramesh, SVP Products, Dataguise, Inc. 7
  • 8. ©2017 Dataguise, Inc. Confidential and Proprietary Dataguise Company Background 8  Based in Silicon Valley  10 years since founding  Company Focus is on Sensitive, Personal Data  Pioneers in Data-Centric Security in Big Data (first deployed Hadoop customer 2012)  Wide coverage of Cloud Technologies over the last 3-4 years  Product: DgSecure, Service: DgSecure On-Demand
  • 9. ©2017 Dataguise, Inc. Confidential and Proprietary Proven technology, trusted by the world’s largest brands 9
  • 10. ©2017 Dataguise, Inc. Confidential and Proprietary Dataguise Technology 10 Where does DgSecure work? On-premises Cloud What does DgSecure do? What Hadoop Distributions does it support? What Databases can it support? What Cloud Stores does it support? What other Repositories does it support? Detection Mask/Encrypt/Access Control Integration Right of Access Right of Erasure ReportingBreach Detection SQL Server Right to Restrict Processing
  • 11. ©2017 Dataguise, Inc. Confidential and Proprietary Dataguise has the widest automated coverage of key back- end GDPR requirements 11  Finding and cataloging personal data – structured and unstructured – in a wide range of data stores  Pseudonymizing – and where needed, anonymizing – data  Breach detection with a focus on personal data access  Right of access  Right to be forgotten  Right to restrict processing
  • 12. ©2017 Dataguise, Inc. Confidential and Proprietary Dataguise support of the Hortonworks Ecosystem 12  Support for Detection of Personal Data in all the latest Hortonworks distributions  Support for Masking (35+ different options), AES, Format Preserving Encryption in HDP  Same level of support via direct HDFS and Hive APIs  Support for YARN, Tez, Spark, and classical MapReduce execution engines  Integration with Kerberos, Hortonworks TDE  Monitoring of sensitive data access in both Hive and Hadoop  HDInsight support  Integration with Atlas and Ranger
  • 13. ©2017 Dataguise, Inc. Confidential and Proprietary Access Control Integration: DgSecure -> Atlas -> Ranger 13  DgSecure does detection on a continuous basis  DgSecure pipes its results to Atlas, marking elements or columns containing personal data via Tags  Ranger Tag-Based Policies can be used to protect access at the level of personal data types  Optionally, Ranger Tag-based Masking can be used to hide data selectively.
  • 14. ©2017 Dataguise, Inc. Confidential and Proprietary DgSecure -> Atlas -> Ranger Integration 14 DgSecure Detection Atlas Populated with Personal Data Tags Ranger Policies based on tags Access Control based on type of Personal Data
  • 15. Demo
  • 18. ©2017 Dataguise, Inc. Confidential and Proprietary DgSecure Policies 18
  • 19. ©2017 Dataguise, Inc. Confidential and Proprietary DgSecure Sensitive Types 19
  • 20. ©2017 Dataguise, Inc. Confidential and Proprietary DgSecure Dashboard 20
  • 21. ©2017 Dataguise, Inc. Confidential and Proprietary DgSecure Hadoop Results 21
  • 22. ©2017 Dataguise, Inc. Confidential and Proprietary DgSecure Hive Results 22
  • 23. ©2017 Dataguise, Inc. Confidential and Proprietary DgSecure Monitor Overview 23
  • 24. ©2017 Dataguise, Inc. Confidential and Proprietary DgSecure GDPR DSAR Overview 24
  • 25. ©2017 Dataguise, Inc. Confidential and Proprietary DgSecure Atlas Tags 25
  • 26. ©2017 Dataguise, Inc. Confidential and Proprietary DgSecure Atlas Tags - Detail 26
  • 27. ©2017 Dataguise, Inc. Confidential and Proprietary Ranger Tag-Based Policy (Disabled) 27
  • 28. ©2017 Dataguise, Inc. Confidential and Proprietary Hive Query – fails, user doesn’t have Ranger permission 28
  • 29. ©2017 Dataguise, Inc. Confidential and Proprietary Hive Query – failure details 29
  • 30. ©2017 Dataguise, Inc. Confidential and Proprietary Ranger Policy – enabled 30
  • 31. ©2017 Dataguise, Inc. Confidential and Proprietary Hive Query Executes Successfully 31
  • 32. ©2017 Dataguise, Inc. Confidential and Proprietary Hive Query Results 32
  • 33. Lineage in DMX-h – ingestion to the cluster DMX-h job executes • In the cluster Sources/Targets: HDFS, Hive, S3 • Out of the cluster Sources/Targets: Mainframe, DBMSs, local and remote FS – Syncsort External Datasets DMX-h job collects lineage information • Source/Target File or Table level DMX-h job lineage is published into Apache Atlas • Connect with lineage published from other tools (REST) 33Syncsort Confidential and Proprietary - do not copy or distribute
  • 34. Syncsort DMX-h Atlas Integration 34Syncsort Confidential and Proprietary - do not copy or distribute
  • 35. Govern and Track Everything for Compliance • Metadata and data lineage for Hive, Avro and Parquet through HCatalog • Metadata lineage export and API from DMX/DMX-h – Simplify audits, analytics dashboards, metrics – Integrate with enterprise metadata repositories • Apache Ambari integration – Native LDAP and Kerberos support – Secure mainframe data access through FTPS and Connect:Direct • Apache Atlas ingestion lineage integration – Audit and track data from source to cluster – Lineage & tagging of Metadata for GDPR Compliance 35Syncsort Confidential and Proprietary - do not copy or distribute
  • 36. End-to-End Data Lineage in Apache Atlas 36Syncsort Confidential and Proprietary - do not copy or distribute Data Sources
  • 37. End-to-End Data Lineage in Apache Atlas 37Syncsort Confidential and Proprietary - do not copy or distribute Data Sources Syncsort accesses data from sources outside cluster.
  • 38. End-to-End Data Lineage in Apache Atlas 38Syncsort Confidential and Proprietary - do not copy or distribute Syncsort onboards data, modifies on-the-fly to match Hadoop storage model. Data Sources Syncsort accesses data from sources outside cluster.
  • 39. End-to-End Data Lineage in Apache Atlas 39Syncsort Confidential and Proprietary - do not copy or distribute Syncsort onboards data, modifies on-the-fly to match Hadoop storage model. Data Sources Syncsort accesses data from sources outside cluster. Syncsort changes, enhances, joins data in cluster with MapReduce or Spark. Data Hub
  • 40. End-to-End Data Lineage in Apache Atlas 40Syncsort Confidential and Proprietary - do not copy or distribute Syncsort onboards data, modifies on-the-fly to match Hadoop storage model. Data Sources Syncsort accesses data from sources outside cluster. Syncsort changes, enhances, joins data in cluster with MapReduce or Spark. Syncsort passes source-to-cluster data lineage info to Atlas. Data Hub
  • 41. End-to-End Data Lineage in Apache Atlas 41Syncsort Confidential and Proprietary - do not copy or distribute Syncsort onboards data, modifies on-the-fly to match Hadoop storage model. Data Sources Syncsort accesses data from sources outside cluster. Syncsort changes, enhances, joins data in cluster with MapReduce or Spark. Analytics and visualizations get complete data. Data analyst gets end-to- end data lineage info from Atlas Syncsort passes source-to-cluster data lineage info to Atlas. Data Hub Analytics, Visualization
  • 42. Syncsort: High Performance Import from Existing Databases 42 • Connect to virtually any data source, including mainframe and MPP databases. • Move data into and out of Hadoop up to 6x faster without the need for manual scripts. • Develop ETL processes without writing code. • Seamlessly accelerate Hadoop performance and scalability for ETL operations in both MapReduce and Spark. Benefits Syncsort Confidential and Proprietary - do not copy or distribute
  • 43. Syncsort + Hortonworks Advantages • Apache Ambari Integration • Deploy DMX-h across cluster • Monitor DMX-h jobs • Process in MapReduce or Spark • Source relational and non relational data (including mainframes) • Out-of-the-box integration, interoperability & certifications • Kerberos-secured clusters • Apache Ranger security certified • Early beta, release certification • Metadata lineage export from DMX • Supports easy identification and management of GDPR relevant Metadata Technical Benefits 43Syncsort Confidential and Proprietary - do not copy or distribute
  • 44. Demo: Apache Atlas 44Syncsort Confidential and Proprietary - do not copy or distribute
  • 45. Demo: Apache Atlas 45Syncsort Confidential and Proprietary - do not copy or distribute
  • 46. Demo: Apache Atlas 46Syncsort Confidential and Proprietary - do not copy or distribute
  • 47. Demo: Apache Atlas 47Syncsort Confidential and Proprietary - do not copy or distribute
  • 48. Demo: Apache Atlas 48Syncsort Confidential and Proprietary - do not copy or distribute
  • 49. Demo: Apache Atlas 49Syncsort Confidential and Proprietary - do not copy or distribute
  • 50. Demo: Apache Atlas 50Syncsort Confidential and Proprietary - do not copy or distribute
  • 51. Demo: Apache Atlas 51Syncsort Confidential and Proprietary - do not copy or distribute
  • 52. Demo: Apache Atlas 52Syncsort Confidential and Proprietary - do not copy or distribute
  • 53. Demo: Apache Atlas 53Syncsort Confidential and Proprietary - do not copy or distribute
  • 54. Demo: Apache Atlas 54Syncsort Confidential and Proprietary - do not copy or distribute
  • 55. Demo: Apache Atlas 55Syncsort Confidential and Proprietary - do not copy or distribute
  • 56. Demo: Apache Atlas 56Syncsort Confidential and Proprietary - do not copy or distribute
  • 57. Demo: Apache Atlas 57Syncsort Confidential and Proprietary - do not copy or distribute
  • 58. Demo: Apache Atlas 58Syncsort Confidential and Proprietary - do not copy or distribute
  • 59. Demo: Apache Atlas 59Syncsort Confidential and Proprietary - do not copy or distribute
  • 60. Demo: Apache Atlas 60Syncsort Confidential and Proprietary - do not copy or distribute
  • 61. Demo: Apache Atlas 61Syncsort Confidential and Proprietary - do not copy or distribute
  • 62. GDPR Be transparent with all Pii data Why not turn GDPR into a new customer experience? Dataworks Summit Berlin 2018 Jan-Kees Buenen, CEO 62(C) 2018 SynerScope
  • 63. Discover and classify data content in full context Know the entire data infrastructur e Know the entire data flows patterns Establish and execute remediation policies Apply same governance to processing Monitor through certified audits “6 Steps for GDPR” expanded to unstructured enterprise data 63 Know who and what application produces and uses Pii Data Know the Pii data that rests in your unstructured data Know its exact location, expiry date, consent status Set and execute your policies based on your granular knowledge of the content Log every event touching your data  Atlas and Ranger are integrated in fully automated processes in SynerScope Have the data instantly available at individual record level for external (certified) audit purposes (Big4 love sampling)
  • 64. GDPR compliance for all content Transparency for governance • Data Discovery • Data Search • Data Matching • Data Context • Data Quality • Data Use patterns • Audit Ready (Big Four endorsed) 64(C) 2018 SynerScope Numbers Text IoT Video, Audio Eco- system Include the “other“ 80% of the enterprise data
  • 65. SynerScope Product position for GDPR and IFRS FAST, FLEXIBLE AND TRANSPARENT o Fast and flexible with raw data, no cleaning, no upfront modeling o Fast and flexible for complex new combinations of data brought in from many different silos o Transparency at individual cell record level, with data presented in full context allows for certified audits o The big audit firms will play an important role between the enterprise, regulators and supervisories o Eco system demands independent certification of data operations Unstructured data is the Achilles Heel for true GDPR compliance …… “SynerScope’s Intelligence Augmentation (IA) can handle the most complex data situations fast and reliably” (Big 4 accounting firm) (C) 2018 SynerScope
  • 66. How SynerScope Ixiwa uses Atlas ATLAS MongoDB IXIWA API Push RANGER ATLAS GraphDB YARN Launch SPARK content scan TAGS SPARK HIVE HDFS (C) 2018 SynerScope
  • 67. LIVE DEMO (C) 2018 SynerScope
  • 68. CONTACTS WWW.SYNERSCOPE.COM CEO jan-kees.Buenen@synerscope.com FS Lead David.de.jong@synerscope.com Thank You (C) 2018 SynerScope
  • 69. 69 © Hortonworks Inc. 2011 – 2017. All Rights Reserved HDP SEC READY & GOV READY Programs ✔ Choice: Customers choose features that they want to deploy—a la carte ✔ Curated & Fast: Partners to provide rich, complimentary and complete features ready to deploy ✔ Agile: Faster deployment and accelerate innovation ✔ Centralized : Open metadata/governance and security infrastructure ✔ Flexibility: Portfolio of partner reference architectures and integration patterns ✔ Safe: HDP at core to provide stability and interoperability
  • 70. 70 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Hortonworks Certified Technology Program HDP YARN Ready Integrates with YARN (native, Tez, Slider) or uses/runs on a YARN Ready engine HDP Operations Ready Integrates with Ambari APIs, Stacks, Blueprints, or Views HDP Governance Ready Integrates with Atlas HDP Security Ready Integrates with Ranger, Knox, or other security features Sign up to be a partner and request certification kit! http://hortonworks.com/partners/product-integration-certification/
  • 71. 71 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Questions

Editor's Notes

  1. How fast ? 7 months !
  2. What does ecosystem look like? Connectors for Sqoop, Hive, Storm, Kafka as well as custom integration method to build your own connector via highly scalable REST API. For ex, although there is no first class connector for Spark, you can hook a snippet of code at end of your Spark job to report lineage/metadata info into Atlas. More native connectors being worked for future releases: NiFi and Hbase We also have partner program for ‘Gov ready’ certification and you can see a list of partners who have already built integration Some interesting ones: Talend: data pipelining done in their canvas gets faithfully converted into Atlas lineage graph so we’re able to capture all the steps/transformations/metadata for each of the processes/entities in that chain Dataguise/Waterline do data discovery and are able to publish classification in bulk into Atlas. Same can be done for lineage IGC is special…its joined at the hip with Atlas: they will have one to one model equivalency in terms of backend and will be able to query each other for metadata/lineage etc
  3. The slide shows the high-level control flow (the title 1st line for each of the 3 boxes): a DMX-h job runs, produces lineage info which is later on published in Atlas More details for each box: DMX-h job executes – currently in the product we are looking at the lineage at the sources/targets level. From the perspective of Atlas, we need to categorize sources/targets that standardized in Atlas, e.g. Hive, HDFS, vs the ones that are not. This is so that DMX-h can later on publish the lineage around these sources/targets as expected by Atlas. DMX-h job produces lineage information – currently this is done for ingestion, and not for distributed executions, and not at field-level. DMX-h job lineage is published into Atlas – DMX-h publishes lineage using (existing) HDFS files and Hive tables entities in Altas, as they are standardized. Other tools (e.g. Hive SQL queries) can use the same HDFS/Hive entities to publish their own lineage, therefore “connecting” to ours from DMX-h We use the REST API to publish the DMX-h lineage. In the product we currently use v1 of the APIs, which is now Legacy, as v2 is the most current. Need to update our product for v2.
  4. This is a simple DMX-h job that ingests an EBCDIC file into the cluster and converts it to ASCII on the fly.
  5. Syncsort DMX-h is highly-efficient software with a small footprint, yet it packages the comprehensive support you need to manage, secure and govern your modern data architecture: Manage: Full integration with Apache Ambari Secure: Native LDAP and Kerberos support Integration with Apache Ranger Secure mainframe data access through FTPS and Connect:Direct Govern: Tight integration with HCatalog for metadata management and data lineage Work directly with mainframe data in its native format – preserving data lineage Can tag metadata that contains Personal Identifiable Information which is critical for GDPR compliance (i.e. knowing where personal data is stored)
  6. A better way is needed – so that, just like the chef, we can have a complete view of our data, from the origin to the data hub – and know what has happened to it at every step of the way
  7. A better way is needed – so that, just like the chef, we can have a complete view of our data, from the origin to the data hub – and know what has happened to it at every step of the way
  8. A better way is needed – so that, just like the chef, we can have a complete view of our data, from the origin to the data hub – and know what has happened to it at every step of the way
  9. A better way is needed – so that, just like the chef, we can have a complete view of our data, from the origin to the data hub – and know what has happened to it at every step of the way
  10. A better way is needed – so that, just like the chef, we can have a complete view of our data, from the origin to the data hub – and know what has happened to it at every step of the way
  11. A better way is needed – so that, just like the chef, we can have a complete view of our data, from the origin to the data hub – and know what has happened to it at every step of the way
  12. Syncsort/Hortonworks reference architecture Deployed by Ambari On every node Data movement and transformation MapReduce or Spark
  13. Syncsort/Hortonworks reference architecture Deployed by Ambari On every node Data movement and transformation MapReduce or Spark
  14. First AI powered all-in-one big data solution Solves the big data myth once and for all Data ingest – Organize – Search – Analyze – Extract All in One Ultra-fast big data visual analytics Unlock the big data complexity Interactive and dynamic user interface fusing Deep Learning with a scalable Data Lake into a ready-to-go Big Data solution
  15. DSR – Data Subject Rights as it flows through the data assets such as consent
  16. First AI powered all-in-one big data solution Solves the big data myth once and for all Data ingest – Organize – Search – Analyze – Extract All in One Ultra-fast big data visual analytics Unlock the big data complexity Interactive and dynamic user interface fusing Deep Learning with a scalable Data Lake into a ready-to-go Big Data solution