More Related Content
Similar to NFS and ODBC (20)
More from MapR Technologies (20)
NFS and ODBC
- 3. The MapR Distribution for Apache Hadoop
The open, enterprise-grade distribution for Apache Hadoop
– Open source components
• Hive, Pig, Cascading, HBase, ZooKeeper, Oozie, Flume, Sqoop, Whirr, …
– Enhancements to make Hadoop more open and enterprise-grade
Fastest growing distribution
– Thousands of clusters deployed
Now available as a service with Amazon Elastic MapReduce (EMR)
– http://aws.amazon.com/elasticmapreduce/mapr
©MapR Technologies - Confidential 3
- 4. Recent News
Amazon selects MapR to provide the enterprise-grade Hadoop
distribution in EMR
Google selects MapR to provide Hadoop on Google Compute
Engine
MapR launched open source Apache Drill project inspired by
Google Dremel
– Low latency queries
©MapR Technologies - Confidential 4
- 5. MapR
Make Hadoop Make Hadoop
more open enterprise-grade
This presentation
©MapR Technologies - Confidential 5
- 6. Not All Applications Use the Hadoop APIs
Applications and libraries
that use files and/or SQL
30 years
100,000s applications
10,000s libraries
10s programming languages
Applications and libraries
that use the Hadoop APIs
©MapR Technologies - Confidential 6
- 7. Hadoop Needs Industry-Standard Interfaces
Hadoop • MapReduce and HBase applications
API • Mostly custom-built
• File-based applications
NFS • Supported by most operating systems
• SQL-based tools
ODBC • Supported by most BI applications and
query builders
©MapR Technologies - Confidential 7
- 9. Your Data is Your Data
HDFS-based Hadoop distributions do not (cannot)
support NFS
Your data is your data – make sure you can access it
– Why store your data in a system which cannot be accessed
by 95% of the world’s applications and libraries?
©MapR Technologies - Confidential 9
- 10. The NFS Protocol
RFC 1813 WRITE3res NFSPROC3_WRITE(WRITE3args) = 7;
struct WRITE3args {
nfs_fh3 file;
Very simple protocol offset3 offset;
count3 count;
stable_how stable;
Random reads/writes opaque data<>;
– Read count bytes from };
offset offset of file file
– Write buffer data to READ3res NFSPROC3_READ(READ3args) = 6;
offset offset of a file file
struct READ3args {
nfs_fh3 file;
offset3 offset;
HDFS does not support count3 count;
random writes so it };
cannot support NFS
©MapR Technologies - Confidential 10
- 11. S3
o.a.h.fs.s3native.NativeS3FileSystem
©MapR Technologies - Confidential
HDFS
o.a.h.hdfs.DistributedFileSystem
Storage Layers
Local File System
o.a.h.fs.LocalFileSystem
MapReduce
FTP
o.a.h.fs.ftp.FTPFileSystem
11
MapR storage layer
o.a.h.fs.FileSystem Interface
com.mapr.fs.MapRFileSystem
Hadoop Was Designed to Support Multiple
Hadoop
NFS interface
FileSystem API
- 16. Customer Examples: Import/Export Data
Network security vendor
– Network packet captures from switches are streamed into the cluster
– New pattern definitions are loaded into online IPS via NFS
Online measurement company
– Clickstreams from application servers are streamed into the cluster
SaaS company
– Exporting a database to Hadoop over NFS
Ad exchange
– Bids and transactions are streamed into the cluster
©MapR Technologies - Confidential 16
- 17. Customer Examples: Productivity and Operations
Retailer
– Operational scripts are easier with NFS than DFS + MapReduce
• chmod/chown, file system searches/greps, make, tab-complete
– Consolidate object store with analytics
Credit card company
– User and project home directories on Linux gateways
• Local files, scripts, source code, …
• Administrators manage quotas, snapshots/backups, …
Large Internet company
– Web server serve MapReduce results (item relationships) directly from cluster
Email marketing company
– Object store with HBase and NFS
©MapR Technologies - Confidential 17
- 21. ODBC
ODBC – Open DataBase Connectivity
– Open standard API for accessing a SQL-based backend
– Developed by Microsoft and Simba Technologies in 1992
Flagship API for SQL-based BI and reporting
– Excel, Tableau, MicroStrategy, Crystal Reports, …
Advanced ODBC drivers use the latest 3.52 specification
©MapR Technologies - Confidential 21
- 22. MapR ODBC Driver
MapR provides a Hive ODBC 3.52 driver
– Developed in partnership with ODBC inventor Simba Technologies
– Compliant with latest ODBC 3.52 specification
• 32- and 64-bit platform support
• Windows and Linux
Enables direct SQL access to MapR-stored data by translating SQL to
HiveQL
SQLizer enables seamless connectivity
– Provides ANSI SQL-92 front-end
– Targeted for existing apps that generate standard SQL queries
– Transforms SQL query into HiveQL query
©MapR Technologies - Confidential 22
- 27. Time for Questions
Download slides or send me an email
– http://info.mapr.com/Japan-HUG-8-2012
Download MapR to learn more
– www.mapr.com/download
Contact EMC Greenplum Japan
– Yoshiaki Hirabayashi – Yoshiaki.Hirabayashi@emc.com
– Akihiko Kusanagi – Akihiko.Kusanagi@emc.com
©MapR Technologies - Confidential 27