Why Hadoop and SQL 
just want to be friends 
Simon Elliston Ball 
@sireb
ETL 
OLTP 
EDW 
Archive 
ETL
ETL 
OLTP 
EDW 
Archive 
ETL
ETL 
OLTP 
ETL EDW 
Archive
ETL 
More data 
Shorter windows 
Wider queries
ETL 
OLTP 
EDW 
Archive 
ETL Sqoop 
Pig 
Hive 
Oozie 
Falcon
ETL 
OLTP 
EDW 
Archive 
ETL 
Less 
structured 
Sqoop
ELT: saving the T for later 
2012-01-06 09:22:27 W3SVC1273337584 RD00155D360166 10.211.146.27 GET 
/ustensiles - 80 Test0001 94.245.127.11 HTTP/1.1 
Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+WOW64;+Trident/5.0) 
__RequestVerificationToken_Lw__=KLZ1dz1Aa4o2UdwJVwr0JhzSwmmSHmID9i/gutMvQkZ 
WX9Q4QDktFHHiBhF8mSd6Cg5oIEeUpy/KNF7VLRFkrqN28raL8PfNuv0IfuKXxgl5s+uZpcvfGE 
6Olfsu7uNLg2bWwLZkrqXjv9cpRGaiXelmaM8=;+.ASPXAUTH=D5796612E924B60496C115914 
CC8F93239E99EEF4B3D6ED74BDD5C8C38D8C115D3021AB7F3B06E563EDE612BFBCBBE756803 
C85DECFACCA080E890C5DA6B4CA00A51792D812C93101F648505133C9E2C10779FA3E5AC19E 
E5E2B7E130C72C18F6309AEB736ABD06C87A7D636976A20534833E20160EC04B6B6617B3788 
45AE627979EE54 
http://site.supersimple.fr/Users/Account/LogOn?ReturnUrl=%2Fustensiles 
site.supersimple.fr 200 0 0 7136 849 1249
ELT: saving the T for later 
Schema on write: 
Model Parse Store Query 
● Keep going back to the drawing board 
● Reprocessing all the data
ELT: saving the T for later 
Schema on read: 
Store Query Model Parse 
● Only model what you need 
● Agile Data Modelling 
● Don’t move the data
Cost per TB...
Come for the cheap storage... 
The Data Lake 
https://www.flickr.com/photos/msvg/5891279010
...stay for the analytics 
Machine learning libraries 
Recommendation systems 
Batch Big Data
Summary 
Hadoop can: 
● Improve your ETL processing 
● Help you with unstructured data 
● Save you money
Thank you! 
Simon Elliston Ball 
@sireb

Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dublin 2014

  • 1.
    Why Hadoop andSQL just want to be friends Simon Elliston Ball @sireb
  • 2.
    ETL OLTP EDW Archive ETL
  • 3.
    ETL OLTP EDW Archive ETL
  • 4.
    ETL OLTP ETLEDW Archive
  • 5.
    ETL More data Shorter windows Wider queries
  • 6.
    ETL OLTP EDW Archive ETL Sqoop Pig Hive Oozie Falcon
  • 7.
    ETL OLTP EDW Archive ETL Less structured Sqoop
  • 8.
    ELT: saving theT for later 2012-01-06 09:22:27 W3SVC1273337584 RD00155D360166 10.211.146.27 GET /ustensiles - 80 Test0001 94.245.127.11 HTTP/1.1 Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+WOW64;+Trident/5.0) __RequestVerificationToken_Lw__=KLZ1dz1Aa4o2UdwJVwr0JhzSwmmSHmID9i/gutMvQkZ WX9Q4QDktFHHiBhF8mSd6Cg5oIEeUpy/KNF7VLRFkrqN28raL8PfNuv0IfuKXxgl5s+uZpcvfGE 6Olfsu7uNLg2bWwLZkrqXjv9cpRGaiXelmaM8=;+.ASPXAUTH=D5796612E924B60496C115914 CC8F93239E99EEF4B3D6ED74BDD5C8C38D8C115D3021AB7F3B06E563EDE612BFBCBBE756803 C85DECFACCA080E890C5DA6B4CA00A51792D812C93101F648505133C9E2C10779FA3E5AC19E E5E2B7E130C72C18F6309AEB736ABD06C87A7D636976A20534833E20160EC04B6B6617B3788 45AE627979EE54 http://site.supersimple.fr/Users/Account/LogOn?ReturnUrl=%2Fustensiles site.supersimple.fr 200 0 0 7136 849 1249
  • 9.
    ELT: saving theT for later Schema on write: Model Parse Store Query ● Keep going back to the drawing board ● Reprocessing all the data
  • 10.
    ELT: saving theT for later Schema on read: Store Query Model Parse ● Only model what you need ● Agile Data Modelling ● Don’t move the data
  • 11.
  • 12.
    Come for thecheap storage... The Data Lake https://www.flickr.com/photos/msvg/5891279010
  • 13.
    ...stay for theanalytics Machine learning libraries Recommendation systems Batch Big Data
  • 14.
    Summary Hadoop can: ● Improve your ETL processing ● Help you with unstructured data ● Save you money
  • 15.
    Thank you! SimonElliston Ball @sireb

Editor's Notes

  • #14 But because you’ve got all this data stored, you can query it directly, not like a tape archive. the cost means you can keep it all online.