Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Bigdata : Big picture

1,089 views

Published on

Big data hadoop flume spark cloudera Oracle big data appliance apache , oracle loader for hadoop, Big data copy. Exadata to Big data appliance. bilginc It academy.

Published in: Data & Analytics
  • Be the first to comment

Bigdata : Big picture

  1. 1. ZEKERIYA BEŞIROĞLU BILGINC IT ACADEMY ORACLE CLOUD DAY 19-11-2015 TROUG-TURKISH ORACLE USER GROUP BIG DATA : BIG PICTURE
  2. 2. ZEKERIYA BEŞIROĞLU ▸ +18 IT ▸ +15 ORACLE DB&DWH ▸ +3 BIG DATA ▸ Leader of TROUG ▸ Instructor&Consultant ▸ http://zekeriyabesiroglu.com ▸ @zbesiroglu TROUG BIG DATA BIG PICTURE TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  3. 3. TROUG HABERLER 2015 WWW.TROUG.ORG
  4. 4. BILGINC IT ACADEMY WWW.BILGINC.COM
  5. 5. METIN BIG DATA Social networks Banking and financial services E-commerce services Web-centric services Internet search indexes Scientific and document searches Medical records Web loggs TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  6. 6. METIN BIG DATA ▸VOLUME ▸VELOCITY ▸VARIETY TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  7. 7. FIRMALAR , MÜŞTERILERININ DNA SINI ANALIZ ETMEK ZORUNDALAR. Zekeriya Beşiroğlu TROUG TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  8. 8. TROUG BIG DATADA HEDEF NEDİR? NASIL YAPILMALI? ▸Big data teknolojilerini kullanarak business’a nasıl değer katabilirim. Bir takım costları azaltabilirmiyim? ▸Big Data ile geleneksel database nasıl entegre edeceğim? Structured,semi structured ve unstructured verileri birleştirme ▸Analytics toolları ile sonuça ulaşma. Oracle Advance Analytics,BI ve DW teknolojileri TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  9. 9. TROUG DATA ▸ Schema on Write yapıyoruz ▸ Schema on READ yapalım. TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  10. 10. TROUG BIG DATA PROJESI SAFHALARI ▸DATA ACQUISITION and Storage ▸DATA ACCESS and Processing ▸Data Unification and Analysis TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  11. 11. DATA ACQUISITION AND STORAGE HADOOP DISTRIBUTED FILE SYSTEM-HDFS ▸petabyte-scale distributed file system ▸linearly scalable on commodity hardware ▸Schema on Read ▸Cheaper ▸low security ▸write once,read many TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  12. 12. DATA ACQUISITION AND STORAGE HADOOP DISTRIBUTED FILE SYSTEM-HDFS ▸Basic file system operations ▸JSON log file HDFS yükleyebilirim. (hadoop fs -put)
  13. 13. DATA ACQUISITION AND STORAGE WHAT IS FLUME? ▸Avro Source ▸Memory Channel ▸HDFS Sink TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  14. 14. DATA ACQUISITION AND STORAGE ORACLE NOSQL DATABASE ▸Key Value Database ▸Access by java Apı ▸Stores unstructured or semi structured data as byte arrays ▸Highly reliable ▸Scalable throughput and predictable latency TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  15. 15. DATA ACQUISITION AND STORAGE RDBMS & NOSQL TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  16. 16. DATA ACQUISITION AND STORAGE HDFS & NOSQL TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  17. 17. DATA ACQUISITION AND STORAGE APPLICATION DATABASE TECHNOLOGY ▸High Volume with Low value ▸Dynamic application schema ▸if answer yes NOSQL TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  18. 18. DATA ACQUISITION AND STORAGE NOSQL EXAMPLE TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  19. 19. DATA ACCESS AND PROCESSING MAP REDUCE ▸Write applications that process vast amounts of data , in parallel on large cluster of commodity hardware in reliable and fault tolerant. ▸Storing data in HDFS is low cost , fault tolerant and scalable. ▸Integrates with HDFS to provide parallel data processing ▸Batch-oriented TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  20. 20. DATA ACCESS AND PROCESSING MAP REDUCE ORNEK map(String input_key, String input_value) foreach word w in input_value: emit(w, 1) reduce(String output_key, Iterator<int> intermediate_vals) set count = 0 foreach v in intermediate_vals: count += v emit(output_key, count) (1000,’Galatasaray sampiyon olur’) (2000,’beşiktas sampiyon olur’) (2200,’Galatasaray Türkiyedir’) (3000,’fenerbahce sampiyon olur’) TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  21. 21. DATA ACCESS AND PROCESSING MAP REDUCE ORNEK Output Mapper (‘Galatasaray’, 1), (‘sampiyon’, 1), (‘olur’, 1), (‘beşiktas’, 1), (‘sampiyon, 1), (‘olur’, 1), (‘Galatasaray’, 1), (‘Türkiyedir’, 1) (‘fenerbahce’, 1), (‘sampiyon, 1), (‘olur’, 1) Intermediate Data Reducer’a gönderilen (‘Galatasaray’,[1,1]) (‘sampiyon’,[1,1,1]) (‘olur’,[1,1,1]) (‘beşiktas’,[1]) (‘fenerbahce’,[1]) (‘Türkiyedir’,[1]) Reducer’ın son cıktısı (‘sampiyon’,3) (‘olur’,3) (‘Galatasaray’,2) (‘fenerbahce’,1) (‘beşiktas’,1) (‘Türkiyedir’,1) TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  22. 22. DATA ACCESS AND PROCESSING HIVE ▸SQL to query HDFS by using Hive QL(SQL like language) ▸Hive transform HiveQL queries into standard Mapreduce jobs ▸Schema on Read via InputFormat and SerDe ▸Not ideal for ad hoc(slow) ▸Immature optimizer TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  23. 23. DATA ACCESS AND PROCESSING HIVE ▸Log Processing ▸Text mining ▸Document Indexing ▸Business Analytics ▸Predictive Modeling ▸Not ideal for ad hoc query TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  24. 24. DATA ACCESS AND PROCESSING PIG ▸Open Source Data flow system ▸simple language for queries and data manipulation, which is compiled into map-reduce jobs that are run on hadoop ▸Provides common operations like join,group,sort ▸Works on files in HDFS ▸Ad hoc queries across large data sets. ▸log analysis TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  25. 25. DATA ACCESS AND PROCESSING CLOUDERA IMPALA ▸DATABASE -LIKE SQL layer on top of Hadoop ▸Distributed,massively parallel processing database engine ▸SQL is the primary development language ▸Open Source,Impala process data in hadoop cluster WITHOUT using MapReduce ▸Interactive analysis on data stored in HDFS and Hbase TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  26. 26. DATA ACCESS AND PROCESSING ORACELE XQUERY FOR HADOOP ▸Is a transform engine for semistructured data that is stored in Apache Hadoop ▸Transform Xquery language translating them into series of Mapreduce ▸load data efficiently into Oracle Database by using Oracle Loader for Hadoop ▸Provides read and write support to Oracle NOSQL DB TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  27. 27. DATA ACCESS AND PROCESSING ORACELE XQUERY FOR HADOOP TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  28. 28. DATA ACCESS AND PROCESSING APACHE SPARK ▸Open Source parallel data processing ▸Develop Fast ▸Online Streaming ▸Interactive analytics ▸Machine Learning ▸Speed TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  29. 29. DATA ACCESS AND PROCESSING APACHE SPARK ÖRNEK TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  30. 30. DATA UNIFICATION AND ANALYSIS APACHE SQOOP ▸Batch Loading ▸Transfer bulk data between structured data stores and Apache Hadoop ▸Data import and Export between external data stores and Hadoop ▸Parallelizes data transfer for fast performance TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  31. 31. DATA UNIFICATION AND ANALYSIS ORACLE LOADER FOR HADOOP ▸Batch Loading ▸High performance loader for fast movement of data from Hadoop into a table in Oracle Database ▸Loading using online and offline modes ▸offloading expensive data processing from the database server to hadoop TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  32. 32. DATA UNIFICATION AND ANALYSIS COPY TO BDA ▸Batch Loading TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  33. 33. DATA UNIFICATION AND ANALYSIS ORACLE SQL CONNECTOR FOR HADOOP ▸ Generate external table in database pointing to HDFS data ▸ Load into database or query data in place on HDFS ▸ Fine-grained control over type mapping ▸ Parallel load with automatic load balancing TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  34. 34. DATA UNIFICATION AND ANALYSIS ORACLE TECHNOLOGIES TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  35. 35. DATA UNIFICATION AND ANALYSIS ORACLE ADVANCED ANALYTICS ▸OAA=Oracle Data Mining+Oracle R enterprise ▸Performance ▸Predictive Analytics ▸Easy TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  36. 36. METIN ORACLE BDA BENEFITS ▸ Ships with leading Hadoop distribution(Cloudera) ▸ Hdfs,hbase,hive,flume,kafka,spark … ▸ Cloudera manager ▸ Ships with great connectivity to Oracle Db ▸ Big Data SQL ▸ Big Data Connectors & ODI TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE
  37. 37. TEŞEKKÜRLER ZEKERIYA BEŞIROĞLU BILGINC IT ACADEMY TROUG @ZBESIROGLU BILGINC IT ACADEMY BIG DATA BIG PICTURE

×