Big data hadoop-no sql and graph db-final
Upcoming SlideShare
Loading in...5

Big data hadoop-no sql and graph db-final






Total Views
Views on SlideShare
Embed Views



1 Embed 6 6



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • This template can be used as a starter file to give updates for project milestones.SectionsRight-click on a slide to add sections. Sections can help to organize your slides or facilitate collaboration between multiple authors.NotesUse the Notes section for delivery notes or to provide additional details for the audience. View these notes in Presentation View during your presentation. Keep in mind the font size (important for accessibility, visibility, videotaping, and online production)Coordinated colors Pay particular attention to the graphs, charts, and text boxes.Consider that attendees will print in black and white or grayscale. Run a test print to make sure your colors work when printed in pure black and white and grayscale.Graphics, tables, and graphsKeep it simple: If possible, use consistent, non-distracting styles and colors.Label all graphs and tables.
  • What is the project about?Define the goal of this projectIs it similar to projects in the past or is it a new effort?Define the scope of this projectIs it an independent project or is it related to other projects?* Note that this slide is not necessary for weekly status meetings
  • * If any of these issues caused a schedule delay or need to be discussed further, include details in next slide.
  • Duplicate this slide as necessary if there is more than one issue.This and related slides can be moved to the appendix or hidden if necessary.
  • Duplicate this slide as necessary if there is more than one issue.This and related slides can be moved to the appendix or hidden if necessary.
  • Duplicate this slide as necessary if there is more than one issue.This and related slides can be moved to the appendix or hidden if necessary.
  • Duplicate this slide as necessary if there is more than one issue.This and related slides can be moved to the appendix or hidden if necessary.

Big data hadoop-no sql and graph db-final Big data hadoop-no sql and graph db-final Presentation Transcript

  • Big Data – Hadoop - NoSQL and Graph DatabaseRamazan FIRIN20.11.2012 This document is intended for only AVEA İletişim Hizmetleri A.Ş.("AVEA"), its dealers, employees and/or others specifically authorised. The contents of this document are confidential and any disclosure, copying, distribution and/or taking any action in reliance with the content of this document is prohibited. AVEA is not liable for the transmission of this document in any manner to any third parties that are not authorised to receive.
  • AGENDA• Big Data• Hadoop• NoSQL• Graph DB and Neoj• Possible Usage in Tellco• Demo 2
  • Executive Summary • Big Data is a new IT trend • Hadoop and NoSQL can used to process Big Data • Possible usage area in Tellco : - Prevent Churn - to offer customer spesific campaign - to get more customerAVEA 3 R&D /MW Developement
  • What is Big Data? Datasets that are too awkward to work with using traditional, hands-ondatabase management tools. 4
  • Big Data- 3V Concept 5
  • Big Data Sources1. Social network profiles -Facebook, LinkedIn, Yahoo, Google2. Social influencers - blog comments, user forums, review sites,3. Activity-generated data - application logs, sensor data4. Public—Wikipedia, IMDb, etc5. Data warehouse appliances - transactional data6. Network and in-stream monitoring7. Legacy documents— 6
  • Big Data To Smart Data Cover of The Economist 7
  • Volume 8
  • New Data Sources - Internet• 2 Billion internet users by 2011• Twitter processes 7 terabytes data of every day• Facebook processes 10 terabytes data of every day• 4.6 billion mobile phone• Google processes 24 petabytes data of every day 9
  • Big Data Approach 10
  • Big Data Design 11
  • Big Data Usage Sector 12
  • Sample Usage - 360°Degree View of theCustomers 13
  • Sample Usage – Customer Sentiment 14
  • Sample Usage – Detect Churn Pattern 15
  • Sample Usage - Healty 16
  • Big Data Market 17
  • Big Data Solutions – Oracle Big Data Appliance 18
  • Big Data Solutions – IBM Pure Data 19
  • TOP 10 Tecnology Trend 2012 from CSC 20
  • Gartner: Top 10 IT Trends for 2013Avea 21 21R&D /MW Developement
  • Gartner:10 Critical IT Trends For The Next FiveYears• Third trend is Bigger data and storage:• By 2015, big data demand will generate 1 million jobs in the Global 1000,• but only a one-third of jobs will get filled due to shortage of talent.• Analytics and pattern recognition are key.• Seeing new specialized ARM-based servers to do specialty analytics.Avea 22 22R&D /MW Developement
  • HADOOP 23
  • What is HADOOP? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models 24
  • History 25
  • Hadoop Components 26
  • Hadoop EcosystemPig - simplifies hadoop programming, data processing languageHive - SQL like queriesHBase - Random read/write, billions of row and millions of colums (NoSQL) 28
  • Other Google Research 29
  • NoSQL 30
  • RDBMS PERFORMANCEAvea 31 31R&D /MW Developement
  • Join is killer...Avea 32 32R&D /MW Developement
  • What is NoSQL?• Stands for Not Only SQL• Non relational• Cheap, Easy to implement• Scalability – Vertically - Add more data – Horizontally - Add more storage• No pre-defined schema• No join operations• Not ACID, support CAP threom 33
  • NoSQL DB Types1. Key-values Stores2. Document Databases3. Column Family Stores4. Graph Databases 34
  • Key-Value Stores - Redis, Voldemort 35
  • Document Database- CouchDB, MongoDB 36
  • Column Family Stores - Cassandra, HBase 37
  • Graph Database- Neo4J, InfoGrid, Infinite Graph 38
  • RMDBS Support ACID• Atomicity - a transaction is all or nothing• Consistency - only valid data is written to the database• Isolation - pretend all transactions are happening serially and the data is correct• Durability - what you write is what you get 39
  • NoSQL Support CAP Threom 40
  • NoSQL Support CAP Theorem• Consistency - each client always has the same view of the data.• Availability - all clients can always read and write.• Partition tolerance - if one or more nodes fails the system still works You can pick only two... 41
  • Visual Guide to NoSQL SystemsAvea 42 42R&D /MW Developement
  • NoSQL Complexity 43
  • NoSQL Performance 44
  • Job TrendsAvea 45 45R&D /MW Developement
  • Graph DB and Neo4j 46
  • Graph DBGraph database uses graph structures with nodes, edges, and properties to represent and store data. 47
  • Graph DB Usage Area• Recommendations • Time Series data• Business Inteligence • Product Catalogue• Social networking • Web Analitics• MDM • Scientific Computing• System Management • Indexing your slow RMDBS 48
  • Relational Databases are Graphs! 49
  • Neo4j• Leading Graph • Opensource Database• Transaction • Traversal framework support (ACID) • High Performance• Indexing (traverse 1.000.000 + relationship/seconds)• Querying• REST support • Robust (in 7/24 operation since 2003)• Disk Based • Massive scalability 50
  • Neo4j Data ModelNeo4j has Nodes and Relationship.Nodes and realtionships have properties. Relationship type : knows Node1 Property : Date of meeting Node2 Relationship Property:name Property:name Property:surname Property:surname 51
  • Ne4j Performance into-neo4j-on-ec2/ 52
  • Who use Neo4j?• Cisco - Master Data Management• Telenor Group : Customer organization scructure (203 million subscribers )• Deutsche Telekom: Social football site (150 million subscribers ) 53
  • Cypher For Query 54
  • Sample Code 55
  • Spring Data Neo4j 56
  • Neoclipse 57
  • Product CatalogAvea 58 58R&D /MW Developement
  • Sample OM Data Model 59
  • Hardware Calculating Tool 60
  • Hardware Calculating Tool ResultCalculation Result Prod Environment • 4 pysical machines • 3 node at every machines • 1024 mhz cpu • 65536 MB Ram 61
  • Orient DB• The Document-Graph • HTTP / Restfull / Json / database Binary supports• ACID support • Hooks• SQL and Native Queries, • Fetch plans• schema-less, schema-full • Inheritance and schema-mixed modes • 200.000 insert per• Roles + Security second(6 M node travels with cache)• Functions 62
  • FluxGraph• Temporal Graph Database• Has checkpoint• Compatible with Neo4jMercedes-Benz Türk A.Ş. 63 632008-07-01_Presentation Template MBT / CEO
  • Examples for TelCos• CDR• Routing• Social graphs• Master Data Management• Spatial and LBS• Network topology analysis• Neo4j and AndroidAvea 64 64R&D /MW Developement
  • CDR AnalysisAvea 65 65R&D /MW Developement
  • Master Data ManagementAvea 66 66R&D /MW Developement
  • Network ManagementAvea 67 67R&D /MW Developement
  • Cell Network AnaliysisAvea 68 68R&D /MW Developement
  • Sample Senarios• Customer Spesific Campaign• Prevent Churn• Get More Customer• Special offer for campaigns 69
  • Thanks 70