• Like
  • Save
Neptune Distributed Data System
Upcoming SlideShare
Loading in...5
×
 

Neptune Distributed Data System

on

  • 1,855 views

Neptune is Distributed Large scale Structured Data Storage, and open source project implementing Google's Bigtable.

Neptune is Distributed Large scale Structured Data Storage, and open source project implementing Google's Bigtable.

Statistics

Views

Total Views
1,855
Views on SlideShare
1,848
Embed Views
7

Actions

Likes
0
Downloads
23
Comments
0

1 Embed 7

http://www.slideshare.net 7

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Neptune Distributed Data System Neptune Distributed Data System Presentation Transcript

    • Neptune
      hjkim
      http://www.openneptune.com
      http://dev.naver.com/projects/neptune(korean)
      babokim@gmail.com
    • Neptune
      Distributed Data Storage
      semi-structured data store(not file system)
      Use Distributed File System for Data file
      Supports real time and batch processing
      Google Bigtable clone
      Data Model, Architecture, Features
      Open source
      http://dev.naver.com/projects/neptune(korean)
      http://www.openneptune.com
      Goal
      500 nodes
      200 GB 이상/per node, Peta bytes
    • Features
      Schema Management
      Create, drop, modify table schema
      Real-time Transaction
      Single row operation(no join, group by, order by)
      Multi row operation: like, between
      Batch Transaction
      Scanner, Direct Uploader, MapReduce Adapter
      Scalability
      Automatic table split & re-assignment
      Reliability
      Data file stored in Distributed File System
      Commit log stored in ChangeLog Cluster
      Failover
      Tablet takeover time: max 1 min.
      Utility
      Web Console, Shell(simple query), Data Verifier
    • Architecture
      User Application
      MapReduce
      Neptune
      Master
      Neptune
      TabletServer #1
      TabletServer #2
      TabletServer #n
      Table
      Distributed File System(Hadoop or other)
      Physical storage
    • Components
      Master
      Lock Server
      Neptune Client
      Neptune Master
      ZooKeeper
      Pleidas
      Neptune Master
      NTable
      Scanner
      Shell
      NChubby
      failover
      / event
      failover
      / event
      Data/Control
      Control
      TabletServer #1
      (Neptune)
      TabletServer #2
      (Neptune)
      TabletServer #n
      (Neptune)
      LogServer
      #1
      LogServer
      #2
      LogServer
      #n
      DFS #1
      (DataNode)
      Computing #1
      (Map&Reduce)
      DFS #2
      (DataNode)
      Computing #2
      (Map&Reduce)
      DFS #n
      (DataNode)
      Computing #n
      (Map&Reduce)
      Local disk
      Local disk
      Local disk
    • Data Model
      Table
      Column#1
      Column#n
      Rowkey
      row #1
      ck-1 v1, t1
      TabletA-1
      rk-1
      v2, t2
      ck-2 v3, t2
      v4, t3
      row #k
      v5, t4
      row #k+1
      ck-n vn, tn
      TabletA-2
      - Sorted by rowKey
      - Sorted by columnKey
      row #m
      Row#1
      Cell
      row #m+1
      Row.Key
      TabletA-n
      Column1
      Column2
      Column-n
      Cell.Key
      Cell1
      Cell1
      Cell1
      Cell.Value(t1)
      Cell2
      Cell2
      Cell2


      row #n
      Cell.Value(t2)
      Cell3


      Cell-k
      Cell-m
      Cell.Value(tn)
      Cell-n
    • Index
      Root Index
      M.T1.1000:M1

      Max:mn
      M.T1.2000:M2
      index of Meta Index
      Meta Tablet
      m2
      m1
      mn
      T1.100:U1

      xx
      xx

      n
      T1.200:U2
      T1.1000:UN

      T1.2000:UN
      T1.1100:U1
      T1.1200:U2

      index of User Tablet
      User defined Tablet
      U1
      U2
      10
      20

      100
      110
      120

      200

      Index of TableMapFile’sblock(max-key, file-offset)
      Index Record format: . Key - TableName.MaxRowKey
      . Value – Tablet Name, assigned host
      scan
      64KB
      TableMapFile(Physical file,sortedby rowkey, columnkey)
    • Data/index file in HDFS
      TabletA
      TabletB
      Column
      Data/index file
    • TabletServer
      Minor
      Compaction
      MemoryTable
      ChangeLogServer
      Data Operation
      put(key, value)
      ChangeLog
      Searcher
      get(key)
      Merged
      MapFile
      (HDFS)
      MapFile#2
      (HDFS)
      MapFile#1
      (HDFS)
      MapFile #n
      (HDFS)
      Major Compaction
    • Failover
      Master fail
      disabled only Table Schema Management and Tablet Split
      can execute Multi-Master
      TabletServer fail
      assign to other TabletServer by master
      within 2 minutes
    • MapReduce with Neptune
      Hadoop
      TaskTracker
      TableA
      Map Task
      Map Task
      TabletInputFormat
      Map Task
      TabletA-1
      TaskTracker
      Partitioned
      by key
      TableB
      Reduce
      Task
      TaskTracker
      Tablet A-2
      Tablet B-1
      Map Task
      Map Task
      Map Task
      Tablet A-3
      TaskTracker
      Tablet B-2

      Reduce
      Task
      TaskTracker
      Map Task
      Tablet A-N
      Map Task
      Map Task
      DBMS
      or HDFS
      META Table
    • Client
      Client API
      Single row operation: put/get
      Multi row operation: like, between
      Batch operation: scanner/uploader
      MapReduce: TabletInputFormat
      Command line Shell
      NQL(Neptune Query Language)
      JDBC support
      Web Console
    • Client API Example
      TableShematableSchema =
      new TableSchema(“T_TEST”, new String[]{“col1”, “col2”});
      NTable.createTable(tableSchema);
      NTablentable = Ntable.openTable(“T_TEST”);
      Row row = new Row(new Row.Key(“RK1”));
      Row.addCell(“col1”, new Cell(new Cell.Key(“CK1”), “test_value”.getBytes()));
      ntable.put(row);
      Row selectedRow = ntable.get(new Row.Key(“RK1”));
      System.out.println(selectedRow.getCellList(“col1”).get(0));
      TableScanner scanner = ntable.openScanner(ntable, new String[]{“col1”});
      Row scanRow = null;
      while( (scanRow = scanner.next()) = null) {
      System.out.println(selectedRow.getCellList(“col1”).get(0));
      }
      scanner.close();
    • Neptune Shell
      Data Definition
      CREATE TABLE
      DROP TABLE
      SHOW TABLES
      DESC
      Data Manipulation
      SELECT
      DELETE
      INSERT
      TRUNCATE COLUMN
      TRUNCATE TABLE
      Cluster Monitoring
      PING TABLETSERVER
      REPORT TABLE
    • Web Console
    • Performance
      Number of 1000-byte values read/written per second
    • HBase/Bigtable Comparison
    • Powered by Neptune
      http://searcus.com/nosql
      twitter search service