Neptune Distributed Data System

  • 1,580 views
Uploaded on

Neptune is Distributed Large scale Structured Data Storage, and open source project implementing Google's Bigtable.

Neptune is Distributed Large scale Structured Data Storage, and open source project implementing Google's Bigtable.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,580
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
24
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Neptune
    hjkim
    http://www.openneptune.com
    http://dev.naver.com/projects/neptune(korean)
    babokim@gmail.com
  • 2. Neptune
    Distributed Data Storage
    semi-structured data store(not file system)
    Use Distributed File System for Data file
    Supports real time and batch processing
    Google Bigtable clone
    Data Model, Architecture, Features
    Open source
    http://dev.naver.com/projects/neptune(korean)
    http://www.openneptune.com
    Goal
    500 nodes
    200 GB 이상/per node, Peta bytes
  • 3. Features
    Schema Management
    Create, drop, modify table schema
    Real-time Transaction
    Single row operation(no join, group by, order by)
    Multi row operation: like, between
    Batch Transaction
    Scanner, Direct Uploader, MapReduce Adapter
    Scalability
    Automatic table split & re-assignment
    Reliability
    Data file stored in Distributed File System
    Commit log stored in ChangeLog Cluster
    Failover
    Tablet takeover time: max 1 min.
    Utility
    Web Console, Shell(simple query), Data Verifier
  • 4. Architecture
    User Application
    MapReduce
    Neptune
    Master
    Neptune
    TabletServer #1
    TabletServer #2
    TabletServer #n
    Table
    Distributed File System(Hadoop or other)
    Physical storage
  • 5. Components
    Master
    Lock Server
    Neptune Client
    Neptune Master
    ZooKeeper
    Pleidas
    Neptune Master
    NTable
    Scanner
    Shell
    NChubby
    failover
    / event
    failover
    / event
    Data/Control
    Control
    TabletServer #1
    (Neptune)
    TabletServer #2
    (Neptune)
    TabletServer #n
    (Neptune)
    LogServer
    #1
    LogServer
    #2
    LogServer
    #n
    DFS #1
    (DataNode)
    Computing #1
    (Map&Reduce)
    DFS #2
    (DataNode)
    Computing #2
    (Map&Reduce)
    DFS #n
    (DataNode)
    Computing #n
    (Map&Reduce)
    Local disk
    Local disk
    Local disk
  • 6. Data Model
    Table
    Column#1
    Column#n
    Rowkey
    row #1
    ck-1 v1, t1
    TabletA-1
    rk-1
    v2, t2
    ck-2 v3, t2
    v4, t3
    row #k
    v5, t4
    row #k+1
    ck-n vn, tn
    TabletA-2
    - Sorted by rowKey
    - Sorted by columnKey
    row #m
    Row#1
    Cell
    row #m+1
    Row.Key
    TabletA-n
    Column1
    Column2
    Column-n
    Cell.Key
    Cell1
    Cell1
    Cell1
    Cell.Value(t1)
    Cell2
    Cell2
    Cell2


    row #n
    Cell.Value(t2)
    Cell3


    Cell-k
    Cell-m
    Cell.Value(tn)
    Cell-n
  • 7. Index
    Root Index
    M.T1.1000:M1

    Max:mn
    M.T1.2000:M2
    index of Meta Index
    Meta Tablet
    m2
    m1
    mn
    T1.100:U1

    xx
    xx

    n
    T1.200:U2
    T1.1000:UN

    T1.2000:UN
    T1.1100:U1
    T1.1200:U2

    index of User Tablet
    User defined Tablet
    U1
    U2
    10
    20

    100
    110
    120

    200

    Index of TableMapFile’sblock(max-key, file-offset)
    Index Record format: . Key - TableName.MaxRowKey
    . Value – Tablet Name, assigned host
    scan
    64KB
    TableMapFile(Physical file,sortedby rowkey, columnkey)
  • 8. Data/index file in HDFS
    TabletA
    TabletB
    Column
    Data/index file
  • 9. TabletServer
    Minor
    Compaction
    MemoryTable
    ChangeLogServer
    Data Operation
    put(key, value)
    ChangeLog
    Searcher
    get(key)
    Merged
    MapFile
    (HDFS)
    MapFile#2
    (HDFS)
    MapFile#1
    (HDFS)
    MapFile #n
    (HDFS)
    Major Compaction
  • 10. Failover
    Master fail
    disabled only Table Schema Management and Tablet Split
    can execute Multi-Master
    TabletServer fail
    assign to other TabletServer by master
    within 2 minutes
  • 11. MapReduce with Neptune
    Hadoop
    TaskTracker
    TableA
    Map Task
    Map Task
    TabletInputFormat
    Map Task
    TabletA-1
    TaskTracker
    Partitioned
    by key
    TableB
    Reduce
    Task
    TaskTracker
    Tablet A-2
    Tablet B-1
    Map Task
    Map Task
    Map Task
    Tablet A-3
    TaskTracker
    Tablet B-2

    Reduce
    Task
    TaskTracker
    Map Task
    Tablet A-N
    Map Task
    Map Task
    DBMS
    or HDFS
    META Table
  • 12. Client
    Client API
    Single row operation: put/get
    Multi row operation: like, between
    Batch operation: scanner/uploader
    MapReduce: TabletInputFormat
    Command line Shell
    NQL(Neptune Query Language)
    JDBC support
    Web Console
  • 13. Client API Example
    TableShematableSchema =
    new TableSchema(“T_TEST”, new String[]{“col1”, “col2”});
    NTable.createTable(tableSchema);
    NTablentable = Ntable.openTable(“T_TEST”);
    Row row = new Row(new Row.Key(“RK1”));
    Row.addCell(“col1”, new Cell(new Cell.Key(“CK1”), “test_value”.getBytes()));
    ntable.put(row);
    Row selectedRow = ntable.get(new Row.Key(“RK1”));
    System.out.println(selectedRow.getCellList(“col1”).get(0));
    TableScanner scanner = ntable.openScanner(ntable, new String[]{“col1”});
    Row scanRow = null;
    while( (scanRow = scanner.next()) = null) {
    System.out.println(selectedRow.getCellList(“col1”).get(0));
    }
    scanner.close();
  • 14. Neptune Shell
    Data Definition
    CREATE TABLE
    DROP TABLE
    SHOW TABLES
    DESC
    Data Manipulation
    SELECT
    DELETE
    INSERT
    TRUNCATE COLUMN
    TRUNCATE TABLE
    Cluster Monitoring
    PING TABLETSERVER
    REPORT TABLE
  • 15. Web Console
  • 16. Performance
    Number of 1000-byte values read/written per second
  • 17. HBase/Bigtable Comparison
  • 18. Powered by Neptune
    http://searcus.com/nosql
    twitter search service