HBaseCon 2012 | Storing and Manipulating Graphs in HBase

Cloudera, Inc.
Cloudera, Inc.Cloudera, Inc.
Storing and Manipulating Graphs
            in HBase


            Dan Lynn
          dan@fullcontact.com
              @danklynn
Keeps Contact Information Current and Complete


  Based in Denver, Colorado




                              CTO & Co-Founder
Turn Partial Contacts
 Into Full Contacts
Refresher: Graph Theory
Refresher: Graph Theory
Refresher: Graph Theory




     rt ex
Ve
Refresher: Graph Theory




                          Edg
                                e
Social Networks
Tweets

@danklynn

              retweeted


                                   “#HBase rocks”
 follows


                          author



            @xorlev
Web Links


http://fullcontact.com/blog/



                               <a href=”...”>TechStars</a>




                               http://techstars.com/
Why should you care?

Vertex Influence
- PageRank

- Social Influence

- Network bottlenecks

Identifying Communities
Storage Options
neo4j
neo4j




Very expressive querying
       (e.g. Gremlin)
neo4j




Transactional
neo4j




Data must fit on a
 single machine

       :-(
FlockDB
FlockDB




Scales horizontally
FlockDB




Very fast
FlockDB




No multi-hop query support

           :-(
RDBMS
(e.g. MySQL, Postgres, et al.)
RDBMS




Transactional
RDBMS




Huge amounts of JOINing

          :-(
HBaseCon 2012 | Storing and Manipulating Graphs in HBase
HBase




Massively scalable
HBase




Data model well-suited
HBase




Multi-hop querying?
Modeling
Techniques
Adjacency Matrix


1
             3




    2
Adjacency Matrix

    1   2    3

1   0   1    1

2   1   0    1

3   1   1    0
Adjacency Matrix




Can use vectorized libraries
Adjacency Matrix




Requires   O(n2)   memory
                   n = number of vertices
Adjacency Matrix




Hard(er) to distribute
Adjacency List


1
                3




      2
Adjacency List




1           2,3

2           1,3

3           1,2
Adjacency List Design in HBase

e:dan@fullcontact.com
                                p:+13039316251




                   t:danklynn
Adjacency List Design in HBase
      row key               “edges” column family

e:dan@fullcontact.com   p:+13039316251= ...

                        t:danklynn= ...


p:+13039316251          e:dan@fullcontact.com= ...

                        t:danklynn= ...


t:danklynn              e:dan@fullcontact.com= ...

                        p:+13039316251= ...
Adjacency List Design in HBase
      row key               “edges” column family

e:dan@fullcontact.com   p:+13039316251= ...

                        t:danklynn= ...
                                                      at to
                                                W e?h
p:+13039316251          e:dan@fullcontact.com= ...
                                                   st or
                        t:danklynn= ...


t:danklynn              e:dan@fullcontact.com= ...

                        p:+13039316251= ...
Custom Writables
package org.apache.hadoop.io;

public interface Writable   {

    void write(java.io.DataOutput dataOutput);

    void readFields(java.io.DataInput dataInput);
}
                                                    java
Custom Writables
class EdgeValueWritable implements Writable {

    EdgeValue edgeValue

    void write(DataOutput dataOutput) {
        dataOutput.writeDouble edgeValue.weight
    }

    void readFields(DataInput dataInput) {
        Double weight = dataInput.readDouble()
        edgeValue = new EdgeValue(weight)
    }

    // ...
}
                                                  groovy
Don’t get fancy with byte[]
class EdgeValueWritable implements Writable {
   EdgeValue edgeValue

    byte[] toBytes() {
        // use strings if you can help it
    }

    static EdgeValueWritable fromBytes(byte[] bytes) {
        // use strings if you can help it
    }
}
                                                     groovy
Querying by vertex
def get = new Get(vertexKeyBytes)
get.addFamily(edgesFamilyBytes)

Result result = table.get(get);
result.noVersionMap.each {family, data ->

    // construct edge objects as needed
    // data is a Map<byte[],byte[]>
}
Adding edges to a vertex
def put = new Put(vertexKeyBytes)

put.add(
    edgesFamilyBytes,
    destinationVertexBytes,
    edgeValue.toBytes() // your own implementation here
)

// if writing directly
table.put(put)


// if using TableReducer
context.write(NullWritable.get(), put)
Distributed Traversal / Indexing

e:dan@fullcontact.com
                         p:+13039316251




                          t:danklynn
Distributed Traversal / Indexing

e:dan@fullcontact.com
                         p:+13039316251




                          t:danklynn
Distributed Traversal / Indexing

e:dan@fullcontact.com
                                         p:+13039316251


                    Pi v
                           ot v
                                  e rt
                                         ex

                                         t:danklynn
Distributed Traversal / Indexing

 e:dan@fullcontact.com
                          p:+13039316251




Ma pReduce ove r
out bou nd edges
                           t:danklynn
Distributed Traversal / Indexing

  e:dan@fullcontact.com
                           p:+13039316251




Em it vertexes an d edge
dat a gro upe d by
the piv ot               t:danklynn
Distributed Traversal / Indexing

   Re duc e key                p:+13039316251




“Ou t” vertex
                e:dan@fullcontact.com



                                        t:danklynn
“In” vertex
Distributed Traversal / Indexing


e:dan@fullcontact.com       t:danklynn




Re duc er em its higher-order edge
Distributed Traversal / Indexing




Ite rat ion 0
Distributed Traversal / Indexing




Ite rat ion 1
Distributed Traversal / Indexing




Ite rat ion 2
Distributed Traversal / Indexing




                               Reuse edges created
                               during previ ous
                               iterat ions




Ite rat ion 2
Distributed Traversal / Indexing




Ite rat ion 3
Distributed Traversal / Indexing




                               Reuse edges created
                               during previ ous
                               iterat ions




Ite rat ion 3
Distributed Traversal / Indexing


   hop s req uires on ly

                   ite rat ion s
Tips / Gotchas
Do implement your own comparator
public static class Comparator
               extends WritableComparator {


    public int compare(
        byte[] b1, int s1, int l1,
        byte[] b2, int s2, int l2) {

        // .....
    }
}
                                              java
Do implement your own comparator


static {
    WritableComparator.define(VertexKeyWritable,
         new VertexKeyWritable.Comparator())
}



                                                   java
MultiScanTableInputFormat

MultiScanTableInputFormat.setTable(conf,
   "graph");

MultiScanTableInputFormat.addScan(conf,
   new Scan());

job.setInputFormatClass(
   MultiScanTableInputFormat.class);


                                          java
TableMapReduceUtil



TableMapReduceUtil.initTableReducerJob(
    "graph", MyReducer.class, job);

                                      java
Elastic
MapReduce
Elastic MapReduce

HFi les
Elastic MapReduce

HFi les
     Copy to S3



  Seq uen ceFiles
Elastic MapReduce

HFi les
     Copy to S3
                     Elastic MapReduce



  Seq uen ceFiles Seq uen ceFiles
Elastic MapReduce

HFi les
     Copy to S3
                     Elastic MapReduce



  Seq uen ceFiles Seq uen ceFiles
Elastic MapReduce

HFi les
     Copy to S3
                                Elastic MapReduce



  Seq uen ceFiles Seq uen ceFiles
          HFileOutputFormat.configureIncrementalLoad(job, outputTable)



  HFi les
Elastic MapReduce

HFi les
     Copy to S3
                                Elastic MapReduce



  Seq uen ceFiles Seq uen ceFiles
          HFileOutputFormat.configureIncrementalLoad(job, outputTable)



  HFi les                                          HBase
                   $ hadoop jar hbase-VERSION.jar completebulkload
Additional Resources
Google Pregel: BSP-based graph processing system

Apache Giraph: Implementation of Pregel for Hadoop

MultiScanTableInputFormat: (code to appear on GitHub)

Apache Mahout - Distributed machine learning on Hadoop
Thanks!
dan@fullcontact.com
1 of 71

Recommended

HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce by
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceHBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceCloudera, Inc.
41.7K views189 slides
Hadoop World 2011: Advanced HBase Schema Design by
Hadoop World 2011: Advanced HBase Schema DesignHadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema DesignCloudera, Inc.
17.9K views33 slides
Introduction To HBase by
Introduction To HBaseIntroduction To HBase
Introduction To HBaseAnil Gupta
87.8K views18 slides
Data Evolution in HBase by
Data Evolution in HBaseData Evolution in HBase
Data Evolution in HBaseHBaseCon
5K views38 slides
Apache Drill - Why, What, How by
Apache Drill - Why, What, HowApache Drill - Why, What, How
Apache Drill - Why, What, Howmcsrivas
7.9K views90 slides
6.hive by
6.hive6.hive
6.hivePrashant Gupta
1.8K views61 slides

More Related Content

What's hot

Chicago Data Summit: Apache HBase: An Introduction by
Chicago Data Summit: Apache HBase: An IntroductionChicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An IntroductionCloudera, Inc.
22.5K views31 slides
HBase for Architects by
HBase for ArchitectsHBase for Architects
HBase for ArchitectsNick Dimiduk
33.7K views21 slides
Apache Spark on Apache HBase: Current and Future by
Apache Spark on Apache HBase: Current and Future Apache Spark on Apache HBase: Current and Future
Apache Spark on Apache HBase: Current and Future HBaseCon
2.8K views23 slides
Introduction to Apache HBase, MapR Tables and Security by
Introduction to Apache HBase, MapR Tables and SecurityIntroduction to Apache HBase, MapR Tables and Security
Introduction to Apache HBase, MapR Tables and SecurityMapR Technologies
4.2K views57 slides
Hive Training -- Motivations and Real World Use Cases by
Hive Training -- Motivations and Real World Use CasesHive Training -- Motivations and Real World Use Cases
Hive Training -- Motivations and Real World Use Casesnzhang
20.2K views46 slides
Using Apache Drill by
Using Apache DrillUsing Apache Drill
Using Apache DrillChicago Hadoop Users Group
5.3K views23 slides

What's hot(20)

Chicago Data Summit: Apache HBase: An Introduction by Cloudera, Inc.
Chicago Data Summit: Apache HBase: An IntroductionChicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An Introduction
Cloudera, Inc.22.5K views
HBase for Architects by Nick Dimiduk
HBase for ArchitectsHBase for Architects
HBase for Architects
Nick Dimiduk33.7K views
Apache Spark on Apache HBase: Current and Future by HBaseCon
Apache Spark on Apache HBase: Current and Future Apache Spark on Apache HBase: Current and Future
Apache Spark on Apache HBase: Current and Future
HBaseCon2.8K views
Introduction to Apache HBase, MapR Tables and Security by MapR Technologies
Introduction to Apache HBase, MapR Tables and SecurityIntroduction to Apache HBase, MapR Tables and Security
Introduction to Apache HBase, MapR Tables and Security
MapR Technologies4.2K views
Hive Training -- Motivations and Real World Use Cases by nzhang
Hive Training -- Motivations and Real World Use CasesHive Training -- Motivations and Real World Use Cases
Hive Training -- Motivations and Real World Use Cases
nzhang20.2K views
Apache HBase - Just the Basics by HBaseCon
Apache HBase - Just the BasicsApache HBase - Just the Basics
Apache HBase - Just the Basics
HBaseCon4.6K views
Apache Hadoop and HBase by Cloudera, Inc.
Apache Hadoop and HBaseApache Hadoop and HBase
Apache Hadoop and HBase
Cloudera, Inc.36.7K views
HBase Read High Availability Using Timeline-Consistent Region Replicas by HBaseCon
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBaseCon4.1K views
Hw09 Practical HBase Getting The Most From Your H Base Install by Cloudera, Inc.
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base Install
Cloudera, Inc.10.3K views
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C... by The Hive
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...
The Hive10.9K views
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory... by Cloudera, Inc.
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...
Cloudera, Inc.4.1K views
Apache HBase Application Archetypes by Cloudera, Inc.
Apache HBase Application ArchetypesApache HBase Application Archetypes
Apache HBase Application Archetypes
Cloudera, Inc.5.2K views
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T... by Simplilearn
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
Simplilearn2.6K views
Big Data Fundamentals in the Emerging New Data World by Jongwook Woo
Big Data Fundamentals in the Emerging New Data WorldBig Data Fundamentals in the Emerging New Data World
Big Data Fundamentals in the Emerging New Data World
Jongwook Woo3.3K views

Viewers also liked

Bulk Loading in the Wild: Ingesting the World's Energy Data by
Bulk Loading in the Wild: Ingesting the World's Energy DataBulk Loading in the Wild: Ingesting the World's Energy Data
Bulk Loading in the Wild: Ingesting the World's Energy DataHBaseCon
3.5K views26 slides
HBaseCon 2012 | Real-time Analytics with HBase - Sematext by
HBaseCon 2012 | Real-time Analytics with HBase - SematextHBaseCon 2012 | Real-time Analytics with HBase - Sematext
HBaseCon 2012 | Real-time Analytics with HBase - SematextCloudera, Inc.
8K views40 slides
Design Patterns for Building 360-degree Views with HBase and Kiji by
Design Patterns for Building 360-degree Views with HBase and KijiDesign Patterns for Building 360-degree Views with HBase and Kiji
Design Patterns for Building 360-degree Views with HBase and KijiHBaseCon
4.3K views37 slides
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase by
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBaseHBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBaseHBaseCon
8.8K views49 slides
Storing and manipulating graphs in HBase by
Storing and manipulating graphs in HBaseStoring and manipulating graphs in HBase
Storing and manipulating graphs in HBaseDan Lynn
12.5K views71 slides
HBaseCon 2012 | HBase powered Merchant Lookup Service at Intuit by
HBaseCon 2012 | HBase powered Merchant Lookup Service at IntuitHBaseCon 2012 | HBase powered Merchant Lookup Service at Intuit
HBaseCon 2012 | HBase powered Merchant Lookup Service at IntuitCloudera, Inc.
2.6K views13 slides

Viewers also liked(20)

Bulk Loading in the Wild: Ingesting the World's Energy Data by HBaseCon
Bulk Loading in the Wild: Ingesting the World's Energy DataBulk Loading in the Wild: Ingesting the World's Energy Data
Bulk Loading in the Wild: Ingesting the World's Energy Data
HBaseCon3.5K views
HBaseCon 2012 | Real-time Analytics with HBase - Sematext by Cloudera, Inc.
HBaseCon 2012 | Real-time Analytics with HBase - SematextHBaseCon 2012 | Real-time Analytics with HBase - Sematext
HBaseCon 2012 | Real-time Analytics with HBase - Sematext
Cloudera, Inc.8K views
Design Patterns for Building 360-degree Views with HBase and Kiji by HBaseCon
Design Patterns for Building 360-degree Views with HBase and KijiDesign Patterns for Building 360-degree Views with HBase and Kiji
Design Patterns for Building 360-degree Views with HBase and Kiji
HBaseCon4.3K views
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase by HBaseCon
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBaseHBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon8.8K views
Storing and manipulating graphs in HBase by Dan Lynn
Storing and manipulating graphs in HBaseStoring and manipulating graphs in HBase
Storing and manipulating graphs in HBase
Dan Lynn12.5K views
HBaseCon 2012 | HBase powered Merchant Lookup Service at Intuit by Cloudera, Inc.
HBaseCon 2012 | HBase powered Merchant Lookup Service at IntuitHBaseCon 2012 | HBase powered Merchant Lookup Service at Intuit
HBaseCon 2012 | HBase powered Merchant Lookup Service at Intuit
Cloudera, Inc.2.6K views
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera by Cloudera, Inc.
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaHBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
Cloudera, Inc.5.5K views
Building a geospatial processing pipeline using Hadoop and HBase and how Mons... by DataWorks Summit
Building a geospatial processing pipeline using Hadoop and HBase and how Mons...Building a geospatial processing pipeline using Hadoop and HBase and how Mons...
Building a geospatial processing pipeline using Hadoop and HBase and how Mons...
DataWorks Summit14.5K views
Facebook Messages & HBase by 强 王
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
强 王39.2K views
Graphs in the Database: Rdbms In The Social Networks Age by Lorenzo Alberton
Graphs in the Database: Rdbms In The Social Networks AgeGraphs in the Database: Rdbms In The Social Networks Age
Graphs in the Database: Rdbms In The Social Networks Age
Lorenzo Alberton118.1K views
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb... by Cloudera, Inc.
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
Cloudera, Inc.3.2K views
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics by Cloudera, Inc.
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
Cloudera, Inc.4.8K views
HBaseCon 2013: Being Smarter Than the Smart Meter by Cloudera, Inc.
HBaseCon 2013: Being Smarter Than the Smart MeterHBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Being Smarter Than the Smart Meter
Cloudera, Inc.4.3K views
HBaseCon 2013: Rebuilding for Scale on Apache HBase by Cloudera, Inc.
HBaseCon 2013: Rebuilding for Scale on Apache HBaseHBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBase
Cloudera, Inc.3.9K views
HBaseCon 2013: Apache HBase on Flash by Cloudera, Inc.
HBaseCon 2013: Apache HBase on FlashHBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on Flash
Cloudera, Inc.4.3K views
HBaseCon 2012 | Building Mobile Infrastructure with HBase by Cloudera, Inc.
HBaseCon 2012 | Building Mobile Infrastructure with HBaseHBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBase
Cloudera, Inc.2.6K views
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data... by Cloudera, Inc.
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
Cloudera, Inc.3.5K views
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon by Cloudera, Inc.
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUponHBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
Cloudera, Inc.3.4K views
Cross-Site BigTable using HBase by HBaseCon
Cross-Site BigTable using HBaseCross-Site BigTable using HBase
Cross-Site BigTable using HBase
HBaseCon3.5K views
HBaseCon 2012 | Scaling GIS In Three Acts by Cloudera, Inc.
HBaseCon 2012 | Scaling GIS In Three ActsHBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three Acts
Cloudera, Inc.3.6K views

Similar to HBaseCon 2012 | Storing and Manipulating Graphs in HBase

Apache Flink & Graph Processing by
Apache Flink & Graph ProcessingApache Flink & Graph Processing
Apache Flink & Graph ProcessingVasia Kalavri
1.8K views97 slides
Storm - As deep into real-time data processing as you can get in 30 minutes. by
Storm - As deep into real-time data processing as you can get in 30 minutes.Storm - As deep into real-time data processing as you can get in 30 minutes.
Storm - As deep into real-time data processing as you can get in 30 minutes.Dan Lynn
22.7K views61 slides
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX by
Introduction into scalable graph analysis with Apache Giraph and Spark GraphXIntroduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphXrhatr
3.3K views48 slides
Performing Data Science with HBase by
Performing Data Science with HBasePerforming Data Science with HBase
Performing Data Science with HBaseWibiData
1.5K views51 slides
Tech Days Paris Intoduction F# and Collective Intelligence by
Tech Days Paris Intoduction F# and Collective IntelligenceTech Days Paris Intoduction F# and Collective Intelligence
Tech Days Paris Intoduction F# and Collective IntelligenceRobert Pickering
1.1K views48 slides
Generics Past, Present and Future (Latest) by
Generics Past, Present and Future (Latest)Generics Past, Present and Future (Latest)
Generics Past, Present and Future (Latest)RichardWarburton
594 views51 slides

Similar to HBaseCon 2012 | Storing and Manipulating Graphs in HBase(20)

Apache Flink & Graph Processing by Vasia Kalavri
Apache Flink & Graph ProcessingApache Flink & Graph Processing
Apache Flink & Graph Processing
Vasia Kalavri1.8K views
Storm - As deep into real-time data processing as you can get in 30 minutes. by Dan Lynn
Storm - As deep into real-time data processing as you can get in 30 minutes.Storm - As deep into real-time data processing as you can get in 30 minutes.
Storm - As deep into real-time data processing as you can get in 30 minutes.
Dan Lynn22.7K views
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX by rhatr
Introduction into scalable graph analysis with Apache Giraph and Spark GraphXIntroduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
rhatr3.3K views
Performing Data Science with HBase by WibiData
Performing Data Science with HBasePerforming Data Science with HBase
Performing Data Science with HBase
WibiData 1.5K views
Tech Days Paris Intoduction F# and Collective Intelligence by Robert Pickering
Tech Days Paris Intoduction F# and Collective IntelligenceTech Days Paris Intoduction F# and Collective Intelligence
Tech Days Paris Intoduction F# and Collective Intelligence
Robert Pickering1.1K views
Generics Past, Present and Future (Latest) by RichardWarburton
Generics Past, Present and Future (Latest)Generics Past, Present and Future (Latest)
Generics Past, Present and Future (Latest)
RichardWarburton594 views
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm by DataStax
C*ollege Credit: CEP Distribtued Processing on Cassandra with StormC*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
DataStax5.8K views
Behm Shah Pagerank by gothicane
Behm Shah PagerankBehm Shah Pagerank
Behm Shah Pagerank
gothicane3K views
Seattle Scalability Mahout by Jake Mannix
Seattle Scalability MahoutSeattle Scalability Mahout
Seattle Scalability Mahout
Jake Mannix2.6K views
Generics Past, Present and Future by RichardWarburton
Generics Past, Present and FutureGenerics Past, Present and Future
Generics Past, Present and Future
RichardWarburton2.8K views
Data Analysis with R (combined slides) by Guy Lebanon
Data Analysis with R (combined slides)Data Analysis with R (combined slides)
Data Analysis with R (combined slides)
Guy Lebanon2K views
Introduction to source{d} Engine and source{d} Lookout by source{d}
Introduction to source{d} Engine and source{d} Lookout Introduction to source{d} Engine and source{d} Lookout
Introduction to source{d} Engine and source{d} Lookout
source{d}141 views
Fishing Graphs in a Hadoop Data Lake by Jörg Schad and Max Neunhoeffer at Big... by Big Data Spain
Fishing Graphs in a Hadoop Data Lake by Jörg Schad and Max Neunhoeffer at Big...Fishing Graphs in a Hadoop Data Lake by Jörg Schad and Max Neunhoeffer at Big...
Fishing Graphs in a Hadoop Data Lake by Jörg Schad and Max Neunhoeffer at Big...
Big Data Spain1.3K views
cppt-170218053903 (1).pptx by WatchDog13
cppt-170218053903 (1).pptxcppt-170218053903 (1).pptx
cppt-170218053903 (1).pptx
WatchDog1321 views
Automatic Task-based Code Generation for High Performance DSEL by Joel Falcou
Automatic Task-based Code Generation for High Performance DSELAutomatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSEL
Joel Falcou1.4K views

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx by
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
107 views55 slides
Cloudera Data Impact Awards 2021 - Finalists by
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
6.4K views34 slides
2020 Cloudera Data Impact Awards Finalists by
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
6.3K views43 slides
Edc event vienna presentation 1 oct 2019 by
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
4.5K views67 slides
Machine Learning with Limited Labeled Data 4/3/19 by
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
3.6K views36 slides
Data Driven With the Cloudera Modern Data Warehouse 3.19.19 by
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
2.5K views21 slides

More from Cloudera, Inc.(20)

Partner Briefing_January 25 (FINAL).pptx by Cloudera, Inc.
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.107 views
Cloudera Data Impact Awards 2021 - Finalists by Cloudera, Inc.
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.6.4K views
2020 Cloudera Data Impact Awards Finalists by Cloudera, Inc.
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.6.3K views
Edc event vienna presentation 1 oct 2019 by Cloudera, Inc.
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.4.5K views
Machine Learning with Limited Labeled Data 4/3/19 by Cloudera, Inc.
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.3.6K views
Data Driven With the Cloudera Modern Data Warehouse 3.19.19 by Cloudera, Inc.
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.2.5K views
Introducing Cloudera DataFlow (CDF) 2.13.19 by Cloudera, Inc.
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.4.9K views
Introducing Cloudera Data Science Workbench for HDP 2.12.19 by Cloudera, Inc.
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.2.7K views
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19 by Cloudera, Inc.
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.1.6K views
Leveraging the cloud for analytics and machine learning 1.29.19 by Cloudera, Inc.
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.1.6K views
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19 by Cloudera, Inc.
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.2.5K views
Leveraging the Cloud for Big Data Analytics 12.11.18 by Cloudera, Inc.
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.1.7K views
Modern Data Warehouse Fundamentals Part 3 by Cloudera, Inc.
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.1.3K views
Modern Data Warehouse Fundamentals Part 2 by Cloudera, Inc.
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.2.3K views
Modern Data Warehouse Fundamentals Part 1 by Cloudera, Inc.
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.1.5K views
Extending Cloudera SDX beyond the Platform by Cloudera, Inc.
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.966 views
Federated Learning: ML with Privacy on the Edge 11.15.18 by Cloudera, Inc.
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.2.2K views
Analyst Webinar: Doing a 180 on Customer 360 by Cloudera, Inc.
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.1.4K views
Build a modern platform for anti-money laundering 9.19.18 by Cloudera, Inc.
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.1K views
Introducing the data science sandbox as a service 8.30.18 by Cloudera, Inc.
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.1.2K views

Recently uploaded

Throughput by
ThroughputThroughput
ThroughputMoisés Armani Ramírez
32 views11 slides
Java Platform Approach 1.0 - Picnic Meetup by
Java Platform Approach 1.0 - Picnic MeetupJava Platform Approach 1.0 - Picnic Meetup
Java Platform Approach 1.0 - Picnic MeetupRick Ossendrijver
25 views39 slides
Empathic Computing: Delivering the Potential of the Metaverse by
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the MetaverseMark Billinghurst
449 views80 slides
Transcript: The Details of Description Techniques tips and tangents on altern... by
Transcript: The Details of Description Techniques tips and tangents on altern...Transcript: The Details of Description Techniques tips and tangents on altern...
Transcript: The Details of Description Techniques tips and tangents on altern...BookNet Canada
119 views15 slides
Five Things You SHOULD Know About Postman by
Five Things You SHOULD Know About PostmanFive Things You SHOULD Know About Postman
Five Things You SHOULD Know About PostmanPostman
25 views43 slides
Future of Learning - Yap Aye Wee.pdf by
Future of Learning - Yap Aye Wee.pdfFuture of Learning - Yap Aye Wee.pdf
Future of Learning - Yap Aye Wee.pdfNUS-ISS
38 views11 slides

Recently uploaded(20)

Empathic Computing: Delivering the Potential of the Metaverse by Mark Billinghurst
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the Metaverse
Mark Billinghurst449 views
Transcript: The Details of Description Techniques tips and tangents on altern... by BookNet Canada
Transcript: The Details of Description Techniques tips and tangents on altern...Transcript: The Details of Description Techniques tips and tangents on altern...
Transcript: The Details of Description Techniques tips and tangents on altern...
BookNet Canada119 views
Five Things You SHOULD Know About Postman by Postman
Five Things You SHOULD Know About PostmanFive Things You SHOULD Know About Postman
Five Things You SHOULD Know About Postman
Postman25 views
Future of Learning - Yap Aye Wee.pdf by NUS-ISS
Future of Learning - Yap Aye Wee.pdfFuture of Learning - Yap Aye Wee.pdf
Future of Learning - Yap Aye Wee.pdf
NUS-ISS38 views
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen... by NUS-ISS
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...
NUS-ISS23 views
TE Connectivity: Card Edge Interconnects by CXL Forum
TE Connectivity: Card Edge InterconnectsTE Connectivity: Card Edge Interconnects
TE Connectivity: Card Edge Interconnects
CXL Forum96 views
Understanding GenAI/LLM and What is Google Offering - Felix Goh by NUS-ISS
Understanding GenAI/LLM and What is Google Offering - Felix GohUnderstanding GenAI/LLM and What is Google Offering - Felix Goh
Understanding GenAI/LLM and What is Google Offering - Felix Goh
NUS-ISS39 views
MemVerge: Past Present and Future of CXL by CXL Forum
MemVerge: Past Present and Future of CXLMemVerge: Past Present and Future of CXL
MemVerge: Past Present and Future of CXL
CXL Forum110 views
MemVerge: Gismo (Global IO-free Shared Memory Objects) by CXL Forum
MemVerge: Gismo (Global IO-free Shared Memory Objects)MemVerge: Gismo (Global IO-free Shared Memory Objects)
MemVerge: Gismo (Global IO-free Shared Memory Objects)
CXL Forum112 views
.conf Go 2023 - Data analysis as a routine by Splunk
.conf Go 2023 - Data analysis as a routine.conf Go 2023 - Data analysis as a routine
.conf Go 2023 - Data analysis as a routine
Splunk90 views
Combining Orchestration and Choreography for a Clean Architecture by ThomasHeinrichs1
Combining Orchestration and Choreography for a Clean ArchitectureCombining Orchestration and Choreography for a Clean Architecture
Combining Orchestration and Choreography for a Clean Architecture
ThomasHeinrichs168 views
The Importance of Cybersecurity for Digital Transformation by NUS-ISS
The Importance of Cybersecurity for Digital TransformationThe Importance of Cybersecurity for Digital Transformation
The Importance of Cybersecurity for Digital Transformation
NUS-ISS25 views
Webinar : Competing for tomorrow’s leaders – How MENA insurers can win the wa... by The Digital Insurer
Webinar : Competing for tomorrow’s leaders – How MENA insurers can win the wa...Webinar : Competing for tomorrow’s leaders – How MENA insurers can win the wa...
Webinar : Competing for tomorrow’s leaders – How MENA insurers can win the wa...
"AI Startup Growth from Idea to 1M ARR", Oleksandr Uspenskyi by Fwdays
"AI Startup Growth from Idea to 1M ARR", Oleksandr Uspenskyi"AI Startup Growth from Idea to 1M ARR", Oleksandr Uspenskyi
"AI Startup Growth from Idea to 1M ARR", Oleksandr Uspenskyi
Fwdays26 views
PharoJS - Zürich Smalltalk Group Meetup November 2023 by Noury Bouraqadi
PharoJS - Zürich Smalltalk Group Meetup November 2023PharoJS - Zürich Smalltalk Group Meetup November 2023
PharoJS - Zürich Smalltalk Group Meetup November 2023
Noury Bouraqadi113 views
"Thriving Culture in a Product Company — Practical Story", Volodymyr Tsukur by Fwdays
"Thriving Culture in a Product Company — Practical Story", Volodymyr Tsukur"Thriving Culture in a Product Company — Practical Story", Volodymyr Tsukur
"Thriving Culture in a Product Company — Practical Story", Volodymyr Tsukur
Fwdays40 views

HBaseCon 2012 | Storing and Manipulating Graphs in HBase