SlideShare a Scribd company logo
1 of 91
Download to read offline
FluidDB
  Terry Jones

  terry@jon.es
  @terrycojones
Proposal

  FluidDB is a hosted database
 we (Fluidinfo) will launch - by
 hook or by crook - into Alpha
shortly before EuroPython 2009
Agenda

51% me: Motivations, high-level view of FluidDB
49% you: Questions, details, API, architecture, etc.
Motivations

Contrasts in how we work with information

User interface and API difficulties / restrictions

Personalization

The computational world is not very writable!
Concepts
             Are not owned.
         No formal structure.
       No permission is needed.
     Partial or full disorganization.
     No predefined set of qualities.
  No special central piece of content.
Easy to reorganize or multiply organize.
Concepts
               Are not owned.
           No formal structure.
         No permission is needed.
       Partial or full disorganization.
       No predefined set of qualities.
    No special central piece of content.
  Easy to reorganize or multiply organize.

To engineer this kind of flexibility,
    we must rethink control
UIs and APIs
Where did my information go?
Can I extract, re-use, add, delete?
Is there an API?
What am I allowed to do?
Can I search? On what?
Special pleading
Wouldn't it be cool if...
Personalization
In our hands, not just on our behalf

   Add anything to anything, and search on it

   Protect, share, combine, delete as you wish

   Organize things as you please
Read ⊳ Search ⊳ Write ?

 The web is readable (c. 1990)
 And searchable (c. 1995)
 But not generally writable
Read ⊳ Search ⊳ Write ?

 The web is readable (c. 1990)
 And searchable (c. 1995)
 But not generally writable

 Plus, search is not very interactive:
   Dull: refine query or ask for more results
   Can’t change results
   Can’t organize
Why don’t our architectures
let us work with information
        more flexibly?
What would it take
to build something that did?
How can we address
all these problems at once ?
Alan Kay
At PARC we had a slogan: “Point of view is worth 80 IQ
points.” It was based on a few things from the past like
how smart you had to be in Roman times to multiply two
numbers together; only geniuses did it.

We haven’t gotten any smarter, we’ve just changed our
representation system. We think better generally by
inventing better representations; that’s something that we
as computer scientists recognize as one of the main
things that we try to do.
Alan Kay
At PARC we had a slogan: “Point of view is worth 80 IQ
points.” It was based on a few things from the past like
how smart you had to be in Roman times to multiply two
numbers together; only geniuses did it.

We haven’t gotten any smarter, we’ve just changed our
representation system. We think better generally by
inventing better representations; that’s something that we
as computer scientists recognize as one of the main
things that we try to do.
1             1
     2            10
     4           100
     8          1000
   16          10000
   32         100000
   64        1000000
  128       10000000
  256      100000000
  512     1000000000
1024 +   10000000000 +
??????   11111111111
VIII
 XVII
 XLIV
LXXX
XCVI
CCLV
VIII
 XVII
 XLIV
LXXX
XCVI
CCLV +
VIII
 XVII
 XLIV
LXXX
XCVI
CCLV +
   D
VIII
 XVII
 XLIV
LXXX
XCVI
CCLV +
   D
8 queens problem
Use a poor representation with
264 (281,474,976,710,656) states:

Look for a smart algorithm!


              Or...

Use a good representation with 8! (40,320) states
and exhaustive search as an algorithm!
A humble FluidDB object

         digg.com/date May 21, 2009
            meg/web/rating 6
            tim/seen True
      about http://digg.com/news.html
   mike/opinion “half-baked nonsense”
A humble FluidDB object

         digg.com/date May 21, 2009

   NO       meg/web/rating 6
            tim/seen true


  OWNER!
      about http://digg.com/news.html
   mike/opinion “half-baked nonsense”
Consequences
Where to put related information / metadata?
How to get permission?
No need to anticipate
How to organize?
How to personalize?
Continue to own your data
Level playing field: no rules, & it’s OK to be late
Makes the world writable by default
A handy object
A handy object
A handy object
A handy object
A handy object
A handy object
A handy object
In FluidDB
    europython.eu/2009/time 11:30am
       europython.eu/2009/date June 29, 2009
      europython.eu/2009/duration 30
    europython.eu/2009/speaker Esteve Fernández
europython.eu/2009/title Twisted, AMQP and Thrift
europython.eu/2009/room Arena Foyer
In FluidDB
terry/rating 8
                 terry/will-attend
                 europython.eu/2009/time 11:30am
                    europython.eu/2009/date June 29, 2009
                   europython.eu/2009/duration 30
                 europython.eu/2009/speaker Esteve Fernández
             europython.eu/2009/title Twisted, AMQP and Thrift
            europython.eu/2009/room Arena Foyer
In FluidDB
terry/rating 8
                 terry/will-attend
                 europython.eu/2009/time 11:30am
                    europython.eu/2009/date June 29, 2009
                   europython.eu/2009/duration 30
                  europython.eu/2009/speaker Esteve Fernández
             europython.eu/2009/title Twisted, AMQP and Thrift
            europython.eu/2009/room Arena Foyer
          russell/will-attend
Double-click to edit

   Can you add to this?
   Why not?
   Are you blogging/tweeting about it?
A folder
                                      jack/interesting
digg.com/date        russell/seen       amazon.com/price
   meg/web/rating       james/stars      sam/wishlist
   tim/seen             tim/own
 about                about
mike/opinion        sally/read-it
A folder
                                                   jack/interesting
digg.com/date               russell/seen             amazon.com/price
   meg/web/rating              james/stars            sam/wishlist
   tim/seen                    tim/own
 about                       about
mike/opinion               sally/read-it




                folder-manager/tag folder-manager/tag-F5283AC21
A folder
                                                             jack/interesting
          digg.com/date               russell/seen             amazon.com/price
            meg/web/rating               james/stars            sam/wishlist
             tim/seen                    tim/own
           about                       about
         mike/opinion                sally/read-it
folder-manager/tag-F5283AC21




                          folder-manager/tag folder-manager/tag-F5283AC21
A folder
                                                             jack/interesting
          digg.com/date               russell/seen              amazon.com/price
            meg/web/rating               james/stars            sam/wishlist
             tim/seen                    tim/own
           about                       about
         mike/opinion                sally/read-it
folder-manager/tag-F5283AC21     folder-manager/tag-F5283AC21




                          folder-manager/tag folder-manager/tag-F5283AC21
A folder
                                                             jack/interesting
          digg.com/date               russell/seen              amazon.com/price
            meg/web/rating               james/stars            sam/wishlist
             tim/seen                    tim/own       folder-manager/tag-F5283AC21
           about                       about
         mike/opinion                sally/read-it
folder-manager/tag-F5283AC21     folder-manager/tag-F5283AC21




                          folder-manager/tag folder-manager/tag-F5283AC21
A folder
                                                             jack/interesting
          digg.com/date               russell/seen              amazon.com/price
            meg/web/rating               james/stars            sam/wishlist
             tim/seen                    tim/own       folder-manager/tag-F5283AC21
           about                       about
         mike/opinion                sally/read-it
folder-manager/tag-F5283AC21     folder-manager/tag-F5283AC21




                          folder-manager/tag folder-manager/tag-F5283AC21
                          folder-manager/name folder-manager/name-624CA19
A folder
folder-manager/name-624CA19 chapter 1
                                                               jack/interesting
           digg.com/date                 russell/seen            amazon.com/price
              meg/web/rating                james/stars           sam/wishlist
              tim/seen                      tim/own       folder-manager/tag-F5283AC21
            about                         about
           mike/opinion                 sally/read-it
 folder-manager/tag-F5283AC21     folder-manager/tag-F5283AC21




                           folder-manager/tag folder-manager/tag-F5283AC21
                           folder-manager/name folder-manager/name-624CA19
A folder
folder-manager/name-624CA19 chapter 1
                                                               jack/interesting
           digg.com/date                 russell/seen             amazon.com/price
              meg/web/rating                james/stars           sam/wishlist
              tim/seen                      tim/own       folder-manager/tag-F5283AC21
            about                         about
           mike/opinion                 sally/read-it
 folder-manager/tag-F5283AC21      folder-manager/tag-F5283AC21

                                folder-manager/name-624CA19 chapter 2



                           folder-manager/tag folder-manager/tag-F5283AC21
                           folder-manager/name folder-manager/name-624CA19
What’s FluidDB about?


   Whatever you like
A voting box
A voting box

       pat/vote
A voting box

       pat/vote



       james/vote
A voting box

         pat/vote



          james/vote

     sally/vote
A voting box

         pat/vote

            tim/vote

          james/vote

     sally/vote
A voting box
 russell/vote
                pat/vote

                  tim/vote

                james/vote

           sally/vote
A voting box
    russell/vote
                   pat/vote

                     tim/vote

                   james/vote

              sally/vote
sofia/vote
A voting box
    russell/vote
                   pat/vote

                     tim/vote



              sally/vote
sofia/vote
Inter-process comms
Inter-process comms

producer/data/01
Inter-process comms

producer/data/01

 producer/data/02
Inter-process comms

producer/data/01    consumer1/seq 02
 producer/data/02
Inter-process comms

producer/data/01    consumer1/seq 02
 producer/data/02

producer/data/03
Inter-process comms

producer/data/01    consumer1/seq 02
 producer/data/02

producer/data/03
                    consumer2/seq 03
Inter-process comms

                    consumer1/seq 02
 producer/data/02

producer/data/03
                    consumer2/seq 03
Social data

Data that increases in value when it’s co-located
All data is social, once we begin to interact with it
Info in context
Information is more useful in context:
  Google
  Wikipedia
FluidDB can do a similiar thing, but for DBs & apps
  Data is held in silos by apps
  Behind restrictive APIs
FluidDB

Database of these simple objects

Enables sharing between applications, people

Single-instance, hosted (like SimpleDB)

Distributed storage and query processing
Sack the golden towns of Montezuma!

  “My dear fellow,” Burlingame said caustically, “we sit on a
  blind rock careening through space; we are all of us
  rushing headlong to the grave. Think you the worms will
  care, when anon they make a meal of you, whether you
  spent your moment sighing wigless in your chamber, or
  sacked the golden towns of Montezuma? Lookee, the day’s
  nigh spent; ‘tis gone careening into time forever. Not a
  tale’s length since we lined our bowels with dinner, and
  already they growl for more. We are dying men Ebenezer:
  i’faith, there’s time for nought but bold resolves!”

                             John Barth The Sotweed Factor
Sack the golden towns of Montezuma!

  “My dear fellow,” Burlingame said caustically, “we sit on a
  blind rock careening through space; we are all of us
  rushing headlong to the grave. Think you the worms will
  care, when anon they make a meal of you, whether you
  spent your moment sighing wigless in your chamber, or
  sacked the golden towns of Montezuma? Lookee, the day’s
  nigh spent; ‘tis gone careening into time forever. Not a
  tale’s length since we lined our bowels with dinner, and
  already they growl for more. We are dying men Ebenezer:
  i’faith, there’s time for nought but bold resolves!”

                             John Barth The Sotweed Factor
Challenge
Engineer a system that can efficiently hold an
effectively unlimited number of these objects

Let each object have an effectively unlimited
number of tags

Design a query language, permissions, and API

Make simple and common things fast
Make it realizable
OK, your 49%
Questions        Apps

HTTP API         Storage

Query language   Architecture

Python           Data

Permissions      Demo
Access           Release!?
Design goals
Simple data model
Simple permissions model
Simple, easily parallelizable, query language
Horizontally scalable
Fast for common tasks
Implementable!
Permissions

For each action on a namespace or tag:

  There’s a policy: ‘open’ or ‘closed’

  And a (possibly empty) list of exceptions
Permissions
  tag or
               action   policy   exceptions
namespace
  tim/seen      read    closed    tim, meg
mike/opinion   update   open
   mike/       create   closed     mike
 meg/rating     see     open
 meg/rating     read    closed      meg
Query language
Numeric: tag value (=, <, etc.)
Textual: tag text match
Presence: has attribute
Grouping/logic: (...), and, or, not
Query language
   Numeric: tag value (=, <, etc.)
   Textual: tag text match
   Presence: has attribute
   Grouping/logic: (...), and, or, not


Designed to bring back object ids
Data buffet
username seen, lastvisited, rating, goingtoread, comment
username me, FBfriend, linkedInContact, met, family
username/myusername name, password
flickr.com longitude, latitude, owner, date, camera/make
username/flickr title, description
VCspotter fredwilson, bradburnham
nasdaq.com name, symbol, type, outstanding, value, price
username/stocks shares, date
google.com pagerank
tracks album, artist, name, year
username/music count, favorite, lastPlay, stars, bestOf2007
Data buffet 2
email messageId, fromId, toId
username/email
 from, to, subject, date
alexa.com rank
digg.com title, description, date, diggs
mahalo.com appeared, category
readwriteweb.com appeared
reddit.com date, score
techcrunch.com appeared, URI
attribute description, name, path
namespace description, name, path
Example queries
terry/rating > 5 and has reddit.com/score

has goingtoread and seen > "January 1, 2008"

has FBfriend and has linkedInContact

has james/FBfriend and not has anne/FBfriend

alexa.com/rank < 50 and fred/comment ~ cool

has reddit.com/score and not has digg.com/diggs

has readwriteweb.com/appeared and not has
techcrunch.com/appeared
More queries
terry/seen > "July 1, 2007"

has russell/myusername/name and not has
terry/myusername/name

flickr.com/latitude > 52.15 and flickr.com/
latitude < 52.35 and flickr.com/camera/make ~
Sony and has sally/seen

amazon.com/stars > 3 and amazon.com/price <
20 and amazon.com/title ~ chess and peter/
bookrating > 3 sort by amazon.com/
publication-date
Architecture

 Software
 Communications
 Functional
 Storage
 Query processing
 Per box
Software
Python

Twisted

PostgreSQL

Thrift

AMQP (RabbitMQ)

Lucene
d
d
                  Functional
                                    objects
           http   facade    sets
    apps                           objects
           http   facade    sets namespace

                  coord            namespace
                  coord               tag
                                      tag
                  control            text
                                     text
d
d
           Functional
                            objects
    http   facade   sets
                           objects
    http   facade   sets namespace

           coord           namespace
           coord              tag
                              tag
                             text
                             text
d
d
           Functional
                           objects
    http   facade   sets
                           objects
    http   facade   sets namespace

           coord    amqp namespace
           coord    amqp    tag
                             tag
                            text
                            text
d
d
           Functional
                           objects
    http   facade   sets
                           objects
    http   facade   sets namespace
    xmpp   coord    amqp namespace
    xmpp   coord    amqp    tag
                             tag
                            text
                            text
d
d
            Functional
                            objects
    http    facade   sets
                            objects
    http    facade   sets namespace
    xmpp    coord    amqp namespace
    xmpp    coord    amqp    tag
    other                     tag
    other                    text
                             text
d
d
                   Functional
                                   objects
    load   http    facade   sets
                                   objects
    load   http    facade   sets namespace
    load xmpp      coord    amqp namespace
    load xmpp      coord    amqp    tag
           other                     tag
           other                    text
                                    text
d
d
                   Functional
                                       objects
    load   http    facade     sets
                                     objects
    load   http    facade     sets namespace
    load xmpp      coord     amqp namespace
    load xmpp      coord     amqp    tag
           other            memcache     tag
           other                        text
                            memcache
                                        text
d
d
                   Functional
                                       objects   db
    load   http    facade     sets
                                     objects     db
    load   http    facade     sets namespace     db
    load xmpp      coord     amqp namespace      db
    load xmpp      coord     amqp    tag         db
           other            memcache     tag     db
           other                        text     db
                            memcache
                                        text     db
d
d
                   Functional
                                       objects   db   kv
    load   http    facade     sets
                                     objects     db
    load   http    facade     sets namespace     db
    load xmpp      coord     amqp namespace      db
    load xmpp      coord     amqp    tag         db
           other            memcache     tag     db
           other                        text     db
                            memcache
                                        text     db
Tag Storage
meg/rating                  tim/books/opinion
object id user id   value   object id user id    value
1234567 667           26    526141     362       nice
6527527 667          188    726483     362        fun
2876281     17       207    635378     362      boring
7628876 667         1225    477582     362       sexy
                            362782     362       long
  PostgreSQL
  Tall tables
  Independent (column store)
  Backed by key/value store (Amazon S3, for now)
Query processing

                       and

digg/date > “Monday”     or

        meg/rating > 5        has tim/seen
Query processing
                       tag   set ops

digg/date > “Monday”


meg/rating > 5


has tim/seen
Tag affinity
                       attr   set ops

digg/date > “Monday”




meg/rating > 5
has tim/seen
Per box
A controller service, launched on boot

The controller launches new services (processes)

All services talk AMQP as well as pure Thrift

A coordinator brings up new boxes & services

We use Amazon EC2, for now

More Related Content

Recently uploaded

Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 

Recently uploaded (20)

Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 

Introducing FluidDB

  • 1. FluidDB Terry Jones terry@jon.es @terrycojones
  • 2. Proposal FluidDB is a hosted database we (Fluidinfo) will launch - by hook or by crook - into Alpha shortly before EuroPython 2009
  • 3. Agenda 51% me: Motivations, high-level view of FluidDB 49% you: Questions, details, API, architecture, etc.
  • 4. Motivations Contrasts in how we work with information User interface and API difficulties / restrictions Personalization The computational world is not very writable!
  • 5. Concepts Are not owned. No formal structure. No permission is needed. Partial or full disorganization. No predefined set of qualities. No special central piece of content. Easy to reorganize or multiply organize.
  • 6. Concepts Are not owned. No formal structure. No permission is needed. Partial or full disorganization. No predefined set of qualities. No special central piece of content. Easy to reorganize or multiply organize. To engineer this kind of flexibility, we must rethink control
  • 7. UIs and APIs Where did my information go? Can I extract, re-use, add, delete? Is there an API? What am I allowed to do? Can I search? On what? Special pleading Wouldn't it be cool if...
  • 8. Personalization In our hands, not just on our behalf Add anything to anything, and search on it Protect, share, combine, delete as you wish Organize things as you please
  • 9. Read ⊳ Search ⊳ Write ? The web is readable (c. 1990) And searchable (c. 1995) But not generally writable
  • 10. Read ⊳ Search ⊳ Write ? The web is readable (c. 1990) And searchable (c. 1995) But not generally writable Plus, search is not very interactive: Dull: refine query or ask for more results Can’t change results Can’t organize
  • 11. Why don’t our architectures let us work with information more flexibly?
  • 12. What would it take to build something that did?
  • 13. How can we address all these problems at once ?
  • 14. Alan Kay At PARC we had a slogan: “Point of view is worth 80 IQ points.” It was based on a few things from the past like how smart you had to be in Roman times to multiply two numbers together; only geniuses did it. We haven’t gotten any smarter, we’ve just changed our representation system. We think better generally by inventing better representations; that’s something that we as computer scientists recognize as one of the main things that we try to do.
  • 15. Alan Kay At PARC we had a slogan: “Point of view is worth 80 IQ points.” It was based on a few things from the past like how smart you had to be in Roman times to multiply two numbers together; only geniuses did it. We haven’t gotten any smarter, we’ve just changed our representation system. We think better generally by inventing better representations; that’s something that we as computer scientists recognize as one of the main things that we try to do.
  • 16. 1 1 2 10 4 100 8 1000 16 10000 32 100000 64 1000000 128 10000000 256 100000000 512 1000000000 1024 + 10000000000 + ?????? 11111111111
  • 21. 8 queens problem Use a poor representation with 264 (281,474,976,710,656) states: Look for a smart algorithm! Or... Use a good representation with 8! (40,320) states and exhaustive search as an algorithm!
  • 22. A humble FluidDB object digg.com/date May 21, 2009 meg/web/rating 6 tim/seen True about http://digg.com/news.html mike/opinion “half-baked nonsense”
  • 23. A humble FluidDB object digg.com/date May 21, 2009 NO meg/web/rating 6 tim/seen true OWNER! about http://digg.com/news.html mike/opinion “half-baked nonsense”
  • 24. Consequences Where to put related information / metadata? How to get permission? No need to anticipate How to organize? How to personalize? Continue to own your data Level playing field: no rules, & it’s OK to be late Makes the world writable by default
  • 32. In FluidDB europython.eu/2009/time 11:30am europython.eu/2009/date June 29, 2009 europython.eu/2009/duration 30 europython.eu/2009/speaker Esteve Fernández europython.eu/2009/title Twisted, AMQP and Thrift europython.eu/2009/room Arena Foyer
  • 33. In FluidDB terry/rating 8 terry/will-attend europython.eu/2009/time 11:30am europython.eu/2009/date June 29, 2009 europython.eu/2009/duration 30 europython.eu/2009/speaker Esteve Fernández europython.eu/2009/title Twisted, AMQP and Thrift europython.eu/2009/room Arena Foyer
  • 34. In FluidDB terry/rating 8 terry/will-attend europython.eu/2009/time 11:30am europython.eu/2009/date June 29, 2009 europython.eu/2009/duration 30 europython.eu/2009/speaker Esteve Fernández europython.eu/2009/title Twisted, AMQP and Thrift europython.eu/2009/room Arena Foyer russell/will-attend
  • 35. Double-click to edit Can you add to this? Why not? Are you blogging/tweeting about it?
  • 36. A folder jack/interesting digg.com/date russell/seen amazon.com/price meg/web/rating james/stars sam/wishlist tim/seen tim/own about about mike/opinion sally/read-it
  • 37. A folder jack/interesting digg.com/date russell/seen amazon.com/price meg/web/rating james/stars sam/wishlist tim/seen tim/own about about mike/opinion sally/read-it folder-manager/tag folder-manager/tag-F5283AC21
  • 38. A folder jack/interesting digg.com/date russell/seen amazon.com/price meg/web/rating james/stars sam/wishlist tim/seen tim/own about about mike/opinion sally/read-it folder-manager/tag-F5283AC21 folder-manager/tag folder-manager/tag-F5283AC21
  • 39. A folder jack/interesting digg.com/date russell/seen amazon.com/price meg/web/rating james/stars sam/wishlist tim/seen tim/own about about mike/opinion sally/read-it folder-manager/tag-F5283AC21 folder-manager/tag-F5283AC21 folder-manager/tag folder-manager/tag-F5283AC21
  • 40. A folder jack/interesting digg.com/date russell/seen amazon.com/price meg/web/rating james/stars sam/wishlist tim/seen tim/own folder-manager/tag-F5283AC21 about about mike/opinion sally/read-it folder-manager/tag-F5283AC21 folder-manager/tag-F5283AC21 folder-manager/tag folder-manager/tag-F5283AC21
  • 41. A folder jack/interesting digg.com/date russell/seen amazon.com/price meg/web/rating james/stars sam/wishlist tim/seen tim/own folder-manager/tag-F5283AC21 about about mike/opinion sally/read-it folder-manager/tag-F5283AC21 folder-manager/tag-F5283AC21 folder-manager/tag folder-manager/tag-F5283AC21 folder-manager/name folder-manager/name-624CA19
  • 42. A folder folder-manager/name-624CA19 chapter 1 jack/interesting digg.com/date russell/seen amazon.com/price meg/web/rating james/stars sam/wishlist tim/seen tim/own folder-manager/tag-F5283AC21 about about mike/opinion sally/read-it folder-manager/tag-F5283AC21 folder-manager/tag-F5283AC21 folder-manager/tag folder-manager/tag-F5283AC21 folder-manager/name folder-manager/name-624CA19
  • 43. A folder folder-manager/name-624CA19 chapter 1 jack/interesting digg.com/date russell/seen amazon.com/price meg/web/rating james/stars sam/wishlist tim/seen tim/own folder-manager/tag-F5283AC21 about about mike/opinion sally/read-it folder-manager/tag-F5283AC21 folder-manager/tag-F5283AC21 folder-manager/name-624CA19 chapter 2 folder-manager/tag folder-manager/tag-F5283AC21 folder-manager/name folder-manager/name-624CA19
  • 44. What’s FluidDB about? Whatever you like
  • 46. A voting box pat/vote
  • 47. A voting box pat/vote james/vote
  • 48. A voting box pat/vote james/vote sally/vote
  • 49. A voting box pat/vote tim/vote james/vote sally/vote
  • 50. A voting box russell/vote pat/vote tim/vote james/vote sally/vote
  • 51. A voting box russell/vote pat/vote tim/vote james/vote sally/vote sofia/vote
  • 52. A voting box russell/vote pat/vote tim/vote sally/vote sofia/vote
  • 56. Inter-process comms producer/data/01 consumer1/seq 02 producer/data/02
  • 57. Inter-process comms producer/data/01 consumer1/seq 02 producer/data/02 producer/data/03
  • 58. Inter-process comms producer/data/01 consumer1/seq 02 producer/data/02 producer/data/03 consumer2/seq 03
  • 59. Inter-process comms consumer1/seq 02 producer/data/02 producer/data/03 consumer2/seq 03
  • 60. Social data Data that increases in value when it’s co-located All data is social, once we begin to interact with it
  • 61. Info in context Information is more useful in context: Google Wikipedia FluidDB can do a similiar thing, but for DBs & apps Data is held in silos by apps Behind restrictive APIs
  • 62. FluidDB Database of these simple objects Enables sharing between applications, people Single-instance, hosted (like SimpleDB) Distributed storage and query processing
  • 63. Sack the golden towns of Montezuma! “My dear fellow,” Burlingame said caustically, “we sit on a blind rock careening through space; we are all of us rushing headlong to the grave. Think you the worms will care, when anon they make a meal of you, whether you spent your moment sighing wigless in your chamber, or sacked the golden towns of Montezuma? Lookee, the day’s nigh spent; ‘tis gone careening into time forever. Not a tale’s length since we lined our bowels with dinner, and already they growl for more. We are dying men Ebenezer: i’faith, there’s time for nought but bold resolves!” John Barth The Sotweed Factor
  • 64. Sack the golden towns of Montezuma! “My dear fellow,” Burlingame said caustically, “we sit on a blind rock careening through space; we are all of us rushing headlong to the grave. Think you the worms will care, when anon they make a meal of you, whether you spent your moment sighing wigless in your chamber, or sacked the golden towns of Montezuma? Lookee, the day’s nigh spent; ‘tis gone careening into time forever. Not a tale’s length since we lined our bowels with dinner, and already they growl for more. We are dying men Ebenezer: i’faith, there’s time for nought but bold resolves!” John Barth The Sotweed Factor
  • 65. Challenge Engineer a system that can efficiently hold an effectively unlimited number of these objects Let each object have an effectively unlimited number of tags Design a query language, permissions, and API Make simple and common things fast Make it realizable
  • 66. OK, your 49% Questions Apps HTTP API Storage Query language Architecture Python Data Permissions Demo Access Release!?
  • 67. Design goals Simple data model Simple permissions model Simple, easily parallelizable, query language Horizontally scalable Fast for common tasks Implementable!
  • 68. Permissions For each action on a namespace or tag: There’s a policy: ‘open’ or ‘closed’ And a (possibly empty) list of exceptions
  • 69. Permissions tag or action policy exceptions namespace tim/seen read closed tim, meg mike/opinion update open mike/ create closed mike meg/rating see open meg/rating read closed meg
  • 70. Query language Numeric: tag value (=, <, etc.) Textual: tag text match Presence: has attribute Grouping/logic: (...), and, or, not
  • 71. Query language Numeric: tag value (=, <, etc.) Textual: tag text match Presence: has attribute Grouping/logic: (...), and, or, not Designed to bring back object ids
  • 72. Data buffet username seen, lastvisited, rating, goingtoread, comment username me, FBfriend, linkedInContact, met, family username/myusername name, password flickr.com longitude, latitude, owner, date, camera/make username/flickr title, description VCspotter fredwilson, bradburnham nasdaq.com name, symbol, type, outstanding, value, price username/stocks shares, date google.com pagerank tracks album, artist, name, year username/music count, favorite, lastPlay, stars, bestOf2007
  • 73. Data buffet 2 email messageId, fromId, toId username/email from, to, subject, date alexa.com rank digg.com title, description, date, diggs mahalo.com appeared, category readwriteweb.com appeared reddit.com date, score techcrunch.com appeared, URI attribute description, name, path namespace description, name, path
  • 74. Example queries terry/rating > 5 and has reddit.com/score has goingtoread and seen > "January 1, 2008" has FBfriend and has linkedInContact has james/FBfriend and not has anne/FBfriend alexa.com/rank < 50 and fred/comment ~ cool has reddit.com/score and not has digg.com/diggs has readwriteweb.com/appeared and not has techcrunch.com/appeared
  • 75. More queries terry/seen > "July 1, 2007" has russell/myusername/name and not has terry/myusername/name flickr.com/latitude > 52.15 and flickr.com/ latitude < 52.35 and flickr.com/camera/make ~ Sony and has sally/seen amazon.com/stars > 3 and amazon.com/price < 20 and amazon.com/title ~ chess and peter/ bookrating > 3 sort by amazon.com/ publication-date
  • 76. Architecture Software Communications Functional Storage Query processing Per box
  • 78. d d Functional objects http facade sets apps objects http facade sets namespace coord namespace coord tag tag control text text
  • 79. d d Functional objects http facade sets objects http facade sets namespace coord namespace coord tag tag text text
  • 80. d d Functional objects http facade sets objects http facade sets namespace coord amqp namespace coord amqp tag tag text text
  • 81. d d Functional objects http facade sets objects http facade sets namespace xmpp coord amqp namespace xmpp coord amqp tag tag text text
  • 82. d d Functional objects http facade sets objects http facade sets namespace xmpp coord amqp namespace xmpp coord amqp tag other tag other text text
  • 83. d d Functional objects load http facade sets objects load http facade sets namespace load xmpp coord amqp namespace load xmpp coord amqp tag other tag other text text
  • 84. d d Functional objects load http facade sets objects load http facade sets namespace load xmpp coord amqp namespace load xmpp coord amqp tag other memcache tag other text memcache text
  • 85. d d Functional objects db load http facade sets objects db load http facade sets namespace db load xmpp coord amqp namespace db load xmpp coord amqp tag db other memcache tag db other text db memcache text db
  • 86. d d Functional objects db kv load http facade sets objects db load http facade sets namespace db load xmpp coord amqp namespace db load xmpp coord amqp tag db other memcache tag db other text db memcache text db
  • 87. Tag Storage meg/rating tim/books/opinion object id user id value object id user id value 1234567 667 26 526141 362 nice 6527527 667 188 726483 362 fun 2876281 17 207 635378 362 boring 7628876 667 1225 477582 362 sexy 362782 362 long PostgreSQL Tall tables Independent (column store) Backed by key/value store (Amazon S3, for now)
  • 88. Query processing and digg/date > “Monday” or meg/rating > 5 has tim/seen
  • 89. Query processing tag set ops digg/date > “Monday” meg/rating > 5 has tim/seen
  • 90. Tag affinity attr set ops digg/date > “Monday” meg/rating > 5 has tim/seen
  • 91. Per box A controller service, launched on boot The controller launches new services (processes) All services talk AMQP as well as pure Thrift A coordinator brings up new boxes & services We use Amazon EC2, for now