Gofer
   A scalable stateless proxy for DBI




Tim.Bunce@pobox.com - September 2008
DBI Layer Cake
Application
    DBI
 DBD::Foo
Database API



                 Database
New Layers
 Application          Gofer Transport

     DBI                    DBI
 DBD::Gofer               DBD::Foo
Gofer...
Gofer gives you...
                               Few
        Application         round-trips           Gofer Transport

 ...
DBD::Proxy vs DBD::Gofer
                              DBD::Proxy   DBD::Gofer
  Supports transactions           ✓        ...
Gofer, logically
•   Gofer is
    - A scalable stateless proxy architecture for DBI
    - Transport independent
    - High...
Gofer, structurally
•   Gofer is
    - A simple stateless request/response protocol
    - A DBI proxy driver: DBD::Gofer
 ...
Gofer Protocol
•   DBI::Gofer::Request & DBI::Gofer::Response
•   Simple blessed hashes
    - Request contains all require...
Using DBD::Gofer
•   Via DSN
    - By adding a prefix
    $dsn = “dbi:Driver:dbname”;
    $dsn = “dbi:Gofer:transport=foo;d...
Gofer Transports
•   DBI::Gofer::Transport::null
    - The ‘null’ transport.
    - Serializes request object,
      transp...
Gofer Transports
•   DBD::Gofer::Transport::stream (ssh)
    - Can ssh to remote system to self-start server
       ssh -x...
Gofer Transports
•   DBD::Gofer::Transport::http
    - Sends requests as http POST requests
    - Server typically Apache ...
Gofer Transports
•   DBD::Gofer::Transport::gearman
    - Distributes requests to a pool of workers
    - Gearman a lightw...
Pooling via gearman vs http
• I haven’t compared them in use myself yet
+ Gearman may have lower latency
+ Gearman spreads...
DBD::Gofer
• A proxy driver
• Aims to be as ‘transparent’ as possible
• Accumulates details of DBI method calls
• Delays f...
DBD::Gofer::Policy::*
•   Three policies supplied: pedantic, classic, and
    rush. Classic is the default.
•   Policies a...
Round-trips per Policy
                             pedantic classic       rush
$dbh = DBI->connect_cached   connect()    ...
Error Handling
•   DBD::Gofer can automatically retry on failure
DBI_AUTOPROXY=“dbi:Gofer:transport=...;
retry_limit=3”
• ...
Client-side Caching
•   DBD::Gofer can automatically cache results
DBI_AUTOPROXY=“dbi:Gofer:transport=null
;cache=CacheMod...
Server-side Caching
•   DBI::Gofer::Transport::* can automatically
    cache results
•   Use any cache module with standar...
Gofer Caveats
•   Adds latency time cost per network round-trip
    - Can minimize round-trips by fine-tuning a policy
    ...
An Example
Using Gofer for Connection Pooling,
   Scaling, and High Availability
The Problem
    Apache
               40 worker processes per server
     Worker
     Worker
     Worker
     Worker
     ...
A Solution
                        A ‘database proxy’ that
    Apache
     Worker
                        does connection ...
An Implementation
                 Standard apache+mod_perl
    Apache
     Worker
     Worker
     Worker
     Worker
   ...
Load Balance and Cache
                                                                                                   ...
Gofer’s Future
•   Caching for http transport
•   Optional JSON serialization
•   Caching in DBD::Gofer (now done)
•   Mak...
Future: http caching
• Potential big win
• DBD::Gofer needs to indicate cache-ability
 - via appropriate http headers
• Se...
Future: JSON
• Turns DBI into a web service!
 - Service Oriented Architecture anyone?
• Accessible to anything that can ta...
Future: Transactions
• State-less-ness could be made optional
 - If transport layer being used agrees
 - Easiest to implem...
Questions?
DBD::Gofer 200809
Upcoming SlideShare
Loading in...5
×

DBD::Gofer 200809

2,911

Published on

DBD::Gofer is the scalable stateless proxy driver for Perl DBI.

These are the slides for my lightning talk on DBD::Gofer given at the Italian Perl Workshop in 2008 (with a few extra slides added).

Published in: Business, Technology
1 Comment
0 Likes
Statistics
Notes
  • Be the first to like this

No Downloads
Views
Total Views
2,911
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
13
Comments
1
Likes
0
Embeds 0
No embeds

No notes for slide

DBD::Gofer 200809

  1. 1. Gofer A scalable stateless proxy for DBI Tim.Bunce@pobox.com - September 2008
  2. 2. DBI Layer Cake Application DBI DBD::Foo Database API Database
  3. 3. New Layers Application Gofer Transport DBI DBI DBD::Gofer DBD::Foo Gofer Transport Database API Database
  4. 4. Gofer gives you... Few Application round-trips Gofer Transport DBI DBI DBD::Gofer DBD::Foo Gofer Transport Stateless protocol Server-side Database DBD caching Client-side only needed caching on server Pluggable Retry on Configure transports Flexible and easy to extend error via DSN ssh, http, etc... e.g. DBIx::Router
  5. 5. DBD::Proxy vs DBD::Gofer DBD::Proxy DBD::Gofer Supports transactions ✓ ✗(not yet) Supports very large results ✓ ✗(memory) Large test suite ✗ ✓ Minimal round-trips ✗ ✓ Automatic retry on error ✗ ✓ Modular & Pluggable classes ✗ ✓ Tunable via Policies ✗ ✓ Scalable ✗ ✓ Connection pooling ✗ ✓ Client-side caching ✗ ✓ Server-side caching ✗ ✓
  6. 6. Gofer, logically • Gofer is - A scalable stateless proxy architecture for DBI - Transport independent - Highly configurable on client and server side - Efficient, in CPU time and minimal round-trips - Well tested - Scalable - Cacheable - Simple and reliable
  7. 7. Gofer, structurally • Gofer is - A simple stateless request/response protocol - A DBI proxy driver: DBD::Gofer - A request executor module - A set of pluggable transport modules - An extensible client configuration mechanism • Development sponsored by Shopzilla.com
  8. 8. Gofer Protocol • DBI::Gofer::Request & DBI::Gofer::Response • Simple blessed hashes - Request contains all required information to connect and execute the requested methods. - Response contains results from methods calls, including result sets (rows of data). • Serialized and transported by transport modules like DBI::Gofer::Transport::http
  9. 9. Using DBD::Gofer • Via DSN - By adding a prefix $dsn = “dbi:Driver:dbname”; $dsn = “dbi:Gofer:transport=foo;dsn=$dsn”; • Via DBI_AUTOPROXY environment variable - Automatically applies to all DBI connect calls $ export DBI_AUTOPROXY=“dbi:Gofer:transport=foo”; - No code changes required!
  10. 10. Gofer Transports • DBI::Gofer::Transport::null - The ‘null’ transport. - Serializes request object, transports it nowhere, then deserializes and passes it to DBI::Gofer::Execute to execute - Serializes response object, transports it nowhere, then deserializes and returns it to caller - Very useful for testing. DBI_AUTOPROXY=“dbi:Gofer:transport=null”
  11. 11. Gofer Transports • DBD::Gofer::Transport::stream (ssh) - Can ssh to remote system to self-start server ssh -xq user@host.domain perl -MDBI::Gofer::Transport::stream -e run_stdio_hex - Automatically reconnects if required - ssh gives you security and optional compression DBI_AUTOPROXY=’dbi:Gofer:transport=stream ;url=ssh:user@host.domain’
  12. 12. Gofer Transports • DBD::Gofer::Transport::http - Sends requests as http POST requests - Server typically Apache mod_perl running DBI::Gofer::Transport::http - Very flexible server-side configuration options - Can use https for security - Can use web techniques for scaling and high- availability. Will support web caching. DBI_AUTOPROXY=’dbi:Gofer:transport=http ;url=http://example.com/gofer’
  13. 13. Gofer Transports • DBD::Gofer::Transport::gearman - Distributes requests to a pool of workers - Gearman a lightweight distributed job queue http://www.danga.com/gearman - Gearman is implemented by the same people who wrote memcached, perlbal, mogileFS, & DJabberd DBI_AUTOPROXY=’dbi:Gofer:transport=gearman ;url=http://example.com/gofer’
  14. 14. Pooling via gearman vs http • I haven’t compared them in use myself yet + Gearman may have lower latency + Gearman spreads load over multiple machines without need for load-balancer + Gearman coalescing may be beneficial + Gearman async client may work well with POE - Gearman clients need to be told about servers • More gearman info http://danga.com/words/ 2007_04_linuxfest_nw/linuxfest.pdf (p61+)
  15. 15. DBD::Gofer • A proxy driver • Aims to be as ‘transparent’ as possible • Accumulates details of DBI method calls • Delays forwarding request for as long as possible • execute_array() is a single round-trip • Policy mechanism allows fine-grained tuning to trade transparency for speed
  16. 16. DBD::Gofer::Policy::* • Three policies supplied: pedantic, classic, and rush. Classic is the default. • Policies are implemented as classes • Currently 22 individual items within a policy • Policy items can be dynamic methods • Policy is selected via DSN: DBI_AUTOPROXY=“dbi:Gofer:transport=null ;policy=pedantic”
  17. 17. Round-trips per Policy pedantic classic rush $dbh = DBI->connect_cached connect() ✓ $dbh->ping ✓ if not if not $dbh->quote ✓ default default $sth = $dbh->prepare ✓ $sth->execute ✓ ✓ ✓ $sth->{NUM_OF_FIELDS} $sth->fetchrow_array cached $dbh->tables ✓ ✓ after first
  18. 18. Error Handling • DBD::Gofer can automatically retry on failure DBI_AUTOPROXY=“dbi:Gofer:transport=...; retry_limit=3” • Default behaviour is to retry if $request->is_idemponent is true - looks at SQL, returns true for most SELECTs • Default is retry_limit=0, so disabled • You can define your own behaviour: DBI->connect(..., { go_retry_hook => sub { ... }, });
  19. 19. Client-side Caching • DBD::Gofer can automatically cache results DBI_AUTOPROXY=“dbi:Gofer:transport=null ;cache=CacheModuleName” • Default behaviour is to cache if $request->is_idemponent is true looks at SQL returns true for most SELECTs • Use any cache module with standard interface set($k,$v) and get($k) • You can define your own behaviour by subclassing
  20. 20. Server-side Caching • DBI::Gofer::Transport::* can automatically cache results • Use any cache module with standard interface - set($k,$v) and get($k) • You can define your own behaviour by subclassing
  21. 21. Gofer Caveats • Adds latency time cost per network round-trip - Can minimize round-trips by fine-tuning a policy - and/or using client-side caching • State-less-ness has implications... - No transactions, AutoCommit only, at the moment. - Can’t alter $dbh attributes after connect - Can’t use temp tables, locks, and other per-connection persistent state, except within a stored procedure - Code using last_insert_id needs a (simple) change - See the docs for a few other very obscure caveats
  22. 22. An Example Using Gofer for Connection Pooling, Scaling, and High Availability
  23. 23. The Problem Apache 40 worker processes per server Worker Worker Worker Worker Worker +vers Worker Worker Worker Worker Worker 150 ser Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Database Worker Worker Server Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker = Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Workers 1 overloaded database -
  24. 24. A Solution A ‘database proxy’ that Apache Worker does connection pooling Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Database Worker Worker Server Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Holds a few Worker Worker Worker Worker connections open Worker Worker Worker Worker Worker Workers Repeat for all servers - Uses them to ser vice DBI requests
  25. 25. An Implementation Standard apache+mod_perl Apache Worker Worker Worker Worker Worker Worker Worker Worker Worker DBI + DBD::* Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Apache Worker Database Worker Worker Server Worker Worker Worker !"#$%#& Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker DBI::Gofer::Execute and Worker Worker Worker Worker DBI::Gofer::Transport::http Workers modules implement stateless proxy - DBD::Gofer with http transport
  26. 26. Load Balance and Cache Load balancing server and fail-over Apache running DBI Gofer with 40 workers Far fewer db HTTP Load Balancer and Cache connections Workers x Many = 6000 db 150 web Gofer Database connections servers Servers Server servers Apache running DBI Gofer Workers Caching - New middle tier Caching
  27. 27. Gofer’s Future • Caching for http transport • Optional JSON serialization • Caching in DBD::Gofer (now done) • Make state-less-ness of transport optional to support transactions • Patches welcome!
  28. 28. Future: http caching • Potential big win • DBD::Gofer needs to indicate cache-ability - via appropriate http headers • Server side needs to agree - and respond with appropriate http headers • Caching then happens just like for web pages - if there’s a web cache between client and server • Patches welcome!
  29. 29. Future: JSON • Turns DBI into a web service! - Service Oriented Architecture anyone? • Accessible to anything that can talk JSON • Clients could be JavaScript, Java, ... - and various languages that begin with P ... • Patches welcome!
  30. 30. Future: Transactions • State-less-ness could be made optional - If transport layer being used agrees - Easiest to implement for stream / ssh • Would be enabled by AutoCommit => 0 $dbh->begin_work • Patches welcome!
  31. 31. Questions?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×