Successfully reported this slideshow.

NoSQL: Why, When, and How

5

Share

Loading in …3
×
1 of 123
1 of 123

NoSQL: Why, When, and How

5

Share

Download to read offline

Transcript

  1. 1. NoSQL Why, When, and How starring CouchDB
  2. 2. aka built rebuilding on hack at contribute to work at
  3. 3. One More thing... Co‐organizing
REST
 Fest
2011
w/Mike
 Amundsen restfest.org
  4. 4. What is Why NoSQL ^
  5. 5. NoSQL is... Not Only SQL or SQL I
vote
for
this
one
  6. 6. Origins of the Name Carlo
Strozzi
‐
NoSQL
app started
in
1998.
RDBMS
sans
SQL applied
to
non‐relational
DB’s
around
 2008‐ish no:sql(east) select
fun,
profit
from
real_world
 where
relational=false;
  7. 7. Origin of the Species non‐relational
databases
pre‐date
 relational
ones it’s
sorta
like
AJAX been
doin’
it
for
awhile gets
a
name now
it’s
cool!!1!
  8. 8. Types of NoSQL DBs graph
(RDF/Semantic
Web/triples) key‐value
(just
what
it
says) document
(k/v
+
queriability) object
(big
in
the
1980s) multivalue
(old
tech...like
1960s) NewSQL?
  9. 9. NewSQL? Recently
(April,
2011)
coined
term by
The
451
Group mostly
means
SQL
db’s
+
better
 scallability or
NoSQL
db’s
+
SQL
layers yeah!
more
keywords!! keeps
marketing
happy...
  10. 10. ...on to specifics
  11. 11. Graph (RDF) pretty
heady
stuff Web
3.0?
Maybe... simple
concept,
 FlockDB (from Twitter) complicated
execution
 (often) AllegroGraph Queried
with: Cytoscape SPARQL Java
  12. 12. Key Value Stores scalable
caches generally
no
 query
language Project Voldemort get
key(s) Hibari return
value(s) Big
Co
need
 driven
  13. 13. Document Databases key value + querying Lotus Notes Amazon SimpleDB
  14. 14. object
  15. 15. multivalue pretty
antique lots
of
legacy
rollouts
  16. 16. NewSQL any
of
the
previously
mentioned
DB’s
 +
query
layers...maybe expect
any
and
all
SQL
DB’s
to
jump
 on
this
train
and/or
slip
in
the
next
 few
years worth
a
look
if
you
*must*
have
 normalized
storage but
who
needs
normalization?
  17. 17. Why
  18. 18. One Reason
  19. 19. options
  20. 20. denormalization schemaless graph/object
schema closer
match
to
business
logic generally
faster/more
scalable certainly
more
distributable
  21. 21. alt.queries Map/Reduce thanks,
Google
(paper
from
2004) XPath
and/or
XQuery SPARQL SQL...or
something
quite
similar Linq
  22. 22. licensing Apache
License
2.0
the
favorite AGPLv3
the
commercial
favorite others
include LGPL(v2/3),
GPL(v2/3),
BSD,
MIT, custom
commercial
or
open
source ...here
there
be
dragons...
  23. 23. (un)expected extras at least in CouchDB Built
in
Web
Server or
App
Server Geospatial
bounding
box
queries n‐master
replication binary
file
storage scales
up
and
down
  24. 24. When
  25. 25. Scenarios Scalability caching,
sharding Analytics Data
Warehousing Ubiquitous/Distributed
Data mobile,
desktop,
server
  26. 26. Scalability “cache”
style
DBs data
served
from
RAM
(mostly) Membase,
Elastic
Couchbase,
 memcached,
MongoDB,
Casandra horizontal
scalability add
more
servers,
not
more
server
  27. 27. Analytics Hadoop HBase Cassandra this
bleeds
over
into
data
 warehousing
quickly
  28. 28. Ubiquitous/Distributed Data CouchDB server
<=>
desktop
<=>
mobile Riak
Mobile? go
to
Erlang
Factory
in
June
  29. 29. FALE Scenarios power
outages data
loss failed
persistence,
no
persistence network
unavailability special thanks to @coats who runs fale.ca for the FALE stamp
  30. 30. Solution: CouchDB
  31. 31. How with CouchDB
  32. 32. Time to Relax
  33. 33. Time to Relax That’s
Damien. He
built
CouchDB.
  34. 34. That’s
the
CouchDB gang
sign. Time to Relax That’s
Damien. He
built
CouchDB.
  35. 35. That’s
the
CouchDB gang
sign. Time to Relax Learn
it! That’s
Damien. He
built
CouchDB.
  36. 36. and here’s why
  37. 37. CouchDB has Super Powers! Schemaless Document
centric Replication/Sync Fail
Fast
Architecture stateless
API append
only
file
storage
  38. 38. Document centric “natural”
data
model store
data
like
it
exists
 everywhere
else:
as
a
document map/reduce
vs.
sql sorting
documents
out
of
a
drawer
 vs.
reassembling
them
from
bits
of
 data
  39. 39. Replication/Sync MVCC‐based
transactions versioning...but
only
meant
for
 transactions safely
merge
databases documents
aren’t
compared,
only
 UUID’s
&
revision
ID’s conflicting
documents
are
marked
and
 a
winner
is
picked
  40. 40. Fail Fast append
only
database
file everything
goes
on
the
end
of
the
 file querys
are
cached
there
too bounce
back
from
errors
rather
than
 spin
wheels
indefinitely
  41. 41. C.O.U.C.H Collection Of Unreliable Commodity Hardware
  42. 42. CouchDB Scaling scales
up
and
down server
<=>
desktop
<=>
mobile thanks
to
n‐master
replication BigCouch
plugin
for
sharding HTTP
API
can
be
load
balanced reverse
proxies
and
caching
  43. 43. Get CouchDB http://www.couchbase.com/downloads/ couchbase‐server/community http://iriscouch.com/ both
have
GeoCouch
built
in! https://cloudant.com/
(BigCouch/sharded) your
system’s
package
installer... http://wiki.apache.org/couchdb/ Installation
  44. 44. Time to Relax (command line) Apache CouchDB 1.0.2 (LogLevel=info) is starting. Apache CouchDB has started. Time to relax. [info] [<0.31.0>] Apache CouchDB has started on http://127.0.0.1:5984/
  45. 45. PHP & CouchDB
  46. 46. HTTP Clients All
you
really
need
to
get
started Does
require
a
better
understanding
of
 CouchDB’s
API...but
that’s
A
Good
Thing! Will
improve
your
HTTP
skillz...another
 Good
Thing! requires
more
work...the
part
you
won’t
 like but
it’s
worth
it
to
learn
HTTP
&
REST
  47. 47. CouchDB Clients CouchDB’s
API
is
just
HTTP but...helper
libraries
can...help: auto
(en|de)code
JSON handle
base64‐ing
“inline”
attachments manage
authentication,
cookies,
OAuth
 token
exchange caching!!! _changes
feed
watching
  48. 48. CouchDB Clients Sag
for
CouchDB
(Apache
License
2.0) PHP
On
Couch
(GPLv2
or
v3) Beyond
here,
there
be
giants... PHPillow
(LGPL
3) PHP
Object_Freezer
(BSD) PHP
CouchDB
Extension
(PHP
License
3.0) Doctrine2
CouchDB
ODM
  49. 49. HTTP Clients curl ugh...messy pecl_http lovely
(next
to
curl),
but
takes
 some
install
time,
lacks
examples Zend_HTTP
&
PEAR
HTTP_Request2 Most
major
frameworks
have
their
own
  50. 50. Client Suggestions HTTP pick
one
that’s
flexible
(can
 handle
COPY) CouchDB I
use
Sag
currently.
Caching,
 Cookie
Auth,
nice
name. PHP‐on‐Couch
seems
great,
but
watch
 the
license
(GPL)
  51. 51. Today’s Stack In
PHP: Sag
‐
saggingcouch.com/ For
HTTP
API
Demoing/Testing: Resty
‐
github.com/micha/resty Poster
for
Firefox code.google.com/p/poster‐ extension/
  52. 52. Other Handy HTTP Clients HTTPClient
for
Mac
OS
X Charles
Proxy
($$) http‐twiddle
for
Emacs Fiddler
for
Windows Solex
for
Eclipse
  53. 53. CouchDB HTTP API we’ll be back to PHP in a bit
  54. 54. JSON Documents all
responses
are
valid
JSON JSON
support
is
built
into
PHP
5.2+ pecl
&
“pure”
PHP
(de|en)code
for
 older
versions
  55. 55. A JSON Document { “json”: “key/value pairs”, “_id” : “some uuid”, “_rev”: “mvcc key”, “string keys”: [1,2,3,”four”,null], “schema free”: {“so it’s”:“flexible”} }
  56. 56. JSON to PHP Object $json
=
'{"json":"document","with":["an",
"array"]}'; $j
=
json_decode($json); //
$j stdClass
Object ( 



[json]
=>
document 



[with]
=>
Array 







( 











[0]
=>
an 











[1]
=>
array 







) ) echo
$j‐>with[1];
  57. 57. JSON to PHP array $json = '{"json":"document","with":["an", "array"]}'; $j = json_decode($json, true); // $j Array ( [json] => document [with] => Array ( [0] => an [1] => array ) ) echo $j[‘with’][1];
  58. 58. I got tired of having to pick -> or [] $j
=
new
ArrayObject( 






json_decod($json), 






ArrayObject::ARRAY_AS_PROPS 




); print_r($j[‘with’][0]); print_r($j‐>with[0]); //
“an”
‐‐
same
result!
no
errors!
  59. 59. HTTP / REST basics GET read PUT create or update DELETE delete POST bulk operation
  60. 60. Resty command line RESTful good times
  61. 61. Setup Resty install
(see
resty
page) $
resty
. #
GET,
POST,
PUT,
DELETE
&
HEAD #
are
scripts
now! $
resty
http://localhost:5984/ #
set
Resty
to
default
to
CouchDB
  62. 62. Create a Database $
PUT
/pouch/ {“ok”:true} $
GET
/pcouch/ {"db_name":"pcouch", "doc_count":0,"doc_del_count":0, "update_seq":0,"purge_seq":0, "compact_running":false, "disk_size":79, "instance_start_time":"1289923325819422", "disk_format_version":5, "committed_update_seq":0}
  63. 63. DELETE a Database $
DELETE
/pouch/ {“ok”:true} #
^^^
be
careful
with
that
one! #
let’s
recreate
it $
PUT
/pouch/
  64. 64. Create Document PUT
/pouch/tek

‘{“php”:”tek”}’ {"ok":true, "id":"test", "rev":"1‐89af21439a03933bc3fc8c14cbeb 496e"}
  65. 65. GET the Document GET
/pouch/tek {"_id":"test", "_rev":"1‐89af21439a03933bc3fc8c14cbe b496e", "php":"tek"} //
we'll
need
that
_rev
value
to
 update
this
doc
  66. 66. PUT (failure) on purpose... PUT
/pouch/tek
‘{“php”:”tek”}’ {"error":"conflict","reason":"Documen t
update
conflict."} #
we
need
that
_rev
value
now
  67. 67. PUT things right PUT
/pouch/tek
‘{"_rev":
 "1‐89af21439a03933bc3fc8c14cbeb496e",
 "php":"tek"}’ #
that

represents
a
new
line #
doc
could
also
contain
the
“id” {"ok":true,
"id":"tek", "rev":"2‐ fff750985c2c2602e859fe38cd1d347e"}
  68. 68. PUT binary attachments PUT
/pouch/tek/photo?rev=2‐ fff750985c2c2602e859fe38cd1d347e
 ‐Q
filename.png
 ‐H
“Content‐Type:
image/png” {"ok":true,"id":"tek", "rev":"3‐18d519e58b569e43a6fd5e87491f0c4c"} GET
/pouch/tek {"_id":"tek", "_rev":"3‐18d519e58b569e43a6fd5e87491f0c4c", "_attachments":{"photo":{"content_type":"image/ png", "revpos":1,"length":18,"stub":true}}}
  69. 69. A bit about attachments Each
attachment
to
a
doc
has
it’s
own
 URL: /pouch/tek/photo /pouch/tek/schedule.pdf each
attachments
has
it’s
own
 mimetype attachments
can
be
added/updated
via
 their
own
URLs
or
inline
  70. 70. DELETE (failure) on purpose...again DELETE
/pouch/tek {"error":"conflict", "reason":"Document
update
conflict."} #
you
always
have
to
send
a
“rev”
 when
changing
a
doc
in
any
way
  71. 71. DELETE DELETE
‘/pouch/tek?rev=
 3‐18d519e58b569e43a6fd5e87491f0c4c’
‐Q #
that
‐Q
tells
Resty
not
to
urlencode {"ok":true,"id":"tek", "rev":"4‐3e8c21e8e610c4ea7f8e247a45c6e b04"} #
what?!?
another
rev?...but
the #
document
should
be
dead?!
  72. 72. GETing 404’s GET
/pouch/tek {"error":"not_found", "reason":"deleted"} GET
/pouch/tek12 {"error":"not_found", "reason":"missing"} #
RESTifarian
note:
would
be
a
409
 #
if
cache’s
were
built
better
  73. 73. That was just the very basics GET
/_stats
(server
stats) GET
/_all_dbs
(list
all
DB’s) GET
/_all_docs
(list
all
docs) GET
/db/_changes
(list
recent
 changes) super
powers!
  74. 74. Map/Reduce Queries searching the file drawer essentially
stored
queries a.k.a.
“Views” written
in
JavaScript
(or
Python
or
 Ruby
or
Erlang
or
PHP?) similar
to
array_map()/ array_reduce(),
but
scalable no
ad‐hoc
queries
  75. 75. Pouch put your media in the Couch starring PHP!
  76. 76. i want to cover schema‐less
JSON
docs _security Binary
Attachments server
side
rendering
 stuff Map/Reduce _show Javascript _list several
examples URL
Rewriting replication _rewrite security vhosts validate_doc_updates
  77. 77. Pouch is... a
Database
of JSON
docs
of
file
meta‐data with
the
file
attached! a
filesystem
importer a
Web‐based
CouchApp
for
browsing
  78. 78. 0.1 Files into CouchDB gather
EXIF
data add
attachment PUT
to
CouchDB let
us
know
how
it
went
  79. 79. code preamble #!/usr/bin/php <?php if ($argc < 2) die('I need a file name.'."n"); // CouchDB config $user = 'admin'; $password = 'passwd'; // gather and clean EXIF data $exif = exif_read_data($argv[1]); unset($exif['MakerNote']); unset($exif['ComponentsConfiguration']); unset($exif['JPEGThumbnail']); unset($exif['TIFFThumbnail']);
  80. 80. require_once dirname(realpath(__FILE__)).'/../libs/Sag/src/Sag.php'; $sag = new Sag('localhost', 5984); $sag->setDatabase('pouch'); // PUT the binary attachment first $sag->setAttachment('original', // the attachment name file_get_contents($argv[1]), // the file image_type_to_mime_type(exif_imagetype($argv[1])), $id = md5_file($argv[1])); // the doc id // then add the EXIF data // GET the full doc as we need to add our info to what's there // atomicity is at a document level in CouchDB $doc = $sag->get($id)->body; $doc->exif = $exif; print_r($sag->put(md5_file($argv[1]), $doc));
  81. 81. handle updates: GET/PUT or HEAD/PUT or PUT/ error_handle/PUT
  82. 82. 3 Tier UI for viewing browser <-> php <-> couchdb
  83. 83. - GET attachments
  84. 84. _all_docs for doc list
  85. 85. loading CouchDB results into JSON and then into template/ PHP output
  86. 86. Just certain documents ala Map/Reduce
  87. 87. _view API & query params
  88. 88. include_docs
  89. 89. thumbnail creation from PHP async or during upload if we want to wrap CouchDB
  90. 90. disadvantages of non- async operations
  91. 91. 2.5 Tier refactoring browser <-> couchdb ^ php
  92. 92. find photos sans thumbnails and add them
  93. 93. _changes feed could check this first before hitting the view: _changes is “lighter”
  94. 94. JS app to replace the PHP-built UI
  95. 95. PHP cronjob handles thumbs/metadata async
  96. 96. CouchDB setup on public port
  97. 97. couchapp script http://couchapp.org
  98. 98. Replicating the App
  99. 99. big advantage: data & app stay together this is huge!
  100. 100. small disadvantages: dynamic data (thumbs) won’t always be there no real data loss in this case, as they can be re-created later
  101. 101. _replicate
  102. 102. Removing Images just DELETE the docs
  103. 103. Compaction (for space cleanup) _deleted docs _stats (for CouchDB-wide) space usage)
  104. 104. Securing the CouchApp
  105. 105. _security API
  106. 106. Users/Roles on docs
  107. 107. CouchDB permissions
  108. 108. Cookie Authentication (in the CouchApp)
  109. 109. Basic Authentication (for the cronjob) Sag, HTTP API access, etc.
  110. 110. OAuth it’s an option, but a whole ‘nother talk
  111. 111. Replicating with security
  112. 112. Adding a form for User Data Entru
  113. 113. - Server-side validation
  114. 114. validate_doc_updates document validate for additional security
  115. 115. Making the App “Static” (less AJAX dependent)
  116. 116. _show “templating” for a single doc
  117. 117. _list templating for view output
  118. 118. Migrating existing MySQL- based Gallery data to Pouch
  119. 119. writing JSON docs pulled from old Gallery
  120. 120. requesting those docs, PUTing them, PUTing the attachment
  121. 121. CouchDB as REST API server Building
another
API
for
Pouch In
addition
to
(or
instead
of)
the
 standard
CouchDB
API Putting
that
into
CouchDB _list
‐
index
pages _show
‐
single
document
page _update
‐
document
modification _rewrite
‐
URL
Rewriting
  122. 122. Any questions?
  123. 123. Other CouchApps demo/review
other
CouchApps? discuss
scalability/load
balancing? deeper
dive
into
Map/Reduce
joins?

Editor's Notes

  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Transcript

    1. 1. NoSQL Why, When, and How starring CouchDB
    2. 2. aka built rebuilding on hack at contribute to work at
    3. 3. One More thing... Co‐organizing
REST
 Fest
2011
w/Mike
 Amundsen restfest.org
    4. 4. What is Why NoSQL ^
    5. 5. NoSQL is... Not Only SQL or SQL I
vote
for
this
one
    6. 6. Origins of the Name Carlo
Strozzi
‐
NoSQL
app started
in
1998.
RDBMS
sans
SQL applied
to
non‐relational
DB’s
around
 2008‐ish no:sql(east) select
fun,
profit
from
real_world
 where
relational=false;
    7. 7. Origin of the Species non‐relational
databases
pre‐date
 relational
ones it’s
sorta
like
AJAX been
doin’
it
for
awhile gets
a
name now
it’s
cool!!1!
    8. 8. Types of NoSQL DBs graph
(RDF/Semantic
Web/triples) key‐value
(just
what
it
says) document
(k/v
+
queriability) object
(big
in
the
1980s) multivalue
(old
tech...like
1960s) NewSQL?
    9. 9. NewSQL? Recently
(April,
2011)
coined
term by
The
451
Group mostly
means
SQL
db’s
+
better
 scallability or
NoSQL
db’s
+
SQL
layers yeah!
more
keywords!! keeps
marketing
happy...
    10. 10. ...on to specifics
    11. 11. Graph (RDF) pretty
heady
stuff Web
3.0?
Maybe... simple
concept,
 FlockDB (from Twitter) complicated
execution
 (often) AllegroGraph Queried
with: Cytoscape SPARQL Java
    12. 12. Key Value Stores scalable
caches generally
no
 query
language Project Voldemort get
key(s) Hibari return
value(s) Big
Co
need
 driven
    13. 13. Document Databases key value + querying Lotus Notes Amazon SimpleDB
    14. 14. object
    15. 15. multivalue pretty
antique lots
of
legacy
rollouts
    16. 16. NewSQL any
of
the
previously
mentioned
DB’s
 +
query
layers...maybe expect
any
and
all
SQL
DB’s
to
jump
 on
this
train
and/or
slip
in
the
next
 few
years worth
a
look
if
you
*must*
have
 normalized
storage but
who
needs
normalization?
    17. 17. Why
    18. 18. One Reason
    19. 19. options
    20. 20. denormalization schemaless graph/object
schema closer
match
to
business
logic generally
faster/more
scalable certainly
more
distributable
    21. 21. alt.queries Map/Reduce thanks,
Google
(paper
from
2004) XPath
and/or
XQuery SPARQL SQL...or
something
quite
similar Linq
    22. 22. licensing Apache
License
2.0
the
favorite AGPLv3
the
commercial
favorite others
include LGPL(v2/3),
GPL(v2/3),
BSD,
MIT, custom
commercial
or
open
source ...here
there
be
dragons...
    23. 23. (un)expected extras at least in CouchDB Built
in
Web
Server or
App
Server Geospatial
bounding
box
queries n‐master
replication binary
file
storage scales
up
and
down
    24. 24. When
    25. 25. Scenarios Scalability caching,
sharding Analytics Data
Warehousing Ubiquitous/Distributed
Data mobile,
desktop,
server
    26. 26. Scalability “cache”
style
DBs data
served
from
RAM
(mostly) Membase,
Elastic
Couchbase,
 memcached,
MongoDB,
Casandra horizontal
scalability add
more
servers,
not
more
server
    27. 27. Analytics Hadoop HBase Cassandra this
bleeds
over
into
data
 warehousing
quickly
    28. 28. Ubiquitous/Distributed Data CouchDB server
<=>
desktop
<=>
mobile Riak
Mobile? go
to
Erlang
Factory
in
June
    29. 29. FALE Scenarios power
outages data
loss failed
persistence,
no
persistence network
unavailability special thanks to @coats who runs fale.ca for the FALE stamp
    30. 30. Solution: CouchDB
    31. 31. How with CouchDB
    32. 32. Time to Relax
    33. 33. Time to Relax That’s
Damien. He
built
CouchDB.
    34. 34. That’s
the
CouchDB gang
sign. Time to Relax That’s
Damien. He
built
CouchDB.
    35. 35. That’s
the
CouchDB gang
sign. Time to Relax Learn
it! That’s
Damien. He
built
CouchDB.
    36. 36. and here’s why
    37. 37. CouchDB has Super Powers! Schemaless Document
centric Replication/Sync Fail
Fast
Architecture stateless
API append
only
file
storage
    38. 38. Document centric “natural”
data
model store
data
like
it
exists
 everywhere
else:
as
a
document map/reduce
vs.
sql sorting
documents
out
of
a
drawer
 vs.
reassembling
them
from
bits
of
 data
    39. 39. Replication/Sync MVCC‐based
transactions versioning...but
only
meant
for
 transactions safely
merge
databases documents
aren’t
compared,
only
 UUID’s
&
revision
ID’s conflicting
documents
are
marked
and
 a
winner
is
picked
    40. 40. Fail Fast append
only
database
file everything
goes
on
the
end
of
the
 file querys
are
cached
there
too bounce
back
from
errors
rather
than
 spin
wheels
indefinitely
    41. 41. C.O.U.C.H Collection Of Unreliable Commodity Hardware
    42. 42. CouchDB Scaling scales
up
and
down server
<=>
desktop
<=>
mobile thanks
to
n‐master
replication BigCouch
plugin
for
sharding HTTP
API
can
be
load
balanced reverse
proxies
and
caching
    43. 43. Get CouchDB http://www.couchbase.com/downloads/ couchbase‐server/community http://iriscouch.com/ both
have
GeoCouch
built
in! https://cloudant.com/
(BigCouch/sharded) your
system’s
package
installer... http://wiki.apache.org/couchdb/ Installation
    44. 44. Time to Relax (command line) Apache CouchDB 1.0.2 (LogLevel=info) is starting. Apache CouchDB has started. Time to relax. [info] [<0.31.0>] Apache CouchDB has started on http://127.0.0.1:5984/
    45. 45. PHP & CouchDB
    46. 46. HTTP Clients All
you
really
need
to
get
started Does
require
a
better
understanding
of
 CouchDB’s
API...but
that’s
A
Good
Thing! Will
improve
your
HTTP
skillz...another
 Good
Thing! requires
more
work...the
part
you
won’t
 like but
it’s
worth
it
to
learn
HTTP
&
REST
    47. 47. CouchDB Clients CouchDB’s
API
is
just
HTTP but...helper
libraries
can...help: auto
(en|de)code
JSON handle
base64‐ing
“inline”
attachments manage
authentication,
cookies,
OAuth
 token
exchange caching!!! _changes
feed
watching
    48. 48. CouchDB Clients Sag
for
CouchDB
(Apache
License
2.0) PHP
On
Couch
(GPLv2
or
v3) Beyond
here,
there
be
giants... PHPillow
(LGPL
3) PHP
Object_Freezer
(BSD) PHP
CouchDB
Extension
(PHP
License
3.0) Doctrine2
CouchDB
ODM
    49. 49. HTTP Clients curl ugh...messy pecl_http lovely
(next
to
curl),
but
takes
 some
install
time,
lacks
examples Zend_HTTP
&
PEAR
HTTP_Request2 Most
major
frameworks
have
their
own
    50. 50. Client Suggestions HTTP pick
one
that’s
flexible
(can
 handle
COPY) CouchDB I
use
Sag
currently.
Caching,
 Cookie
Auth,
nice
name. PHP‐on‐Couch
seems
great,
but
watch
 the
license
(GPL)
    51. 51. Today’s Stack In
PHP: Sag
‐
saggingcouch.com/ For
HTTP
API
Demoing/Testing: Resty
‐
github.com/micha/resty Poster
for
Firefox code.google.com/p/poster‐ extension/
    52. 52. Other Handy HTTP Clients HTTPClient
for
Mac
OS
X Charles
Proxy
($$) http‐twiddle
for
Emacs Fiddler
for
Windows Solex
for
Eclipse
    53. 53. CouchDB HTTP API we’ll be back to PHP in a bit
    54. 54. JSON Documents all
responses
are
valid
JSON JSON
support
is
built
into
PHP
5.2+ pecl
&
“pure”
PHP
(de|en)code
for
 older
versions
    55. 55. A JSON Document { “json”: “key/value pairs”, “_id” : “some uuid”, “_rev”: “mvcc key”, “string keys”: [1,2,3,”four”,null], “schema free”: {“so it’s”:“flexible”} }
    56. 56. JSON to PHP Object $json
=
'{"json":"document","with":["an",
"array"]}'; $j
=
json_decode($json); //
$j stdClass
Object ( 



[json]
=>
document 



[with]
=>
Array 







( 











[0]
=>
an 











[1]
=>
array 







) ) echo
$j‐>with[1];
    57. 57. JSON to PHP array $json = '{"json":"document","with":["an", "array"]}'; $j = json_decode($json, true); // $j Array ( [json] => document [with] => Array ( [0] => an [1] => array ) ) echo $j[‘with’][1];
    58. 58. I got tired of having to pick -> or [] $j
=
new
ArrayObject( 






json_decod($json), 






ArrayObject::ARRAY_AS_PROPS 




); print_r($j[‘with’][0]); print_r($j‐>with[0]); //
“an”
‐‐
same
result!
no
errors!
    59. 59. HTTP / REST basics GET read PUT create or update DELETE delete POST bulk operation
    60. 60. Resty command line RESTful good times
    61. 61. Setup Resty install
(see
resty
page) $
resty
. #
GET,
POST,
PUT,
DELETE
&
HEAD #
are
scripts
now! $
resty
http://localhost:5984/ #
set
Resty
to
default
to
CouchDB
    62. 62. Create a Database $
PUT
/pouch/ {“ok”:true} $
GET
/pcouch/ {"db_name":"pcouch", "doc_count":0,"doc_del_count":0, "update_seq":0,"purge_seq":0, "compact_running":false, "disk_size":79, "instance_start_time":"1289923325819422", "disk_format_version":5, "committed_update_seq":0}
    63. 63. DELETE a Database $
DELETE
/pouch/ {“ok”:true} #
^^^
be
careful
with
that
one! #
let’s
recreate
it $
PUT
/pouch/
    64. 64. Create Document PUT
/pouch/tek

‘{“php”:”tek”}’ {"ok":true, "id":"test", "rev":"1‐89af21439a03933bc3fc8c14cbeb 496e"}
    65. 65. GET the Document GET
/pouch/tek {"_id":"test", "_rev":"1‐89af21439a03933bc3fc8c14cbe b496e", "php":"tek"} //
we'll
need
that
_rev
value
to
 update
this
doc
    66. 66. PUT (failure) on purpose... PUT
/pouch/tek
‘{“php”:”tek”}’ {"error":"conflict","reason":"Documen t
update
conflict."} #
we
need
that
_rev
value
now
    67. 67. PUT things right PUT
/pouch/tek
‘{"_rev":
 "1‐89af21439a03933bc3fc8c14cbeb496e",
 "php":"tek"}’ #
that

represents
a
new
line #
doc
could
also
contain
the
“id” {"ok":true,
"id":"tek", "rev":"2‐ fff750985c2c2602e859fe38cd1d347e"}
    68. 68. PUT binary attachments PUT
/pouch/tek/photo?rev=2‐ fff750985c2c2602e859fe38cd1d347e
 ‐Q
filename.png
 ‐H
“Content‐Type:
image/png” {"ok":true,"id":"tek", "rev":"3‐18d519e58b569e43a6fd5e87491f0c4c"} GET
/pouch/tek {"_id":"tek", "_rev":"3‐18d519e58b569e43a6fd5e87491f0c4c", "_attachments":{"photo":{"content_type":"image/ png", "revpos":1,"length":18,"stub":true}}}
    69. 69. A bit about attachments Each
attachment
to
a
doc
has
it’s
own
 URL: /pouch/tek/photo /pouch/tek/schedule.pdf each
attachments
has
it’s
own
 mimetype attachments
can
be
added/updated
via
 their
own
URLs
or
inline
    70. 70. DELETE (failure) on purpose...again DELETE
/pouch/tek {"error":"conflict", "reason":"Document
update
conflict."} #
you
always
have
to
send
a
“rev”
 when
changing
a
doc
in
any
way
    71. 71. DELETE DELETE
‘/pouch/tek?rev=
 3‐18d519e58b569e43a6fd5e87491f0c4c’
‐Q #
that
‐Q
tells
Resty
not
to
urlencode {"ok":true,"id":"tek", "rev":"4‐3e8c21e8e610c4ea7f8e247a45c6e b04"} #
what?!?
another
rev?...but
the #
document
should
be
dead?!
    72. 72. GETing 404’s GET
/pouch/tek {"error":"not_found", "reason":"deleted"} GET
/pouch/tek12 {"error":"not_found", "reason":"missing"} #
RESTifarian
note:
would
be
a
409
 #
if
cache’s
were
built
better
    73. 73. That was just the very basics GET
/_stats
(server
stats) GET
/_all_dbs
(list
all
DB’s) GET
/_all_docs
(list
all
docs) GET
/db/_changes
(list
recent
 changes) super
powers!
    74. 74. Map/Reduce Queries searching the file drawer essentially
stored
queries a.k.a.
“Views” written
in
JavaScript
(or
Python
or
 Ruby
or
Erlang
or
PHP?) similar
to
array_map()/ array_reduce(),
but
scalable no
ad‐hoc
queries
    75. 75. Pouch put your media in the Couch starring PHP!
    76. 76. i want to cover schema‐less
JSON
docs _security Binary
Attachments server
side
rendering
 stuff Map/Reduce _show Javascript _list several
examples URL
Rewriting replication _rewrite security vhosts validate_doc_updates
    77. 77. Pouch is... a
Database
of JSON
docs
of
file
meta‐data with
the
file
attached! a
filesystem
importer a
Web‐based
CouchApp
for
browsing
    78. 78. 0.1 Files into CouchDB gather
EXIF
data add
attachment PUT
to
CouchDB let
us
know
how
it
went
    79. 79. code preamble #!/usr/bin/php <?php if ($argc < 2) die('I need a file name.'."n"); // CouchDB config $user = 'admin'; $password = 'passwd'; // gather and clean EXIF data $exif = exif_read_data($argv[1]); unset($exif['MakerNote']); unset($exif['ComponentsConfiguration']); unset($exif['JPEGThumbnail']); unset($exif['TIFFThumbnail']);
    80. 80. require_once dirname(realpath(__FILE__)).'/../libs/Sag/src/Sag.php'; $sag = new Sag('localhost', 5984); $sag->setDatabase('pouch'); // PUT the binary attachment first $sag->setAttachment('original', // the attachment name file_get_contents($argv[1]), // the file image_type_to_mime_type(exif_imagetype($argv[1])), $id = md5_file($argv[1])); // the doc id // then add the EXIF data // GET the full doc as we need to add our info to what's there // atomicity is at a document level in CouchDB $doc = $sag->get($id)->body; $doc->exif = $exif; print_r($sag->put(md5_file($argv[1]), $doc));
    81. 81. handle updates: GET/PUT or HEAD/PUT or PUT/ error_handle/PUT
    82. 82. 3 Tier UI for viewing browser <-> php <-> couchdb
    83. 83. - GET attachments
    84. 84. _all_docs for doc list
    85. 85. loading CouchDB results into JSON and then into template/ PHP output
    86. 86. Just certain documents ala Map/Reduce
    87. 87. _view API & query params
    88. 88. include_docs
    89. 89. thumbnail creation from PHP async or during upload if we want to wrap CouchDB
    90. 90. disadvantages of non- async operations
    91. 91. 2.5 Tier refactoring browser <-> couchdb ^ php
    92. 92. find photos sans thumbnails and add them
    93. 93. _changes feed could check this first before hitting the view: _changes is “lighter”
    94. 94. JS app to replace the PHP-built UI
    95. 95. PHP cronjob handles thumbs/metadata async
    96. 96. CouchDB setup on public port
    97. 97. couchapp script http://couchapp.org
    98. 98. Replicating the App
    99. 99. big advantage: data & app stay together this is huge!
    100. 100. small disadvantages: dynamic data (thumbs) won’t always be there no real data loss in this case, as they can be re-created later
    101. 101. _replicate
    102. 102. Removing Images just DELETE the docs
    103. 103. Compaction (for space cleanup) _deleted docs _stats (for CouchDB-wide) space usage)
    104. 104. Securing the CouchApp
    105. 105. _security API
    106. 106. Users/Roles on docs
    107. 107. CouchDB permissions
    108. 108. Cookie Authentication (in the CouchApp)
    109. 109. Basic Authentication (for the cronjob) Sag, HTTP API access, etc.
    110. 110. OAuth it’s an option, but a whole ‘nother talk
    111. 111. Replicating with security
    112. 112. Adding a form for User Data Entru
    113. 113. - Server-side validation
    114. 114. validate_doc_updates document validate for additional security
    115. 115. Making the App “Static” (less AJAX dependent)
    116. 116. _show “templating” for a single doc
    117. 117. _list templating for view output
    118. 118. Migrating existing MySQL- based Gallery data to Pouch
    119. 119. writing JSON docs pulled from old Gallery
    120. 120. requesting those docs, PUTing them, PUTing the attachment
    121. 121. CouchDB as REST API server Building
another
API
for
Pouch In
addition
to
(or
instead
of)
the
 standard
CouchDB
API Putting
that
into
CouchDB _list
‐
index
pages _show
‐
single
document
page _update
‐
document
modification _rewrite
‐
URL
Rewriting
    122. 122. Any questions?
    123. 123. Other CouchApps demo/review
other
CouchApps? discuss
scalability/load
balancing? deeper
dive
into
Map/Reduce
joins?

    Editor's Notes

  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • More Related Content

    Related Books

    Free with a 30 day trial from Scribd

    See all

    ×