An overview of Ruby, jRuby, Rails, Torquebox, and PostgreSQL that was presented as a 3 hour class to other programmers at The Ironyard (http://theironyard.com) in Greenville, SC in July of 2013. The Rails specific sections are mostly code samples that were explained during the session so the real focus of the slides is Ruby, "the rails way" / workflow / differentiators and PostgreSQL.
2. Who am I?
• I’m Barry Jones
• Application Developer since ’98
– Java, PHP, Groovy, Ruby, Perl, Python
– MySQL, PostgreSQL, SQL Server, Oracle, MongoDB
• Efficiency and infrastructure nut
• Believer in “right tool for the job”
– There is no silver bullet, programming is about
tradeoffs
4. What do we look for in a language?
• Balance
– Can it do what I need it to do?
• Web: Ruby/Python/PHP/Perl/Java/C#/C/C++
– Efficient to develop with it?
• Ruby/Python/PHP
– Libraries/tools/ecosystem to avoid reinventing the wheel?
• Ruby/Python/PHP/Java/Perl
– Is it fast?
• Ruby/Python/Java/C#/C/C++
– Is it stable?
• Ruby/Python/PHP/Perl/Java/C#/C/C++
– Do other developers use it?
• At my company? In the area? Globally?
– Cost effective?
• Ruby/Python/PHP/Perl/C/C++
– Can it handle my architectural approach well?
• Ruby/Python/Java/C# handle just about everything
• CGI languages (PHP/Perl/C/C++) are very bad fits for frameworks, long polling, evented programming
– Will it scale?
• Yes. This is a subjective question because web servers scale horizontally naturally
– Will my boss let me use it?
• .NET shop? C#
• Java shop? Java (Groovy, Clojure, Scala), jRuby, jython
• *nix shop? Ruby, Python, Perl, PHP, C, C++
• Probable Winners: Ruby and Python
5. What stands out about Ruby?
• Malleability
– Everything is an object
– Objects can be monkey
patched
• Great for writing
Domain Specific
Languages
– Puppet
– Chef
– Capistrano
– Rails
“this is a string object”.length
class String
def palindrome?
self == self.reverse
end
end
“radar”.palindrome?
6. How is monkey patching good?
• Rails adds web specific capabilities to Ruby
– “ “.blank? == true
• Makes using 3rd party libraries much easier
– Aspect Oriented Development
• Not dependent on built in hooks
– Queued processing
record = Record.find(id)
record.delay.some_intense_logic
• DelayedJob
• Resque
• Sidekiq
• Stalker
– Cross integrations
Email.deliver
• MailHopper – Automatically deliver all email in the background
• Gems that specifically enhance other gems
7. How is monkey patching…bad?
• If any behavior is modified by a monkey patch
there is a chance something will break
• On a positive note, if you’re writing tests and
following TDD or BDD the tests should catch
any problems
• On another positive note, the ruby community
is very big on testing
8. Why was Ruby created?
• Created by Yukirio Matsumoto
• "I wanted a scripting language that was more
powerful than Perl, and more object-oriented
than Python.”
• "I hope to see Ruby help every programmer in
the world to be productive, and to enjoy
programming, and to be happy. That is the
primary purpose of Ruby language.”
– Google Tech Talk in 2008
9. Ruby Version Manager
• cd into directory
autoselects correct
version of ruby and
gemset
• Makes running multiple
projects with multiple
versions of ruby and
gem dependencies on
one machine dead
simple
.rvmrc file
rvm rubytype-version-patch@gemset
Examples:
rvm ruby-1.9.3-p327@myproject
rvm jruby-1.7.4@myjrubyproject
rvm ree-1.8.7@oldproject
10. Bundler and Gemfile
$ bundle install
Using rake (10.0.4)
Using i18n (0.6.1)
Using multi_json (1.7.2)
Using activesupport (3.2.13)
Using builder (3.0.4)
Using activemodel (3.2.13)
Using erubis (2.7.0)
Using journey (1.0.4)
Using rack (1.4.5)
Using rack-cache (1.2)
Using rack-test (0.6.2)
…
Your bundle is complete! Use `bundle show
[gemname]` to see where a bundled gem is
installed.
source 'https://rubygems.org'
source 'http://gems.github.com'
# Application infrastructure
gem 'rails', '3.2.13'
gem 'devise'
gem 'simple_form'
gem 'slim'
gem 'activerecord-jdbc-adapter’
gem 'activerecord-jdbcpostgresql-
adapter'
gem 'jdbc-postgres'
gem 'jruby-openssl'
gem 'jquery-rails'
gem 'torquebox', '2.3.0'
gem 'torquebox-server', '~> 2.3.0'
11. Foreman
Not Ruby specific but written in ruby
Used with Heroku
Drop in a Procfile
$ foreman start
CTRL + C to stop everything
Procfile
web: bundle exec thin start -p $PORT
worker: bundle exec rake resque:work QUEUE=*
clock: bundle exec rake resque:scheduler
12. jRuby: Why?
Ruby isn’t perfect
• Some gems can create memory leaks
– esp. if they were written with native C
• Does not have kernel level threading
– Global Interpreter Lock
• Everything is an object means unnecessary
processing happens when doing things like
adding numbers leading to a performance hit
13. jRuby: So how does it fix things?
I hate writing Java…but the JVM is a work of art
• Java infrastructure is virtually bulletproof
– Most mature way to deploy a web application
– Enterprisey
• JVM’s garbage collector is best of breed and eliminates the potential
memory leak issues
• JVM’s Just-In-Time compiler continually optimizes code the longer it runs
making it faster
• JVM gives Ruby kernel level threading
• jRuby inspects your Ruby code to see if you’re doing anything it would
prefer you didn’t…and turns it off if you’re not
– Eg. If you aren’t overloading the + operator on int’s, it will convert them to
basic types instead of running as objects
• Include and use very mature Java libraries directly in your Ruby code
– Significantly expands your toolbelt
– Allows easy integration into existing Java environments
14. The Sidekiq Test
Sidekiq is a multithreaded background
worker that provides tremendous
concurrency benefits
Creating 1,000,000 objects in 50
concurrent threads
Ruby
jRuby
19. Update and Clarification
• As of this posting to Slideshare, Torquebox has a mature version 3
and a prototype version 4 that operates in a “web server only”
mode. Ruby is at version 2.1.0 with dramatic improvements to
memory performance with forking which allows higher
concurrency.
• At this time, jruby still wins but it’s much closer. Based on chatter
from the #jruby IRC channels, a major new release of both jRuby
and Torquebox are expected to dramatically improve their
performance thanks to recent Java updates. The expected timeline
was late 2014 last I heard.
• Independent benchmarks can be found here:
http://www.techempower.com/benchmarks/#section=data-
r9&hw=peak&test=json
21. What do we look for in a framework?
• Please don’t suck
– Rails does not suck
• Does it follow Model-View-Controller?
– Yes
– Since Rails 1 it’s been the standard bearer for how to do MVC on the web, copied in almost every language
• Does it help me avoid repeating myself (DRY)?
– Yes
• Is it self documenting?
– Yes, it has a set of rules that generally make most documentation unnecessary
• Is it flexible enough to bend to my application needs?
– Yes
• Do other people use it?
– Good gosh yes
• Will it work with my database?
– Yes
• Is it still going to be around in X years?
– Ruby has Rails
– Python has Django
– Groovy has Grails
– C# has MVC
– PHP has fragmented framework Hell (aka – who knows?)
– Java has a few major players (Struts 2, Play, etc)
23. Rails: Rack
Watch this excellent walkthrough of Rack
Middleware:
http://railscasts.com/episodes/151-rack-
middleware
Summary:
It’s a layer of ruby code that passes requests into your app and
sends responses back out. You can add layers to do pre/post
processing on all requests prior to beginning ANY of your
application code.
24. Rails: Models / ActiveRecord
class Post < ActiveRecord::Base
belongs_to :category
has_many :tags, through: :posts_tags
validates :title, presence: true
before_save :create_slug, only: :create
scope :newest_first, order(‘created_at DESC’)
scope :active, where(‘active = ?’,true)
scope :newest_active, newest_first.active
scope :search, lambda do |text|
where(‘title LIKE ?’,”%#{text}%”)
end
def create_slug
self.slug = title.downcase.squish.sub(‘ ‘,’-’)
end
end
post = Post.new(title: ‘Some title’)
post.save!
OR
post = Post.create(title: ‘Some title’)
post.slug # some-title
post.id # 1
post.created_at # Created datetime
post.updated_at # Updated datetime
post.title = ‘New title’
post.save!
# Relations
post.tags.first
post.tags.count
post.category.name
post = Post.include(:tags) # Eager load
post =
Post.search(‘some’).newest_active.first
25. Rails: Migrations
class CreateInitialTables < ActiveRecord::Migration
def up
create_table :posts do |t|
t.string :title
t.text :body
t.string :slug
t.integer :category_id
t.timestamps
end
# … create more tables…
add_index :tags, [:name,:something], unique: true
execute “UPDATE posts SET field = ‘value’ WHERE stuff = ‘happens’”
end
def down
drop_table :posts
end
def change
add_column :posts, :user_id, :integer
end
end
$ rake db:migrate
26. Rails: Controllers
Class PostsController < ApplicationController
before_filter :authenticate, only: :destroy
def index # GET /posts
end
def new # GET /posts/new
end
def create # POST /posts
end
def show # GET /posts/:id
end
def edit # GET /posts/:id/edit
end
def update # PUT /posts/:id
end
def destroy # DELETE /posts/:id
end
end
# Routes
resources :posts
OR limit it
resources :posts, only: [:create,:new]
28. Rails: Testing with rspec
Describe Post do
describe ‘a basic test’ do
subject { FactoryGirl.build(:post,title: ‘Some title’) }
it ‘should be valid’ do
should_not be_nil
subject.valid?.should be_true
end
end
describe ‘something with a complicated dependency’ do
before do
Post.stub(:function_to_override){ true }
end
end
describe ‘a test with API hits’ do
use_vcr_cassette ‘all_a_twitter’, record: :new_episodes
end
end
30. How do you pronounce it?
Answer Response Percentage
post-gres-q-l 2379 45%
post-gres 1611 30%
pahst-grey 24 0%
pg-sequel 50 0%
post-gree 350 6%
postgres-sequel 574 10%
p-g 49 0%
database 230 4%
Total 5267
31. What IS PostgreSQL?
• Fully ACID compliant
• Feature rich and extensible
• Fast, scalable and leverages multicore
processors very well
• Enterprise class with quality corporate
support options
• Free as in beer
• It’s kind’ve nifty
32. Laundry List of Features
• Multi-version Concurrency Control (MVCC)
• Point in Time Recovery
• Tablespaces
• Asynchronous replication
• Nested Transactions
• Online/hot backups
• Genetic query optimizer multiple index types
• Write ahead logging (WAL)
• Internationalization: character sets, locale-aware sorting, case sensitivity,
formatting
• Full subquery support
• Multiple index scans per query
• ANSI-SQL:2008 standard conformant
• Table inheritance
• LISTEN / NOTIFY event system
• Ability to make a Power Point slide run out of room
33. What are we covering today?
• Full text-search
• Built in data types
• User defined data types
• Automatic data compression
• A look at some other cool features and
extensions, depending how we’re doing on
time
34. Full-text Search
• What about…?
– Solr
– Elastic Search
– Sphinx
– Lucene
– MySQL
• All have their purpose
– Distributed search of multiple document types
• Sphinx
– Client search performance is all that matters
• Solr
– Search constantly incoming data with
streaming index updates
• Elastic Search excels
– You really like Java
• Lucene
– You want terrible search results that don’t even
make sense to you much less your users
• MySQL full text search = the worst thing in the world
35. Full-text Search
• Complications of stand alone search engines
– Data synchronization
• Managing deltas, index updates
• Filtering/deleting/hiding expired data
• Search server outages, redundancy
– Learning curve
– Character sets match up with my database?
– Additional hardware / servers just for search
– Can feel like a black box when you get a support
question asking “why is/isn’t this showing up?”
36. Full-text Search
• But what if your needs are more like:
– Search within my database
– Avoid syncing data with outside systems
– Avoid maintaining outside systems
– Less black box, more control
37. Full-text Search
• tsvector
– The text to be searched
• tsquery
– The search query
• to_tsvector(‘the church is AWESOME’) @@ to_tsquery(SEARCH)
• @@ to_tsquery(‘church’) == true
• @@ to_tsquery(‘churches’) == true
• @@ to_tsquery(‘awesome’) == true
• @@ to_tsquery(‘the’) == false
• @@ to_tsquery(‘churches & awesome’) == true
• @@ to_tsquery(‘church & okay’) == false
• to_tsvector(‘the church is awesome’)
– 'awesom':4 'church':2
• to_tsvector(‘simple’,’the church is awesome’)
– 'are':3 'awesome':4 'church':2 'the':1
38. Full-text Search
• ALTER TABLE mytable ADD COLUMN search_vector tsvector
• UPDATE mytable
SET search_vector = to_tsvector(‘english’,coalesce(title,’’) || ‘ ‘ ||
coalesce(body,’’) || ‘ ‘ || coalesce(tags,’’))
• CREATE INDEX search_text ON mytable USING gin(search_vector)
• SELECT some, columns, we, need
FROM mytable
WHERE search_vector @@ to_tsquery(‘english’,‘Jesus & awesome’)
ORDER BY ts_rank(search_vector,to_tsquery(‘english’,‘Jesus & awesome’))
DESC
• CREATE TRIGGER search_update BEFORE INSERT OR UPDATE
ON mytable FOR EACH ROW EXECUTE PROCEDURE
tsvector_update_trigger(search_vector, ’english’, title, body, tags)
39. Full-text Search
• CREATE FUNCTION search_trigger RETURNS trigger AS $$
begin
new.search_vector :=
setweight(to_tsvector(‘english’,coalesce(new.title,’’)),’A’) ||
setweight(to_tsvector(‘english’,coalesce(new.body,’’)),’D’) ||
setweight(to_tsvector(‘english’,coalesce(new.tags,’’)),’B’);
return new;
end
$$ LANGUAGE plpgsql;
• CREATE TRIGGER search_vector_update
BEFORE INSERT OR UPDATE OF title, body, tags ON mytable
FOR EACH ROW EXECUTE PROCEDURE search_trigger();
40. Full-text Search
• A variety of dictionaries
– Various Languages
– Thesaurus
– Snowball, Stem, Ispell, Synonym
– Write your own
• ts_headline
– Snippet extraction and highlighting
41. Datatypes: ranges
• int4range, int8range, numrange, tsrange, tstzrange, daterange
• SELECT int4range(10,20) @> 3 == false
• SELECT numrange(11.1,22.2) && numrange(20.0,30.0) == true
• SELECT int4range(10,20) * int4range(15,25) == 15-20
• CREATE INDEX res_index ON schedule USING gist(during)
• ALTER TABLE schedule ADD EXCLUDE USING gist (during WITH &&)
ERROR: conflicting key value violates exclusion constraint
”schedule_during_excl”
DETAIL: Key (during)=([ 2010-01-01 14:45:00, 2010-01-01
15:45:00 )) conflicts with existing key (during)=([ 2010-01-01
14:30:00, 2010-01-01 15:30:00 )).
42. Datatypes: hstore
• properties
– {“author” => “John Grisham”, “pages” => 535}
– {“director” => “Jon Favreau”, “runtime” = 126}
• SELECT … FROM mytable
WHERE properties -> ‘director’ LIKE ‘%Favreau’
– Does not use an index
• WHERE properties @> (‘author’ LIKE “%Grisham”)
– Uses an index to only check properties with an ‘author’
• CREATE INDEX table_properties ON mytable USING gin(properties)
44. Datatypes: JSON
• Validate JSON structure
• Convert row to JSON
• Functions and operators very similar to hstore
45. Datatypes: XML
• Validates well-formed XML
• Stores like a TEXT field
• XML operations like Xpath
• Can’t index XML column but you can index the
result of an Xpath function
46. Data compression with TOAST
• TOAST = The Oversized Attribute Storage Technique
• TOASTable data is automatically TOASTed
• Example:
– stored a 2.2m XML document
– storage size was 81k
47. User created datatypes
• Built in types
– Numerics, monetary, binary, time, date, interval, boolean,
enumerated, geometric, network address, bit string, text search, UUID,
XML, JSON, array, composite, range
– Add-ons for more such as UPC, ISBN and more
• Create your own types
– Address (contains 2 streets, city, state, zip, country)
– Define how your datatype is indexed
– GIN and GiST indexes are used by custom datatypes
48. Further exploration: PostGIS
• Adds Geographic datatypes
• Distance, area, union, intersection, perimeter
• Spatial indexes
• Tools to load available geographic data
• Distance, Within, Overlaps, Touches, Equals,
Contains, Crosses
• SELECT name, ST_AsText(geom)
FROM nyc_subway_stations
WHERE name = ‘Broad St’
• SELECT name, boroname
FROM nyc_neighborhoods
WHERE ST_Intersects(geom,
ST_GeomFromText(‘POINT(583571 4506714)’,26918)
• SELECT sub.name, nh.name, nh.borough
FROM nyc_neighborhoods AS nh
JOIN nyc_subway_stations AS sub
ON ST_Contains(nh.geom, sub.geom)
WHERE sub.name = ‘Broad St”
49. Further exploration: Functions
• Can be used in queries
• Can be used in stored procedures and triggers
• Can be used to build indexes
• Can be used as table defaults
• Can be written in PL/pgSQL, PL/Tcl, PL/Perl,
PL/Python out of the box
• PL/V8 is available an an extension to use
Javascript
50. Further exploration: PLV8
• CREATE OR REPLACE FUNCTION plv8_test(keys text[], vals text[])
RETURNS text AS $$
var o = {};
for(var i = 0; i < keys.length; i++) {
o[keys[i]] = vals[i];
}
return JSON.stringify(o);
$$ LANGUAGE plv8 IMMUTABLE STRICT;
SELECT plv8_test(ARRAY[‘name’,’age’],ARRAY[‘Tom’,’29’]);
• CREATE TYPE rec AS (i integer, t text);
CREATE FUNCTION set_of_records RETURNS SETOF rec AS $$
plv8.return_next({“i”: 1,”t”: ”a”});
plv8.return_next({“i”: 2,”t”: “b”});
$$ LANGUAGE plv8;
SELECT * FROM set_of_records();
51. Further exploration: Async commands
/ indexes
• Fine grained control within functions
– PQsendQuery
– PQsendQueryParams
– PQsendPrepare
– PQsendQueryPrepared
– PQsendDescribePrepared
– PQgetResult
– PQconsumeInput
• Per connection asynchronous commits
– set synchronous_commit = off
• Concurrent index creation to avoid blocking large tables
– CREATE INDEX CONCURRENTLY big_index ON mytable (things)
53. Biggest Issue with Frameworks
• Framework Dependency
• Trying to do everything in application code
• Race conditions
• Package dependency
54. Old School
• Service Oriented Architecture
– Getting more popular because of REST
– Had been happening for years prior with WSDL
• Database managed your data
– Constraints, triggers, functions, stored procedures
– If it was in the database…it was valid
• Nothing has changed…this is still the best way
55. If you really leverage your database…
• You can easily break your application into
logical parts
• You don’t need to create APIs through your
core code base when direct DB access there
• You can use a different language for certain
things if it makes sense to do so
– Node.js is great for APIs
– Using a library that only runs on Windows
• Database can provide granular access controls
60. Credits / Sources
• NOTE: Some code samples in this presentation have minor alterations for
presentation clarity (such as leaving out dictionary specifications on some
search calls, etc)
• http://www.postgresql.org/docs/9.2/static/index.html
• http://workshops.opengeo.org/postgis-intro/
• http://stackoverflow.com/questions/15983152/how-can-i-find-out-how-big-a-
large-text-field-is-in-postgres
• https://devcenter.heroku.com/articles/heroku-postgres-extensions-postgis-full-
text-search
• http://railscasts.com/episodes/345-hstore?view=asciicast
• http://www.slideshare.net/billkarwin/full-text-search-in-postgresql
• http://sourceforge.net/apps/mediawiki/postgres-xc/index.php?title=Main_Page
• http://railscasts.com/episodes/151-rack-middleware
• http://joshrendek.com/2012/11/sidekiq-vs-resque/
• http://torquebox.org/news/2011/10/06/torquebox-2x-performance/
• http://jruby.org/
• https://rvm.io/
• http://ddollar.github.io/foreman/
• http://en.wikipedia.org/wiki/Ruby_(programming_language)
• http://bundler.io/
• http://www.techempower.com/benchmarks/#section=data-r9&hw=peak&test=json
Editor's Notes
This is a bad diagram because I tried to use the built in tools in power point. I need to update it. In reality the View arrow should be going back through Rack to the Browser.