Exploring Ruby on Rails and PostgreSQL


Published on

An overview of Ruby, jRuby, Rails, Torquebox, and PostgreSQL that was presented as a 3 hour class to other programmers at The Ironyard (http://theironyard.com) in Greenville, SC in July of 2013. The Rails specific sections are mostly code samples that were explained during the session so the real focus of the slides is Ruby, "the rails way" / workflow / differentiators and PostgreSQL.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • This is a bad diagram because I tried to use the built in tools in power point. I need to update it. In reality the View arrow should be going back through Rack to the Browser.
  • Exploring Ruby on Rails and PostgreSQL

    1. 1. Exploring Ruby on Rails and PostgreSQL
    2. 2. Who am I? • I’m Barry Jones • Application Developer since ’98 – Java, PHP, Groovy, Ruby, Perl, Python – MySQL, PostgreSQL, SQL Server, Oracle, MongoDB • Efficiency and infrastructure nut • Believer in “right tool for the job” – There is no silver bullet, programming is about tradeoffs
    3. 3. The Silver Bullet Ruby on Rails and PostgreSQL j/k  But it’s close
    4. 4. What do we look for in a language? • Balance – Can it do what I need it to do? • Web: Ruby/Python/PHP/Perl/Java/C#/C/C++ – Efficient to develop with it? • Ruby/Python/PHP – Libraries/tools/ecosystem to avoid reinventing the wheel? • Ruby/Python/PHP/Java/Perl – Is it fast? • Ruby/Python/Java/C#/C/C++ – Is it stable? • Ruby/Python/PHP/Perl/Java/C#/C/C++ – Do other developers use it? • At my company? In the area? Globally? – Cost effective? • Ruby/Python/PHP/Perl/C/C++ – Can it handle my architectural approach well? • Ruby/Python/Java/C# handle just about everything • CGI languages (PHP/Perl/C/C++) are very bad fits for frameworks, long polling, evented programming – Will it scale? • Yes. This is a subjective question because web servers scale horizontally naturally – Will my boss let me use it? • .NET shop? C# • Java shop? Java (Groovy, Clojure, Scala), jRuby, jython • *nix shop? Ruby, Python, Perl, PHP, C, C++ • Probable Winners: Ruby and Python
    5. 5. What stands out about Ruby? • Malleability – Everything is an object – Objects can be monkey patched • Great for writing Domain Specific Languages – Puppet – Chef – Capistrano – Rails “this is a string object”.length class String def palindrome? self == self.reverse end end “radar”.palindrome?
    6. 6. How is monkey patching good? • Rails adds web specific capabilities to Ruby – “ “.blank? == true • Makes using 3rd party libraries much easier – Aspect Oriented Development • Not dependent on built in hooks – Queued processing record = Record.find(id) record.delay.some_intense_logic • DelayedJob • Resque • Sidekiq • Stalker – Cross integrations Email.deliver • MailHopper – Automatically deliver all email in the background • Gems that specifically enhance other gems
    7. 7. How is monkey patching…bad? • If any behavior is modified by a monkey patch there is a chance something will break • On a positive note, if you’re writing tests and following TDD or BDD the tests should catch any problems • On another positive note, the ruby community is very big on testing
    8. 8. Why was Ruby created? • Created by Yukirio Matsumoto • "I wanted a scripting language that was more powerful than Perl, and more object-oriented than Python.” • "I hope to see Ruby help every programmer in the world to be productive, and to enjoy programming, and to be happy. That is the primary purpose of Ruby language.” – Google Tech Talk in 2008
    9. 9. Ruby Version Manager • cd into directory autoselects correct version of ruby and gemset • Makes running multiple projects with multiple versions of ruby and gem dependencies on one machine dead simple .rvmrc file rvm rubytype-version-patch@gemset Examples: rvm ruby-1.9.3-p327@myproject rvm jruby-1.7.4@myjrubyproject rvm ree-1.8.7@oldproject
    10. 10. Bundler and Gemfile $ bundle install Using rake (10.0.4) Using i18n (0.6.1) Using multi_json (1.7.2) Using activesupport (3.2.13) Using builder (3.0.4) Using activemodel (3.2.13) Using erubis (2.7.0) Using journey (1.0.4) Using rack (1.4.5) Using rack-cache (1.2) Using rack-test (0.6.2) … Your bundle is complete! Use `bundle show [gemname]` to see where a bundled gem is installed. source 'https://rubygems.org' source 'http://gems.github.com' # Application infrastructure gem 'rails', '3.2.13' gem 'devise' gem 'simple_form' gem 'slim' gem 'activerecord-jdbc-adapter’ gem 'activerecord-jdbcpostgresql- adapter' gem 'jdbc-postgres' gem 'jruby-openssl' gem 'jquery-rails' gem 'torquebox', '2.3.0' gem 'torquebox-server', '~> 2.3.0'
    11. 11. Foreman Not Ruby specific but written in ruby Used with Heroku Drop in a Procfile $ foreman start  CTRL + C to stop everything Procfile web: bundle exec thin start -p $PORT worker: bundle exec rake resque:work QUEUE=* clock: bundle exec rake resque:scheduler
    12. 12. jRuby: Why? Ruby isn’t perfect • Some gems can create memory leaks – esp. if they were written with native C • Does not have kernel level threading – Global Interpreter Lock • Everything is an object means unnecessary processing happens when doing things like adding numbers leading to a performance hit
    13. 13. jRuby: So how does it fix things? I hate writing Java…but the JVM is a work of art • Java infrastructure is virtually bulletproof – Most mature way to deploy a web application – Enterprisey  • JVM’s garbage collector is best of breed and eliminates the potential memory leak issues • JVM’s Just-In-Time compiler continually optimizes code the longer it runs making it faster • JVM gives Ruby kernel level threading • jRuby inspects your Ruby code to see if you’re doing anything it would prefer you didn’t…and turns it off if you’re not – Eg. If you aren’t overloading the + operator on int’s, it will convert them to basic types instead of running as objects • Include and use very mature Java libraries directly in your Ruby code – Significantly expands your toolbelt – Allows easy integration into existing Java environments
    14. 14. The Sidekiq Test Sidekiq is a multithreaded background worker that provides tremendous concurrency benefits Creating 1,000,000 objects in 50 concurrent threads Ruby jRuby
    15. 15. The App Server Test CPU Usage
    16. 16. The App Server Test Free Memory
    17. 17. The App Server Test Latency
    18. 18. The App Server Test Throughput
    19. 19. Update and Clarification • As of this posting to Slideshare, Torquebox has a mature version 3 and a prototype version 4 that operates in a “web server only” mode. Ruby is at version 2.1.0 with dramatic improvements to memory performance with forking which allows higher concurrency. • At this time, jruby still wins but it’s much closer. Based on chatter from the #jruby IRC channels, a major new release of both jRuby and Torquebox are expected to dramatically improve their performance thanks to recent Java updates. The expected timeline was late 2014 last I heard. • Independent benchmarks can be found here: http://www.techempower.com/benchmarks/#section=data- r9&hw=peak&test=json
    20. 20. RUBY ON RAILS Let’s take a break before covering…
    21. 21. What do we look for in a framework? • Please don’t suck – Rails does not suck • Does it follow Model-View-Controller? – Yes – Since Rails 1 it’s been the standard bearer for how to do MVC on the web, copied in almost every language • Does it help me avoid repeating myself (DRY)? – Yes • Is it self documenting? – Yes, it has a set of rules that generally make most documentation unnecessary • Is it flexible enough to bend to my application needs? – Yes • Do other people use it? – Good gosh yes • Will it work with my database? – Yes • Is it still going to be around in X years? – Ruby has Rails – Python has Django – Groovy has Grails – C# has MVC – PHP has fragmented framework Hell (aka – who knows?) – Java has a few major players (Struts 2, Play, etc)
    22. 22. Rails: The Basics Browser Rack Router Controller + Models View
    23. 23. Rails: Rack Watch this excellent walkthrough of Rack Middleware: http://railscasts.com/episodes/151-rack- middleware Summary: It’s a layer of ruby code that passes requests into your app and sends responses back out. You can add layers to do pre/post processing on all requests prior to beginning ANY of your application code.
    24. 24. Rails: Models / ActiveRecord class Post < ActiveRecord::Base belongs_to :category has_many :tags, through: :posts_tags validates :title, presence: true before_save :create_slug, only: :create scope :newest_first, order(‘created_at DESC’) scope :active, where(‘active = ?’,true) scope :newest_active, newest_first.active scope :search, lambda do |text| where(‘title LIKE ?’,”%#{text}%”) end def create_slug self.slug = title.downcase.squish.sub(‘ ‘,’-’) end end post = Post.new(title: ‘Some title’) post.save! OR post = Post.create(title: ‘Some title’) post.slug # some-title post.id # 1 post.created_at # Created datetime post.updated_at # Updated datetime post.title = ‘New title’ post.save! # Relations post.tags.first post.tags.count post.category.name post = Post.include(:tags) # Eager load post = Post.search(‘some’).newest_active.first
    25. 25. Rails: Migrations class CreateInitialTables < ActiveRecord::Migration def up create_table :posts do |t| t.string :title t.text :body t.string :slug t.integer :category_id t.timestamps end # … create more tables… add_index :tags, [:name,:something], unique: true execute “UPDATE posts SET field = ‘value’ WHERE stuff = ‘happens’” end def down drop_table :posts end def change add_column :posts, :user_id, :integer end end $ rake db:migrate
    26. 26. Rails: Controllers Class PostsController < ApplicationController before_filter :authenticate, only: :destroy def index # GET /posts end def new # GET /posts/new end def create # POST /posts end def show # GET /posts/:id end def edit # GET /posts/:id/edit end def update # PUT /posts/:id end def destroy # DELETE /posts/:id end end # Routes resources :posts OR limit it resources :posts, only: [:create,:new]
    27. 27. Rails: Views /app/views /layouts /application.html.erb /posts /new.html.slim /new.json.rabl /index.xml.erb /_widget.html.erb # slim example .post h2=post.title .body.grid-8=post.body # erb example <div class=“post”> <h2><%=post.title%></h2> <div class=“body grid-8”> <%=post.body%> </div> </div>
    28. 28. Rails: Testing with rspec Describe Post do describe ‘a basic test’ do subject { FactoryGirl.build(:post,title: ‘Some title’) } it ‘should be valid’ do should_not be_nil subject.valid?.should be_true end end describe ‘something with a complicated dependency’ do before do Post.stub(:function_to_override){ true } end end describe ‘a test with API hits’ do use_vcr_cassette ‘all_a_twitter’, record: :new_episodes end end
    29. 29. POSTGRESQL Let’s take a break before we talk about…
    30. 30. How do you pronounce it? Answer Response Percentage post-gres-q-l 2379 45% post-gres 1611 30% pahst-grey 24 0% pg-sequel 50 0% post-gree 350 6% postgres-sequel 574 10% p-g 49 0% database 230 4% Total 5267
    31. 31. What IS PostgreSQL? • Fully ACID compliant • Feature rich and extensible • Fast, scalable and leverages multicore processors very well • Enterprise class with quality corporate support options • Free as in beer • It’s kind’ve nifty
    32. 32. Laundry List of Features • Multi-version Concurrency Control (MVCC) • Point in Time Recovery • Tablespaces • Asynchronous replication • Nested Transactions • Online/hot backups • Genetic query optimizer multiple index types • Write ahead logging (WAL) • Internationalization: character sets, locale-aware sorting, case sensitivity, formatting • Full subquery support • Multiple index scans per query • ANSI-SQL:2008 standard conformant • Table inheritance • LISTEN / NOTIFY event system • Ability to make a Power Point slide run out of room
    33. 33. What are we covering today? • Full text-search • Built in data types • User defined data types • Automatic data compression • A look at some other cool features and extensions, depending how we’re doing on time
    34. 34. Full-text Search • What about…? – Solr – Elastic Search – Sphinx – Lucene – MySQL • All have their purpose – Distributed search of multiple document types • Sphinx – Client search performance is all that matters • Solr – Search constantly incoming data with streaming index updates • Elastic Search excels – You really like Java • Lucene – You want terrible search results that don’t even make sense to you much less your users • MySQL full text search = the worst thing in the world
    35. 35. Full-text Search • Complications of stand alone search engines – Data synchronization • Managing deltas, index updates • Filtering/deleting/hiding expired data • Search server outages, redundancy – Learning curve – Character sets match up with my database? – Additional hardware / servers just for search – Can feel like a black box when you get a support question asking “why is/isn’t this showing up?”
    36. 36. Full-text Search • But what if your needs are more like: – Search within my database – Avoid syncing data with outside systems – Avoid maintaining outside systems – Less black box, more control
    37. 37. Full-text Search • tsvector – The text to be searched • tsquery – The search query • to_tsvector(‘the church is AWESOME’) @@ to_tsquery(SEARCH) • @@ to_tsquery(‘church’) == true • @@ to_tsquery(‘churches’) == true • @@ to_tsquery(‘awesome’) == true • @@ to_tsquery(‘the’) == false • @@ to_tsquery(‘churches & awesome’) == true • @@ to_tsquery(‘church & okay’) == false • to_tsvector(‘the church is awesome’) – 'awesom':4 'church':2 • to_tsvector(‘simple’,’the church is awesome’) – 'are':3 'awesome':4 'church':2 'the':1
    38. 38. Full-text Search • ALTER TABLE mytable ADD COLUMN search_vector tsvector • UPDATE mytable SET search_vector = to_tsvector(‘english’,coalesce(title,’’) || ‘ ‘ || coalesce(body,’’) || ‘ ‘ || coalesce(tags,’’)) • CREATE INDEX search_text ON mytable USING gin(search_vector) • SELECT some, columns, we, need FROM mytable WHERE search_vector @@ to_tsquery(‘english’,‘Jesus & awesome’) ORDER BY ts_rank(search_vector,to_tsquery(‘english’,‘Jesus & awesome’)) DESC • CREATE TRIGGER search_update BEFORE INSERT OR UPDATE ON mytable FOR EACH ROW EXECUTE PROCEDURE tsvector_update_trigger(search_vector, ’english’, title, body, tags)
    39. 39. Full-text Search • CREATE FUNCTION search_trigger RETURNS trigger AS $$ begin new.search_vector := setweight(to_tsvector(‘english’,coalesce(new.title,’’)),’A’) || setweight(to_tsvector(‘english’,coalesce(new.body,’’)),’D’) || setweight(to_tsvector(‘english’,coalesce(new.tags,’’)),’B’); return new; end $$ LANGUAGE plpgsql; • CREATE TRIGGER search_vector_update BEFORE INSERT OR UPDATE OF title, body, tags ON mytable FOR EACH ROW EXECUTE PROCEDURE search_trigger();
    40. 40. Full-text Search • A variety of dictionaries – Various Languages – Thesaurus – Snowball, Stem, Ispell, Synonym – Write your own • ts_headline – Snippet extraction and highlighting
    41. 41. Datatypes: ranges • int4range, int8range, numrange, tsrange, tstzrange, daterange • SELECT int4range(10,20) @> 3 == false • SELECT numrange(11.1,22.2) && numrange(20.0,30.0) == true • SELECT int4range(10,20) * int4range(15,25) == 15-20 • CREATE INDEX res_index ON schedule USING gist(during) • ALTER TABLE schedule ADD EXCLUDE USING gist (during WITH &&) ERROR: conflicting key value violates exclusion constraint ”schedule_during_excl” DETAIL: Key (during)=([ 2010-01-01 14:45:00, 2010-01-01 15:45:00 )) conflicts with existing key (during)=([ 2010-01-01 14:30:00, 2010-01-01 15:30:00 )).
    42. 42. Datatypes: hstore • properties – {“author” => “John Grisham”, “pages” => 535} – {“director” => “Jon Favreau”, “runtime” = 126} • SELECT … FROM mytable WHERE properties -> ‘director’ LIKE ‘%Favreau’ – Does not use an index • WHERE properties @> (‘author’ LIKE “%Grisham”) – Uses an index to only check properties with an ‘author’ • CREATE INDEX table_properties ON mytable USING gin(properties)
    43. 43. Datatypes: arrays • CREATE TABLE sal_emp(name text, pay_by_quarter integer[], schedule text[][]) • CREATE TABLE tictactoe ( squares integer[3][3] ) • INSERT INTO tictactoe VALUES (‘{{1,2,3},{4,5,6},{7,8,9}}’) • SELECT squares[1:2][1:1] == {{1},{4}} • SELECT squares[2:3][2:3] == {{5,6},{8,9}}
    44. 44. Datatypes: JSON • Validate JSON structure • Convert row to JSON • Functions and operators very similar to hstore
    45. 45. Datatypes: XML • Validates well-formed XML • Stores like a TEXT field • XML operations like Xpath • Can’t index XML column but you can index the result of an Xpath function
    46. 46. Data compression with TOAST • TOAST = The Oversized Attribute Storage Technique • TOASTable data is automatically TOASTed • Example: – stored a 2.2m XML document – storage size was 81k
    47. 47. User created datatypes • Built in types – Numerics, monetary, binary, time, date, interval, boolean, enumerated, geometric, network address, bit string, text search, UUID, XML, JSON, array, composite, range – Add-ons for more such as UPC, ISBN and more • Create your own types – Address (contains 2 streets, city, state, zip, country) – Define how your datatype is indexed – GIN and GiST indexes are used by custom datatypes
    48. 48. Further exploration: PostGIS • Adds Geographic datatypes • Distance, area, union, intersection, perimeter • Spatial indexes • Tools to load available geographic data • Distance, Within, Overlaps, Touches, Equals, Contains, Crosses • SELECT name, ST_AsText(geom) FROM nyc_subway_stations WHERE name = ‘Broad St’ • SELECT name, boroname FROM nyc_neighborhoods WHERE ST_Intersects(geom, ST_GeomFromText(‘POINT(583571 4506714)’,26918) • SELECT sub.name, nh.name, nh.borough FROM nyc_neighborhoods AS nh JOIN nyc_subway_stations AS sub ON ST_Contains(nh.geom, sub.geom) WHERE sub.name = ‘Broad St”
    49. 49. Further exploration: Functions • Can be used in queries • Can be used in stored procedures and triggers • Can be used to build indexes • Can be used as table defaults • Can be written in PL/pgSQL, PL/Tcl, PL/Perl, PL/Python out of the box • PL/V8 is available an an extension to use Javascript
    50. 50. Further exploration: PLV8 • CREATE OR REPLACE FUNCTION plv8_test(keys text[], vals text[]) RETURNS text AS $$ var o = {}; for(var i = 0; i < keys.length; i++) { o[keys[i]] = vals[i]; } return JSON.stringify(o); $$ LANGUAGE plv8 IMMUTABLE STRICT; SELECT plv8_test(ARRAY[‘name’,’age’],ARRAY[‘Tom’,’29’]); • CREATE TYPE rec AS (i integer, t text); CREATE FUNCTION set_of_records RETURNS SETOF rec AS $$ plv8.return_next({“i”: 1,”t”: ”a”}); plv8.return_next({“i”: 2,”t”: “b”}); $$ LANGUAGE plv8; SELECT * FROM set_of_records();
    51. 51. Further exploration: Async commands / indexes • Fine grained control within functions – PQsendQuery – PQsendQueryParams – PQsendPrepare – PQsendQueryPrepared – PQsendDescribePrepared – PQgetResult – PQconsumeInput • Per connection asynchronous commits – set synchronous_commit = off • Concurrent index creation to avoid blocking large tables – CREATE INDEX CONCURRENTLY big_index ON mytable (things)
    52. 52. ARCHITECTURE And finally…
    53. 53. Biggest Issue with Frameworks • Framework Dependency • Trying to do everything in application code • Race conditions • Package dependency
    54. 54. Old School • Service Oriented Architecture – Getting more popular because of REST – Had been happening for years prior with WSDL • Database managed your data – Constraints, triggers, functions, stored procedures – If it was in the database…it was valid • Nothing has changed…this is still the best way
    55. 55. If you really leverage your database… • You can easily break your application into logical parts • You don’t need to create APIs through your core code base when direct DB access there • You can use a different language for certain things if it makes sense to do so – Node.js is great for APIs – Using a library that only runs on Windows • Database can provide granular access controls
    56. 56. Architecture: Before
    57. 57. Architecture: After
    58. 58. Architecture: Scaled
    59. 59. THANKS!
    60. 60. Credits / Sources • NOTE: Some code samples in this presentation have minor alterations for presentation clarity (such as leaving out dictionary specifications on some search calls, etc) • http://www.postgresql.org/docs/9.2/static/index.html • http://workshops.opengeo.org/postgis-intro/ • http://stackoverflow.com/questions/15983152/how-can-i-find-out-how-big-a- large-text-field-is-in-postgres • https://devcenter.heroku.com/articles/heroku-postgres-extensions-postgis-full- text-search • http://railscasts.com/episodes/345-hstore?view=asciicast • http://www.slideshare.net/billkarwin/full-text-search-in-postgresql • http://sourceforge.net/apps/mediawiki/postgres-xc/index.php?title=Main_Page • http://railscasts.com/episodes/151-rack-middleware • http://joshrendek.com/2012/11/sidekiq-vs-resque/ • http://torquebox.org/news/2011/10/06/torquebox-2x-performance/ • http://jruby.org/ • https://rvm.io/ • http://ddollar.github.io/foreman/ • http://en.wikipedia.org/wiki/Ruby_(programming_language) • http://bundler.io/ • http://www.techempower.com/benchmarks/#section=data-r9&hw=peak&test=json