Inside a modern RIA powered by Solr

                       Andrew Montalenti
                            Co-Founder &
                          Technology Lead
                      andrew@cogtree.com

                                            1
2
Mainstream
                      30,000
                                      Blogs
                                                 1

                900,000
1From   Technorati’s 2008 State of the Blogosphere
                                                     3
But,



What about your interests?   4
your interests,
your web



                  5
6
What is               ?
• Your unique interests
• … create a filtered, prioritized, and personalized news feed
• … built just for you!
• 120K+ news and blog sources tracked
• The most personally relevant items at the top


• Bottom line:
  You spend less time skimming headlines,
  and more time reading relevant content.
                                                                 7
Demo!
(if possible)
Let’s pop open the hood!



                           9
The RIA
       ExtJS + jQuery
       JS REST Binding
          REST API
       django-piston
Solr                 Postgres
                                10
11
Per-Doc
        Processing



Batch
 size


              IO-bound work




                              12
13
Solr in the Real World
•   Storage of "canonical data“
•   Relational vs. Search Index
•   Complexity of custom relevancy scoring
•   "Near-Real-Time" updates
•   Solr in a pipeline
•   Pushing bits and marshalling cost
•   Index size, corruption, and stability
•   Administrability
                                             14
Scaling Up Parse.ly
      • Custom scoring
      • Multicore
      • Distributed search
      • Celery / Disco
      • User-Article Binding Problem




                                       16
17
Basic        Almost There      Comprehensive


   solr.py
                 collective.solr      haystack
    pysolr


    solrpy
                    solango          python-solr
json/py output


                                                   18
19
Batching
                       Context Lib


 Caching
Memoization
                                                Multicore
                                                 Proxies
                  Comprehensive.
                    Pythonic.
                      Solr.


                                        Web
        Pagination
           Iterators                 Framework
                                       Django



                                                            20
powered by



             21
Didier   Sachin
Andrew




                           22
Quick Plug

Does your company or enterprise
need our services?




                                  23
24
24
Andrew Montalenti
                  andrew@cogtree.com



Twitter            Website                   Sign up now!
@amontalenti       http://parse.ly           It’s Free!

Product Twitter    Team Blog                 Promo Code
@parse_ly          http://blog.cogtree.com   SLIDES

                                                      25
                                                      25

Parse.ly: Inside a modern RIA built with Solr

  • 1.
    Inside a modernRIA powered by Solr Andrew Montalenti Co-Founder & Technology Lead andrew@cogtree.com 1
  • 2.
  • 3.
    Mainstream 30,000 Blogs 1 900,000 1From Technorati’s 2008 State of the Blogosphere 3
  • 4.
  • 5.
  • 6.
  • 7.
    What is ? • Your unique interests • … create a filtered, prioritized, and personalized news feed • … built just for you! • 120K+ news and blog sources tracked • The most personally relevant items at the top • Bottom line: You spend less time skimming headlines, and more time reading relevant content. 7
  • 8.
  • 9.
    Let’s pop openthe hood! 9
  • 10.
    The RIA ExtJS + jQuery JS REST Binding REST API django-piston Solr Postgres 10
  • 11.
  • 12.
    Per-Doc Processing Batch size IO-bound work 12
  • 13.
  • 14.
    Solr in theReal World • Storage of "canonical data“ • Relational vs. Search Index • Complexity of custom relevancy scoring • "Near-Real-Time" updates • Solr in a pipeline • Pushing bits and marshalling cost • Index size, corruption, and stability • Administrability 14
  • 16.
    Scaling Up Parse.ly • Custom scoring • Multicore • Distributed search • Celery / Disco • User-Article Binding Problem 16
  • 17.
  • 18.
    Basic Almost There Comprehensive solr.py collective.solr haystack pysolr solrpy solango python-solr json/py output 18
  • 19.
  • 20.
    Batching Context Lib Caching Memoization Multicore Proxies Comprehensive. Pythonic. Solr. Web Pagination Iterators Framework Django 20
  • 21.
  • 22.
    Didier Sachin Andrew 22
  • 23.
    Quick Plug Does yourcompany or enterprise need our services? 23
  • 24.
  • 25.
    Andrew Montalenti andrew@cogtree.com Twitter Website Sign up now! @amontalenti http://parse.ly It’s Free! Product Twitter Team Blog Promo Code @parse_ly http://blog.cogtree.com SLIDES 25 25