Using Solr in Online Travel Shopping to Improve User Experience - By  Esteban Donato and  Sudhakara Karegowdra
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Using Solr in Online Travel Shopping to Improve User Experience - By Esteban Donato and Sudhakara Karegowdra

  • 2,406 views
Uploaded on

See conference video - http://www.lucidimagination.com/devzone/events/conferences/revolution/2011 ...

See conference video - http://www.lucidimagination.com/devzone/events/conferences/revolution/2011

In this talk we would like to present three different use cases of Solr in the travel industry. First of all
we would describe how we implemented faceted navigation for hotel shopping. Then, we will
introduce how we implemented destination searching functionality like auto-complete and
misspelling. Lastly, we will show you how we integrated Solr to provide better experiences to mobile
users.

More in: Technology , Travel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,406
On Slideshare
2,015
From Embeds
391
Number of Embeds
12

Actions

Shares
Downloads
27
Comments
0
Likes
1

Embeds 391

http://www.lucidimagination.com 270
http://searchhub.org 50
http://lucenerevolution.com 29
http://www.lucenerevolution.org 18
http://confluence 13
http://lucidsearchhub.stephenz.com 3
url_unknown 3
https://sokol-web.de 1
http://info.lucidimagination.com 1
http://10.55.20.113 1
http://confluence.intranet.malapronta.com.br 1
http://lucidworks.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Using Solr in Online Travel to Improve User Experience
    Sudhakar Karegowdra, Esteban Donato
    Travelocity, May 25TH 2011{ sudhakar.karegowdra, esteban.donato}@travelocity.com
  • 2. What We Will Cover
    Travelocity
    Speakers Background
    Merchandising & Solr
    Challenges
    Solution
    Sizing and performance data
    Take Away
    Location Resolution & Solr
    Challenges
    Solution
    Sizing and performance data
    Take Away
    Q&A
    3
  • 3. First Online Travel Agency(OTA) Launched in 1996
    Grown to 3,000 employees and is one of the largest travel agencies worldwide
    Headquartered in Dallas/Fort Worth with satellite offices in San Francisco, New York, London, Singapore, Bangalore, Buenos Aires to name a few
    In 2004, the Roaming Gnome became the centerpiece of marketing efforts and has become an international pop icon
    Owned by Sabre Holdings - sister companies include Travelocity Business, IgoUgo.com, lastminute.com, Zuji among others
    4
  • 4. Speakers Background
    • Sudhakar Karegowdra
    • 5. Principal Architect
    Travelocity.com
    • My experience
    • 6. 13 + years
    • 7. Solr/ Lucene 3 years
    • 8. Implementing Hadoop, Pig and Hive for Data warehouse.
    • 9. Topic : Merchandising
    Esteban Donato
    Lead Architect
    Travelocity.com
    My experience
    10 + years
    Solr 2 years
    Analyzing Mahout and Carrot2 for document clustering engine.
    Topic : Location Resolution
    5
  • 10. 6
    Merchandising
    By Sudhakar Karegowdra
  • 11. The Challenge
    Market Drivers
    Build Landing Pages with Faceted Navigation
    Enable Content Segmentation and delivery
    Support Roll out of Promotions
    Roll up Data to a higher level
    E.g., All 5 star hotels in California to bring all the 5 Star hotels from SFO,LAX, SAN etc.,
    Faster time to market new Ideas
    Rapidly scale to accommodate global brands with disparate data sources
    7
  • 12. The Challenge
    Traditional Database approach
    Higher time to market
    Specialized skill set to design and optimize database structures and queries
    Aggregation of data and changing of structures quite complex
    Building Faceted navigation capabilities needs complex logic leading to high maintenance cost
    8
  • 13. Solution - Overview
    Data from various sources aggregated and ingested into Solr
    Core per Locale and Product Type
    Wrapper service to combine some data across product cores and manage configuration rules
    Solr’s built in Search and Faceting to power the navigation
    9
  • 14. Solution – Architecture View
    10
    UI
    Widgets
    Mobile
    Services/Business Logic
    Solr Slaves (Multi Core)
    Solr Master (Multi Core)
    Offer Management Tool
    Oracle
    ETL
    Products
    Deals
    ……
  • 15. Solution - Achievements
    Millions of unique Long Tail Landing Pages
    E.g., http://www.travelocity.com/hotel-d4980-nevada-las-vegas-hotels_5-star_business-center_green
    Faster search across products
    E.g., Beach Deals under $500
    Segmented Content delivery through tagging
    Scaled well to distribute the content to different brands, partners and advertisers
    Opened up for other innovative applications
    Deals on Map, Deals on Mobile, Wizards etc.,
    11
  • 16. Solution – Road Ahead
    Migration to Solr 3.1
    Geo spatial search
    CSV out put format
    Query boosting by Search pattern
    Near Real time Updates
    Deal and user behavior mining in Hadoop – MapReduce and Solr to Serve the Content
    Move Slaves to Cloud
    12
  • 17. Sizing & Performance
    Index Stats
    Number of Cores : 25
    Number of Documents : ~ 1 Million Records
    Response
    Requests : 70 tps
    Average response time : 0.005 seconds (5 ms)
    Software Versions
    Solr Version 1.4.0
    filterCache size : 30000
    Tomcat – 5.5.9
    JDK1.6
    13
  • 18. Take Away
    Semi Structured Storage in Solr helps aggregate disparate sources easily
    Remember Dynamic fields
    Multiple Cores to manage multiple locale data
    Solr is a great enabler of “Innovations”
    14
  • 19. 15
    Location Resolution
    By Esteban Donato
  • 20. The Challenge
    How to develop a global location resolution service?
    Flexibility to changes
    General enough to cover everyone needs
    Multi language
    Performance and scalability
    Configurable by site
    16
  • 21. Architecture of the solution
    17
    Solr Slave
    Auto-complete
    Resolution
    • Master/Slave architecture
    • 22. Multi-core: each core represents a language
    • 23. Remote Streaming indexing
    • 24. CSV format
    Solr Master
    Location DB
    Batch Job
    Management Tool
    • SolrJ client binary format
    • 25. Solr response cache
  • Auto-complete
    System has to suggest options as the users type their desired location
    Examples “san” => San Francisco, “veg” => Las Vegas
    Relevancy: not all the locations are equally important. “par” => “Paris, France”; “Parana, Argentina”
    Users can search by various fields: location code, location name, city code, city name, state/province code, state province name, country code, country name.
    18
  • 26. Solr schema
    <dynamicField name="RANK*" type="int" required="false" indexed="true" stored="true" />
    <field name="GLS_FULL_SEARCH" type="glsSearchField" required="false" indexed="true" stored="false" multiValued="true"/>
    <fieldType name="glsSearchField" class="solr.TextField" positionIncrementGap="100“>
    <analyzer>
    <tokenizer class="solr.PatternTokenizerFactory" pattern="[/-t ]+" />
    <filter class="solr.LowerCaseFilterFactory" />
    <filter class="solr.TrimFilterFactory" />
    <filter class="solr.ISOLatin1AccentFilterFactory" />
    <filter class="solr.RemoveDuplicatesTokenFilterFactory" />
    <filter class="solr.PatternReplaceFilterFactory" pattern="[,.]" replacement="" replace="all"/>
    </analyzer>
    </fieldType>
    19
  • 27. Resolution
    System has to resolve the location requested by the users.
    Contemplates aliases. Big Apple => New York
    Contemplates ambiguities.
    Contemplates misspellings. Lomdon => London
    NGramDistance algorithm.
    How to combine distance with relevancy
    Error suggesting the correct location when it is a prefix. Lond => London
    20
  • 28. Spellchecker configuration
    <fieldType name=" spellcheckType " class="solr.TextField" positionIncrementGap="100“>
    <analyzer>
    <tokenizerclass="solr.KeywordTokenizerFactory” />
    <filter class="solr.LowerCaseFilterFactory" />
    <filter class="solr.TrimFilterFactory" />
    <filter class="solr.ISOLatin1AccentFilterFactory" />
    <filter class="solr.RemoveDuplicatesTokenFilterFactory" />
    <filter class="solr.PatternReplaceFilterFactory" pattern="[,.]" replacement="" replace="all"/>
    </analyzer>
    </fieldType>
    21
  • 29. Sizing & Performance
    4 cores with ~ 500,000 documents indexed each
    Response times
    Auto-complete: 15ms, 20 TPS
    Resolution: 10ms, 2 TPS
    Cache configuration
    queryResultCache: maxSize=1024
    documentCache, maxSize=1024
    fieldValueCache  & filterCache  disabled
    22
  • 30. Wrap Up
    Performance always as top priority
    Develop simple but robust services
    Provide a simple API
    23
  • 31. Q&A
    24
  • 32. Contact
    Esteban Donato
    Esteban.donato@travelocity.com
    Twitter: @eddonato
    Sudhakar Karegowdra
    Sudhakar.karegowdra@travelocity.com
    Twitter: @skaregowdra
    https://www.facebook.com/travelocity
    Twitter: @travelocityand
    @RoamingGnome
    25