2. Background
● Cloud-hosted E-learning Platform (LMS)
● www.edu20.org for academia
● www.edu20.com for businesses
● 1,000,000+ users
● 15,000 new users a week
● Customers include Disney, Large
Universities, California School Districts,
small Kindergartens, etc.
3.
4. Search
● Originally using Sphinx/ThinkingSphinx
● All data came from RDS/MySQL
● One Sphinx per app server
● Full indexing once a day
● Noticeable slowdown during indexing
● Sphinx daemons would sometimes fail
5. CloudSearch
● Decided to move to CloudSearch
● Simple, scaleable
● No Sphinx servers to manage
● Reliable, fast, delta-indexing
● Easy to index anything, including DynamoDB
6. Migration
● 14 different types were being indexed
● Decided to index just one item for testing
● Use script to upload initial contents
● Then index everything except for high
volume items (messages, postings), which
were migrated last of all.
● Finally, index messages and postings
7. Configuration
● Two search domains, one for each site
● 20 index fields (only 2 text)
● Truncate messages/postings to 1000 bytes
8. Rails Integration
● Used aws_cloud_search gem
● Added hooks into object model to add
search update records to database
● Separate workers update search every 15
minutes with records from database
● Had issues with XML characters
9. Example Hooks
def after_create
super
update_search
end
def update_search
if ((type = material_class).searchable? && (scope != 'None'))
SearchUpdate.add(type, material_id)
end
end
10. Example DB update
def self.add(type, ids)
begin
search_update = SearchUpdate.new(:class_name => type.name, :ids =>
(ids.kind_of?(Array) ? ids.join(',') : ids), :operation => 'Update')
search_update.save!
rescue Exception => exception
puts "SearchUpdate.add exception: #{exception.message}"
end
end
11. End Result
● Migrated all of search in about two weeks,
spending 1-2 hours a day.
● edu20.org: 7,000,000 documents, m2.xlarge
● edu20.com: 500,000 documents, m1.small
● Simplified architecture, positioned for
scalability and DynamoDB
● Only downside: $500/month for search