%in Midrand+277-882-255-28 abortion pills for sale in midrand
Downtown SF Lucene/Solr Meetup: Developing Scalable Search for User Generated Content at PlayStation
1. Developing
Scalable
Search
for
User
Generated
Content
at
PlaySta:on
Alvin
Peng
Sr.
So=ware
Engineer
Sony
Interac:ve
Entertainment
2. User
Generated
Content
(UGC)
in
PlaySta:on
• PlaySta:on
users
can
easily
share
awesome
medias
• Media
types
– Broadcasts
– Screenshots
– Videos
• Medias
are
posted
to
third
party
networks
– Facebook
– TwiKer
– YouTube
– Dailymo:on
– Twitch
– Ustream
– Niconico
3. However…
• There
was
no
central
place
to
show
or
search
for
all
these
awesome
contents
• Only
shown
up
in
users’
Ac:vity
Feed
and
Profile
• Only
sent
to
friends
• Basically
not
visible
to
majority
of
our
millions
of
users
4. Difficulties of UGC System
• Searchable
• Scalability
• Performance
• Dynamic
content
• A
lot
of
read
• A
lot
write
• Various
searching
requirements
6. Why Solr?
• Widely
used
open
source
search
plaTorm
• Scalable
• Stable
• Feature
rich
• Not
just
a
search
plaTorm
• Great
Solr
community,
both
individuals
and
companies
7. Developers of UGC backend
system
• Alvin
Peng
• David
Herrera
Rosales
20. UGC SolrCloud System Design
• Solr
5.2.1
• SolrJ
CloudSolrClient
• Single
collec:on
• 3
clusters
in
produc:on
environment
– Broadcasts
– Screenshots
– Videos
• 5
zookeeper
nodes
• Single
shard
• 16
Solr
nodes
per
cluster
21. Solr Schema
• Field
types
– Class
• StrField
• TextField
• TrieLongField
• TrieDateField
• etc.
– Analyzer
• Char
filter
– MappingCharFilterFactory
– HTMLStripCharFilterFactory
– PaKernReplaceCharFilterFactory
– etc.
• Tokenizer
– StandardTokenizerFactory
– NGramTokenizerFactory
– KeywordTokenizerFactory
– etc.
• Filter
– LowerCaseFilterFactory
– PorterStemFilterFactory
– StopFilterFactory
– etc.
– Index
Analyzer
and
Query
Analyzer
22. Solr Schema
• Fields
– Number
of
fields
– Field
type
– Indexed
– Stored
– etc.
• copyField
– <copyField
source="*_t"
dest=”anything"
maxChars="25000"
/>
• dynamicField
– <dynamicField
name="*_t"
type=”text"
indexed="true"
stored="true"/>
23. UGC Multilingual Support
• Supports
about
20
languages
– English
– Spanish
– Japanese
– etc.
• Different
field
types
for
different
languages
• Different
tokenizers
and
filters
24. UGC Solr Configuration
• Hard
commit:
15
minutes
– Hard
commits
are
about
durability
• So=
commit:
1
minute
– So=
commits
are
about
visibility
– Less
expensive,
but
not
free
– Use
the
longest
so=
commit
interval
that’s
acceptable
for
best
performance
25. UGC Stats
• Online
since
last
Sept.
• Number
of
documents
– Broadcasts:
26K
– Screenshots:
5M
– Videos:
20M
• Average
request
RPS
– Total
UGC
query
requests
per
day
>
1B
– Average
Solr
query
RPS:
• Broadcasts:
1600
• Screenshots:
250
• Videos:
250
– Average
Solr
update
RPS:
• Broadcasts:
500
• Screenshots:
250
• Videos:
500
• Average
query
latency
– Average
Solr
query
latency:
• Broadcasts:
4ms
(16ms
for
leader)
• Screenshots:
14ms
(16ms
for
leader)
• Videos:
60ms
(210ms
for
leader)
– Average
Solr
update
latency:
• Broadcasts:
8ms
(60ms
for
leader)
• Screenshots:
1ms
(10ms
for
leader)
• Videos:
2ms
(24ms
for
leader)