• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
曾勇 Elastic search-intro

曾勇 Elastic search-intro



#LAMP人#第12期《新一代互联网行为定向广告技术的挑战与优化- 品友互动专场》 – anyshare 之 曾勇

#LAMP人#第12期《新一代互联网行为定向广告技术的挑战与优化- 品友互动专场》 – anyshare 之 曾勇



Total Views
Views on SlideShare
Embed Views



3 Embeds 93

http://blog.lamper.cn 90
https://home.jolicloud.com 2
http://s.medcl.net 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.


12 of 2 previous next

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    曾勇 Elastic search-intro 曾勇 Elastic search-intro Presentation Transcript

    • LAMP人 主题分享交流会第12期:《新一代互联网行为定向广告技术的挑战与优化》 - 品友互动专场 www.LAMPER.cn QQ群:83304912 http://weibo.com/lampercn
    • ElasticSearch A search engine “ready to fly” Medcl/2012/2/18
    • About me• Medcl• medcl@sina• medcl@github• m@medcl.net• log.medcl.net
    • Why I am here?• 好东西需要与大家一起分享!
    • What’s elasticsearch• “Distributed, (Near) Real Time, Search Engine”• Open Source(Apache 2.0)• RESTful• Free Schema(Dynamic)• MultiTenant• Scalable• High Availability• Rich Search Features• Good Expansibility• ……
    • first impression
    • Let’s start the trip
    • Debug Tools
    • Index a documentcurl –XPOST http://localhost:9200/myindex/share/1-d’ Field 字段内容 字段名称 RESTful{ URL地址 "url" : "http://www.lamper.cn/", "date" : "2012-02-18 13:00:00", "location" : "beijing,北京"}’ 索引文档内容, Json格式
    • Index Response{ "ok": true, "_index": "myindex", "_type": "share", "_id": "1", "_version": 1}
    • Explain the url 索引文档 服务器IP地址 索引名称 唯一标识http://localhost:9200/myindex/share/1 HTTP端口 索引类型名称
    • Query the document ES服务器地址 类型名称 指定查询条件curl –XGEThttp://localhost:9200/myindex/share/_search?q=location:beijing 索引名称 搜索RESTful接口 查询条件, 字段名:值
    • Search Response{ "took": 12, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1, "max_score": 0.5, "hits": [ { "_index": "myindex", "_type": "share", "_id": "1", "_score": 0.5, "_source": { "url": "http://www.lamper.cn/", "date": "2012-02-18 13:00:00", "location": "beijing,北京" } } ] }}
    • Querieshttp://localhost:9200/myindex/share/_search?q=beijinghttp://localhost:9200/myindex/share,conf/_search?q=beijinghttp://localhost:9200/myindex/_search?q=beijinghttp://localhost:9200/myindex,myindex2/_search?q=beijinghttp://localhost:9200/_search?q=beijing
    • QueryDSLcurl -XPOSThttp://localhost:9200/myindex/_search –d’{ "query": { Why QueryDSL? "term": { Filters、Caching、 "location": "beijing" Highlighting、Facet、 ComplexQuery } …… }}’
    • Scalability&HA
    • Distributed Lucene Directory• Each index is fully sharded with a configurable number of shards.• Each shard can have zero or more replicas.• Read / Search operations performed on either replica shard.
    • Automatic shard allocationFrom:http://www.slideshare.net/elasticsearch/elasticsearch-at-berlinbuzzwords-2010#
    • Scalability• nodes that can hold data, and nodes that do not.• There is no need for a load balancer in elasticsearch, each node can receive a request, and if it can’t handle it, it will automatically delegate it to the appropriate node(s).• If you want to scale out search, you can simply have more shard replicas per shard.
    • Transaction log• Indexed / deleted doc is fully persistent• No need for a Lucene IndexWriter#commit• Managed using a transaction log / WAL• Full single node durability (kill dash 9)• Utilized when doing hot relocation of shards• Periodically “flushed” (calling IW#commit)
    • BASE• Each document you index is there once the index operation is done.• No need to commit or something similar to get everything persisted.• A shard can have 1 or more replicas for HA.• Gateway persistency is done in the background in an async manner.
    • Not Mentioned Here…• Versioning• Template• River That’s Too Much,• Percolator Discovery it yourself• PartialUpdate• Routing• Parent-Child Type• Scripting• ……
    • Community&Support• http://github.com/elasticsearch• http://groups.google.com/group/elasticsearch• Irc:#elasticsearch@freenode• qq群:190605846• http://doc.elasticsearch.cn• http://s.medcl.net/
    • BTW• 招人in’ – 分布式 – 高性能 – 海量数据处理 – 个性化推荐 My – 搜索引擎 Company!• 对以上任一感兴趣者: – 欢迎加入我们的团伙!
    • Thank you!