Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Maziyar PANAHI
Big Data engineer / Cloud Architect
ARGOS - NoSQL / Big Data

Université Paris-Sud, LAL

25 November 2015
E...
ISCPIF
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE
UNITÉ CNRS UPS3611 - HTTP://ISCPIF.FR Creative commons, open...
ISCPIF
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE
UNITÉ CNRS UPS3611 - HTTP://ISCPIF.FR Creative commons, open...
ISCPIF
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE
http://iscpif.fr/services/
ISCPIF
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE
http://iscpif.fr/services/
ISCPIF
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE
• Core Services
• ROOM RESERVATION

• EVENT ANNOUNCEMENT

• ...
ISCPIF
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE
External Collaborators:
ISCPIF Partners:
• Elasticsearch
• MongoDB
• Redis
• RabbitMQ
• Big Data Platform
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (I...
–Elasticsearch
“You Know, for Search!”
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch | Real-Time Data
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch | Real-Time Advanced Analytics
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch | Massively Distributed
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch | High Availability
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch | Multi-tenancy
Host Index
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch | Full-Text Search
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch | Document-Oriented & Schema-Free
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch | Developer-Friendly, RESTful API
• Single d...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch | Developer-Friendly, RESTful API
Index
Type...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch | Search & Analyze Data in Real Time
Apache ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Installation - Package
1. curl -L -O h...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Installation - Repositories
echo "deb ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Installation - That’s it!
Simply run:
...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Configuration - System
curl localhost:9...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Configuration - System
-> /etc/default/...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Configuration - Elasticsearch
curl loca...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Configuration - Elasticsearch
Node.name...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Configuration - Elasticsearch
#Shards
#...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Shards and Replicas
Why Shards and Rep...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Shards and Replicas
What is Shard?
• Y...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | How many Shards and Replicas?
• Replic...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | How many Shards and Replicas?
• Changi...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | How many Shards and Replicas?
• StackO...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | How many Shards and Replicas?
• 3x nod...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Let’s Use It!
• Wikipedia uses Elastic...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | RESTful API with JSON over HTTP
• VERB...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | RESTful API with JSON over HTTP
curl -...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Clarification
Relational DB Databases T...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Indexing a document
PUT /cnrs/employee...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Retrieving a document
GET /cnrs/employ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Deleting a document
DELETE /cnrs/emplo...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Search
GET /cnrs/employee/_search
{
"t...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Search with Query DSL
GET /cnrs/employ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Search with Query DSL
GET /cnrs/employ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Full-text Search
GET /cnrs/employee/_s...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Phrase Search
GET /cnrs/employee/_sear...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Highlighting Searches
GET /cnrs/employ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Search with Query DSL
• Full text quer...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Search with Query DSL
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Aggregations
GET /cnrs/employee/_searc...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Aggregations
GET /cnrs/employee/_searc...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Aggregations
{
"aggs" : {
“my_ip_range...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Aggregations
{
"aggs" : {
"articles_ov...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Aggregations
• Metrics Aggregations
• ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Plugins
• Plugins Types
• Java plugins...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Plugins
• API extension Plugins
• Aler...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Plugins - kopf
Web administration tool...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Plugins - kopf
Web administration tool...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Plugins - kopf
Web administration tool...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Plugins - kopf
Web administration tool...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Plugins - kopf
Web administration tool...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Plugins - kopf
Web administration tool...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Plugins - kopf
Web administration tool...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Plugins - kopf
Web administration tool...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Plugins - kopf
Web administration tool...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Plugins - kopf
Web administration tool...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Plugins - kopf
Web administration tool...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Plugins - kopf
Web administration tool...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Full-text search and ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Full-text search and ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Full-text search and ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Full-text search and ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Full-text search and ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Full-text search and ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Full-text search and ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Full-text search and ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Full-text search and ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Full-text search and ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Full-text search and ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Full-text search and ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Full-text search and ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Full-text search and ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Full-text search and ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Full-text search and ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
The Elastic Platform ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
The Elastic Platform ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
The Elastic Platform ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Logs!
• Check Nginx a...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Logs!
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Logs!
• Old School to...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Logs!
• Symantec Secu...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Logstash | Collect, E...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Logstash | Collect, E...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Logstash | Collect, E...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
An input plugin enabl...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
• Output Plugins
• bo...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Logstash-forwarder: s...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Logstash Server: sysl...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Logstash Server: sysl...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Logstash-forwarder: s...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Logstash: Suricata | ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Suricata.yml
- eve-lo...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Logstash-forwarder: S...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Logstash server: Suri...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Logstash: Suricata
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Logstash: Suricata
• ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Logstash: Suricata
• ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Kibana | Explore & Vi...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
Kibana 4.2.1 | Compat...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
The ELK Stack | Elast...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
The ELK Stack | Elast...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
The ELK Stack | Elast...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
The ELK Stack | Elast...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
The ELK Stack | Elast...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
The ELK Stack | Elast...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
The ELK Stack | Elast...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
The ELK Stack | Elast...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
The ELK Stack | Elast...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | ISCPIF Use Cases
The ELK Stack | Elast...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Elasticsearch Mapping
• Which string fi...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Elasticsearch Mapping
• Dynamic mappin...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Mapping Explicit
PUT my_index 

{

"ma...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Mapping Dynamic templates
PUT my_index...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Mapping Dynamic templates
PUT my_index...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Production
• Hardware
• Memory

• 64 G...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Security
• No built-in authentication
...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Elasticsearch 2.0 | Lots of things!
• Analyzer and tokeniz...
“Launch your
GIANT idea”
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB v3.2 is coming soon. Lear...
• Document Database
• Documents (i.e. objects)

• Embedded documents and arrays (no expensive joins!)

• Dynamic schema su...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | Platforms
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | Install MongoDB on Ubuntu
echo "deb http://repo....
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | SQL to MongoDB
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | SQL to MongoDB
https://docs.mongodb.org/manual/r...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | Insert Documents
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | Query Documents
db.inventory.find( { type: "snack...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | Query Documents
db.tweets.find(
{
"coordinates.co...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | Aggregation
{

"_id": "10280",

"city": "NEW YOR...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
db.tweets.aggregate([

{

$geoNear: {

near: [ 2.348730, 4...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | MapReduce
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | Index
• Creating an index 

• db.ships.ensureInd...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | Replica Set
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | Replica Set
• Replica Set Members

• Replica Set...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | Shardings
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | 3.0
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | 3.0
• 7-10x Better Performance
• Up to 80% Less ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | 3.0
Datacenter in France and Italy
60M/week - 8M/day - 360K/hour
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPI...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | ISCPIF Use Cases
Avg. usage of Twitter in Paris ...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | ISCPIF Use Cases
24HOURS NEWS: Real-time Breakin...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | ISCPIF Use Cases
24HOURS NEWS: Real-time Breakin...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | ISCPIF Use Cases
TIMOTHY: Real-time Dashboards
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | ISCPIF Use Cases
TIMOTHY: Real-time Dashboards
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | ISCPIF Use Cases
TIMOTHY: Real-time Dashboards
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | ISCPIF Use Cases
HIGH THROUGHPUT: Real-time aggr...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | ISCPIF Use Cases
Most generic
Most specific Calcu...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | ISCPIF Use Cases
130K
inserts /s
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | ISCPIF Use Cases
10K-80K
inserts /s
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | ISCPIF Use Cases
9K queries /s
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | ISCPIF Use Cases
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | ISCPIF Use Cases
Gephi Streaming
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | Monitoring System
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | Monitoring System Network and Cache
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | Monitoring System Hardware
MongoDB | Monitoring System Hardware
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | ISCPIF Use Cases
Monitoring NewRelic
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | ISCPIF Use Cases
Monitoring NewRelic
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | Monitoring System Objects
3.02 Billion Documents...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
MongoDB | Monitoring System Objects
“in-memory data structure store”
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
used as database, cache a...
• Data structures
• strings
• hashes
• lists
• sets
• sorted sets
• bitmaps

• hyperlogs
• geospatial indexes

• Built-in
...
Redis KEYS:
> set mykey somevalue
OK

> get mykey
“somevalue”
Redis LISTS:
> rpush mylist A
(integer) 1

> rpush mylist B
...
maziyar-beautiful-MacBook$ redis-benchmark -q -n 100000
PING_INLINE: 85178.88 requests per second

PING_BULK: 80000.00 req...
maziyar-beautiful-MacBook$ redis-benchmark -n 1000000 -t set,get -P 16 -q
PING_INLINE: 735294.12 requests per second

PING...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Redis | ISCPIF Use Cases
Scientific Games
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Redis | ISCPIF Use Cases
Scientific Games
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Redis | ISCPIF Use Cases
Scientific Games
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Redis | ISCPIF Use Cases
• Scientific Operations
• Occurren...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Redis | ISCPIF Use Cases
Monitoring NewRelic
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Redis | ISCPIF Use Cases
Monitoring NewRelic
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Redis | ISCPIF Use Cases
Monitoring NewRelic
“Messaging that just works”
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
well actually it’s more than t...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
RabbitMQ | Feature List
A messaging broker
• Highlights

•...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
RabbitMQ | Feature List
A messaging broker
Type
Topic
Q1
Q...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
RabbitMQ | ISCPIF Use Cases
Monitoring NewRelic
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
RabbitMQ | ISCPIF Use Cases
Monitoring RabbitMQ Management...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
RabbitMQ | ISCPIF Use Cases
Monitoring RabbitMQ Management...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
RabbitMQ | ISCPIF Use Cases
Monitoring RabbitMQ Management...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
RabbitMQ | ISCPIF Use Cases
Monitoring RabbitMQ Management...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
RabbitMQ | ISCPIF Use Cases
• Distributed Computations
• P...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
RabbitMQ | ISCPIF Use Cases
• Parsing
• 225 file

• 10m-20m...
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
ISCPIF Big Data
• Multivac (Open Data Platform)
• ISCPIF A...
Twitter
Storing
Data
Real-time
Streaming
System
Data
Analytics
Real-time
Processing
Web
Mobile
Wearable
Devices
Text
Minin...
HighPerformanceInfrastructure
HighlyAvailableInfrastructure
• Multivac (Open Data)
• ISCPIF APIs
• Scientific Dashboards
• Science en Poche
• Distributed Computing
• Climatique (COP21...
–just a regular Geek :)
“You are your best benchmark!”
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
SHOWCASE
FRANCE 2014
Real-time processing and visualizing Twitter in France
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE ...
FRANCE 2014
Real-time processing and visualizing Twitter in France
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE ...
News Tracking
Real-time tracking news with highest impact of networks
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRAN...
Aviation Accidents
50K retweets/10min
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Aviation Accidents
120K retweets/10min
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Robin Williams
180K retweets/10min
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
#Ferguson michealBROWN
75K retweets/10min
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
Paris
13 Novembre
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
22 VMs
Distributed Systems
320K Ops /S
130K Insert /S
64K Index /S
…
8 WEB SERVERS
4 API SERVERS
Search Engine Cluster
900...
• Starting 2016 with CouchBase in parallel

• Graph Databases

• Spark Streaming / machine learning

• Clustering and cate...
–The Blacklist: Lord Baltimore (No. 104)
“Every piece of information is worth something to somebody. And
in the hands of t...
Thanks!
maziyar.panahi@iscpif.fr
25 November 2015
INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
http://i...
HOW TO SCALE FROM ZERO TO BILLIONS!
HOW TO SCALE FROM ZERO TO BILLIONS!
HOW TO SCALE FROM ZERO TO BILLIONS!
Upcoming SlideShare
Loading in …5
×

HOW TO SCALE FROM ZERO TO BILLIONS!

3,467 views

Published on

How Big Data platform scaled from zero to billions of data within 6 months at ISCPIF (CNRS).
This talk contains our use of Elasticsearch, MongoDB, Redis, RabbitMQ and scalable/high available Web services built over Big Data architecture.

This presentation was presented at Université Paris-Sud, LAL, Bâtiment 200 organized by ARGOS. https://indico.mathrice.fr/event/2/overview

ISCPIF: http://iscpif.fr
Big Data at ISCPIF: http://bigdata.iscpif.fr
Climate at ISCPIF: http://climate.iscpif.fr
Playground for climate: http://climate.iscpif.fr/playground
Tweetoscope: http://tweetoscope.iscpif.fr

Published in: Technology
  • Using My Methods, Linda Hopkins Went From An A Cup to a C Cup in Just 5 Weeks and 5 Days! 》》》 https://dwz1.cc/YYZPZbuh
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Dating direct: ♥♥♥ http://bit.ly/36cXjBY ♥♥♥
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Sex in your area is here: ❤❤❤ http://bit.ly/36cXjBY ❤❤❤
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • development of the Cloud Systems
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

HOW TO SCALE FROM ZERO TO BILLIONS!

  1. 1. Maziyar PANAHI Big Data engineer / Cloud Architect ARGOS - NoSQL / Big Data Université Paris-Sud, LAL 25 November 2015 Engineer at CNRS
  2. 2. ISCPIF INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE UNITÉ CNRS UPS3611 - HTTP://ISCPIF.FR Creative commons, open science, open data, ressources mutualisées
  3. 3. ISCPIF INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE UNITÉ CNRS UPS3611 - HTTP://ISCPIF.FR Creative commons, open science, open data, ressources mutualisées
  4. 4. ISCPIF INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE http://iscpif.fr/services/
  5. 5. ISCPIF INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE http://iscpif.fr/services/
  6. 6. ISCPIF INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE • Core Services • ROOM RESERVATION • EVENT ANNOUNCEMENT • PROJECT HOSTING AND RESIDENCIES • HIGH PERFORMANCE COMPUTING • TRAINING SESSIONS • COMMUNITY EXPLORER • Open Platforms • OpenMole • Gargantext • Big Data • Linkrbrain http://iscpif.fr/services/
  7. 7. ISCPIF INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE External Collaborators: ISCPIF Partners:
  8. 8. • Elasticsearch • MongoDB • Redis • RabbitMQ • Big Data Platform INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) HOW TO SCALE FROM ZERO TO BILLIONS!
  9. 9. –Elasticsearch “You Know, for Search!” INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  10. 10. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch
  11. 11. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch | Real-Time Data
  12. 12. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch | Real-Time Advanced Analytics
  13. 13. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch | Massively Distributed
  14. 14. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch | High Availability
  15. 15. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch | Multi-tenancy Host Index
  16. 16. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch | Full-Text Search
  17. 17. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch | Document-Oriented & Schema-Free
  18. 18. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch | Developer-Friendly, RESTful API • Single document APIs • Index API • Get API • Delete API • Update API • Multi-document APIs • Multi Get API • Bulk API • Bulk UDP API • Delete By Query API
  19. 19. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch | Developer-Friendly, RESTful API Index Type ID Document
  20. 20. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch | Search & Analyze Data in Real Time Apache 2 Open Source License Build on top of Apache Lucene™
  21. 21. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Installation - Package 1. curl -L -O https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/ tar/elasticsearch/2.0.0/elasticsearch-2.0.0.tar.gz 2. tar -xvf elasticsearch-2.0.0.tar.gz 3. cd elasticsearch-2.0.0/bin 4. ./elasticsearch
  22. 22. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Installation - Repositories echo "deb http://packages.elastic.co/elasticsearch/2.x/debian stable main" | sudo tee -a /etc/ apt/sources.list.d/elasticsearch-2.x.list Download and install the Public Signing Key: Repository definition APT -> /etc/apt/sources.list.d/elasticsearch-2.x.list sudo apt-get update && sudo apt-get install elasticsearch wget -qO - https://packages.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add - Install Elasticsearch 2.0:
  23. 23. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Installation - That’s it! Simply run: curl 'http://localhost:9200/?pretty'
  24. 24. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Configuration - System curl localhost:9200/_nodes/process?pretty • #File descriptors • Setting it to 32k or even 64k is recommended • #Memory settings • Disable swap • sudo swapoff -a • /etc/fstab
  25. 25. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Configuration - System -> /etc/default/elasticsearch• ES_HEAP_SIZE • Leave enough for the OS • Leave enough for the • Neve ever go over 30.1 GB!! • I’ll go with half < 30 GB
  26. 26. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Configuration - Elasticsearch curl localhost:9200/_nodes/process?pretty• #/etc/elasticsearch/elasticsearch.yml • network: • host : <MACHINE IP ADDRESS> • path: • logs: /var/log/elasticsearch • data: /var/data/elasticsearch • cluster: • name: <NAME OF YOUR CLUSTER> • node: • name: <NAME OF YOUR NODE>
  27. 27. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Configuration - Elasticsearch Node.name Cluster.name mlockall # Elasticsearch performs poorly when JVM starts swapping: you should ensure that it _never_ swaps.
  28. 28. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Configuration - Elasticsearch #Shards #Replicas Remember this? • 1 cluster • 3 nodes • 6 shards • 1 replica
  29. 29. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Shards and Replicas Why Shards and Replicas? • ES has built in clustering • Scaling out index: (shards) • Parallel work on an index: (shards) • Increasing availability: (replicas) • Can change number of replicas anytime! • Cannot change number of shards after index creation! (must reindex)
  30. 30. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Shards and Replicas What is Shard? • You can't actually split an index! • ES uses Multiple Lucene indexes (AKA SHARDS) • Simply, a shard is a Lucene index! • Over head, query hits all shards for scoring • So they don’t come free!
  31. 31. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | How many Shards and Replicas? • Replicas: • More replicas = More availability = Longer indexing! • Shards • How much data? • How many queries? • How complex are those queries? • How much resources each node has? • Number of nodes in your cluster • Don’t know? over allocate few shards. (but not too many, they are not free!)
  32. 32. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | How many Shards and Replicas? • Changing number of replicas: easy curl -XPUT 'localhost:9200/my_index/_settings' -d ' { "index" : { "number_of_replicas" : 4 } }' • Changing number of shards: must be re-indexed • For some, not a big deal. • For some, it is a big deal!
  33. 33. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | How many Shards and Replicas? • StackOverflow http://stackexchange.com/performance • Scaling out: • More shards than the #nodes • Multiple shards in one node
  34. 34. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | How many Shards and Replicas? • 3x nodes - 3x shards - 2x replicas • Failing 2x nodes = cluster’s still healthy • Doubling the storage need • each replica = 1/3 of index size • Storage is cheap, small price to pay for availability
  35. 35. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Let’s Use It! • Wikipedia uses Elasticsearch to provide full-text search with highlighted search snippets, and search-as-you-type and did-you-mean suggestions. • The Guardian uses Elasticsearch to combine visitor logs with social -network data to provide real-time feedback to its editors about the public’s response to new articles. • Stack Overflow combines full-text search with geolocation queries and uses more- like-this to find related questions and answers. • GitHub uses Elasticsearch to query 130 billion lines of code! full-text search, structured search, analytics, and all three in combination:
  36. 36. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | RESTful API with JSON over HTTP • VERB: GET, POST, PUT, HEAD, or DELETE. • PROTOCOL: http or https • HOST: hostname of any node • PORT: Elasticsearch HTTP service, which defaults to 9200 • PATH: API Endpoint (_count, _cluster/stats, _nodes/stats/jvm, etc.) • QUERY_STRING: any optional query-string parameters for example ?pretty • BODY: A JSON-encoded request body (if the request needs one.) curl -X<VERB> '<PROTOCOL>://<HOST>:<PORT>/<PATH>?<QUERY_STRING>' -d '<BODY>'
  37. 37. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | RESTful API with JSON over HTTP curl -XGET 'http://localhost:9200/_count?pretty' -d ' { "query": { "match_all": {} } } ' { "count" : 0, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 } }
  38. 38. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Clarification Relational DB Databases Tables Rows Columns Elasticsearch Indices Types Documents Fields • Index (noun) • Traditional relational database. It is the place to store related documents. • Index (verb) • To index a document is to store a document in an index (noun) so that it can be retrieved and queried. (Like INSERT in SQL) • Inverted index • B-tree index in Relational databases add an index = Elasticsearch and Lucene use a structure called an inverted index. Both to improve the speed of data retrieval.
  39. 39. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Indexing a document PUT /cnrs/employee/1 { "first_name" : "John", "last_name" : "Smith", "age" : 25, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] } • cnrs • The index name • employee • The type name • /1 • The ID of this particular employee PUT /cnrs/employee/2 { "first_name" : "Jane", "last_name" : "Smith", "age" : 32, "about" : "I like to collect rock albums", "interests": [ "music" ] }
  40. 40. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Retrieving a document GET /cnrs/employee/1 { "_index" : "cnrs", "_type" : "employee", "_id" : "1", "_version" : 1, "found" : true, "_source" : { "first_name" : "John", "last_name" : "Smith", "age" : 25, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] } }
  41. 41. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Deleting a document DELETE /cnrs/employee/1
  42. 42. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Search GET /cnrs/employee/_search { "took": 6, "timed_out": false, "_shards": { ... }, "hits": { "total": 2, "max_score": 1, "hits": [ { "_index": "cnrs", "_type": "employee", "_id": "1", "_score": 1, "_source": { "first_name": "John", "last_name": "Smith", "age": 25, "about": "I love to go rock climbing", "interests": [ "sports", "music" ] } }, { "_index": "cnrs", "_type": "employee", "_id": "2", "_score": 1, "_source": { "first_name": "Jane", "last_name": "Smith", "age": 32, "about": "I like to collect rock albums", "interests": [ "music" ] } } ] } } GET /cnrs/employee/_search?q=last_name:Smith { ... "hits": { "total": 2, "max_score": 0.30685282, "hits": [ { ... "_source": { "first_name": "John", "last_name": "Smith", "age": 25, "about": "I love to go rock climbing", "interests": [ "sports", "music" ] } }, { ... "_source": { "first_name": "Jane", "last_name": "Smith", "age": 32, "about": "I like to collect rock albums", "interests": [ "music" ] } } ] } }
  43. 43. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Search with Query DSL GET /cnrs/employee/_search { "query" : { "match" : { "last_name" : "Smith" } } } Elasticsearch provides a rich, flexible, query language called the query DSL, which allows us to build much more complicated, robust queries.
  44. 44. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Search with Query DSL GET /cnrs/employee/_search { "query" : { "filtered" : { "filter" : { "range" : { "age" : { "gt" : 30 } } }, "query" : { "match" : { "last_name" : "smith" } } } } } { ... "hits": { "total": 1, "max_score": 0.30685282, "hits": [ { ... "_source": { "first_name": "Jane", "last_name": "Smith", "age": 32, "about": "I like to collect rock albums", "interests": [ "music" ] } } ] } }
  45. 45. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Full-text Search GET /cnrs/employee/_search { "query" : { "match" : { "about" : "rock climbing" } } } { ... "hits": { "total": 2, "max_score": 0.16273327, "hits": [ { ... "_score": 0.16273327, "_source": { "first_name": "John", "last_name": "Smith", "age": 25, "about": "I love to go rock climbing", "interests": [ "sports", "music" ] } }, { ... "_score": 0.016878016, "_source": { "first_name": "Jane", "last_name": "Smith", "age": 32, "about": "I like to collect rock albums", "interests": [ "music" ] } } ] } }
  46. 46. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Phrase Search GET /cnrs/employee/_search { "query" : { "match_phrase" : { "about" : "rock climbing" } } } { ... "hits": { "total": 1, "max_score": 0.23013961, "hits": [ { ... "_score": 0.23013961, "_source": { "first_name": "John", "last_name": "Smith", "age": 25, "about": "I love to go rock climbing", "interests": [ "sports", "music" ] } } ] } }
  47. 47. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Highlighting Searches GET /cnrs/employee/_search { "query" : { "match_phrase" : { "about" : "rock climbing" } }, "highlight": { "fields" : { "about" : {} } } } { ... "hits": { "total": 1, "max_score": 0.23013961, "hits": [ { ... "_score": 0.23013961, "_source": { "first_name": "John", "last_name": "Smith", "age": 25, "about": "I love to go rock climbing", "interests": [ "sports", "music" ] }, "highlight": { "about": [ "I love to go <em>rock</em> <em>climbing</ em>" ] } } ] } } The highlighted fragment from the original text
  48. 48. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Search with Query DSL • Full text queries • Match Query • Multi Match Query • Common Terms Query • Query String Query • Simple Query String Query • Term level queries • Term Query • Terms Query • Range Query • Exists Query • Missing Query • Prefix Query • Wildcard Query • Regexp Query • Fuzzy Query • Type Query • Ids Query • Compound queries • Constant Score Query • Bool Query • Dis Max Query • Function Score Query • Boosting Query • Indices Query • And Query • Not Query • Or Query • Filtered Query • Limit Query • Joining queries • Nested Query • Has Child Query • Has Parent Query • Geo queries • GeoShape Query • Geo Bounding Box Query • Geo Distance Query • Geo Distance Range Query • Geo Polygon Query • Geohash Cell Query • Specialized queries • More Like This Query • Template Query • Script Query
  49. 49. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Search with Query DSL
  50. 50. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Aggregations GET /cnrs/employee/_search { "query": { "match": { "last_name": "smith" } }, "aggs": { "all_interests": { "terms": { "field": "interests" } } } } ... "all_interests": { "buckets": [ { "key": "music", "doc_count": 2 }, { "key": "sports", "doc_count": 1 } ] }
  51. 51. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Aggregations GET /cnrs/employee/_search { "aggs" : { "all_interests" : { "terms" : { "field" : "interests" }, "aggs" : { "avg_age" : { "avg" : { "field" : "age" } } } } } } ... "all_interests": { "buckets": [ { "key": "music", "doc_count": 2, "avg_age": { "value": 28.5 }, } { "key": "sports", "doc_count": 1, "avg_age": { "value": 25 } } ] }
  52. 52. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Aggregations { "aggs" : { “my_ip_ranges" : { "ip_range" : { "field" : "ip", "ranges" : [ { "to" : "10.0.0.5" }, { "from" : "10.0.0.5" } ] } } } } { ... "aggregations": { "my_ip_ranges": { "buckets" : [ { "to": 167772165, "to_as_string": "10.0.0.5", "doc_count": 4 }, { "from": 167772165, "from_as_string": "10.0.0.5", "doc_count": 6 } ] } } }
  53. 53. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Aggregations { "aggs" : { "articles_over_time" : { "date_histogram" : { "field" : "date", "interval" : "month" } } } } { "aggs" : { "articles_over_time" : { "date_histogram" : { "field" : "date", "interval" : "1.5h" } } } }
  54. 54. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Aggregations • Metrics Aggregations • Avg Aggregation • Cardinality Aggregation • Extended Stats Aggregation • Geo Bounds Aggregation • Max Aggregation • Min Aggregation • Percentiles Aggregation • Percentile Ranks Aggregation • Scripted Metric Aggregation • Stats Aggregation • Sum Aggregation • Top hits Aggregation • Value Count Aggregation • Bucket Aggregations • Children Aggregation • Date Histogram Aggregation • Date Range Aggregation • Filter Aggregation • Geo Distance Aggregation • GeoHash grid Aggregation • Histogram Aggregation • IPv4 Range Aggregation • Missing Aggregation • Nested Aggregation • Range Aggregation • Reverse nested Aggregation • Sampler Aggregation • Significant Terms Aggregation • Terms Aggregation
  55. 55. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Plugins • Plugins Types • Java plugins • JAR files • Must be installed on all nodes in the cluster • Each node must be restarted • Site plugins • Web content: JS, HTML, CSS etc. • Can be only on one node • Do not require a restart • Mixed plugins • Both JAR files and web content to enhance the core Elasticsearch functionality
  56. 56. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Plugins • API extension Plugins • Alerting Plugins • Analysis Plugins • Discovery Plugins • Management and Site Plugins • Mapper Plugins • Scripting Plugins • Security Plugins • Snapshot/Restore Plugins • Transport Plugins • Integrations to enhance the core Elasticsearch functionality
  57. 57. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Plugins - kopf Web administration tool for Elasticsearch cluster sudo [/usr/share/elasticsearch/]bin/plugin install [plugin_name] sudo [/usr/share/elasticsearch/]bin/plugin install lmenezes/elasticsearch-kopf sudo [/usr/share/elasticsearch/]bin/plugin install lmenezes/elasticsearch-kopf/2.x open http://localhost:9200/_plugin/kopf
  58. 58. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Plugins - kopf Web administration tool for Elasticsearch cluster
  59. 59. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Plugins - kopf Web administration tool for Elasticsearch cluster
  60. 60. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Plugins - kopf Web administration tool for Elasticsearch cluster
  61. 61. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Plugins - kopf Web administration tool for Elasticsearch cluster
  62. 62. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Plugins - kopf Web administration tool for Elasticsearch cluster
  63. 63. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Plugins - kopf Web administration tool for Elasticsearch cluster index 5k/s larger document (full tweets) index 23k/s smaller document (tweet date, text, etc.) index 67k/s just 140 characters and ID
  64. 64. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Plugins - kopf Web administration tool for Elasticsearch cluster
  65. 65. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Plugins - kopf Web administration tool for Elasticsearch cluster sudo service elasticsearch stop
  66. 66. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Plugins - kopf Web administration tool for Elasticsearch cluster sudo service elasticsearch start
  67. 67. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Plugins - kopf Web administration tool for Elasticsearch cluster sudo service elasticsearch start
  68. 68. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Plugins - kopf Web administration tool for Elasticsearch cluster
  69. 69. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Full-text search and Data Analytics • Web of Science • Text Mining and NLP • Keyword Extractions • Phrase Occurrence • Phrase Co-Occurrence • Keyword Analytics • Date Histogram • Significant Terms • N-Grams • etc.
  70. 70. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Full-text search and Data Analytics
  71. 71. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Full-text search and Data Analytics
  72. 72. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Full-text search and Data Analytics
  73. 73. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Full-text search and Data Analytics
  74. 74. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Full-text search and Data Analytics
  75. 75. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Full-text search and Data Analytics • Boolean • Query String • Aggregation • Date histogram • Query cache • nop!
  76. 76. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Full-text search and Data Analytics { "bool": { "must": { "match": { "title": "how to make millions" }}, "must_not": { "match": { "tag": "spam" }}, "should": [ { "match": { "tag": "starred" }}, { "range": { "date": { "gte": "2014-01-01" }}} ] } } • title field matches “how to make millions” • not marked as spam • documents are starred or are from 2014 onward will rank higher • Documents that match both conditions will rank even higher
  77. 77. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Full-text search and Data Analytics • Filters • for binary yes/no searches • for queries on exact values • Exists • just the ones with abstract != null • Query cache • true!
  78. 78. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Full-text search and Data Analytics My bool vs. filter = ~500ms vs. ~50ms - ~120ms
  79. 79. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Full-text search and Data Analytics
  80. 80. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Full-text search and Data Analytics • Query DSL: Filters • And Filter • Bool Filter • Exists Filter • Geo Bounding Box Filter • Geo Distance Filter • Geo Distance Range Filter • Geo Polygon Filter • GeoShape Filter • Geohash Cell Filter • Has Child Filter • Has Parent Filter • Ids Filter • Indices Filter • Limit Filter • Match All Filter • Missing Filter • Nested Filter • Not Filter • Or Filter • Prefix Filter • Query Filter • Range Filter • Regexp Filter • Script Filter • Term Filter • Terms Filter • Type Filter
  81. 81. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Full-text search and Data Analytics
  82. 82. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Full-text search and Data Analytics
  83. 83. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Full-text search and Data Analytics • Significant Terms Aggregation • JLH score • mutual information • Chi square • google normalized distance • Percentage • scripted [Yang and Pedersen, "A Comparative Study on Feature Selection in Text Categorization", 1997] (http://courses.ischool.berkeley.edu/i256/f06/papers/yang97comparative.pdf) for a study on using significant terms for feature selection for text classification). "script_heuristic": { "script": "_subset_freq/(_superset_freq - _subset_freq + 1)" }
  84. 84. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Full-text search and Data Analytics
  85. 85. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases The Elastic Platform | Make Sense of Your Data
  86. 86. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases The Elastic Platform | Make Sense of Your Data
  87. 87. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases The Elastic Platform | Make Sense of Your Data
  88. 88. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Logs! • Check Nginx access logs between 2015-11-23T10:23:10 and 2015-11-24T21:53:30 • Check Suricata alert >= 2 between 2015-11-23T10:23:10 and 2015-11-24T21:53:30 and with type of DNS • Now correlate the results!!!
  89. 89. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Logs!
  90. 90. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Logs! • Old School tools • grep/sed/awk/cut/sort • Manually analyze the output • Different formats • Customized fields and details • Not centralized • Modern way (the right way!) • Define endpoints (input/output) • Correlate patterns • Store data (searchable and visualizable)
  91. 91. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Logs! • Symantec Security Information Manager • Splunk • HP / Arcsight • Tripwire • NetIQ • Quest Software • IMB/Q1 Labs • Novell • graylog • fluentd
  92. 92. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Logstash | Collect, Enrich & Transport Data • Process Any Data, From Any Source • Centralize data processing of all types • Normalize varying schema and formats • Quickly extend to custom log formats • Easily add plugins for custom data sources • https://github.com/elastic/logstash
  93. 93. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Logstash | Collect, Enrich & Transport Data input { stdin { } } filter { grok { match => { "message" => "%{COMBINEDAPACHELOG}" } } date { match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ] } } output { elasticsearch { hosts => ["localhost:9200"] } stdout { codec => rubydebug } }
  94. 94. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Logstash | Collect, Enrich & Transport Data 127.0.0.1 - - [11/Dec/2013:00:01:45 -0800] "GET /xampp/status.php HTTP/1.1" 200 3891 "http://cadenza/ xampp/navi.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0 { "message" => "127.0.0.1 - - [11/Dec/2013:00:01:45 -0800] "GET /xampp/status.php HTTP/1.1" 200 3891 "http://cadenza/ xampp/navi.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0"", "@timestamp" => "2013-12-11T08:01:45.000Z", "@version" => "1", "host" => "cadenza", "clientip" => "127.0.0.1", "ident" => "-", "auth" => "-", "timestamp" => "11/Dec/2013:00:01:45 -0800", "verb" => "GET", "request" => "/xampp/status.php", "httpversion" => "1.1", "response" => "200", "bytes" => "3891", "referrer" => ""http://cadenza/xampp/navi.php"", "agent" => ""Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0"" }
  95. 95. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases An input plugin enables a specific source of events to be read by Logstash. • Input Plugins • beats • couchdb_changes • drupal_dblog • elasticsearch • exec • eventlog • file • ganglia • gelf • generator • graphite • github • heartbeat • heroku • http • http_poller • irc • imap • jdbc • jmx • kafka • log4j • lumberjack • meetup • pipe • puppet_facter • relp • rss • rackspace • rabbitmq • redis • salesforce • snmptrap • stdin • sqlite • s3 • sqs • stomp • syslog • tcp • twitter • unix • udp • varnishlog • wmi • websocket • xmpp • zenoss • zeromq
  96. 96. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases • Output Plugins • boundary • circonus • csv • cloudwatch • datadog • datadog_metric s • email • elasticsearch • elasticsearch_j ava • exec • file • google_bigquery • google_cloud_s torage • ganglia • gelf • graphtastic • graphite • hipchat • http • irc • influxdb • juggernaut • jira • kafka • lumberjack • librato • loggly • mongodb • metriccatcher • nagios • null • nagios_nsca • opentsdb • pagerduty • pipe • riemann • redmine • rackspace • rabbitmq • redis • riak • s3 • sqs • stomp • statsd • solr_http • sns • syslog • stdout • tcp • udp • webhdfs • websocket • xmpp • zabbix • zeromq An output plugin sends event data to a particular destination.
  97. 97. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Logstash-forwarder: syslog, auth, ufw and nginx { "network": { "servers": [ "10.0.0.2:5000" ], "timeout": 15, "ssl ca": "/etc/pki/tls/certs/logstash-forwarder.crt" }, "files": [ { "paths": [ "/var/log/syslog", "/var/log/auth.log" ], "fields": { "type": "syslog" } }, { "paths": [ "/var/log/ufw.log" ], "fields": {"type": "firewall"} }, { "paths": [ "/var/log/nginx/*.log" ], "exlude":["*.gz", "err*.log", "*.log.*"], "fields": { "type": "nginx-api" } } ] }
  98. 98. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Logstash Server: syslog, auth, ufw and nginx input { lumberjack { port => 5000 type => "logs" ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt" ssl_key => "/etc/pki/tls/private/logstash-forwarder.key" } }
  99. 99. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Logstash Server: syslog, auth, ufw and nginx filter { if [type] == "syslog" { grok { match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} % {SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:[%{POSINT:syslog_pid}])?: % {GREEDYDATA:syslog_message}" } add_field => [ "received_at", "%{@timestamp}" ] add_field => [ "received_from", "%{host}" ] } syslog_pri { } date { match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ] } } }
  100. 100. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Logstash-forwarder: syslog, auth, ufw and nginx output {elasticsearch { host => "10.0.0.25" port => "9300" cluster => "iscpif-es"} }
  101. 101. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Logstash: Suricata | Open Source IDS / IPS / NSM engine • Highly Scalable • Suricata is multi threaded • Protocol Identification • Suricata a Malware Command and Control Channel hunter. • Off port HTTP CnC channels, which normally slide right by most IDS systems • Thanks to dedicated keywords you can match on protocol fields which range from http URI to a SSL certificate identifier. • File Identification, MD5 Checksums, and File Extraction • Identify thousands of file types while crossing your network!
  102. 102. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Suricata.yml - eve-log: enabled: yes type: file #file|syslog|unix_dgram|unix_stream filename: eve.json types: - alert - http: extended: yes - dns - tls: extended: yes # enable this for extended logging information - files: force-magic: yes # force logging magic on all logged files force-md5: yes # force logging of md5 checksums - ssh
  103. 103. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Logstash-forwarder: Suricata { # The network section covers network configuration :) "network": { "servers": [ "10.0.0.2:5000" ], "timeout": 15, "ssl ca": "/etc/pki/tls/certs/logstash-forwarder.crt" }, "files": [{ "paths": ["/var/log/suricata/eve.json"], "fields": { "type": "suricata" }, "sincedb_path": "/var/logstash/suricata.db", "sincedb_write_interval": 1, "codec": "json", "type":"suricata" }] }
  104. 104. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Logstash server: Suricata filter { if [type] == "suricata" { json{ source => "message" } if [src_ip] { geoip { source => "src_ip" target => "geoip" database => "/etc/logstash/GeoLiteCity.dat" add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ] add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ] } mutate { convert => [ "[geoip][coordinates]", "float" ] remove_field => [ "timestamp" ] } } } }
  105. 105. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Logstash: Suricata
  106. 106. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Logstash: Suricata • Logstash daily index • index template • easy to retire index • close/delete • 22 machines • only 2 with public IP • Logs • between 1-3 millions /day
  107. 107. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Logstash: Suricata • 73 million docs < 2days! • during Mongodump • transferring remotely • Watch out for Suricata • Stream events! • SURICATA STREAM Packet with invalid ack • And lots of other stream alerts! • I disabled it! Maybe I am wrong :)
  108. 108. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Kibana | Explore & Visualize Your Data • Seamless Integration with Elasticsearch • Give Shape to Your Data • Sophisticated Analytics • Empower More Team Members • Flexible Interface, Easy to Share • Easy Setup • Visualize Data from Many Sources • Simple Data Export
  109. 109. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases Kibana 4.2.1 | Compatible with Elasticsearch 2.x
  110. 110. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases The ELK Stack | Elasticsearch, Logstash and Kibana Elasticsearch 1.7.3 (2.0.0) Logstash 1.5.5 (2.0.0) Kibana 3 (4.2.1)
  111. 111. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases The ELK Stack | Elasticsearch, Logstash and Kibana
  112. 112. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases The ELK Stack | Elasticsearch, Logstash and Kibana
  113. 113. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases The ELK Stack | Elasticsearch, Logstash and Kibana
  114. 114. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases The ELK Stack | Elasticsearch, Logstash and Kibana
  115. 115. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases The ELK Stack | Elasticsearch, Logstash and Kibana
  116. 116. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases The ELK Stack | Elasticsearch, Logstash and Kibana
  117. 117. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases The ELK Stack | Elasticsearch, Logstash and Kibana
  118. 118. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases The ELK Stack | Elasticsearch, Logstash and Kibana
  119. 119. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | ISCPIF Use Cases The ELK Stack | Elasticsearch, Logstash and Kibana
  120. 120. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Elasticsearch Mapping • Which string fields should be full text fields. • Which fields contain numbers, dates, or geolocations. • The format of date values. • a simple type like string, date, long, double, boolean or ip. • a type which supports the hierarchical nature of JSON such as object or nested. • or a specialized type like geo_point, geo_shape, or completion. • multi-fields • a string field could be indexed as an analyzed field for full-text search, and as a not_analyzed field for sorting or aggregations. • Alternatively, you could index a string field with the standard analyzer, the english analyzer, and the french analyzer.
  121. 121. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Elasticsearch Mapping • Dynamic mapping • Fields and mapping types do not need to be defined before being used. • Explicit mappings • You can create mapping types and field mappings when you create an index • Updating existing mappings • Existing type and field mappings cannot be updated • Create a new index with the correct mappings and reindex your data
  122. 122. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Mapping Explicit PUT my_index { "mappings": { "user": { "_all": { "enabled": false }, "properties": { "title": { "type": "string" }, "name": { "type": "string" }, "age": { "type": "integer" } } }, "blogpost": { "properties": { "title": { "type": "string" }, "body": { "type": "string" }, "user_id": { "type": "string", "index": "not_analyzed" }, "created": { "type": "date", "format": "yyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis" } } } } }
  123. 123. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Mapping Dynamic templates PUT my_index { "mappings": { "my_type": { "dynamic_templates": [ { "integers": { "match_mapping_type": "long", "mapping": { "type": "integer" } } }, { "strings": { "match_mapping_type": "string", "mapping": { "type": "string", "fields": { "raw": { "type": "string", "index": "not_analyzed", "ignore_above": 256 } } } } } ] } } }
  124. 124. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Mapping Dynamic templates PUT my_index { "template": "logs-*", "settings": { "index.number_of_replicas": "0", "index.number_of_shards": "3" }, "mappings": { "my_type": { "dynamic_templates": [ { "integers": { "match_mapping_type": "long", "mapping": { "type": "integer" } } }, { "strings": { "match_mapping_type": "string", "mapping": { "type": "string", "fields": { "raw": { "type": "string", "index": "not_analyzed", "ignore_above": 256 } } } } } ] } } }
  125. 125. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Production • Hardware • Memory • 64 GB of RAM is the ideal sweet spot • 16 GB of RAM for Heap Size and 32 GB total • And don’t cross 30.5 GB! • CPU • 2-8 cores of CPU • faster CPUs vs. more cores = choose more cores • Disk • SSDs (monitor the I/O) • high-performance server disks (15k RPM) • RAID 0 (ES is high available by replicas)
  126. 126. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Security • No built-in authentication • Do not expose Elasticsearch to the world • Watch out for Denial of Service • Do not give users to define index name (like ",*" ) • Turn off Dynamic Scripts (default is off) • Control protocols (DELETE, PUT, etc.) • Nginx (reverse proxy, SSL and auth) • Apache
  127. 127. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Elasticsearch 2.0 | Lots of things! • Analyzer and tokenizer (human language) • Log slow ops • Index settings • refresh interval • flush interval • Differentiate your nodes • Data nodes • Master nodes • Client nodes • Cluster health • Heap size • Thread pools • Merging time • etc.
  128. 128. “Launch your GIANT idea” INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB v3.2 is coming soon. Learn more.
  129. 129. • Document Database • Documents (i.e. objects) • Embedded documents and arrays (no expensive joins!) • Dynamic schema supports fluent polymorphism. • High Availability • automatic failover • data redundancy • Automatic Scaling • Automatic sharding INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | Key Features
  130. 130. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | Platforms
  131. 131. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | Install MongoDB on Ubuntu echo "deb http://repo.mongodb.org/apt/ubuntu trusty/mongodb-org/3.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.0.list Import the public key used by the package management system Create a list file for MongoDB sudo apt-get update && sudo apt-get install -y mongodb-org sudo service mongod start sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10 Install MongoDB 3.0.7
  132. 132. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | SQL to MongoDB
  133. 133. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | SQL to MongoDB https://docs.mongodb.org/manual/reference/sql-comparison/
  134. 134. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | Insert Documents
  135. 135. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | Query Documents db.inventory.find( { type: "snacks" } ) db.inventory.find( {} ) db.inventory.find( { type: { $in: [ 'food', 'snacks' ] } } ) db.inventory.find( { type: 'food', price: { $lt: 9.95 } } ) db.inventory.find( { $or: [ { qty: { $gt: 100 } }, { price: { $lt: 9.95 } } ] } ) All documents Equality Query Operation AND condition OR condition
  136. 136. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | Query Documents db.tweets.find( { "coordinates.coordinates": { $near : { $geometry: { type: "Point", coordinates: [ 2.3325923, 48.8537095] }, $minDistance: 1, $maxDistance: 500 } } } )
  137. 137. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | Aggregation { "_id": "10280", "city": "NEW YORK", "state": "NY", "pop": 5574, "loc": [ -74.016323, 40.710537 ] } db.zipcodes.aggregate( [ { $group: { _id: "$state", totalPop: { $sum: "$pop" } } }, { $match: { totalPop: { $gte: 10*1000*1000 } } } ] ) { "_id" : "AK", "totalPop" : 550043 } SELECT state, SUM(pop) AS totalPop FROM zipcodes GROUP BY state HAVING totalPop >= (10*1000*1000)
  138. 138. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) db.tweets.aggregate([ { $geoNear: { near: [ 2.348730, 48.840982 ], distanceField: "dist.calculated", maxDistance: 100, includeLocs: "dist.location", query: {"coordinates.type": "Point"}, limit: 100 } }, { $group: { _id: "$user.id", count: { $sum: 1 }, name: { $addToSet: "$user.name" }, date: { $addToSet: "$created_at" }, text: { $addToSet: "$text" }, coordinates: { $addToSet: "$coordinates" } } }, {$sort: {"count": -1}}, {$limit: 10} ]); MongoDB | Aggregation
  139. 139. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | MapReduce
  140. 140. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | Index • Creating an index • db.ships.ensureIndex({name : 1}) • Dropping an index • db.ships.dropIndex({name : 1}) • Creating a compound index • db.ships.ensureIndex({name : 1, operator : 1, class : 0}) • Dropping a compound index • db.ships.dropIndex({name : 1, operator : 1, class : 0})
  141. 141. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | Replica Set
  142. 142. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | Replica Set • Replica Set Members • Replica Set Primary • Accepts write operations • Replica Set Secondary Members • Replicate the primary’s data set and accept read operations • Priority 0 Replica Set Members • Priority 0 members are secondaries that cannot become the primary. • Hidden Replica Set Members • Invisible to applications • Replica Set Arbiter
  143. 143. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | Shardings
  144. 144. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | 3.0
  145. 145. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | 3.0 • 7-10x Better Performance • Up to 80% Less Storage • Reduce Operational Overhead By Up to 95% • Pluggable Storage Optimized For Your Workload • Low Latency Across the Globe • Enhancements That Make You More Productive • Faster Loading and Export • Easier Query Optimization • Faster Debugging • Richer Geospatial Apps • Better Time-Series Analytics
  146. 146. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | 3.0
  147. 147. Datacenter in France and Italy 60M/week - 8M/day - 360K/hour INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | ISCPIF Use Cases
  148. 148. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | ISCPIF Use Cases Avg. usage of Twitter in Paris in October
  149. 149. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | ISCPIF Use Cases 24HOURS NEWS: Real-time Breaking News
  150. 150. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | ISCPIF Use Cases 24HOURS NEWS: Real-time Breaking News • 50-100 Updates /s • Time-series Queries • Grid-FS • FTS (full-text search) • Tokenizes and stems • Scoring • 140 characters/small dataset! • TTL Index • Session store
  151. 151. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | ISCPIF Use Cases TIMOTHY: Real-time Dashboards
  152. 152. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | ISCPIF Use Cases TIMOTHY: Real-time Dashboards
  153. 153. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | ISCPIF Use Cases TIMOTHY: Real-time Dashboards
  154. 154. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | ISCPIF Use Cases HIGH THROUGHPUT: Real-time aggregations, over 50K inserts /s
  155. 155. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | ISCPIF Use Cases Most generic Most specific Calculating the new graph HIGH THROUGHPUT: Real-time aggregations, over 50K inserts /s
  156. 156. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | ISCPIF Use Cases 130K inserts /s
  157. 157. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | ISCPIF Use Cases 10K-80K inserts /s
  158. 158. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | ISCPIF Use Cases 9K queries /s
  159. 159. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | ISCPIF Use Cases
  160. 160. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | ISCPIF Use Cases Gephi Streaming
  161. 161. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | Monitoring System
  162. 162. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | Monitoring System Network and Cache
  163. 163. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | Monitoring System Hardware
  164. 164. MongoDB | Monitoring System Hardware INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  165. 165. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | ISCPIF Use Cases Monitoring NewRelic
  166. 166. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | ISCPIF Use Cases Monitoring NewRelic
  167. 167. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | Monitoring System Objects 3.02 Billion Documents 64 Collections 195 Indexes
  168. 168. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) MongoDB | Monitoring System Objects
  169. 169. “in-memory data structure store” INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) used as database, cache and message broker
  170. 170. • Data structures • strings • hashes • lists • sets • sorted sets • bitmaps • hyperlogs • geospatial indexes • Built-in • replication • Lua scripting • transactions • on-disk persistence INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Redis | Key Features
  171. 171. Redis KEYS: > set mykey somevalue OK > get mykey “somevalue” Redis LISTS: > rpush mylist A (integer) 1 > rpush mylist B (integer) 2 > lpush mylist first (integer) 3 > lrange mylist 0 -1 1) "first" 2) "A" 3) "B" INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Redis | Data Structures just a little!
  172. 172. maziyar-beautiful-MacBook$ redis-benchmark -q -n 100000 PING_INLINE: 85178.88 requests per second PING_BULK: 80000.00 requests per second SET: 86580.09 requests per second GET: 83263.95 requests per second INCR: 83963.05 requests per second LPUSH: 86880.97 requests per second LPOP: 90252.70 requests per second SADD: 84388.19 requests per second SPOP: 92936.80 requests per second LPUSH (needed to benchmark LRANGE): 87336.24 requests per second LRANGE_100 (first 100 elements): 25614.75 requests per second LRANGE_300 (first 300 elements): 10455.88 requests per second LRANGE_500 (first 450 elements): 7125.04 requests per second LRANGE_600 (first 600 elements): 5369.13 requests per second MSET (10 keys): 50000.00 requests per second INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Redis | How fast is Redis?
  173. 173. maziyar-beautiful-MacBook$ redis-benchmark -n 1000000 -t set,get -P 16 -q PING_INLINE: 735294.12 requests per second PING_BULK: 988142.31 requests per second SET: 681198.88 requests per second GET: 831255.25 requests per second INCR: 778210.12 requests per second LPUSH: 682593.81 requests per second LPOP: 713775.88 requests per second SADD: 732600.75 requests per second SPOP: 885739.62 requests per second LPUSH (needed to benchmark LRANGE): 656598.81 requests per second INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Redis | How fast is Redis? Pipelining of 16 commands
  174. 174. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Redis | ISCPIF Use Cases Scientific Games
  175. 175. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Redis | ISCPIF Use Cases Scientific Games
  176. 176. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Redis | ISCPIF Use Cases Scientific Games
  177. 177. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Redis | ISCPIF Use Cases • Scientific Operations • Occurrence • Co-Occurrence • Scientific Games • Pub/Sub • Rate Limiter (IP-based with TTL) • Chat rooms • TTL
  178. 178. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Redis | ISCPIF Use Cases Monitoring NewRelic
  179. 179. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Redis | ISCPIF Use Cases Monitoring NewRelic
  180. 180. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Redis | ISCPIF Use Cases Monitoring NewRelic
  181. 181. “Messaging that just works” INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) well actually it’s more than that, but OK!
  182. 182. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) RabbitMQ | Feature List A messaging broker • Highlights • Reliability • Flexible Routing • Clustering • Federation • Highly Available Queues • Multi-protocol • Many Clients • Management UI • Plugin System • For what? • Data delivery • Non-blocking operations • Push notifications • Publish / subscribe • Asynchronous processing (work queues)
  183. 183. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) RabbitMQ | Feature List A messaging broker Type Topic Q1 Q2 Q3 climate.* risk.* news.* RabbitMQ Routing
  184. 184. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) RabbitMQ | ISCPIF Use Cases Monitoring NewRelic
  185. 185. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) RabbitMQ | ISCPIF Use Cases Monitoring RabbitMQ Management UI
  186. 186. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) RabbitMQ | ISCPIF Use Cases Monitoring RabbitMQ Management UI
  187. 187. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) RabbitMQ | ISCPIF Use Cases Monitoring RabbitMQ Management UI
  188. 188. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) RabbitMQ | ISCPIF Use Cases Monitoring RabbitMQ Management UI
  189. 189. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) RabbitMQ | ISCPIF Use Cases • Distributed Computations • Parsing text files • Scientific calculations • Realtime Processing • Text mining • NLP • Annotation • Keyword extractions • Job Queues • RPC (Remote procedure call) • Topic based routing
  190. 190. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) RabbitMQ | ISCPIF Use Cases • Parsing • 225 file • 10m-20m lines • Avg. total of 3.3 Billions • RPC • Post-process each document • Output • MongoDB • ElasticSearch • Redis
  191. 191. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) ISCPIF Big Data • Multivac (Open Data Platform) • ISCPIF APIs • Science en Poche • Climatique (COP21) • Risk (AXA research fund) • Scientific Dashboards • Distributed Computing • Nobel Game (scientific game) • Twitter streaming (UN, France, Climate Change, Risk, etc) • Instagram streaming (Paris)
  192. 192. Twitter Storing Data Real-time Streaming System Data Analytics Real-time Processing Web Mobile Wearable Devices Text Mining Sensor-based devices Mobile Devices Wearable Devices Instagram Foursquare Data Streams Real-time Streaming System Web Socket XMLJSON Authorization Authentication Identification Flash Socket xhr-polling jsonp-polling Backend Architecture Facebook Files End User Indexing Data RPC System NLP Annotation Extraction Streaming Data Crowd Sourcing INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Real-Time Data Stream Processing
  193. 193. HighPerformanceInfrastructure
  194. 194. HighlyAvailableInfrastructure
  195. 195. • Multivac (Open Data) • ISCPIF APIs • Scientific Dashboards • Science en Poche • Distributed Computing • Climatique (COP21) • Risk (AXA research fund) • Nobel Game (scientific game) • Twitter streaming (UN, France, Climate Change, Risk) • Instagram streaming (Paris) Python Scala Script Java Erlang iOS Node JS current projects
  196. 196. –just a regular Geek :) “You are your best benchmark!” INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  197. 197. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) SHOWCASE
  198. 198. FRANCE 2014 Real-time processing and visualizing Twitter in France INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  199. 199. FRANCE 2014 Real-time processing and visualizing Twitter in France INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  200. 200. News Tracking Real-time tracking news with highest impact of networks INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  201. 201. Aviation Accidents 50K retweets/10min INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  202. 202. Aviation Accidents 120K retweets/10min INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  203. 203. Robin Williams 180K retweets/10min INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  204. 204. #Ferguson michealBROWN 75K retweets/10min INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  205. 205. Paris 13 Novembre INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  206. 206. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  207. 207. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  208. 208. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  209. 209. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  210. 210. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  211. 211. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  212. 212. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  213. 213. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  214. 214. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  215. 215. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  216. 216. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  217. 217. INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  218. 218. 22 VMs Distributed Systems 320K Ops /S 130K Insert /S 64K Index /S … 8 WEB SERVERS 4 API SERVERS Search Engine Cluster 900 million data 45% Database 2.9 billion data 22% 120 Cores 2.5TB RAM 30 TB SSD +8000 Lines Code 14 Web Apps 4 Mobile Apps INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  219. 219. • Starting 2016 with CouchBase in parallel • Graph Databases • Spark Streaming / machine learning • Clustering and categorizing in real-time • Creative Hardware • SlipStream, StratusLab and EGI Cloud • Healthcare and Wearable devices • Non, no drones! ;-) What’s next for Big Data at ISCPIF INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF)
  220. 220. –The Blacklist: Lord Baltimore (No. 104) “Every piece of information is worth something to somebody. And in the hands of the wrong person, that could be deadly.” INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) Reddington: People love to decry big brother the NSA, the government listening in on their most private lives, yet they all willingly go online and hand over the most intimate details of those lives - to big data. Elizabeth: Most people don't care that Google knows their search history. Reddington: They know more than that. They know your habits, the banks you use, the pills you pop, the men or women you sleep with.”
  221. 221. Thanks! maziyar.panahi@iscpif.fr 25 November 2015 INSTITUT DES SYSTÈMES COMPLEXES DE PARIS ÎLE-DE-FRANCE (ISCPIF) http://iscpif.fr/maziyar

×