Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Keira Zhou
May 11, 2016
§ Install on your laptop
§ Kafka 0.9
§ Flink 1.0.2
§ Elasticserach 2.3.2
§ Create a topic
§ bin/kafka-topics.sh 
--create 
--zookeeper localhost:2181 
--replication-factor 1 
--partitions 1 
--to...
§ Create an index
§ curl -XPUT 'http://localhost:9200/viper-test/' -d '{
"settings" :{
"index" :{
"number_of_shards" :1,
"...
§ More info:
§ https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/connectors/kafka.html
§ Maven depende...
§ More info:
§ https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/connectors/elasticsearch2.html
§ Mave...
§ Example Java code
§ public static void writeElastic(DataStream<String> input) {
Map<String, String> config = new HashMap...
§ https://github.com/keiraqz/KafkaFlinkElastic/blob/master/src/main/java/viper/K
afkaFlinkElastic.java
§ Start your Flink program in your IDE
§ Start Kafka producer cli interface
§ bin/kafka-console-producer.sh --broker-list ...
§ https://github.com/keiraqz/KafkaFlinkElastic
Upcoming SlideShare
Loading in …5
×

Streaming using Kafka Flink & Elasticsearch

851 views

Published on

Basic streaming example using Flink to read from Kafka and write to Elasticsearch. The example works on a laptop

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

Streaming using Kafka Flink & Elasticsearch

  1. 1. Keira Zhou May 11, 2016
  2. 2. § Install on your laptop § Kafka 0.9 § Flink 1.0.2 § Elasticserach 2.3.2
  3. 3. § Create a topic § bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic viper_test
  4. 4. § Create an index § curl -XPUT 'http://localhost:9200/viper-test/' -d '{ "settings" :{ "index" :{ "number_of_shards" :1, "number_of_replicas" :0 } } }’ § Put mapping of a doc type within the index § curl -XPUT 'localhost:9200/viper-test/_mapping/viper-log' -d '{ "properties":{ "ip":{ "type":"string","index":"not_analyzed" }, "info":{ "type":"string" } } }'
  5. 5. § More info: § https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/connectors/kafka.html § Maven dependency § <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-connector-kafka-0.9_2.10</artifactId> <version>1.0.2</version> </dependency> § Example Java code § public static DataStream<String> readFromKafka(StreamExecutionEnvironment env) { env.enableCheckpointing(5000); // set up the execution environment Properties properties = new Properties(); properties.setProperty("bootstrap.servers", "localhost:9092"); properties.setProperty("group.id", "test"); DataStream<String> stream = env.addSource( new FlinkKafkaConsumer09<>("test", new SimpleStringSchema(), properties)); return stream; }
  6. 6. § More info: § https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/connectors/elasticsearch2.html § Maven dependency § <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-connector-elasticsearch2_2.10</artifactId> <version>1.1-SNAPSHOT</version> </dependency> § Example Java code § Next page…
  7. 7. § Example Java code § public static void writeElastic(DataStream<String> input) { Map<String, String> config = new HashMap<>(); // This instructs the sink to emit after every element, otherwise they would be buffered config.put("bulk.flush.max.actions", "1"); config.put("cluster.name", "es_keira"); try { // Add elasticsearch hosts on startup List<InetSocketAddress> transports = new ArrayList<>(); transports.add(new InetSocketAddress("127.0.0.1", 9300)); // port is 9300 not 9200 for ES TransportClient ElasticsearchSinkFunction<String> indexLog = new ElasticsearchSinkFunction<String>() { public IndexRequest createIndexRequest(String element) { String[] logContent = element.trim().split("t"); Map<String, String> esJson = new HashMap<>(); esJson.put("IP", logContent[0]); esJson.put("info", logContent[1]); return Requests .indexRequest() .index("viper-test") .type("viper-log") .source(esJson); } @Override public void process(String element, RuntimeContext ctx, RequestIndexer indexer) { indexer.add(createIndexRequest(element)); } }; ElasticsearchSink esSink = new ElasticsearchSink(config, transports, indexLog); input.addSink(esSink); } catch (Exception e) { System.out.println(e); } }
  8. 8. § https://github.com/keiraqz/KafkaFlinkElastic/blob/master/src/main/java/viper/K afkaFlinkElastic.java
  9. 9. § Start your Flink program in your IDE § Start Kafka producer cli interface § bin/kafka-console-producer.sh --broker-list localhost:9092 --topic viper_test § In your terminal,type (it’s tab separated): § 10.20.30.40 test § Afterwards,in elastic: § curl 'localhost:9200/viper-test/viper-log/_search?pretty’ § You should see: § { "took" :1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 1.0, "hits" : [ { "_index" : "viper-test", "_type" : "viper-log", "_id" : ”SOMETHING","_score" : 1.0, "_source" : { "IP" : "10.20.30.40", "info" : "test" } } ] } }
  10. 10. § https://github.com/keiraqz/KafkaFlinkElastic

×