Deploying and Operating KSQL

1
ONLINE TALK
Deploying and Operating
KSQL

2
Neil is a senior engineer and technologist at Confluent, the
company founded by the creators of Apache Kafka®. He has over
20 years of expertise of working on distributed computing,
messaging and stream processing. He has built or redesigned
commercial messaging platforms, distributed caching products as
well as developed large scale bespoke systems for tier-1 banks.
After a period at ThoughtWorks, he went on to build some of the
first distributed risk engines in financial services. In 2008 he
launched a startup that specialised in distributed data analytics and
visualization. Then prior to joining Confluent he was the CTO at a
fintech consultancy.
Neil Avery
Senior Engineer and Technologist, Confluent

3
Housekeeping Items
● This session will last about an hour.
● This session will be recorded.
● You can submit your questions by entering them into the GoToWebinar panel.
● The last 10-15 minutes will consist of Q&A.
● The slides and recording will be available after the talk.

Agenda
• Deployment
• Configuration
• Scaling
• Monitoring

5
First things first
• Getting KSQL binaries is easy
Download: https://www.confluent.io/download/
Confluent Open Source 4.1+ (free)
Confluent Enterprise 4.1+ (30-day trial)
• Links to downloads, docs, news, examples, etc.
https://confluent.io/ksql
• KSQL code is open source (Apache license)
https://github.com/confluentinc/ksql

6
Deploying KSQL – Getting Started
• For development, e.g. on your laptop, use the Confluent CLI:
$ confluent start
• Starts up a full set of services:
Zookeeper & Kafka Broker
Schema Registry
KSQL Server
REST Proxy
Kafka Connect
Control Center

7
Deploying KSQL – Getting Started

8
Deploying KSQL – Starting KSQL Server
• KSQL Server acts as a Kafka client
Run it on nodes separate from the Kafka Brokers
• Provide a configuration file of settings
• From your installation directory:
$ bin/ksql-server-start config/ksql-server.properties

9
KSQL Server Configuration
• The configuration file has only a few mandatory options:
bootstrap.servers – where to find the Kafka Broker(s)
listeners – ports on which to listen for connections from the KSQL CLI
• Optional:
ksql.service.id – a name to group together a pool of KSQL Servers
• Optionally, add any property the embedded Kafka consumers and producers or Kafka Streams API would understand
e.g. security configurations
• Example:
bootstrap.servers=broker1:9092
listeners=http://localhost:8088
ksql.streams.commit.interval.ms=1000
producer.interceptor.classes=io.confluent.monitoring.clients.interceptor.MonitoringProducerInterceptor
consumer.interceptor.classes=io.confluent.monitoring.clients.interceptor.MonitoringConsumerInterceptor

10
Connecting to a Secured Kafka cluster
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=
org.apache.kafka.common.security.plain.PlainLoginModule required
username="<name of the user KSQL should use>"
password="<the password>";
Exact settings you will need will vary depending on what SASL mechanism your
Kafka cluster is using and how your SSL certificates are signed. For full details,
please refer to the Security section of the Kafka documentation
<http://kafka.apache.org/documentation.html#security>

11
Connecting to a Schema Registry (Optional)
• Add Schema Registry address in the same configuration file
• If your Schema Registry is secured, you will also need to set a KSQL_OPTS environment variable when starting KSQL
Server to specify the connection credentials

12
Starting KSQL CLI
• KSQL CLI interfaces with a KSQL Server over HTTP
• Start by specifying the address of the target KSQL Server

13
Starting KSQL (preview) web interface
• 1. Download https://s3.amazonaws.com/ksql-experimental-ui/ksql-experimental-ui-0.1.war
• 2. Copy the war into ksql/ui folder
• 3. Run ksql-server-start Start by specifying the address of the target KSQL Server
http://localhost:8080/index.html

15
Log Files
• See config/log4j.properties or config/log4j-rolling.properties

16
Patterns & Best Practices
• KSQL Server pools
– per team / project / use-case
• “headless” vs. “interactive”
$ bin/ksql-server-start config/my.properties - -queries-file /path/to/foo.sql

17
Scaling
• KSQL Servers are Kafka clients
• Queries act as Consumer Groups
• Partitions are the limit of scale-out

18
Scaling your data model
• Partitions – 1 topic has multiple queries – the number of partitions determines horizontal scale
• Queries performance (200k/second) (log-resilience using kafka)
• Partitions are the limit of scale-out -

19
Scaling your data model
• Partitions are the limit of scale-out – over 30k per typical server
• Query throughput determined by serialization
• Latency considerations

20
Scaling KSQL Server
• Using K8s & Docker (create pods of Server instances – deploy using an application.sql file)
• Monitor using Control Centre & Datadog & others
• Latency considerations

21
Example Application
Ecommerce Site HadoopEvent Stream Support

22
Example Application
https://github.com/bluemonk3y/ksql-recipe-fraudulent-txns

23
Show & Explain Queries
KSQL > show queries
KSQL > explain CSAS_SUSPICIOUS_TXNS;

25
KSQL-Server JMX metrics
$ export JMX_PORT=1099 && bin/ksql-server-start config/nicks-ksqlserver.properties
• Attach JConsole or tool of choice
• OR
• jstatd– run it on every host!
• Remotely connect and use Visualvm

32
Transient vs Persistent Queries

33
Drivers of Load and Throughput
• Messages
- message size (same as for any Kafka Client)
- message format (JSON is more expensive in CPU)
• Message (de)serialization is the most CPU-intensive aspect of any query
- in throughput testing, all queries are CPU-bound
- start with 4 cores minimum
• Use SSD if any joins or aggregations
• Relative resource demand: Query Type CPU RAM DISK
Project, filter n/a Medium None
Join n/a High Medium
Aggregate n/a High High

36
Example Application
https://github.com/bluemonk3y/ksql-recipe-fraudulent-txns

Deploying and Operating KSQL

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Deploying and Operating KSQL

Similar to Deploying and Operating KSQL (20)

More from confluent

More from confluent (20)

Recently uploaded

Recently uploaded (20)

Deploying and Operating KSQL