3. Joblift – KSQL live Bot Detection – 08/14/19 3
Live Bot Detection
It's not only Bot-Traffic -
It's more about the understanding of the Traffic-Quality
Image: https://www.pexels.com/photo/flat-lay-photography-of-macbook-pro-beside-paper-1509428/
4. Joblift – KSQL live Bot Detection – 08/14/19 4
1 Usage of Resources
2 Falsify your statistics
What are bots causing at Joblift
1
technical hints (user agent,
browser, ip, ...)
How could we identify bots
3 Revenue blocking
2 scheduled interaction patterns
3 many interactions
Introduction
5. Joblift – KSQL live Bot Detection – 08/14/19 5
1 one search every 30 seconds
2 visits a certain page very often ( > 3x/h)
Definition of many interactions in terms of Joblift
3 clicks many jobitems in a short time ( > 20x/min)
Introduction
6. Joblift – KSQL live Bot Detection – 08/14/19 6
= Solution
OR
Possible Solution
Live Bot Detection
BackendOR+Frontend
and many more ...
Own Solution
Next Question
Solution Cost VS Effort and purpose => often you need more than just detecting bots
7. Joblift – KSQL live Bot Detection – 08/14/19 7
- Reporting
What is “more” ?
Live Bot Detection
- Budgets
- Alerting / Monitoring
- Partner Management
- Invoices
8. Joblift – KSQL live Bot Detection – 08/14/19 8
1 You identify bots in batch and prepare a list at night
2 You do all checking of user-data, search and click behaviour
Why you need this live: Here is a counterpart question
3 You have known patterns to distinguish between user and bot
Why you need a ‘live’ Bot Detection
Which problems do you solve with that?
Which problems do you NOT solve with that?
9. Joblift – KSQL live Bot Detection – 08/14/19 9
How we solve this problem ?
Time for a decision
Image: https://www.pexels.com/photo/man-wearing-black-and-white-stripe-shirt-looking-at-white-printer-papers-on-the-wall-212286/
10. Joblift – KSQL live Bot Detection – 08/14/19 10
- a streaming SQL engine
What is KSQL ?
KSQL
Image: https://www.confluent.io/product/ksql/
Source: https://docs.confluent.io/current/ksql/docs/index.html
- process data without code in a programming language
- scaleable, elastic, fault-tolerant
- process data just with SQL
-
filtering, transformations, aggregations, joins, windowing, and
sessionization
11. Joblift – KSQL live Bot Detection – 08/14/19 11
KSQL Example
Table
CREATE TABLE users
(registertime BIGINT,
gender VARCHAR,
regionid VARCHAR,
userid VARCHAR,
interests array<VARCHAR>,
contactinfo map<VARCHAR,
VARCHAR>)
WITH (KAFKA_TOPIC='users',
VALUE_FORMAT='JSON',
KEY = 'userid');
Stream
CREATE STREAM pageviews
(viewtime BIGINT,
userid VARCHAR,
pageid VARCHAR)
WITH (KAFKA_TOPIC='pageviews',
VALUE_FORMAT='DELIMITED');
KSQL Server
12. Joblift – KSQL live Bot Detection – 08/14/19 12
KSQL Example
Join
CREATE STREAM
pageviews_enriched AS
SELECT pv.viewtime,
pv.userid AS userid,
pv.pageid,
pv.timestring,
u.gender,
u.regionid
FROM pageviews pv
LEFT JOIN user u ON pv.userid =
u.userid;
KSQL Server
13. Joblift – KSQL live Bot Detection – 08/14/19 13
Test Workflow
How to test a KSQL-Application
1 start a Kafka instance via docker
2 start a integration framework like fitnesse
3 start ksql
start consumer for the target topics
produce test events
4
6 compare the target events with the expected
5
14. Joblift – KSQL live Bot Detection – 08/14/19 14
1 start headless: ´bin/sql-node query-file=foo/bar.sql´
2 distribute load (exclusively) with `ksql.service.id`
Steps to get your KSQL into production:
3 Monitoring with JMX exposed metrics
Make it production ready
15. Joblift – KSQL live Bot Detection – 08/14/19 15
KSQL: Kafka Streams applications with simple filtering, aggregation or
transformation
Same scaling and monitoring capabilities while implementation is just a SQL
query
Conclusion
Kafka knowledge is helpful
16. Joblift – KSQL live Bot Detection – 08/14/19 16
ANDRÉ EBERHARDT HASAN GÜRCAN