SlideShare a Scribd company logo
1 of 17
Download to read offline
Joblift – KSQL live Bot Detection – 08/14/19
Live Bot Detection: Identify them with KSQL
08/14/19, Berlin
Joblift – KSQL live Bot Detection – 08/14/19 2
Agenda
Introduction
Live Bot Detection:
KSQL:
●
●
What’s next:
1
2
3
4
Joblift – KSQL live Bot Detection – 08/14/19 3
Live Bot Detection
It's not only Bot-Traffic -
It's more about the understanding of the Traffic-Quality
Image: https://www.pexels.com/photo/flat-lay-photography-of-macbook-pro-beside-paper-1509428/
Joblift – KSQL live Bot Detection – 08/14/19 4
1 Usage of Resources
2 Falsify your statistics
What are bots causing at Joblift
1
technical hints (user agent,
browser, ip, ...)
How could we identify bots
3 Revenue blocking
2 scheduled interaction patterns
3 many interactions
Introduction
Joblift – KSQL live Bot Detection – 08/14/19 5
1 one search every 30 seconds
2 visits a certain page very often ( > 3x/h)
Definition of many interactions in terms of Joblift
3 clicks many jobitems in a short time ( > 20x/min)
Introduction
Joblift – KSQL live Bot Detection – 08/14/19 6
= Solution
OR
Possible Solution
Live Bot Detection
BackendOR+Frontend
and many more ...
Own Solution
Next Question
Solution Cost VS Effort and purpose => often you need more than just detecting bots
Joblift – KSQL live Bot Detection – 08/14/19 7
- Reporting
What is “more” ?
Live Bot Detection
- Budgets
- Alerting / Monitoring
- Partner Management
- Invoices
Joblift – KSQL live Bot Detection – 08/14/19 8
1 You identify bots in batch and prepare a list at night
2 You do all checking of user-data, search and click behaviour
Why you need this live: Here is a counterpart question
3 You have known patterns to distinguish between user and bot
Why you need a ‘live’ Bot Detection
Which problems do you solve with that?
Which problems do you NOT solve with that?
Joblift – KSQL live Bot Detection – 08/14/19 9
How we solve this problem ?
Time for a decision
Image: https://www.pexels.com/photo/man-wearing-black-and-white-stripe-shirt-looking-at-white-printer-papers-on-the-wall-212286/
Joblift – KSQL live Bot Detection – 08/14/19 10
- a streaming SQL engine
What is KSQL ?
KSQL
Image: https://www.confluent.io/product/ksql/
Source: https://docs.confluent.io/current/ksql/docs/index.html
- process data without code in a programming language
- scaleable, elastic, fault-tolerant
- process data just with SQL
-
filtering, transformations, aggregations, joins, windowing, and
sessionization
Joblift – KSQL live Bot Detection – 08/14/19 11
KSQL Example
Table
CREATE TABLE users
(registertime BIGINT,
gender VARCHAR,
regionid VARCHAR,
userid VARCHAR,
interests array<VARCHAR>,
contactinfo map<VARCHAR,
VARCHAR>)
WITH (KAFKA_TOPIC='users',
VALUE_FORMAT='JSON',
KEY = 'userid');
Stream
CREATE STREAM pageviews
(viewtime BIGINT,
userid VARCHAR,
pageid VARCHAR)
WITH (KAFKA_TOPIC='pageviews',
VALUE_FORMAT='DELIMITED');
KSQL Server
Joblift – KSQL live Bot Detection – 08/14/19 12
KSQL Example
Join
CREATE STREAM
pageviews_enriched AS
SELECT pv.viewtime,
pv.userid AS userid,
pv.pageid,
pv.timestring,
u.gender,
u.regionid
FROM pageviews pv
LEFT JOIN user u ON pv.userid =
u.userid;
KSQL Server
Joblift – KSQL live Bot Detection – 08/14/19 13
Test Workflow
How to test a KSQL-Application
1 start a Kafka instance via docker
2 start a integration framework like fitnesse
3 start ksql
start consumer for the target topics
produce test events
4
6 compare the target events with the expected
5
Joblift – KSQL live Bot Detection – 08/14/19 14
1 start headless: ´bin/sql-node query-file=foo/bar.sql´
2 distribute load (exclusively) with `ksql.service.id`
Steps to get your KSQL into production:
3 Monitoring with JMX exposed metrics
Make it production ready
Joblift – KSQL live Bot Detection – 08/14/19 15
KSQL: Kafka Streams applications with simple filtering, aggregation or
transformation
Same scaling and monitoring capabilities while implementation is just a SQL
query
Conclusion
Kafka knowledge is helpful
Joblift – KSQL live Bot Detection – 08/14/19 16
ANDRÉ EBERHARDT HASAN GÜRCAN
Joblift – KSQL live Bot Detection – 08/14/19 17

More Related Content

Similar to Live Bot Detection - Identify them with KSQL - Data Wholphins Meetup #2

KSQL: The Streaming SQL Engine for Apache Kafka
KSQL: The Streaming SQL Engine for Apache KafkaKSQL: The Streaming SQL Engine for Apache Kafka
KSQL: The Streaming SQL Engine for Apache KafkaChris Mueller
 
How to measure everything - a million metrics per second with minimal develop...
How to measure everything - a million metrics per second with minimal develop...How to measure everything - a million metrics per second with minimal develop...
How to measure everything - a million metrics per second with minimal develop...Jos Boumans
 
UX Programming: The Sakai 3 Approach
UX Programming: The Sakai 3 ApproachUX Programming: The Sakai 3 Approach
UX Programming: The Sakai 3 ApproachCarl Hall
 
Mobile Analytics mit Elasticsearch und Kibana
Mobile Analytics mit Elasticsearch und KibanaMobile Analytics mit Elasticsearch und Kibana
Mobile Analytics mit Elasticsearch und Kibanainovex GmbH
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Demi Ben-Ari
 
Visualized Conference and jQuery Conference
Visualized Conference and jQuery ConferenceVisualized Conference and jQuery Conference
Visualized Conference and jQuery ConferenceKeiichiro Ono
 
You Too Can Be a Radio Host Or How We Scaled a .NET Startup And Had Fun Doing It
You Too Can Be a Radio Host Or How We Scaled a .NET Startup And Had Fun Doing ItYou Too Can Be a Radio Host Or How We Scaled a .NET Startup And Had Fun Doing It
You Too Can Be a Radio Host Or How We Scaled a .NET Startup And Had Fun Doing ItAleksandr Yampolskiy
 
Viacheslav Eremin interview about DOT NET (eng lang)
Viacheslav Eremin interview about DOT NET (eng lang)Viacheslav Eremin interview about DOT NET (eng lang)
Viacheslav Eremin interview about DOT NET (eng lang)Viacheslav Eremin
 
Reactive Java Robotics and IoT 2016
Reactive Java Robotics and IoT 2016Reactive Java Robotics and IoT 2016
Reactive Java Robotics and IoT 2016ilievt
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Demi Ben-Ari
 
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Codemotion
 
RootedCON 2014 - Kicking around SCADA!
RootedCON 2014 - Kicking around SCADA!RootedCON 2014 - Kicking around SCADA!
RootedCON 2014 - Kicking around SCADA!testpurposes
 
Mongo at Sailthru (MongoNYC 2011)
Mongo at Sailthru (MongoNYC 2011)Mongo at Sailthru (MongoNYC 2011)
Mongo at Sailthru (MongoNYC 2011)ibwhite
 
CoverPage_Resume V2
CoverPage_Resume V2CoverPage_Resume V2
CoverPage_Resume V2Gary Lewis
 
How to create self-service analytics tool from activity logs garbage
How to create self-service analytics tool from activity logs garbageHow to create self-service analytics tool from activity logs garbage
How to create self-service analytics tool from activity logs garbageAnton Anokhin
 
20130503 iCore at calipso workshop fia dublin
20130503 iCore at calipso workshop fia dublin20130503 iCore at calipso workshop fia dublin
20130503 iCore at calipso workshop fia dublinRaffaele Giaffreda
 
LAMP is so yesterday, MEAN is so tomorrow! :)
LAMP is so yesterday, MEAN is so tomorrow! :) LAMP is so yesterday, MEAN is so tomorrow! :)
LAMP is so yesterday, MEAN is so tomorrow! :) Sascha Sambale
 

Similar to Live Bot Detection - Identify them with KSQL - Data Wholphins Meetup #2 (20)

An open web for all
An open web for allAn open web for all
An open web for all
 
KSQL: The Streaming SQL Engine for Apache Kafka
KSQL: The Streaming SQL Engine for Apache KafkaKSQL: The Streaming SQL Engine for Apache Kafka
KSQL: The Streaming SQL Engine for Apache Kafka
 
How to measure everything - a million metrics per second with minimal develop...
How to measure everything - a million metrics per second with minimal develop...How to measure everything - a million metrics per second with minimal develop...
How to measure everything - a million metrics per second with minimal develop...
 
UX Programming: The Sakai 3 Approach
UX Programming: The Sakai 3 ApproachUX Programming: The Sakai 3 Approach
UX Programming: The Sakai 3 Approach
 
Augmented reality in your web proxy
Augmented reality in your web proxyAugmented reality in your web proxy
Augmented reality in your web proxy
 
Mobile Analytics mit Elasticsearch und Kibana
Mobile Analytics mit Elasticsearch und KibanaMobile Analytics mit Elasticsearch und Kibana
Mobile Analytics mit Elasticsearch und Kibana
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
 
Visualized Conference and jQuery Conference
Visualized Conference and jQuery ConferenceVisualized Conference and jQuery Conference
Visualized Conference and jQuery Conference
 
You Too Can Be a Radio Host Or How We Scaled a .NET Startup And Had Fun Doing It
You Too Can Be a Radio Host Or How We Scaled a .NET Startup And Had Fun Doing ItYou Too Can Be a Radio Host Or How We Scaled a .NET Startup And Had Fun Doing It
You Too Can Be a Radio Host Or How We Scaled a .NET Startup And Had Fun Doing It
 
Viacheslav Eremin interview about DOT NET (eng lang)
Viacheslav Eremin interview about DOT NET (eng lang)Viacheslav Eremin interview about DOT NET (eng lang)
Viacheslav Eremin interview about DOT NET (eng lang)
 
MyReplayInZen
MyReplayInZenMyReplayInZen
MyReplayInZen
 
Reactive Java Robotics and IoT 2016
Reactive Java Robotics and IoT 2016Reactive Java Robotics and IoT 2016
Reactive Java Robotics and IoT 2016
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
 
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
 
RootedCON 2014 - Kicking around SCADA!
RootedCON 2014 - Kicking around SCADA!RootedCON 2014 - Kicking around SCADA!
RootedCON 2014 - Kicking around SCADA!
 
Mongo at Sailthru (MongoNYC 2011)
Mongo at Sailthru (MongoNYC 2011)Mongo at Sailthru (MongoNYC 2011)
Mongo at Sailthru (MongoNYC 2011)
 
CoverPage_Resume V2
CoverPage_Resume V2CoverPage_Resume V2
CoverPage_Resume V2
 
How to create self-service analytics tool from activity logs garbage
How to create self-service analytics tool from activity logs garbageHow to create self-service analytics tool from activity logs garbage
How to create self-service analytics tool from activity logs garbage
 
20130503 iCore at calipso workshop fia dublin
20130503 iCore at calipso workshop fia dublin20130503 iCore at calipso workshop fia dublin
20130503 iCore at calipso workshop fia dublin
 
LAMP is so yesterday, MEAN is so tomorrow! :)
LAMP is so yesterday, MEAN is so tomorrow! :) LAMP is so yesterday, MEAN is so tomorrow! :)
LAMP is so yesterday, MEAN is so tomorrow! :)
 

Recently uploaded

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 

Recently uploaded (20)

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 

Live Bot Detection - Identify them with KSQL - Data Wholphins Meetup #2

  • 1. Joblift – KSQL live Bot Detection – 08/14/19 Live Bot Detection: Identify them with KSQL 08/14/19, Berlin
  • 2. Joblift – KSQL live Bot Detection – 08/14/19 2 Agenda Introduction Live Bot Detection: KSQL: ● ● What’s next: 1 2 3 4
  • 3. Joblift – KSQL live Bot Detection – 08/14/19 3 Live Bot Detection It's not only Bot-Traffic - It's more about the understanding of the Traffic-Quality Image: https://www.pexels.com/photo/flat-lay-photography-of-macbook-pro-beside-paper-1509428/
  • 4. Joblift – KSQL live Bot Detection – 08/14/19 4 1 Usage of Resources 2 Falsify your statistics What are bots causing at Joblift 1 technical hints (user agent, browser, ip, ...) How could we identify bots 3 Revenue blocking 2 scheduled interaction patterns 3 many interactions Introduction
  • 5. Joblift – KSQL live Bot Detection – 08/14/19 5 1 one search every 30 seconds 2 visits a certain page very often ( > 3x/h) Definition of many interactions in terms of Joblift 3 clicks many jobitems in a short time ( > 20x/min) Introduction
  • 6. Joblift – KSQL live Bot Detection – 08/14/19 6 = Solution OR Possible Solution Live Bot Detection BackendOR+Frontend and many more ... Own Solution Next Question Solution Cost VS Effort and purpose => often you need more than just detecting bots
  • 7. Joblift – KSQL live Bot Detection – 08/14/19 7 - Reporting What is “more” ? Live Bot Detection - Budgets - Alerting / Monitoring - Partner Management - Invoices
  • 8. Joblift – KSQL live Bot Detection – 08/14/19 8 1 You identify bots in batch and prepare a list at night 2 You do all checking of user-data, search and click behaviour Why you need this live: Here is a counterpart question 3 You have known patterns to distinguish between user and bot Why you need a ‘live’ Bot Detection Which problems do you solve with that? Which problems do you NOT solve with that?
  • 9. Joblift – KSQL live Bot Detection – 08/14/19 9 How we solve this problem ? Time for a decision Image: https://www.pexels.com/photo/man-wearing-black-and-white-stripe-shirt-looking-at-white-printer-papers-on-the-wall-212286/
  • 10. Joblift – KSQL live Bot Detection – 08/14/19 10 - a streaming SQL engine What is KSQL ? KSQL Image: https://www.confluent.io/product/ksql/ Source: https://docs.confluent.io/current/ksql/docs/index.html - process data without code in a programming language - scaleable, elastic, fault-tolerant - process data just with SQL - filtering, transformations, aggregations, joins, windowing, and sessionization
  • 11. Joblift – KSQL live Bot Detection – 08/14/19 11 KSQL Example Table CREATE TABLE users (registertime BIGINT, gender VARCHAR, regionid VARCHAR, userid VARCHAR, interests array<VARCHAR>, contactinfo map<VARCHAR, VARCHAR>) WITH (KAFKA_TOPIC='users', VALUE_FORMAT='JSON', KEY = 'userid'); Stream CREATE STREAM pageviews (viewtime BIGINT, userid VARCHAR, pageid VARCHAR) WITH (KAFKA_TOPIC='pageviews', VALUE_FORMAT='DELIMITED'); KSQL Server
  • 12. Joblift – KSQL live Bot Detection – 08/14/19 12 KSQL Example Join CREATE STREAM pageviews_enriched AS SELECT pv.viewtime, pv.userid AS userid, pv.pageid, pv.timestring, u.gender, u.regionid FROM pageviews pv LEFT JOIN user u ON pv.userid = u.userid; KSQL Server
  • 13. Joblift – KSQL live Bot Detection – 08/14/19 13 Test Workflow How to test a KSQL-Application 1 start a Kafka instance via docker 2 start a integration framework like fitnesse 3 start ksql start consumer for the target topics produce test events 4 6 compare the target events with the expected 5
  • 14. Joblift – KSQL live Bot Detection – 08/14/19 14 1 start headless: ´bin/sql-node query-file=foo/bar.sql´ 2 distribute load (exclusively) with `ksql.service.id` Steps to get your KSQL into production: 3 Monitoring with JMX exposed metrics Make it production ready
  • 15. Joblift – KSQL live Bot Detection – 08/14/19 15 KSQL: Kafka Streams applications with simple filtering, aggregation or transformation Same scaling and monitoring capabilities while implementation is just a SQL query Conclusion Kafka knowledge is helpful
  • 16. Joblift – KSQL live Bot Detection – 08/14/19 16 ANDRÉ EBERHARDT HASAN GÜRCAN
  • 17. Joblift – KSQL live Bot Detection – 08/14/19 17