Workshop Sentiment, Emotion, Demography, and Bot Detection
1. WORKSHOP:
SENTIMENT, EMOTION,
DEMOGRAPHY, AND
BOT DETECTION
Ismail Fahmi, Ph.D.
Director PT. Media Kernels Indonesia
(a.k.a Drone Emprit)
Ismail.fahmi@gmail.com
SEMINAR & WORKSHOP DEA
UII YOGYAKARTA
19 November 2019
UNIVERSITAS ISLAM
INDONESIA
ACADEMIC
3. ACADEMIC
TENTANG DRONE EMPRIT ACADEMIC
Drone Emprit Academic adalah sebuah sistem big data yang
menangkap dan menganalisis percakapan di media sosial
khususnya Twitter, yang dikembangkan oleh PT Media Kernels
Indonesia, dan dipasang di data center Badan Sistem Informasi
(BSI) Universitas Islam Indonesia. Drone Emprit menggunakan
layanan API (Applications Programming Interface) dari Twitter
untuk menangkap percakapan secara semi realtime melalui
metode streaming.
Ismail Fahmi, 2017. Drone Emprit: Konsep dan Teknologi, IT
Camp on Big data and Data Mining, Jakarta
3
9. ACADEMIC
KEYWORDS:
SUMBER DAYA YANG SANGAT TERBATAS
9
TWITTER
Max 400
keywords
Server
IP Addr 1
Server
IP Addr 2
Server
IP Addr n
Max 400
keywords
Max 400
keywords
DRONE EMPRIT
ACADEMIC
DATA LAKE
10. ACADEMICDrone Emprit Big Data Architecture
10
News Crawler
Twitter Crawler
Twitter Streaming
FB Page Crawler
Data Pipeline
Data
SOLR Indexer 1 SOLR Indexer 2 SOLR Indexer 3 SOLR Indexer 4
Hadoop Framework
Physical Hardware
Insight
DataIngest
Management&Queue
RealtimeJob
Processing
Google Custom
Search
Database Framework
ScheduledJob
Processing
Map Reduce
Sentiment
Analysis
Other
Processings
Data&Workflow
Management
Access
Visualization
Other sources
Analytics UI
38. ACADEMIC
DEMOGRAPHY ANALYSIS: DEA
• Fitur ini sudah 80% dikembangkan, dan dalam waktu dekat akan
ditambahkan ke delam dashboard Drone Emprit Academic.
38
42. ACADEMIC
HOW IT WORKS
• Botometer is a machine learning algorithm trained to classify an
account as bot or human based on tens of thousands of labeled
examples.
• When you check an account, you fetches its public profile and
hundreds of its public tweets and mentions using the Twitter API.
• This data is passed to the Botometer API, which extracts about
1,200 features to characterize the account's profile, friends,
social network structure, temporal activity patterns, language,
and sentiment.
• Finally, the features are used by various machine learning
models to compute the bot scores.
42