VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015
1. BE ready. BE safe. BE secure.
Bsides Lisbon 2015
Security data Metrics
and measurements at
scale1
2. binaryedge.io
Who
TIAGO HENRIQUES
• BSc Software Engineering /
University of Brighton
• MSc Computer Security and
Forensics / University of
Bedfordshire
• 8 Years experience in Information
Security consultancy, leadership
and research
CEO and Founder @ BinaryEdge
TIAGO MARTINS
• BSc and MSc Computer
Science / University of
Lisbon
• 7 Years experience
developing real-time
systems and high-volume
data processing
CTO and Co-Founder @ BinaryEdge
ROBERTO BARBOSA
• More than 20 years on the IT
sector
• ex-Engineer at Sun Microsystems
• Former Philip Morris corporate
Auditor
• expert on High Scalability and
Availability on the Finance Sector
(UBS, Citigroup and Leonteq) and
mobile startup.
COO and Head of DataScience at BinaryEdge
2
7. binaryedge.io
DATA BUSINESS MODEL
Gather
and Sell
raw Data
collect
supply
store
host
filter
refine
enhance
enrich
simplify
access
consult
advise
Hold onto
someone
else’s data
for them
Strip out
problematic
records or
data fields or
release
interesting
data subsets
Blend in
other
datasets to
create a new
and
interesting
picture
Help people
cherry-pick
the data they
want in the
format they
prefer
Provide
guidance on
others’ data
efforts
7
9. binaryedge.io
BECOMING EXPONENTIAL ORGANIZATION
SI
D
E
A
S
C
A
L
E
Staff on Demand
MASSIVE TRANSFORMATIVE PURPOSE
MTP
MARKET
Interfaces
Dashboards
Experimentation
Autonomy
Social
Community and Crowd
Algorithms
Lease Assets
Engagement
LEFT BRAIN
order
control
stability
RIGHT BRAIN
creativity
growth
uncertainty
9
14. No legacy to maintain
Lots of experience in the team
Lots of technologies to pick from
Micro service based approach
Metrics collection at large scale
Very young
startup
Technologies?
Architecture?
Prototype?
but where
to start?
14
18. HTTP API
Command line clients
Modules
• Python
• NodeJS
• Go
Third-party APIs
architecture - job request
API oriented
job types Data Collection
Data Processing / Analytics
18
19. Agents listen for work in channels
technologies
Multiple types of agents
Agents
architecture - job execution
GO
Python
NodeJS
Scala
Java
RabbitMQ
NSQ
Redis
Apollo
job control
19
22. architecture - job execution
Amazon
Microsoft
Google
realtime.co
…
Messaging
Cloud
22
23. Agents can feed other agents
Different types of enrichment
• Clean data
• Process data
• Alarms
architecture - data enrichment
23
24. All information is stored
• RAW data
• Processed data
Geolocate of information
Encrypted data for each client
Data Storage
Cloud Services
• Amazon S3
• Amazon DynamoDB
• AzureDocumentDB
• Azure Storage
• Google Cloud Storage
• Google BigQuery
• Rackspace Cloud Files
• Constant Cloud Storage
• Skylable
• RunAbove
architecture - store
Database Solutions
• MongoDB
• ElasticSearch
• Cassandra
• Riak
• LUCEne
24
25. Delivering data
• Realtime - Streaming
• Storage for Analytics
• API
• Raw
Data Analytics
• Kibana
• InfluxDB
• Druid
architecture - serving
25
26. Data Processing
• Apache Spark
• Hadoop
• Amazon Kinesis
Data Intelligence
• Amazon Machine Learning/EMR
• Google Prediction API
• Azure Machine Learning
architecture - serving
26
27. Our agents are very simple
• Simple tasks
• Easy to maintain and adapt
agents/ minions
Agents can be located/run anywhere
• Geo distribution
• Clouds
• Dedicated Servers
• Raspberry Pis in Tiago Henriques’ dual gbit connection
27
31. binaryedge.io
CHALLENGES IN DATA MINING
MODELLING LARGE
SCALE NETWORKS
DISCOVERY OF THREATS
Network dynamics
and Cyberattacks
Privacy Preservation
in data mining
31
35. binaryedge.io
Measurements on our own data
Support - Indicates which percentage of data on storage shows correlation
Confidence - Indicates probability of our assumption being correct
35
36. binaryedge.io
Improving our own data
• Kalman Filter
• AdaBoost (Adaptive Boost)
SAMPLE 1 SAMPLE 1.2 SAMPLE 1.3 SAMPLE 1.4
DEEPER DATA POINT SUPPORT
MANUAL CLASSIFICATION
Weight 6/10
Portscan
Weight 3/10
GEolocation
Weight 5/10
OCR screenshot
Weight 9/10
Previous known
Correct data
36
38. binaryedge.io
CYBER INNOVATION LOOP
Observations
guide &
control
cultural
IDENTITY
new
information
previous
experience
analysis
&
synthesis
Decisions Action
feedback
feedback
interaction
with
environment
interaction
with
environment
SECURITY
FEEDS
REAL WORD DATA
RWD
MODELS
guide &
control
observe ORIENT DECIDE act
feed
forward
feed
forward
feed
forward
38
39. binaryedge.io
CYBER INNOVATION LOOP
INFORMATION
HYPOTHESISdirectives
facts classification
resolution
ASSESSMENT
enactment
knowledge
data
• classification knowledge transforms fact to information
• assessment knowledge transforms information to hypothesis
• resolution knowledge transforms hypothesis to directive
• enactment knowledge transforms directive to fact
39
40. binaryedge.io
CYBERSECURITY DATA SCIENCE
TELEMETRY
SENSOR DATA
CONTEXTUAL DATA
HISTORICAL DATA
REAL TIME PREDICTIONS AND
DECISIONS
agents
agents
REAL WORD DATA
RWD
RECOMMENDERCLASSIFIER SOCIAL THREAT FRAUD
features MODELS VALUE
data INTELLIGENCE
data engineering
custom
dashboard
40