SlideShare a Scribd company logo
1 of 25
Download to read offline
© 2015 Proofpoint, Inc.© 2015 Proofpoint, Inc.
threat protection | compliance | archiving & governance | secure communication
Cassandra at Proofpoint (& Nexgate)
!
Harold Nguyen, Data Scientist, Nexgate division of Proofpoint
!
Slides created with the help of Proofpoint colleagues:
Bryan Burns, Brian Hawkins, Wayne Lewis, Andy Maas, Anand Somani,
Grey Saylor, and Rich Sutton
© 2015 Proofpoint, Inc.
Outline of This Talk
whoami
!
About Proofpoint
!
Cassandra Uses cases for Proofpoint email security
Targeted Attack Protection
Threat-event correlation
General-purpose infrastructure
Clustering email topics
!
Cassandra Uses for Proofpoint Nexgate social media security and compliance
Spam multiplicity
Trending topics
Archive Search
Data integrity and connectedness across the globe
© 2015 Proofpoint, Inc.
Few Words About Me
Data engineer/scientist
!
Responsible for content classification, fraudulent
detection, and security research
!
Work with entering, marketing and research
teams
© 2015 Proofpoint, Inc.
About Proofpoint
Security and compliance for enterprise
messaging (email, social, and mobile)
Founded 2002
1100 employees worldwide
$2.5B public company: PFPT
$200M revenue
!
Cassandra used all over

the organization
© 2015 Proofpoint, Inc.
Cassandra for Email Security and
Compliance
Use cases of Cassandra for Email Security
© 2015 Proofpoint, Inc.
Targeted Attack Protection (TAP)
What is Targeted Attack ?
Attack aimed at specific user or organization, designed to breech a
specific target
!
What is TAP ?
Combats targeted threats by monitoring suspicious messages
containing malicious URLs and attachments, and analyzing user
clicks
!
Predictive defense by using machine learning techniques to
determine would ‘could likely’ by malicious and take preemptive
steps
!
Insights into threat by determining if an organization is under attack,
who is being targeted, what threats are received, and if they are still
valid threats
© 2015 Proofpoint, Inc.
Cassandra with TAP - (Use Case 1)
C* use case with TAP
Uses Cassandra as an indexer - index URLs (row key) to email messages
(columns) that contain them
Store a blob of email message to display on dashboard for malicious alerting
!
C* infrastructure
40-node cluster in AWS, c3.2xlarge nodes
About 2 TB of EBS storage on each
Replication factor of 4
Data has increased by 100% since a year ago
!
KairosDB and C*
JMX metrics inserted into KairosDB, where they are read and monitored from
Over the 3 clusters (9, 6, 6), 14 billion metrics a month from 1000s of machines
Has become critical to Proofpoint being able to track metrics from systems
© 2015 Proofpoint, Inc.
Threat Database (Use Case 2)
Problem:
• Proofpoint collects billions of threat data points a day that aren’t being correlated
Solution:
• Build a custom graph database on top of C*
• Key is vertex, wide rows are edges
• 18 nodes, 24 TB of data, ingest peaks of 1M events per second
Benefits:
• Security researchers can now identify relationships between hosts, actors and threats
that they couldn’t before
• Dridex campaign, detection of numerous targeted attacks
© 2015 Proofpoint, Inc.
Why not TitanDB ?
Proofpoint security research team created a graph database on top of Cassandra (CQL application)
!
Why didn’t we use TitanDB, or other existing Graph DBs) ?
These DBs want to generate their own IDs- causes unnecessary querying for us
This killed insert performance
Created our own ID generation scheme so an ID could be deterministically generated without
querying the db
Cassandra allowed us to overwrite the same data multiple times if needed without needing to
query the db to reconcile duplicates
Titan could be “hacked” to use a hash-based id and not call Cassandra for id generation, but
their keys were contained to 64-bit integers (too small for us)
!
Other design differences from Titan:
A key cache is used in the import application, so we avoid having to write the vertex key over and
over
Shard data into many subgraphs -
queries can thus include time ranges, and
reduces compaction overhead
!
Edges design is similar to Titan - edges of a vertex are kept in the same data partition
© 2015 Proofpoint, Inc.
Email General Purpose
Infrastructure (Use Case 3)
We also have a 6-node cluster in 2 datacenters (3 nodes in each DC)
Stores email and attachments as large encrypted blobs (from 20M to
2 GB) - for “SecureShare” - a product that securely shares emails
As an identity database - users, customers, etc..
Chosen over SQL because of its distributed / multi-DC nature
© 2015 Proofpoint, Inc.
Email Topics Clustering (Use case 4)
Uses Cassandra as a store
for clustered email topics
Uses Word2Vec algorithm
with a 100-dimensional vector,
and apply Spark-streaming
MLib k-means clustering
algorithm on incoming stream
of email subjects
Tried k= 20, 50, and 100
Word2vec translates
synonymous words into the
same vector space
© 2015 Proofpoint, Inc.
Use Cases for Social Media Security
and Compliance
© 2015 Proofpoint, Inc.
Why Cassandra
Content classification is what we do.
The completeness of any classification
system is predicated on the breadth of
the corpus of data upon which it is
built.!
!
We wanted to keep everything. Forever.
© 2015 Proofpoint, Inc.
Deployment: Current production
2.1 TB of data, 150 writes & 15 reads per second
© 2015 Proofpoint, Inc.
Start – 2012:
• Cassandra 1.1.6,
Ubuntu 10.04
• One datacenter, three
nodes
Finish – today:
• Datastax Enterprise 4.5.7, Ubuntu
12.04
• Three datacenters
• Nine nodes
• Solr deployed
• Half a billion content stored
Deployment: Evolution
© 2015 Proofpoint, Inc.
Never Gone Down
© 2015 Proofpoint, Inc.
Going Global (Use Case 5)
Objectives:
• Scale system horizontally
• Allow customers to keep data in a region (EU), while benefitting from other
data centers (spammy users)
Dedicated “Global” C* cluster, with each Nexgate app instance
having its own “Local” C* cluster
us-­‐west us-­‐east eu-­‐central
© 2015 Proofpoint, Inc.
Global data center
2 nodes in each datacenter across the globe, 60 writes/sec, 1
read/sec
© 2015 Proofpoint, Inc.
Spam multiplicity (Use Case 6)
Problem:
• Spammers on social media repeat messages across accounts
Solution:
• Cassandra data model to query repeat-content spammers in real time
Benefits: Efficiently get a count of times we’ve seen content, while
retaining detail data, supporting real-time analysis
© 2015 Proofpoint, Inc.
Spam Multiplicity Data Modeling
CREATE	
  TABLE	
  item_cnt	
  (	
  
	
  	
  content	
  text,	
  
	
  	
  column1	
  text,	
  
	
  	
  value	
  text,	
  
	
  	
  PRIMARY	
  KEY	
  ((content),	
  column1)	
  
)
Hash of Content
(Partition Key)
Column1 Value
d131dd02c5e6eec4
…
property_native_id “itemId_timestamp”
Look up content quickly (by hitting hashed index)
Number of columns = number of times content was seen
Value provides information for offline analysis (time series,
patterns in content, etc…)
© 2015 Proofpoint, Inc.
Trending topics (Use Case 7)
Problem:
• Detect when the conversation radically changes on a social account
Solution:
• Use Cassandra data model to detect social mob and alert when it occurs
Benefits: Efficiently get bi-gram counts from adjacent date ranges
and analyze them
© 2015 Proofpoint, Inc.
Trending Topics Data Modeling
CREATE TABLE trending_topics (
account_id int,
year_month text,
minute_bucket timestamp,
topic text,
item_id int,
counter_value counter,
PRIMARY KEY ((account_id, year_month),
minute_bucket, topic, item_id)
)
Goal: Get back number of times a
topic has been seen for any range
of minutes
!
(account_id, year_month) is
“composite partition key” -> data
with same account id and date
live on same node
!
Sorted by minute_bucket, topics,
and item_id
!
Only account_id and year_month
are minimum necessary values
needed for query
!
Flexibility for analysis
© 2015 Proofpoint, Inc.
Archive search (Use Case 8)
Problem:
• Allow customers to identify arbitrary compliance problems in social content with an open-ended
search feature
Solution:
• Cassandra column family that contains the content
• Datastax Enterprise Solr with a core on that columnfamily
Benefits: Near real-time index updates make new content available via search from same
infrastructure
combined with trending topics, can be used to easily lookup and remove
inappropriate content from social media account
© 2015 Proofpoint, Inc.
Summary
8 use cases that take advantage of Cassandra:
Data modeling
Distributed nature
Other tools can easily plugin (Solr, Spark)
Ease of Use
Community’s amazing support
© 2015 Proofpoint, Inc.
Q A&
threat protection | compliance | archiving & governance | secure communication

More Related Content

What's hot

Tsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in ChinaTsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in ChinaDataStax Academy
 
Cassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra CommunityCassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra CommunityHiromitsu Komatsu
 
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...DataStax Academy
 
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable CassandraCassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandraaaronmorton
 
The Last Pickle: Distributed Tracing from Application to Database
The Last Pickle: Distributed Tracing from Application to DatabaseThe Last Pickle: Distributed Tracing from Application to Database
The Last Pickle: Distributed Tracing from Application to DatabaseDataStax Academy
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeItai Yaffe
 
GumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWSGumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWSDataStax Academy
 
Deep dive into event store using Apache Cassandra
Deep dive into event store using Apache CassandraDeep dive into event store using Apache Cassandra
Deep dive into event store using Apache CassandraAhmedabadJavaMeetup
 
Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...
Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...
Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...DataStax
 
C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...
C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...
C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...DataStax
 
Instaclustr webinar 2017 feb 08 japan
Instaclustr webinar 2017 feb 08   japanInstaclustr webinar 2017 feb 08   japan
Instaclustr webinar 2017 feb 08 japanHiromitsu Komatsu
 
Solving Hybrid Cloud Data Replication with Apache Cassandra
Solving Hybrid Cloud Data Replication with Apache CassandraSolving Hybrid Cloud Data Replication with Apache Cassandra
Solving Hybrid Cloud Data Replication with Apache CassandraAaron Ploetz
 
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...DataStax
 
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd KnownCassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd KnownDataStax
 
Petabridge: The New .NET Enterprise Stack
Petabridge: The New .NET Enterprise StackPetabridge: The New .NET Enterprise Stack
Petabridge: The New .NET Enterprise StackDataStax Academy
 
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...DataStax
 
Tuning Java Driver for Apache Cassandra by Nenad Bozic at Big Data Spain 2017
Tuning Java Driver for Apache Cassandra by Nenad Bozic at Big Data Spain 2017Tuning Java Driver for Apache Cassandra by Nenad Bozic at Big Data Spain 2017
Tuning Java Driver for Apache Cassandra by Nenad Bozic at Big Data Spain 2017Big Data Spain
 
Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7DataStax
 

What's hot (20)

Tsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in ChinaTsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in China
 
Cassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra CommunityCassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra Community
 
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
 
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable CassandraCassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
 
The Last Pickle: Distributed Tracing from Application to Database
The Last Pickle: Distributed Tracing from Application to DatabaseThe Last Pickle: Distributed Tracing from Application to Database
The Last Pickle: Distributed Tracing from Application to Database
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real time
 
GumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWSGumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWS
 
Deep dive into event store using Apache Cassandra
Deep dive into event store using Apache CassandraDeep dive into event store using Apache Cassandra
Deep dive into event store using Apache Cassandra
 
Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...
Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...
Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...
 
C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...
C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...
C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...
 
Instaclustr webinar 2017 feb 08 japan
Instaclustr webinar 2017 feb 08   japanInstaclustr webinar 2017 feb 08   japan
Instaclustr webinar 2017 feb 08 japan
 
Solving Hybrid Cloud Data Replication with Apache Cassandra
Solving Hybrid Cloud Data Replication with Apache CassandraSolving Hybrid Cloud Data Replication with Apache Cassandra
Solving Hybrid Cloud Data Replication with Apache Cassandra
 
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...
 
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd KnownCassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
 
Cassandra & Spark for IoT
Cassandra & Spark for IoTCassandra & Spark for IoT
Cassandra & Spark for IoT
 
Petabridge: The New .NET Enterprise Stack
Petabridge: The New .NET Enterprise StackPetabridge: The New .NET Enterprise Stack
Petabridge: The New .NET Enterprise Stack
 
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
 
Tuning Java Driver for Apache Cassandra by Nenad Bozic at Big Data Spain 2017
Tuning Java Driver for Apache Cassandra by Nenad Bozic at Big Data Spain 2017Tuning Java Driver for Apache Cassandra by Nenad Bozic at Big Data Spain 2017
Tuning Java Driver for Apache Cassandra by Nenad Bozic at Big Data Spain 2017
 
In Flux Limiting for a multi-tenant logging service
In Flux Limiting for a multi-tenant logging serviceIn Flux Limiting for a multi-tenant logging service
In Flux Limiting for a multi-tenant logging service
 
Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7
 

Viewers also liked

Webinar: Proofpoint, a pioneer in security-as-a-service protects people, info...
Webinar: Proofpoint, a pioneer in security-as-a-service protects people, info...Webinar: Proofpoint, a pioneer in security-as-a-service protects people, info...
Webinar: Proofpoint, a pioneer in security-as-a-service protects people, info...DataStax
 
Institucional proofpoint
Institucional proofpointInstitucional proofpoint
Institucional proofpointvoliverio
 
Adapted from an ESG report - Seeing Is Securing - Protecting Against Advanced...
Adapted from an ESG report - Seeing Is Securing - Protecting Against Advanced...Adapted from an ESG report - Seeing Is Securing - Protecting Against Advanced...
Adapted from an ESG report - Seeing Is Securing - Protecting Against Advanced...Proofpoint
 
Customer Success and Security Technology
Customer Success and Security TechnologyCustomer Success and Security Technology
Customer Success and Security TechnologyGainsight
 
E-FILE_Proofpoint_Uberflip_120915_optimized
E-FILE_Proofpoint_Uberflip_120915_optimizedE-FILE_Proofpoint_Uberflip_120915_optimized
E-FILE_Proofpoint_Uberflip_120915_optimizedLynn Feltner
 
Proofpoint Outbound/DLP Survey Results
Proofpoint Outbound/DLP Survey ResultsProofpoint Outbound/DLP Survey Results
Proofpoint Outbound/DLP Survey Resultsshapetech
 
Social media – issues and trends caus 2014
Social media – issues and trends   caus 2014Social media – issues and trends   caus 2014
Social media – issues and trends caus 2014Dan Michaluk
 
Compliant Practices for Social Media in Financial Services
Compliant Practices for Social Media in Financial ServicesCompliant Practices for Social Media in Financial Services
Compliant Practices for Social Media in Financial ServicesLinkedIn Sales Solutions
 
Governança de Dados nas empresas - BI Summit 2017
Governança de Dados nas empresas - BI Summit 2017Governança de Dados nas empresas - BI Summit 2017
Governança de Dados nas empresas - BI Summit 2017BLRDATA
 
Tecnoset curitiba printing services
Tecnoset curitiba   printing servicesTecnoset curitiba   printing services
Tecnoset curitiba printing servicesFernando Misato
 
Fraud Analytics Techniques Moving Into Security
Fraud Analytics Techniques Moving Into SecurityFraud Analytics Techniques Moving Into Security
Fraud Analytics Techniques Moving Into SecurityBruno Motta Rego
 
Using Social Media for Security Monitoring
Using Social Media for Security MonitoringUsing Social Media for Security Monitoring
Using Social Media for Security MonitoringSysomos
 
Cisco amp everywhere
Cisco amp everywhereCisco amp everywhere
Cisco amp everywhereCisco Canada
 
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetchDataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetchDataStax Academy
 
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...DataStax Academy
 

Viewers also liked (15)

Webinar: Proofpoint, a pioneer in security-as-a-service protects people, info...
Webinar: Proofpoint, a pioneer in security-as-a-service protects people, info...Webinar: Proofpoint, a pioneer in security-as-a-service protects people, info...
Webinar: Proofpoint, a pioneer in security-as-a-service protects people, info...
 
Institucional proofpoint
Institucional proofpointInstitucional proofpoint
Institucional proofpoint
 
Adapted from an ESG report - Seeing Is Securing - Protecting Against Advanced...
Adapted from an ESG report - Seeing Is Securing - Protecting Against Advanced...Adapted from an ESG report - Seeing Is Securing - Protecting Against Advanced...
Adapted from an ESG report - Seeing Is Securing - Protecting Against Advanced...
 
Customer Success and Security Technology
Customer Success and Security TechnologyCustomer Success and Security Technology
Customer Success and Security Technology
 
E-FILE_Proofpoint_Uberflip_120915_optimized
E-FILE_Proofpoint_Uberflip_120915_optimizedE-FILE_Proofpoint_Uberflip_120915_optimized
E-FILE_Proofpoint_Uberflip_120915_optimized
 
Proofpoint Outbound/DLP Survey Results
Proofpoint Outbound/DLP Survey ResultsProofpoint Outbound/DLP Survey Results
Proofpoint Outbound/DLP Survey Results
 
Social media – issues and trends caus 2014
Social media – issues and trends   caus 2014Social media – issues and trends   caus 2014
Social media – issues and trends caus 2014
 
Compliant Practices for Social Media in Financial Services
Compliant Practices for Social Media in Financial ServicesCompliant Practices for Social Media in Financial Services
Compliant Practices for Social Media in Financial Services
 
Governança de Dados nas empresas - BI Summit 2017
Governança de Dados nas empresas - BI Summit 2017Governança de Dados nas empresas - BI Summit 2017
Governança de Dados nas empresas - BI Summit 2017
 
Tecnoset curitiba printing services
Tecnoset curitiba   printing servicesTecnoset curitiba   printing services
Tecnoset curitiba printing services
 
Fraud Analytics Techniques Moving Into Security
Fraud Analytics Techniques Moving Into SecurityFraud Analytics Techniques Moving Into Security
Fraud Analytics Techniques Moving Into Security
 
Using Social Media for Security Monitoring
Using Social Media for Security MonitoringUsing Social Media for Security Monitoring
Using Social Media for Security Monitoring
 
Cisco amp everywhere
Cisco amp everywhereCisco amp everywhere
Cisco amp everywhere
 
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetchDataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
 
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...
 

Similar to Proofpoint: Fraud Detection and Security on Social Media

Wicsa2011 cloud tutorial
Wicsa2011 cloud tutorialWicsa2011 cloud tutorial
Wicsa2011 cloud tutorialAnna Liu
 
Data Virtualization to Survive a Multi and Hybrid Cloud World
Data Virtualization to Survive a Multi and Hybrid Cloud WorldData Virtualization to Survive a Multi and Hybrid Cloud World
Data Virtualization to Survive a Multi and Hybrid Cloud WorldDenodo
 
Lean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big DataLean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big DataStylight
 
IBM CDS Overview
IBM CDS OverviewIBM CDS Overview
IBM CDS OverviewJean Tan
 
Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...
Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...
Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...DataStax Academy
 
Declare Victory with Big Data
Declare Victory with Big DataDeclare Victory with Big Data
Declare Victory with Big DataJ On The Beach
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise AnalyticsDATAVERSITY
 
AWS Summit Berlin 2013 - Big Data Analytics
AWS Summit Berlin 2013 - Big Data AnalyticsAWS Summit Berlin 2013 - Big Data Analytics
AWS Summit Berlin 2013 - Big Data AnalyticsAWS Germany
 
Social Security Company Nexgate's Success Relies on Apache Cassandra
Social Security Company Nexgate's Success Relies on Apache CassandraSocial Security Company Nexgate's Success Relies on Apache Cassandra
Social Security Company Nexgate's Success Relies on Apache CassandraDataStax Academy
 
How much money do you lose every time your ecommerce site goes down?
How much money do you lose every time your ecommerce site goes down?How much money do you lose every time your ecommerce site goes down?
How much money do you lose every time your ecommerce site goes down?DataStax
 
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014Amazon Web Services
 
Building and Maintaining Bulletproof Systems with DataStax
Building and Maintaining Bulletproof Systems with DataStaxBuilding and Maintaining Bulletproof Systems with DataStax
Building and Maintaining Bulletproof Systems with DataStaxDataStax
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAmazon Web Services
 
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...MSAdvAnalytics
 
Cloudera federal summit
Cloudera federal summitCloudera federal summit
Cloudera federal summitMatt Carroll
 
IRJET- Secured Hadoop Environment
IRJET- Secured Hadoop EnvironmentIRJET- Secured Hadoop Environment
IRJET- Secured Hadoop EnvironmentIRJET Journal
 
System Security on Cloud
System Security on CloudSystem Security on Cloud
System Security on CloudTu Pham
 
Distributed Data Processing for Real-time Applications
Distributed Data Processing for Real-time ApplicationsDistributed Data Processing for Real-time Applications
Distributed Data Processing for Real-time ApplicationsScyllaDB
 
Applying Auto-Data Classification Techniques for Large Data Sets
Applying Auto-Data Classification Techniques for Large Data SetsApplying Auto-Data Classification Techniques for Large Data Sets
Applying Auto-Data Classification Techniques for Large Data SetsPriyanka Aash
 

Similar to Proofpoint: Fraud Detection and Security on Social Media (20)

Wicsa2011 cloud tutorial
Wicsa2011 cloud tutorialWicsa2011 cloud tutorial
Wicsa2011 cloud tutorial
 
Data Virtualization to Survive a Multi and Hybrid Cloud World
Data Virtualization to Survive a Multi and Hybrid Cloud WorldData Virtualization to Survive a Multi and Hybrid Cloud World
Data Virtualization to Survive a Multi and Hybrid Cloud World
 
Lean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big DataLean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big Data
 
IBM CDS Overview
IBM CDS OverviewIBM CDS Overview
IBM CDS Overview
 
Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...
Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...
Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...
 
Declare Victory with Big Data
Declare Victory with Big DataDeclare Victory with Big Data
Declare Victory with Big Data
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
 
AWS Summit Berlin 2013 - Big Data Analytics
AWS Summit Berlin 2013 - Big Data AnalyticsAWS Summit Berlin 2013 - Big Data Analytics
AWS Summit Berlin 2013 - Big Data Analytics
 
Social Security Company Nexgate's Success Relies on Apache Cassandra
Social Security Company Nexgate's Success Relies on Apache CassandraSocial Security Company Nexgate's Success Relies on Apache Cassandra
Social Security Company Nexgate's Success Relies on Apache Cassandra
 
How much money do you lose every time your ecommerce site goes down?
How much money do you lose every time your ecommerce site goes down?How much money do you lose every time your ecommerce site goes down?
How much money do you lose every time your ecommerce site goes down?
 
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
 
Building and Maintaining Bulletproof Systems with DataStax
Building and Maintaining Bulletproof Systems with DataStaxBuilding and Maintaining Bulletproof Systems with DataStax
Building and Maintaining Bulletproof Systems with DataStax
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data Analytics
 
kumarResume
kumarResumekumarResume
kumarResume
 
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
 
Cloudera federal summit
Cloudera federal summitCloudera federal summit
Cloudera federal summit
 
IRJET- Secured Hadoop Environment
IRJET- Secured Hadoop EnvironmentIRJET- Secured Hadoop Environment
IRJET- Secured Hadoop Environment
 
System Security on Cloud
System Security on CloudSystem Security on Cloud
System Security on Cloud
 
Distributed Data Processing for Real-time Applications
Distributed Data Processing for Real-time ApplicationsDistributed Data Processing for Real-time Applications
Distributed Data Processing for Real-time Applications
 
Applying Auto-Data Classification Techniques for Large Data Sets
Applying Auto-Data Classification Techniques for Large Data SetsApplying Auto-Data Classification Techniques for Large Data Sets
Applying Auto-Data Classification Techniques for Large Data Sets
 

More from DataStax Academy

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftDataStax Academy
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsDataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackDataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache CassandraDataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready CassandraDataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonDataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with DseDataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraDataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraDataStax Academy
 

More from DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 

Recently uploaded

Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 

Recently uploaded (20)

Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 

Proofpoint: Fraud Detection and Security on Social Media

  • 1. © 2015 Proofpoint, Inc.© 2015 Proofpoint, Inc. threat protection | compliance | archiving & governance | secure communication Cassandra at Proofpoint (& Nexgate) ! Harold Nguyen, Data Scientist, Nexgate division of Proofpoint ! Slides created with the help of Proofpoint colleagues: Bryan Burns, Brian Hawkins, Wayne Lewis, Andy Maas, Anand Somani, Grey Saylor, and Rich Sutton
  • 2. © 2015 Proofpoint, Inc. Outline of This Talk whoami ! About Proofpoint ! Cassandra Uses cases for Proofpoint email security Targeted Attack Protection Threat-event correlation General-purpose infrastructure Clustering email topics ! Cassandra Uses for Proofpoint Nexgate social media security and compliance Spam multiplicity Trending topics Archive Search Data integrity and connectedness across the globe
  • 3. © 2015 Proofpoint, Inc. Few Words About Me Data engineer/scientist ! Responsible for content classification, fraudulent detection, and security research ! Work with entering, marketing and research teams
  • 4. © 2015 Proofpoint, Inc. About Proofpoint Security and compliance for enterprise messaging (email, social, and mobile) Founded 2002 1100 employees worldwide $2.5B public company: PFPT $200M revenue ! Cassandra used all over
 the organization
  • 5. © 2015 Proofpoint, Inc. Cassandra for Email Security and Compliance Use cases of Cassandra for Email Security
  • 6. © 2015 Proofpoint, Inc. Targeted Attack Protection (TAP) What is Targeted Attack ? Attack aimed at specific user or organization, designed to breech a specific target ! What is TAP ? Combats targeted threats by monitoring suspicious messages containing malicious URLs and attachments, and analyzing user clicks ! Predictive defense by using machine learning techniques to determine would ‘could likely’ by malicious and take preemptive steps ! Insights into threat by determining if an organization is under attack, who is being targeted, what threats are received, and if they are still valid threats
  • 7. © 2015 Proofpoint, Inc. Cassandra with TAP - (Use Case 1) C* use case with TAP Uses Cassandra as an indexer - index URLs (row key) to email messages (columns) that contain them Store a blob of email message to display on dashboard for malicious alerting ! C* infrastructure 40-node cluster in AWS, c3.2xlarge nodes About 2 TB of EBS storage on each Replication factor of 4 Data has increased by 100% since a year ago ! KairosDB and C* JMX metrics inserted into KairosDB, where they are read and monitored from Over the 3 clusters (9, 6, 6), 14 billion metrics a month from 1000s of machines Has become critical to Proofpoint being able to track metrics from systems
  • 8. © 2015 Proofpoint, Inc. Threat Database (Use Case 2) Problem: • Proofpoint collects billions of threat data points a day that aren’t being correlated Solution: • Build a custom graph database on top of C* • Key is vertex, wide rows are edges • 18 nodes, 24 TB of data, ingest peaks of 1M events per second Benefits: • Security researchers can now identify relationships between hosts, actors and threats that they couldn’t before • Dridex campaign, detection of numerous targeted attacks
  • 9. © 2015 Proofpoint, Inc. Why not TitanDB ? Proofpoint security research team created a graph database on top of Cassandra (CQL application) ! Why didn’t we use TitanDB, or other existing Graph DBs) ? These DBs want to generate their own IDs- causes unnecessary querying for us This killed insert performance Created our own ID generation scheme so an ID could be deterministically generated without querying the db Cassandra allowed us to overwrite the same data multiple times if needed without needing to query the db to reconcile duplicates Titan could be “hacked” to use a hash-based id and not call Cassandra for id generation, but their keys were contained to 64-bit integers (too small for us) ! Other design differences from Titan: A key cache is used in the import application, so we avoid having to write the vertex key over and over Shard data into many subgraphs - queries can thus include time ranges, and reduces compaction overhead ! Edges design is similar to Titan - edges of a vertex are kept in the same data partition
  • 10. © 2015 Proofpoint, Inc. Email General Purpose Infrastructure (Use Case 3) We also have a 6-node cluster in 2 datacenters (3 nodes in each DC) Stores email and attachments as large encrypted blobs (from 20M to 2 GB) - for “SecureShare” - a product that securely shares emails As an identity database - users, customers, etc.. Chosen over SQL because of its distributed / multi-DC nature
  • 11. © 2015 Proofpoint, Inc. Email Topics Clustering (Use case 4) Uses Cassandra as a store for clustered email topics Uses Word2Vec algorithm with a 100-dimensional vector, and apply Spark-streaming MLib k-means clustering algorithm on incoming stream of email subjects Tried k= 20, 50, and 100 Word2vec translates synonymous words into the same vector space
  • 12. © 2015 Proofpoint, Inc. Use Cases for Social Media Security and Compliance
  • 13. © 2015 Proofpoint, Inc. Why Cassandra Content classification is what we do. The completeness of any classification system is predicated on the breadth of the corpus of data upon which it is built.! ! We wanted to keep everything. Forever.
  • 14. © 2015 Proofpoint, Inc. Deployment: Current production 2.1 TB of data, 150 writes & 15 reads per second
  • 15. © 2015 Proofpoint, Inc. Start – 2012: • Cassandra 1.1.6, Ubuntu 10.04 • One datacenter, three nodes Finish – today: • Datastax Enterprise 4.5.7, Ubuntu 12.04 • Three datacenters • Nine nodes • Solr deployed • Half a billion content stored Deployment: Evolution
  • 16. © 2015 Proofpoint, Inc. Never Gone Down
  • 17. © 2015 Proofpoint, Inc. Going Global (Use Case 5) Objectives: • Scale system horizontally • Allow customers to keep data in a region (EU), while benefitting from other data centers (spammy users) Dedicated “Global” C* cluster, with each Nexgate app instance having its own “Local” C* cluster us-­‐west us-­‐east eu-­‐central
  • 18. © 2015 Proofpoint, Inc. Global data center 2 nodes in each datacenter across the globe, 60 writes/sec, 1 read/sec
  • 19. © 2015 Proofpoint, Inc. Spam multiplicity (Use Case 6) Problem: • Spammers on social media repeat messages across accounts Solution: • Cassandra data model to query repeat-content spammers in real time Benefits: Efficiently get a count of times we’ve seen content, while retaining detail data, supporting real-time analysis
  • 20. © 2015 Proofpoint, Inc. Spam Multiplicity Data Modeling CREATE  TABLE  item_cnt  (      content  text,      column1  text,      value  text,      PRIMARY  KEY  ((content),  column1)   ) Hash of Content (Partition Key) Column1 Value d131dd02c5e6eec4 … property_native_id “itemId_timestamp” Look up content quickly (by hitting hashed index) Number of columns = number of times content was seen Value provides information for offline analysis (time series, patterns in content, etc…)
  • 21. © 2015 Proofpoint, Inc. Trending topics (Use Case 7) Problem: • Detect when the conversation radically changes on a social account Solution: • Use Cassandra data model to detect social mob and alert when it occurs Benefits: Efficiently get bi-gram counts from adjacent date ranges and analyze them
  • 22. © 2015 Proofpoint, Inc. Trending Topics Data Modeling CREATE TABLE trending_topics ( account_id int, year_month text, minute_bucket timestamp, topic text, item_id int, counter_value counter, PRIMARY KEY ((account_id, year_month), minute_bucket, topic, item_id) ) Goal: Get back number of times a topic has been seen for any range of minutes ! (account_id, year_month) is “composite partition key” -> data with same account id and date live on same node ! Sorted by minute_bucket, topics, and item_id ! Only account_id and year_month are minimum necessary values needed for query ! Flexibility for analysis
  • 23. © 2015 Proofpoint, Inc. Archive search (Use Case 8) Problem: • Allow customers to identify arbitrary compliance problems in social content with an open-ended search feature Solution: • Cassandra column family that contains the content • Datastax Enterprise Solr with a core on that columnfamily Benefits: Near real-time index updates make new content available via search from same infrastructure combined with trending topics, can be used to easily lookup and remove inappropriate content from social media account
  • 24. © 2015 Proofpoint, Inc. Summary 8 use cases that take advantage of Cassandra: Data modeling Distributed nature Other tools can easily plugin (Solr, Spark) Ease of Use Community’s amazing support
  • 25. © 2015 Proofpoint, Inc. Q A& threat protection | compliance | archiving & governance | secure communication