Apache Metron Meetup May 4, 2016 - Big data cybersecurityHortonworks
For more info: http://hortonworks.com/apache/metron/
To ask questions: https://community.hortonworks.com/spaces/111/cybersecurity.html?type=question
To contribute: https://metron.incubator.apache.org/
Co Speaker: Cheryl Biswas
Talk Description:
How about this: a blue team talk given by red teamers. But here’s our rationale - your best defence right now is a strategic offence. The rules of the game have changed and we need to get defence up to speed.
We’ll show you what the key elements are in a good defence strategy; what you can and need to be using to full advantage. We’ll talk about the new “buzzwords” and how they apply: visibility; patterns; big data. There’s a whole lotta data to wrangle, and you aren’t seeing the whole picture if you aren’t doing things right. Threat intel is about getting the big picture as it applies to you. You’ll learn the importance of context and prioritization so that you can manipulate intel feeds to do your bidding. And then we’ll take things further and talk about hunting the adversary, using an update on proven methodologies.
We’ll show you how to understand your data, correlate threats and pin point attacks. Attendees will leave with a new understanding of the resources they have on hand, and how to leverage those into an Adaptive Proactive Defense Strategy.
Hortonworks Data In Motion Webinar Series Pt. 2Hortonworks
How Hortonworks DataFlow (HDF), powered by Apache NIFi, MiNiFi, Kafka and Storm, and it’s associated HDF Certification Program make it easier and faster to integrate different systems together. Highlights on the latest partner integrations from HPE, SAS, Attunity, Impetus Technologies, Kepware and Midfin Systems. “
Watch the webinar on-demand: http://hortonworks.com/webinar/make-big-data-ecosystem-work-better/
HDF Partner certification program: http://hortonworks.com/partners/product-integration-certification/#hdf-integration
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks
VIEW THE ON-DEMAND WEBINAR: http://hortonworks.com/webinar/introduction-hortonworks-dataflow/
Learn about Hortonworks DataFlow (HDFTM) and how you can easily augment your existing data systems – Hadoop and otherwise. Learn what Dataflow is all about and how Apache NiFi, MiNiFi, Kafka and Storm work together for streaming analytics.
Apache Metron Meetup May 4, 2016 - Big data cybersecurityHortonworks
For more info: http://hortonworks.com/apache/metron/
To ask questions: https://community.hortonworks.com/spaces/111/cybersecurity.html?type=question
To contribute: https://metron.incubator.apache.org/
Co Speaker: Cheryl Biswas
Talk Description:
How about this: a blue team talk given by red teamers. But here’s our rationale - your best defence right now is a strategic offence. The rules of the game have changed and we need to get defence up to speed.
We’ll show you what the key elements are in a good defence strategy; what you can and need to be using to full advantage. We’ll talk about the new “buzzwords” and how they apply: visibility; patterns; big data. There’s a whole lotta data to wrangle, and you aren’t seeing the whole picture if you aren’t doing things right. Threat intel is about getting the big picture as it applies to you. You’ll learn the importance of context and prioritization so that you can manipulate intel feeds to do your bidding. And then we’ll take things further and talk about hunting the adversary, using an update on proven methodologies.
We’ll show you how to understand your data, correlate threats and pin point attacks. Attendees will leave with a new understanding of the resources they have on hand, and how to leverage those into an Adaptive Proactive Defense Strategy.
Hortonworks Data In Motion Webinar Series Pt. 2Hortonworks
How Hortonworks DataFlow (HDF), powered by Apache NIFi, MiNiFi, Kafka and Storm, and it’s associated HDF Certification Program make it easier and faster to integrate different systems together. Highlights on the latest partner integrations from HPE, SAS, Attunity, Impetus Technologies, Kepware and Midfin Systems. “
Watch the webinar on-demand: http://hortonworks.com/webinar/make-big-data-ecosystem-work-better/
HDF Partner certification program: http://hortonworks.com/partners/product-integration-certification/#hdf-integration
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks
VIEW THE ON-DEMAND WEBINAR: http://hortonworks.com/webinar/introduction-hortonworks-dataflow/
Learn about Hortonworks DataFlow (HDFTM) and how you can easily augment your existing data systems – Hadoop and otherwise. Learn what Dataflow is all about and how Apache NiFi, MiNiFi, Kafka and Storm work together for streaming analytics.
This talk was held at the 10th meeting on February 3rd 2014 by Sean Owen.
Having collected Big Data, organizations are now keen on data science and “Big Learning”. Much of the focus has been on data science as exploratory analytics: offline, in the lab. However, building from that a production-ready large-scale operational analytics system remains a difficult and ad-hoc endeavor, especially when real-time answers are required. Design patterns for effective implementations are emerging, which take advantage of relaxed assumptions, adopt a new tiered "lambda" architecture, and pick the right scale-friendly algorithms to succeed. Drawing on experience from customer problems and the open source Oryx project at Cloudera, this session will provide examples of operational analytics projects in the field, and present a reference architecture and algorithm design choices for a successful implementation.
Talk I gave at StratHadoop in Barcelona on November 21, 2014.
In this talk I discuss the experience we made with realtime analysis on high volume event data streams.
Hortonworks Data In Motion Series Part 3 - HDF Ambari Hortonworks
How To: Hortonworks DataFlow 2.0 with Ambari and Ranger for integrated installation, deployment and operations of Apache NiFi.
On demand webinar with demo: http://hortonworks.com/webinar/getting-goal-big-data-faster-enterprise-readiness-data-motion/
Detecting Hacks: Anomaly Detection on Networking DataJames Sirota
See https://medium.com/@jamessirota for a series of blog entries that goes with this deck...
Defense in Depth for Big Data
Network Anomaly Detection Overview
Volume Anomaly Detection
Feature Anomaly Detection
Model Architecture
Deployment on OpenSOC Platform
Questions
Hortonworks Data In Motion Series Part 4Hortonworks
How real-world enterprises leverage Hortonworks DataFlow/Apache NiFi to to create real-time data flows in record time to enable new business opportunities, improve customer retention, accelerate big data projects from months to minutes through increased efficiency and reduced costs.
On-Demand webinar: http://hortonworks.com/webinar/paradigm-shift-business-usual-real-time-dataflows-record-time/
How to Use Apache Zeppelin with HWX HDBHortonworks
Part five in a five-part series, this webcast will be a demonstration of the integration of Apache Zeppelin and Pivotal HDB. Apache Zeppelin is a web-based notebook that enables interactive data analytics. You can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and more. This webinar will demonstrate the configuration of the psql interpreter and the basic operations of Apache Zeppelin when used in conjunction with Hortonworks HDB.
Webinar Series Part 5 New Features of HDF 5Hortonworks
Overview of the newest features of Hortonworks DataFlow highlighting the new processors, new user interface, edge intelligence powered by Apache MiNiFi and new support for multi-tenancy and new zero master clustering architecture
Agenda:
1.Data Flow Challenges in an Enterprise
2.Introduction to Apache NiFi
3.Core Features
4.Architecture
5.Demo –Simple Lambda Architecture
6.Use Cases
7.Q & A
Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...Kevin Mao
Strata Hadoop World 2017 San Jose
Today’s enterprise architectures are often composed of a myriad of heterogeneous devices. Bring-your-own-device policies, vendor diversification, and the transition to the cloud all contribute to a sprawling infrastructure, the complexity and scale of which can only be addressed by using modern distributed data processing systems.
Kevin Mao outlines the system that Capital One has built to collect, clean, and analyze the security-related events occurring within its digital infrastructure. Raw data from each component is collected and preprocessed using Apache NiFi flows. This raw data is then written into an Apache Kafka cluster, which serves as the primary communications backbone of the platform. The raw data is parsed, cleaned, and enriched in real time via Apache Metron and Apache Storm and ingested into ElasticSearch, allowing operations teams to detect and monitor events as they occur. The refined data is also transformed into the Apache ORC data format and stored in Amazon S3, allowing data scientists to perform long-term, batch-based analysis.
Kevin discusses the challenges involved with architecting and implementing this system, such as data quality, performance tuning, and the impact of additional financial regulations relating to data governance, and shares the results of these efforts and the value that the data platform brings to Capital One.
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks
Apache NiFi, Storm and Kafka augment each other in modern enterprise architectures. NiFi provides a coding free solution to get many different formats and protocols in and out of Kafka and compliments Kafka with full audit trails and interactive command and control. Storm compliments NiFi with the capability to handle complex event processing.
Join us to learn how Apache NiFi, Storm and Kafka can augment each other for creating a new dataplane connecting multiple systems within your enterprise with ease, speed and increased productivity.
https://www.brighttalk.com/webcast/9573/224063
An immersive workshop at General Assembly, SF. I typically teach this workshop at General Assembly, San Francisco. To see a list of my upcoming classes, visit https://generalassemb.ly/instructors/seth-familian/4813
I also teach this workshop as a private lunch-and-learn or half-day immersive session for corporate clients. To learn more about pricing and availability, please contact me at http://familian1.com
This talk was held at the 10th meeting on February 3rd 2014 by Sean Owen.
Having collected Big Data, organizations are now keen on data science and “Big Learning”. Much of the focus has been on data science as exploratory analytics: offline, in the lab. However, building from that a production-ready large-scale operational analytics system remains a difficult and ad-hoc endeavor, especially when real-time answers are required. Design patterns for effective implementations are emerging, which take advantage of relaxed assumptions, adopt a new tiered "lambda" architecture, and pick the right scale-friendly algorithms to succeed. Drawing on experience from customer problems and the open source Oryx project at Cloudera, this session will provide examples of operational analytics projects in the field, and present a reference architecture and algorithm design choices for a successful implementation.
Talk I gave at StratHadoop in Barcelona on November 21, 2014.
In this talk I discuss the experience we made with realtime analysis on high volume event data streams.
Hortonworks Data In Motion Series Part 3 - HDF Ambari Hortonworks
How To: Hortonworks DataFlow 2.0 with Ambari and Ranger for integrated installation, deployment and operations of Apache NiFi.
On demand webinar with demo: http://hortonworks.com/webinar/getting-goal-big-data-faster-enterprise-readiness-data-motion/
Detecting Hacks: Anomaly Detection on Networking DataJames Sirota
See https://medium.com/@jamessirota for a series of blog entries that goes with this deck...
Defense in Depth for Big Data
Network Anomaly Detection Overview
Volume Anomaly Detection
Feature Anomaly Detection
Model Architecture
Deployment on OpenSOC Platform
Questions
Hortonworks Data In Motion Series Part 4Hortonworks
How real-world enterprises leverage Hortonworks DataFlow/Apache NiFi to to create real-time data flows in record time to enable new business opportunities, improve customer retention, accelerate big data projects from months to minutes through increased efficiency and reduced costs.
On-Demand webinar: http://hortonworks.com/webinar/paradigm-shift-business-usual-real-time-dataflows-record-time/
How to Use Apache Zeppelin with HWX HDBHortonworks
Part five in a five-part series, this webcast will be a demonstration of the integration of Apache Zeppelin and Pivotal HDB. Apache Zeppelin is a web-based notebook that enables interactive data analytics. You can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and more. This webinar will demonstrate the configuration of the psql interpreter and the basic operations of Apache Zeppelin when used in conjunction with Hortonworks HDB.
Webinar Series Part 5 New Features of HDF 5Hortonworks
Overview of the newest features of Hortonworks DataFlow highlighting the new processors, new user interface, edge intelligence powered by Apache MiNiFi and new support for multi-tenancy and new zero master clustering architecture
Agenda:
1.Data Flow Challenges in an Enterprise
2.Introduction to Apache NiFi
3.Core Features
4.Architecture
5.Demo –Simple Lambda Architecture
6.Use Cases
7.Q & A
Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...Kevin Mao
Strata Hadoop World 2017 San Jose
Today’s enterprise architectures are often composed of a myriad of heterogeneous devices. Bring-your-own-device policies, vendor diversification, and the transition to the cloud all contribute to a sprawling infrastructure, the complexity and scale of which can only be addressed by using modern distributed data processing systems.
Kevin Mao outlines the system that Capital One has built to collect, clean, and analyze the security-related events occurring within its digital infrastructure. Raw data from each component is collected and preprocessed using Apache NiFi flows. This raw data is then written into an Apache Kafka cluster, which serves as the primary communications backbone of the platform. The raw data is parsed, cleaned, and enriched in real time via Apache Metron and Apache Storm and ingested into ElasticSearch, allowing operations teams to detect and monitor events as they occur. The refined data is also transformed into the Apache ORC data format and stored in Amazon S3, allowing data scientists to perform long-term, batch-based analysis.
Kevin discusses the challenges involved with architecting and implementing this system, such as data quality, performance tuning, and the impact of additional financial regulations relating to data governance, and shares the results of these efforts and the value that the data platform brings to Capital One.
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks
Apache NiFi, Storm and Kafka augment each other in modern enterprise architectures. NiFi provides a coding free solution to get many different formats and protocols in and out of Kafka and compliments Kafka with full audit trails and interactive command and control. Storm compliments NiFi with the capability to handle complex event processing.
Join us to learn how Apache NiFi, Storm and Kafka can augment each other for creating a new dataplane connecting multiple systems within your enterprise with ease, speed and increased productivity.
https://www.brighttalk.com/webcast/9573/224063
An immersive workshop at General Assembly, SF. I typically teach this workshop at General Assembly, San Francisco. To see a list of my upcoming classes, visit https://generalassemb.ly/instructors/seth-familian/4813
I also teach this workshop as a private lunch-and-learn or half-day immersive session for corporate clients. To learn more about pricing and availability, please contact me at http://familian1.com
오컴 Clip IT 세미나 1회차 "머신러닝과 인공지능의 현재와 미래"
1. 인공지능과 머신러닝
- 영화 및 애니메이션에 나타나는 친화적 인공지능과 적대적 인공지능, 그리고 감성적 인공지능
- 강한 인공지능과 약한 인공지능의 차이
- 인공지능과 머신러닝의 관계
2. 딥러닝과 강화학습
- 인공지능의 중요 열쇠이자 머신러닝의 세부 이론인 딥러닝과 강화학습에 대한 개괄 소개
3. 인공지능에 대한 우리의 자세
- 과연 인공지능은 완벽한가?
- 과연 인공지능은 인간 전문가를 대체할 수 있을까?
- 데이터의 중요성
3. 내가 표현하고 싶었던 발표로 준비한 내용
유전 알고리즘이란
프로그램 제작 일지 (디버깅..etc)
결과 분석
4. 유전 알고리즘이란?
"유전 알고리즘(Genetic Algorithm)은 자연세계의 진화과정에
기초한 계산 모델로서 존 홀랜드(John Holland)에 의해서
1975년에 개발된 전역 최적화 기법으로, 최적화 문제를 해결하는
기법의 하나이다. 생물의 진화를 모방한 진화 연산의 대표적인
기법으로, 실제 진화의 과정에서 많은 부분을 차용하였으며, 변이
(돌연변이), 교배 연산 등이 존재한다. 또한 세대, 인구 등의
용어도 문제 풀이 과정에서 사용된다." - 위키피디아
14. 필터링 규칙 진화 과정
1. 필터링 규칙 초기 세대 생성
2. 세대 적합도 측정
3. 목표 적합도 도달 판단
4. 반복 & 반복
15. 필터링 규칙 요소
필터링 규칙은 아래와 같이 이루어 진다
IP
Port
Type(src, dst)
Active
DNA 요소들
16. 필터링 규칙 예시
IP info
ip src 192.168.0.1 and not dst port 80
출발지
Port Info
주소 번호도착지
"출발지 IP 주소가 192.168.0.1 이면서, 도착지 포트 번호가 80이
아닌"
17. 필터링 방법
Use pcapy package
- open_offline func
- setfilter func
Follow standard of linux
https://linux.die.net/man/7/pcap-filter
참고자료: http://genetic.o-r.kr
18. 첫 세대 만들기
모든 IP, Port 를 가져옴
Src, Dst 간 구분함
→ 상관없는 규칙을 줄임
→ 세대 낭비를 줄임
43. 연구 한계
- 퍼포먼스에 대한 테스트가 미흡
Snort, Suricata 와 같은 오픈소스를 이용해 테스트
- Packet Filtering Element 가 부족
단순히 IP, Port 만으로 패킷을 필터링 하는 것은 매우 위험.
Content 와 같은 요소를 추가하여 돌리면 흥미로운 결과가 기대됨.
- Genetic 에 대한 적합도 함수 최적화
더욱 적은 세대에서 최적 데이터를 구함. (아 수학...)