SlideShare a Scribd company logo
1 of 20
Download to read offline
Apache Eagle
Architecture Evolvement and New Features
Since v0.5
Hao Chen, Co-Creator & PMC of Apache Eagle
Agenda
● Introduction
● New Features and Use Cases
● Architecture Evolvement
● What’s Next
● Q&A
Apache Eagle -Introduction
Apache® Eagle™ analyzes data activities, yarn applications, JMX metrics, and
daemon logs etc., provides state-of-the-art alert engine to identify security
breach, performance issues and shows insights.
Oct, 2015
Apache
Incubation
Apr, 2016
Apache
Eagle v0.3
Release
Jul, 2016
Apache
Eagle v0.4
Release Jan, 2017
Apache Top Level Project
Apache
Eagle v0.5
Release
March,
2017
Apache Eagle -Introduction
Apache® Eagle™ analyzes data activities, yarn applications, JMX metrics, and
daemon logs etc., provides state-of-the-art alert engine to identify security
breach, performance issues and shows insights.
Ingestion
(Metric/Log/Event)
Processing
(Parsing/Enrich/Aggregation)
Alerting
(CEP/Correlation/ML)
Insight
(Storage/Dashboard)
Apache Eagle -Use Cases
Apache Eagle -Typical Use Cases
3
4
Security Monitoring
Instantly identify sensitive data access and malicious operations
Job Performance Monitoring
Hadoop, Spark job profiling and performance analysis
2
Bad Node Detection
Detect soft failure issues, Linux filesystem ACL, disk full
1
Service Health Check
Service and process aliveness, JMX status as well as JVM GC
Apache Eagle -Overview
Daemon
Log
Audit Log
JMX
Metrics
Job Log
Ingested in
real-time
YARN
......
System
Metrics
Service Heath
Job
Performance
Downgrade
Node Failure
Security
Breach
......
......
Flexible
Policy
Definition
Dynamic and
Scalable
Engine
1 2 3
Source processing
and
policy enforcement
in real-time
Extensible
Monitoring
Use Cases
Apache Eagle -Case Onboarding
1. Register a new Monitored Site
2. Choose and Install Application
3. Configure Application
4. Administrate Application
5. Define Alert Policies and
Explore Alerts
6. Analyze with Dashboards and
Insights
Job Performance AppsHadoop Monitoring Apps
Apache Eagle -Components
App Framework
Lifecycle Manage
Alert Engine
Streaming and Real time
Storage Engine
Easy and Fast Query
StaticResourceApp
Dashboard/Analytics
Features/Policies
SchedulingApp
HealthCheck/Anomal
y Detector Jobs
StreamingApp
Ingest/Process/Aggre
gation
Eagle
Interface
UI
Dashboard
API
Eagle Core
Scheduling Engine
Job Workflow Scheduling
Hadoop Security Apps
Eagle Apps
Integration
JMX/System Hardware Metric
Service/Process/Topology
Availability HealthCheck
MR Job Monitoring
Spark Job Monitoring
Cluster Capacity Analytics
Hadoop Audit/Security
HBase Audit/Security
Job Access Security
Apache Eagle -Architecture
Applications
(Process/Job/Policy/View)
Messaging
(Kafka)
Alerting
(Storm)
Storage
(HBase)
Metadata
(Metadata-driven)
Metric/Entity/Log
PolicyConfig
Stream
Onboarding
Administration
Monitoring
Insights
Alerts
Daemon
Log
Audit Log
JMX Metrics
Job Log
YARN
......
System
Metrics
......
System
Metrics
Apache Eagle -Application Framework
An “Application” is case-oriented solution package
● Installation: Application user guide, configuration, management
● Ingestion: Provide data ingestion/collection approaches to integrate any kinds of monitor
data sources
● Process: Analyze data source based on Storm Topology or Spark Streaming App
● Alerting:
○ Stream: Structured stream exported for alerting with eagle alerting engine or
persistence in eagle storage
○ Model: Complex built-in policies or policy templates defined in SQL/Java code/ML
model, etc.
● Insight: Monitoring Analytics UI or Dashboard
Execution RuntimeEagle AppsEagle Server
Apache Eagle -Application Execution
StreamingApp
Ingest/Process/Aggrega
tion
StaticResourceApp
Dashboard/Analytics
Features/Policies
SchedulingApp
HealthCheck/Detector
Jobs and Workflows
Application Manager
● Loader (SPI)
● Lifecycle/Admin
● Configuration
● Monitoring
Eagle UI
REST API
Alert Engine
Apache Eagle -Distributed Alert Engine
from MetricStream[(name ==
'ReplLag') and (value > 1000)]
select * insert into
outputStream;
Messaging
Notification
Slack
Insight
Action
● Real-time Streaming: Apache Storm
(Execution Engine) + Kafka (Messaging)
● Declarative Policy: CEP and Extensible
Alert Model in streaming way
● Dynamical Onboarding & Correlation:
Connect to new stream and change
Stream Grouping in Runtime
● Hot Deploy & No Downtime:
Metadata-driven and lightweight alert
logic assignment
Apache Eagle -Distributed Alert Engine
from hadoopJmxMetricEventStream
[metric == "hadoop.namenode.fsnamesystemstate.capacityused" and value > 0.9] select
metric, host, value, timestamp, component, site insert into alertStream;
Example 1: Alert if hadoop namenode capacity usage exceed 90 percentages
from every
a = hadoopJmxMetricEventStream[metric=="hadoop.namenode.fsnamesystem.hastate"]
->
b = hadoopJmxMetricEventStream[metric==a.metric and b.host == a.host and a.value !=
value)]
within 10 min
select a.host, a.value as oldHaState, b.value as newHaState, b.timestamp as timestamp,
b.metric as metric, b.component as component, b.site as site insert into alertStream;
Example 2: Alert if hadoop namenode HA switches
Apache Eagle -Distributed Alert Engine
Extensible Data Source
Dynamic Sorting/Grouping
Declarative CEP Policy
Elastic Resource Pool
MetadataCoordinator Topology Manager
REST API (Schema/Policy)ZK Notify (Schedule) START/STOP
AlertInsight
Management Services
DataSource User Interface: Register Data Source -> Design Stream Model -> Define Alert Policy
REST API
Apache Eagle -Distributed Alert Engine
MetadataCoordinator Topology Manager
Topology Resource Pool
REST API (Schema/Policy)ZK Notify (Schedule) START/STOP
Stream
Receiver
Stream
Router
Alert
Publisher
Policy
Evaluator
Stream
Receiver
Stream
Router
Alert
Publisher
Policy
Evaluator
Stream
Receiver
Stream
Router
Alert
Publisher
Policy
Evaluator
AlertInsight
Management Services
DataSource
define stream SystemMetricStream (
metric string,
host string,
device string
value double);
from SystemMetricStream
[name = "disk.usage.metric" and value > 0.99 ]
#window.time(30 min)
group by host, device
insert into SystemAlertStream;
Apache Eagle -Distributed Alert Engine
SourceSpec
PartitionSpec
SortSpec
PolicySpec
PublishSpec
From Policy Definition in User View to Engine View
Schedule
Assignment
Source: kafka topic
Schema:
SystemMetricStream
Window: 5min
Margin: 1min
Partition:
Group by host, device
Publish:
SystemAlertStream
Process:
CEP Execution Plan
Apache Eagle -Distributed Alert Engine
3 Rebuild Assignments
2 Trigger ScheduleDefine new Policy
3
21
Stream
Receiver
Stream
Router
Alert
Publisher
Policy
Evaluator
Metadata
Coordinator
Notify with latest version
5. Watch notification
6. Pull notified version of metadata
7. Update components runtime according to metadata changes
SourceSpec
PartitionSpec
PolicySpec
PublishSpec
AlertZKRoot/
receiver/
router/
evaluator/
publisher/
4
6
5
78 Connect and flow stream through
alert engine
Apache Eagle -What's Next
● Eagle Alert Engine on Apache Beam
○ Unified streaming on Spark/Flink
● Eagle Integration with Ambari/Cloudera Manager
○ Seamless connect monitoring data source
● Eagle Agent
○ Unified Log/Metric/Event Shipper
● Eagle on Cloud
○ Support deployment and monitor service on AWS/Aliyun
Thanks -Q&A

More Related Content

What's hot

Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Apache Spark for Cyber Security in an Enterprise Company
Apache Spark for Cyber Security in an Enterprise CompanyApache Spark for Cyber Security in an Enterprise Company
Apache Spark for Cyber Security in an Enterprise Company
Databricks
 

What's hot (20)

Monitoring Error Logs at Databricks
Monitoring Error Logs at DatabricksMonitoring Error Logs at Databricks
Monitoring Error Logs at Databricks
 
Making Nested Columns as First Citizen in Apache Spark SQL
Making Nested Columns as First Citizen in Apache Spark SQLMaking Nested Columns as First Citizen in Apache Spark SQL
Making Nested Columns as First Citizen in Apache Spark SQL
 
Apache Spark Model Deployment
Apache Spark Model Deployment Apache Spark Model Deployment
Apache Spark Model Deployment
 
Tracing your security telemetry with Apache Metron
Tracing your security telemetry with Apache MetronTracing your security telemetry with Apache Metron
Tracing your security telemetry with Apache Metron
 
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
 
Sumo Logic Search Job API
Sumo Logic Search Job APISumo Logic Search Job API
Sumo Logic Search Job API
 
Sumo Logic Quickstart - Nv 2016
Sumo Logic Quickstart - Nv 2016Sumo Logic Quickstart - Nv 2016
Sumo Logic Quickstart - Nv 2016
 
Sumo Logic QuickStat - Apr 2017
Sumo Logic QuickStat - Apr 2017Sumo Logic QuickStat - Apr 2017
Sumo Logic QuickStat - Apr 2017
 
Introducing apache prediction io (incubating) (bay area spark meetup at sales...
Introducing apache prediction io (incubating) (bay area spark meetup at sales...Introducing apache prediction io (incubating) (bay area spark meetup at sales...
Introducing apache prediction io (incubating) (bay area spark meetup at sales...
 
Hadoop and Big Data Overview
Hadoop and Big Data OverviewHadoop and Big Data Overview
Hadoop and Big Data Overview
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Flock: Data Science Platform @ CISL
Flock: Data Science Platform @ CISLFlock: Data Science Platform @ CISL
Flock: Data Science Platform @ CISL
 
Log Data Analysis Platform
Log Data Analysis PlatformLog Data Analysis Platform
Log Data Analysis Platform
 
Log Data Analysis Platform by Valentin Kropov
Log Data Analysis Platform by Valentin KropovLog Data Analysis Platform by Valentin Kropov
Log Data Analysis Platform by Valentin Kropov
 
Apache Spark and Oracle Stream Analytics
Apache Spark and Oracle Stream AnalyticsApache Spark and Oracle Stream Analytics
Apache Spark and Oracle Stream Analytics
 
A Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLA Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQL
 
It's Java, Jim, but not as we know it
It's Java, Jim, but not as we know itIt's Java, Jim, but not as we know it
It's Java, Jim, but not as we know it
 
The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...
The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...
The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...
 
Apache Spark for Cyber Security in an Enterprise Company
Apache Spark for Cyber Security in an Enterprise CompanyApache Spark for Cyber Security in an Enterprise Company
Apache Spark for Cyber Security in an Enterprise Company
 
Spark Summit EU talk by Reza Karimi
Spark Summit EU talk by Reza KarimiSpark Summit EU talk by Reza Karimi
Spark Summit EU talk by Reza Karimi
 

Similar to Apache Eagle Architecture Evolvement

Real-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven ApplicationsReal-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven Applications
VMware Tanzu
 
Microsoft SQL Server - StreamInsight Overview Presentation
Microsoft SQL Server - StreamInsight Overview PresentationMicrosoft SQL Server - StreamInsight Overview Presentation
Microsoft SQL Server - StreamInsight Overview Presentation
Microsoft Private Cloud
 

Similar to Apache Eagle Architecture Evolvement (20)

Apache Eagle: Architecture Evolvement and New Features
Apache Eagle: Architecture Evolvement and New FeaturesApache Eagle: Architecture Evolvement and New Features
Apache Eagle: Architecture Evolvement and New Features
 
Apache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real TimeApache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real Time
 
Apache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseApache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San Jose
 
Apache Eagle in Action
Apache Eagle in ActionApache Eagle in Action
Apache Eagle in Action
 
Visualizing Big Data in Realtime
Visualizing Big Data in RealtimeVisualizing Big Data in Realtime
Visualizing Big Data in Realtime
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks
 
Apache Eagle: eBay构建开源分布式实时预警引擎实践
Apache Eagle: eBay构建开源分布式实时预警引擎实践Apache Eagle: eBay构建开源分布式实时预警引擎实践
Apache Eagle: eBay构建开源分布式实时预警引擎实践
 
Scalable AutoML for Time Series Forecasting using Ray
Scalable AutoML for Time Series Forecasting using RayScalable AutoML for Time Series Forecasting using Ray
Scalable AutoML for Time Series Forecasting using Ray
 
Real-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven ApplicationsReal-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven Applications
 
Atlas ApacheCon 2017
Atlas ApacheCon 2017Atlas ApacheCon 2017
Atlas ApacheCon 2017
 
AWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache StormAWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache Storm
 
Using Istio to Secure & Monitor Your Services
Using Istio to Secure & Monitor Your ServicesUsing Istio to Secure & Monitor Your Services
Using Istio to Secure & Monitor Your Services
 
Apache Eagle - Monitor Hadoop in Real Time
Apache Eagle - Monitor Hadoop in Real TimeApache Eagle - Monitor Hadoop in Real Time
Apache Eagle - Monitor Hadoop in Real Time
 
IBM Rational AppScan Technical Overview
IBM Rational AppScan Technical OverviewIBM Rational AppScan Technical Overview
IBM Rational AppScan Technical Overview
 
Security From The Big Data and Analytics Perspective
Security From The Big Data and Analytics PerspectiveSecurity From The Big Data and Analytics Perspective
Security From The Big Data and Analytics Perspective
 
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
 
Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
Hadoop Eagle - Real Time Monitoring Framework for eBay HadoopHadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
 
Microsoft SQL Server - StreamInsight Overview Presentation
Microsoft SQL Server - StreamInsight Overview PresentationMicrosoft SQL Server - StreamInsight Overview Presentation
Microsoft SQL Server - StreamInsight Overview Presentation
 
XSEDE14 SciGaP-Apache Airavata Tutorial
XSEDE14 SciGaP-Apache Airavata TutorialXSEDE14 SciGaP-Apache Airavata Tutorial
XSEDE14 SciGaP-Apache Airavata Tutorial
 
Logging, tracing and metrics: Instrumentation in .NET 5 and Azure
Logging, tracing and metrics: Instrumentation in .NET 5 and AzureLogging, tracing and metrics: Instrumentation in .NET 5 and Azure
Logging, tracing and metrics: Instrumentation in .NET 5 and Azure
 

Recently uploaded

TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern Enterprise
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 

Apache Eagle Architecture Evolvement

  • 1. Apache Eagle Architecture Evolvement and New Features Since v0.5 Hao Chen, Co-Creator & PMC of Apache Eagle
  • 2. Agenda ● Introduction ● New Features and Use Cases ● Architecture Evolvement ● What’s Next ● Q&A
  • 3. Apache Eagle -Introduction Apache® Eagle™ analyzes data activities, yarn applications, JMX metrics, and daemon logs etc., provides state-of-the-art alert engine to identify security breach, performance issues and shows insights. Oct, 2015 Apache Incubation Apr, 2016 Apache Eagle v0.3 Release Jul, 2016 Apache Eagle v0.4 Release Jan, 2017 Apache Top Level Project Apache Eagle v0.5 Release March, 2017
  • 4. Apache Eagle -Introduction Apache® Eagle™ analyzes data activities, yarn applications, JMX metrics, and daemon logs etc., provides state-of-the-art alert engine to identify security breach, performance issues and shows insights. Ingestion (Metric/Log/Event) Processing (Parsing/Enrich/Aggregation) Alerting (CEP/Correlation/ML) Insight (Storage/Dashboard)
  • 6. Apache Eagle -Typical Use Cases 3 4 Security Monitoring Instantly identify sensitive data access and malicious operations Job Performance Monitoring Hadoop, Spark job profiling and performance analysis 2 Bad Node Detection Detect soft failure issues, Linux filesystem ACL, disk full 1 Service Health Check Service and process aliveness, JMX status as well as JVM GC
  • 7. Apache Eagle -Overview Daemon Log Audit Log JMX Metrics Job Log Ingested in real-time YARN ...... System Metrics Service Heath Job Performance Downgrade Node Failure Security Breach ...... ...... Flexible Policy Definition Dynamic and Scalable Engine 1 2 3 Source processing and policy enforcement in real-time Extensible Monitoring Use Cases
  • 8. Apache Eagle -Case Onboarding 1. Register a new Monitored Site 2. Choose and Install Application 3. Configure Application 4. Administrate Application 5. Define Alert Policies and Explore Alerts 6. Analyze with Dashboards and Insights
  • 9. Job Performance AppsHadoop Monitoring Apps Apache Eagle -Components App Framework Lifecycle Manage Alert Engine Streaming and Real time Storage Engine Easy and Fast Query StaticResourceApp Dashboard/Analytics Features/Policies SchedulingApp HealthCheck/Anomal y Detector Jobs StreamingApp Ingest/Process/Aggre gation Eagle Interface UI Dashboard API Eagle Core Scheduling Engine Job Workflow Scheduling Hadoop Security Apps Eagle Apps Integration JMX/System Hardware Metric Service/Process/Topology Availability HealthCheck MR Job Monitoring Spark Job Monitoring Cluster Capacity Analytics Hadoop Audit/Security HBase Audit/Security Job Access Security
  • 11. Apache Eagle -Application Framework An “Application” is case-oriented solution package ● Installation: Application user guide, configuration, management ● Ingestion: Provide data ingestion/collection approaches to integrate any kinds of monitor data sources ● Process: Analyze data source based on Storm Topology or Spark Streaming App ● Alerting: ○ Stream: Structured stream exported for alerting with eagle alerting engine or persistence in eagle storage ○ Model: Complex built-in policies or policy templates defined in SQL/Java code/ML model, etc. ● Insight: Monitoring Analytics UI or Dashboard
  • 12. Execution RuntimeEagle AppsEagle Server Apache Eagle -Application Execution StreamingApp Ingest/Process/Aggrega tion StaticResourceApp Dashboard/Analytics Features/Policies SchedulingApp HealthCheck/Detector Jobs and Workflows Application Manager ● Loader (SPI) ● Lifecycle/Admin ● Configuration ● Monitoring Eagle UI REST API
  • 13. Alert Engine Apache Eagle -Distributed Alert Engine from MetricStream[(name == 'ReplLag') and (value > 1000)] select * insert into outputStream; Messaging Notification Slack Insight Action ● Real-time Streaming: Apache Storm (Execution Engine) + Kafka (Messaging) ● Declarative Policy: CEP and Extensible Alert Model in streaming way ● Dynamical Onboarding & Correlation: Connect to new stream and change Stream Grouping in Runtime ● Hot Deploy & No Downtime: Metadata-driven and lightweight alert logic assignment
  • 14. Apache Eagle -Distributed Alert Engine from hadoopJmxMetricEventStream [metric == "hadoop.namenode.fsnamesystemstate.capacityused" and value > 0.9] select metric, host, value, timestamp, component, site insert into alertStream; Example 1: Alert if hadoop namenode capacity usage exceed 90 percentages from every a = hadoopJmxMetricEventStream[metric=="hadoop.namenode.fsnamesystem.hastate"] -> b = hadoopJmxMetricEventStream[metric==a.metric and b.host == a.host and a.value != value)] within 10 min select a.host, a.value as oldHaState, b.value as newHaState, b.timestamp as timestamp, b.metric as metric, b.component as component, b.site as site insert into alertStream; Example 2: Alert if hadoop namenode HA switches
  • 15. Apache Eagle -Distributed Alert Engine Extensible Data Source Dynamic Sorting/Grouping Declarative CEP Policy Elastic Resource Pool MetadataCoordinator Topology Manager REST API (Schema/Policy)ZK Notify (Schedule) START/STOP AlertInsight Management Services DataSource User Interface: Register Data Source -> Design Stream Model -> Define Alert Policy REST API
  • 16. Apache Eagle -Distributed Alert Engine MetadataCoordinator Topology Manager Topology Resource Pool REST API (Schema/Policy)ZK Notify (Schedule) START/STOP Stream Receiver Stream Router Alert Publisher Policy Evaluator Stream Receiver Stream Router Alert Publisher Policy Evaluator Stream Receiver Stream Router Alert Publisher Policy Evaluator AlertInsight Management Services DataSource
  • 17. define stream SystemMetricStream ( metric string, host string, device string value double); from SystemMetricStream [name = "disk.usage.metric" and value > 0.99 ] #window.time(30 min) group by host, device insert into SystemAlertStream; Apache Eagle -Distributed Alert Engine SourceSpec PartitionSpec SortSpec PolicySpec PublishSpec From Policy Definition in User View to Engine View Schedule Assignment Source: kafka topic Schema: SystemMetricStream Window: 5min Margin: 1min Partition: Group by host, device Publish: SystemAlertStream Process: CEP Execution Plan
  • 18. Apache Eagle -Distributed Alert Engine 3 Rebuild Assignments 2 Trigger ScheduleDefine new Policy 3 21 Stream Receiver Stream Router Alert Publisher Policy Evaluator Metadata Coordinator Notify with latest version 5. Watch notification 6. Pull notified version of metadata 7. Update components runtime according to metadata changes SourceSpec PartitionSpec PolicySpec PublishSpec AlertZKRoot/ receiver/ router/ evaluator/ publisher/ 4 6 5 78 Connect and flow stream through alert engine
  • 19. Apache Eagle -What's Next ● Eagle Alert Engine on Apache Beam ○ Unified streaming on Spark/Flink ● Eagle Integration with Ambari/Cloudera Manager ○ Seamless connect monitoring data source ● Eagle Agent ○ Unified Log/Metric/Event Shipper ● Eagle on Cloud ○ Support deployment and monitor service on AWS/Aliyun