The document discusses event processing using the Esper engine and Event Processing Language (EPL). It provides an overview of Esper's features for efficient event processing, extensibility as middleware, rich user interface, and high availability. The document then demonstrates EPL using a running example of detecting fires based on sensor data. It shows how to define event types, register queries, use various window types, and detect patterns with temporal operators.
This document summarizes Dominik Dorn's presentation on event sourcing microservices with Play and Akka. It discusses introducting event sourcing and actors, using Akka persistence to handle commands and events, implementing CQRS with separate write and read models, and clustering services with Akka. The presentation provided an overview and examples of building event sourced microservices with Akka persistence and Play frameworks in Scala.
This document outlines the agenda and logistics for a Sumo Metrics Certified Analyst training course. The course will teach students how to use a unified logs and metrics solution, collect metrics data, analyze metrics using tools and queries, and apply their knowledge through hands-on labs covering common use cases. These include monitoring host metrics, analyzing AWS metrics, working with different metric formats, and converting logs to metrics. Students will learn to visualize metrics using charts and dashboards, and configure metric monitors and alerts. Upon completing the course, students will take an online certification exam to test their mastery of the skills covered.
The Internet of Things (IoT) is set to occupy a substantial component of future Internet. The IoT connects sensors and devices that record physical observations to applications and services of the Internet[1]. As a successor to technologies such as RFID and Wireless Sensor Networks (WSN), the IoT has stumbled into vertical silos of proprietary systems, providing little or no interoperability with similar systems. As the IoT represents future state of the Internet, an intelligent and scalable architecture is required to provide connectivity between these silos, enabling discovery of physical sensors and interpretation of messages between the things. This paper proposes a gateway and Semantic Web enabled IoT architecture to provide interoperability between systems, which utilizes established communication and data standards. The Semantic Gateway as Service (SGS) allows translation between messaging protocols such as XMPP, CoAP and MQTT via a multi-protocol proxy architecture. Utilization of broadly accepted specifications such as W3Cs Semantic Sensor Network (SSN) ontology for semantic annotations of sensor data provide semantic interoperability between messages and support semantic reasoning to obtain higher-level actionable knowledge from low-level sensor data.
Link to the paper: http://knoesis.org/library/resource.php?id=2154
Citation:
Pratikkumar Desai, Amit Sheth, Pramod Anantharam, 'Semantic Gateway as a Service architecture for IoT Interoperability', IEEE 4th International Conference on Mobile Services, June 27 - July 2, 2015, New York, USA.
This document summarizes research on analyzing the social media footprint of street gangs. The researchers collected Twitter data related to a specific Chicago gang and 10 Chicago neighborhoods. They analyzed this data using techniques like spatio-temporal analysis, network analysis, sentiment analysis and profile analysis. Their goals were to monitor gang activities, identify influential members, and evaluate community sentiment from social media posts. Challenges included automatically identifying gang members and detecting conflicts between gangs.
Examples of Applied Semantic Technologies to Solve Variety Challenge of Big Data: Application of Semantic Sensor Network
(SSN) Ontology
Pramod Anantharam - Kno.e.sis
The document demonstrates how to use an ontology to semantically query a relational database using Ontop software. It shows how to create an ontology representing the database schema, map the ontology to the database, load sample data, and run SPARQL queries over the ontology to retrieve and infer additional information from the database thanks to the reasoner. The results demonstrate how ontology-based data access can provide a semantic view of relational data and enable richer querying capabilities than what is natively supported by the database.
Semantics Approach to Big Data and Event Processing: an introduction focused on velocity and variety
Prof Emanuele Della Valle - DEIB Politecnico di Milano
Wenbo Wang defended his PhD dissertation on automatic emotion identification from text. His dissertation focused on three areas: 1) Emotion classification using machine learning techniques to identify emotions from suicide notes and tweets. 2) Creating large self-labeled emotion datasets by leveraging hashtags on Twitter. 3) Adapting emotion identification models to new domains by selecting informative tweets to add to limited labeled data in target domains like blogs. The goal was to improve identification by utilizing large Twitter datasets while addressing challenges of limited labeled data in other domains.
This document summarizes Dominik Dorn's presentation on event sourcing microservices with Play and Akka. It discusses introducting event sourcing and actors, using Akka persistence to handle commands and events, implementing CQRS with separate write and read models, and clustering services with Akka. The presentation provided an overview and examples of building event sourced microservices with Akka persistence and Play frameworks in Scala.
This document outlines the agenda and logistics for a Sumo Metrics Certified Analyst training course. The course will teach students how to use a unified logs and metrics solution, collect metrics data, analyze metrics using tools and queries, and apply their knowledge through hands-on labs covering common use cases. These include monitoring host metrics, analyzing AWS metrics, working with different metric formats, and converting logs to metrics. Students will learn to visualize metrics using charts and dashboards, and configure metric monitors and alerts. Upon completing the course, students will take an online certification exam to test their mastery of the skills covered.
The Internet of Things (IoT) is set to occupy a substantial component of future Internet. The IoT connects sensors and devices that record physical observations to applications and services of the Internet[1]. As a successor to technologies such as RFID and Wireless Sensor Networks (WSN), the IoT has stumbled into vertical silos of proprietary systems, providing little or no interoperability with similar systems. As the IoT represents future state of the Internet, an intelligent and scalable architecture is required to provide connectivity between these silos, enabling discovery of physical sensors and interpretation of messages between the things. This paper proposes a gateway and Semantic Web enabled IoT architecture to provide interoperability between systems, which utilizes established communication and data standards. The Semantic Gateway as Service (SGS) allows translation between messaging protocols such as XMPP, CoAP and MQTT via a multi-protocol proxy architecture. Utilization of broadly accepted specifications such as W3Cs Semantic Sensor Network (SSN) ontology for semantic annotations of sensor data provide semantic interoperability between messages and support semantic reasoning to obtain higher-level actionable knowledge from low-level sensor data.
Link to the paper: http://knoesis.org/library/resource.php?id=2154
Citation:
Pratikkumar Desai, Amit Sheth, Pramod Anantharam, 'Semantic Gateway as a Service architecture for IoT Interoperability', IEEE 4th International Conference on Mobile Services, June 27 - July 2, 2015, New York, USA.
This document summarizes research on analyzing the social media footprint of street gangs. The researchers collected Twitter data related to a specific Chicago gang and 10 Chicago neighborhoods. They analyzed this data using techniques like spatio-temporal analysis, network analysis, sentiment analysis and profile analysis. Their goals were to monitor gang activities, identify influential members, and evaluate community sentiment from social media posts. Challenges included automatically identifying gang members and detecting conflicts between gangs.
Examples of Applied Semantic Technologies to Solve Variety Challenge of Big Data: Application of Semantic Sensor Network
(SSN) Ontology
Pramod Anantharam - Kno.e.sis
The document demonstrates how to use an ontology to semantically query a relational database using Ontop software. It shows how to create an ontology representing the database schema, map the ontology to the database, load sample data, and run SPARQL queries over the ontology to retrieve and infer additional information from the database thanks to the reasoner. The results demonstrate how ontology-based data access can provide a semantic view of relational data and enable richer querying capabilities than what is natively supported by the database.
Semantics Approach to Big Data and Event Processing: an introduction focused on velocity and variety
Prof Emanuele Della Valle - DEIB Politecnico di Milano
Wenbo Wang defended his PhD dissertation on automatic emotion identification from text. His dissertation focused on three areas: 1) Emotion classification using machine learning techniques to identify emotions from suicide notes and tweets. 2) Creating large self-labeled emotion datasets by leveraging hashtags on Twitter. 3) Adapting emotion identification models to new domains by selecting informative tweets to add to limited labeled data in target domains like blogs. The goal was to improve identification by utilizing large Twitter datasets while addressing challenges of limited labeled data in other domains.
This document discusses mastering the velocity dimension of big data through information flow processing (IFP). It presents an agenda that includes an overview of IFP and a modeling framework for data stream management systems (DSMS) and complex event processing (CEP). The modeling framework consists of functional, processing, deployment, interaction, data, time, rule, and language models to characterize different systems.
This document discusses using semantic technologies to address the variety challenge of big data. It provides examples of applying semantic annotation to social data and metadata. Specifically, it describes how semantic annotation can extract meaningful metadata from social media posts, including information about users, content, relationships between users, and activity networks. The document outlines different types of metadata that can be derived from social media content, users, and networks.
Best Paper Award winning paper presented at ASONAM 2015.
Derek Doran, Samir Yelne, Luisa Massari, Maria-Carla Calzarossa, LaTrelle Jackson, Glen MoriartyDept. of CSE, Professional Psych, Wright State University, USADept. of Electrical, Computer, and Biomedical Eng., University of Pavia, Italy
7 Cups of Tea, Inc.
http://knoesis.wright.edu/doran
Krishnaprasad Thirunarayan, Value-Oriented Big Data Processing with Applications,
Invited Talk, The 2015 International Conference on Collaboration
Technologies and Systems (CTS 2015), June 2015.
Mastering the variety dimension of Big Data with semantic technologies: high level intro to standards. Focus on variety/interoperability dimension. Prof Amit Sheth
Examples of Real-World Big Data Application Specific examples of velocity challenge and how it is addressed in disaster coordination scenario (e.g., Jammu&Kashmir Floods).
Prof Amit Sheth - Kno.e.sis
Revathy Krishnamurthy, Pavan Kapanipathi, Amit P. Sheth, and Krishnaprasad Thirunarayan. Knowledge Enabled Approach to Predict the Location of Twitter Users, Proc. of the Extended Semantic Web Conference, Slovenia, May 3, 2015.
Paper at: http://www.knoesis.org/library/resource.php?id=2039
This document summarizes an approach to entity recommendations using hierarchical knowledge bases. It proposes using spreading activation theory on Wikipedia's category hierarchy to model user interests and generate recommendations. The approach transforms Wikipedia's category structure into a hierarchy and uses spreading activation to calculate interest scores for categories. These scores are then propagated across the hierarchy to score and rank entities. The approach is evaluated on movie recommendations and shows improved recall over baseline methods, particularly when incorporating category priority weights. Future work to improve normalization and validate category priorities is discussed.
Presentation of Hexoskin Validation for KHealth's Dementia Project
The paper is available at: http://www.knoesis.org/library/resource.php?id=2155
Citation for the paper: T. Banerjee, P. Anantharam, W. L. Romine, L. Lawhorne, A. Sheth, 'Evaluating a Potential Commercial Tool for Healthcare Application for People with Dementia' in Proc. of the Intl Conf on Health Informatics and Medical Systems (HIMS), Las Vegas, July 27-30, 2015.
Wide adoption of smartphones and availability of low-cost sensors has resulted in seamless and continuous monitoring of physiology, environment, and public health notifications. However, personalized digital health and patient empowerment can become a reality only if the complex multisensory and multimodal data is processed within the patient context. Contextual processing of patient data along with personalized medical knowledge can lead to actionable information for better and timely decisions. We present a system called kHealth capable of aggregating multisensory and multimodal data from sensors (passive sensing) and answers to questionnaire (active sensing) from patients with asthma. We present our preliminary data analysis comprising data collected from real patients highlighting the challenges in deploying such an application. The results show strong promise to derive actionable information using a combination of physiological indicators from active and passive sensors that can help doctors determine more precisely the cause, severity, and control level of asthma. Information synthesized from kHealth can be used to alert patients and caregivers for seeking timely clinical assistance to better manage asthma and improve their quality of life.
Paper: http://www.knoesis.org/library/resource.php?id=2153
Citation:
Pramod Anantharam, Tanvi Banerjee, Amit Sheth, Krishnaprasad Thirunarayan, Surendra Marupudi, Vaikunth Sridharan, Shalini G. Forbis, Knowledge-driven Personalized Contextual mHealth Service for Asthma Management in Children , IEEE 4th International Conference on Mobile Services, June 27 - July 2, 2015, New York, USA.
Social media provides a natural platform for dynamic emergence of citizen (as) sensor communities, where the citizens share information, express opinions, and engage in discussions. Often such a Online Citizen Sensor Community (CSC) has stated or implied goals related to workflows of organizational actors with defined roles and responsibilities. For example, a community of crisis response volunteers, for informing the prioritization of responses for resource needs (e.g., medical) to assist the managers of crisis response organizations. However, in CSC, there are challenges related to information overload for organizational actors, including finding reliable information providers and finding the actionable information from citizens. This threatens awareness and articulation of workflows to enable cooperation between citizens and organizational actors. CSCs supported by Web 2.0 social media platforms offer new opportunities and pose new challenges. This work addresses issues of ambiguity in interpreting unconstrained natural language (e.g., ‘wanna help’ appearing in both types of messages for asking and offering help during crises), sparsity of user and group behaviors (e.g., expression of specific intent), and diversity of user demographics (e.g., medical or technical professional) for interpreting user-generated data of citizen sensors. Interdisciplinary research involving social and computer sciences is essential to address these socio-technical issues in CSC, and allow better accessibility to user-generated data at higher level of information abstraction for organizational actors. This study presents a novel web information processing framework focused on actors and actions in cooperation, called Identify-Match-Engage (IME), which fuses top-down and bottom-up computing approaches to design a cooperative web information system between citizens and organizational actors. It includes a.) identification of action related seeking-offering intent behaviors from short, unstructured text documents using both declarative and statistical knowledge based classification model, b.) matching of intentions about seeking and offering, and c.) engagement models of users and groups in CSC to prioritize whom to engage, by modeling context with social theories using features of users, their generated content, and their dynamic network connections in the user interaction networks. The results show an improvement in modeling efficiency from the fusion of top-down knowledge-driven and bottom-up data-driven approaches than from conventional bottom-up approaches alone for modeling intent and engagement. Several applications of this work include use of the engagement interface tool during recent crises to enable efficient citizen engagement for spreading critical information of prioritized needs to ensure donation of only required supplies by the citizens. The engagement interface application also won the United Nations ICT agency ITU's Young Innovator 2014 award.
Krishnaprasad Thirunarayan, Trust Management: Multimodal Data Perspective,
Invited Tutorial, The 2015 International Conference on Collaboration
Technologies and Systems (CTS 2015), June 2015
Ceilometer is a tool that collects usage and performance data, while Heat orchestrates complex deployments on top of OpenStack. Heat aims to autoscale its deployments, scaling up when they're running hot and scaling back when idle.
Ceilometer can access decisive data and trigger the appropriate actions in Heat. The result of these two OpenStack projects meeting is value creation in the form of an alarming API in Ceilometer and its consumption in Heat.
Slides presented at the Fall OpenStack Design Summit in Hong Kong
Using bluemix predictive analytics service in Node-REDLionel Mommeja
This document describes how to use the IBM Bluemix Predictive Analytics service with Node-RED to enable collaboration between data scientists and developers for Internet of Things applications. It provides a step-by-step example of building a predictive model using sensor data from a TI SensorTag to detect failures, deploying the model on the Predictive Analytics service, and calling it from a Node-RED application. This allows data scientists to build models and developers to easily integrate predictive capabilities into their IoT solutions.
This document provides an overview of WSO2 Complex Event Processor (CEP). It discusses key CEP concepts like event streams, queries, and execution plans. It also demonstrates various query patterns for filtering, transforming, enriching, joining, and detecting patterns in event streams. The document outlines the architecture of CEP and shows how to define streams, tables, queries, and adaptors to integrate CEP with external systems. It provides examples of windowing, aggregations, functions, and extensions that can be used in Siddhi queries to process event streams.
Azure Functions are great for a wide range of scenarios, including working with data on a transactional or event-driven basis. In this session, we'll look at how you can interact with Azure SQL, Cosmos DB, Event Hubs, and more so you can see how you can take a lightweight but code-first approach to building APIs, integrations, ETL, and maintenance routines.
This document provides an agenda for a Sumo Metrics Analyst certification course. The course covers collecting, analyzing, and monitoring metrics using Sumo Logic. It includes hands-on labs on collecting host and AWS metrics, analyzing metric formats, converting logs to metrics, and creating dashboards and alerts. The course aims to help students master metrics and earn a Sumo Logic certification by passing an online exam at the end.
Flink Forward Berlin 2017: Patrick Gunia - Migration of a realtime stats prod...Flink Forward
Counting things might sound like a trivial thing to do. But counting things consistently at scale can create unique and difficult challenges. At ResearchGate we count things for different reasons. On the one hand we provide numbers to our members to give them insights about their scientific impact and reach. At the same time, we use numbers ourselves as a basis for data-driven product development. We continuously tune our statistics infrastructure to improve our platform, adapt to new business requirements or fix bugs. A milestone in this improvement process has been the strategic decision to move our stats infrastructure from Storm to Flink. This significantly reduced complexity and required resources, including decreasing the load on our database backend by more than 30%. We will discuss the challenges we’ve encountered and overcome on the way, including handling of state and the need for online and offline processing using streaming and batch processors on the same data.
Flink Forward San Francisco 2018: David Reniz & Dahyr Vergara - "Real-time m...Flink Forward
“Customer experience is the next big battle ground for telcos,” proclaimed recently Amit Akhelikar, Global Director of Lynx Analytics at TM Forum Live! Asia in Singapore. But, how to fight in this battle? A common approach has been to keep “under control” some well-known network quality indicators, like dropped calls, radio access congestion, availability, and so on; but this has proven not to be enough to keep customers happy, like a siege weapon is not enough to conquer a city. But, what if it were possible to know how customers perceive services, at least most demanded ones, like web browsing or video streaming? That would be like a squad of archers ready to battle. And even having that, how to extract value of it and take actions in no time, giving our skilled archers the right targets? Meet CANVAS (Customer And Network Visualization and AnaltyticS), one of the first LATAM implementations of a Flink-based stream processing use case for a telco, which successfully combines leading and innovative technologies like Apache Hadoop, YARN, Kafka, Nifi, Druid and advanced visualizations with Flink core features like non-trivial stateful stream processing (joins, windows and aggregations on event time) and CEP capabilities for alarm generation, delivering a next-generation tool for SOC (Service Operation Center) teams.
Masterclass Webinar: Application Services and Dynamic DashboardAmazon Web Services
This document provides an overview of a technical demonstration that uses various AWS services like SNS, SQS, DynamoDB and S3 to mimic an auto-scaling application and generate dynamic dashboard content. It describes how auto scaling events are captured using SNS notifications and persisted to SQS, then processed to update a DynamoDB table and generate JSON files in S3 that power dynamic frontend content through JavaScript. The goal is to illustrate how these services can be integrated to build scalable, event-driven applications.
This document discusses mastering the velocity dimension of big data through information flow processing (IFP). It presents an agenda that includes an overview of IFP and a modeling framework for data stream management systems (DSMS) and complex event processing (CEP). The modeling framework consists of functional, processing, deployment, interaction, data, time, rule, and language models to characterize different systems.
This document discusses using semantic technologies to address the variety challenge of big data. It provides examples of applying semantic annotation to social data and metadata. Specifically, it describes how semantic annotation can extract meaningful metadata from social media posts, including information about users, content, relationships between users, and activity networks. The document outlines different types of metadata that can be derived from social media content, users, and networks.
Best Paper Award winning paper presented at ASONAM 2015.
Derek Doran, Samir Yelne, Luisa Massari, Maria-Carla Calzarossa, LaTrelle Jackson, Glen MoriartyDept. of CSE, Professional Psych, Wright State University, USADept. of Electrical, Computer, and Biomedical Eng., University of Pavia, Italy
7 Cups of Tea, Inc.
http://knoesis.wright.edu/doran
Krishnaprasad Thirunarayan, Value-Oriented Big Data Processing with Applications,
Invited Talk, The 2015 International Conference on Collaboration
Technologies and Systems (CTS 2015), June 2015.
Mastering the variety dimension of Big Data with semantic technologies: high level intro to standards. Focus on variety/interoperability dimension. Prof Amit Sheth
Examples of Real-World Big Data Application Specific examples of velocity challenge and how it is addressed in disaster coordination scenario (e.g., Jammu&Kashmir Floods).
Prof Amit Sheth - Kno.e.sis
Revathy Krishnamurthy, Pavan Kapanipathi, Amit P. Sheth, and Krishnaprasad Thirunarayan. Knowledge Enabled Approach to Predict the Location of Twitter Users, Proc. of the Extended Semantic Web Conference, Slovenia, May 3, 2015.
Paper at: http://www.knoesis.org/library/resource.php?id=2039
This document summarizes an approach to entity recommendations using hierarchical knowledge bases. It proposes using spreading activation theory on Wikipedia's category hierarchy to model user interests and generate recommendations. The approach transforms Wikipedia's category structure into a hierarchy and uses spreading activation to calculate interest scores for categories. These scores are then propagated across the hierarchy to score and rank entities. The approach is evaluated on movie recommendations and shows improved recall over baseline methods, particularly when incorporating category priority weights. Future work to improve normalization and validate category priorities is discussed.
Presentation of Hexoskin Validation for KHealth's Dementia Project
The paper is available at: http://www.knoesis.org/library/resource.php?id=2155
Citation for the paper: T. Banerjee, P. Anantharam, W. L. Romine, L. Lawhorne, A. Sheth, 'Evaluating a Potential Commercial Tool for Healthcare Application for People with Dementia' in Proc. of the Intl Conf on Health Informatics and Medical Systems (HIMS), Las Vegas, July 27-30, 2015.
Wide adoption of smartphones and availability of low-cost sensors has resulted in seamless and continuous monitoring of physiology, environment, and public health notifications. However, personalized digital health and patient empowerment can become a reality only if the complex multisensory and multimodal data is processed within the patient context. Contextual processing of patient data along with personalized medical knowledge can lead to actionable information for better and timely decisions. We present a system called kHealth capable of aggregating multisensory and multimodal data from sensors (passive sensing) and answers to questionnaire (active sensing) from patients with asthma. We present our preliminary data analysis comprising data collected from real patients highlighting the challenges in deploying such an application. The results show strong promise to derive actionable information using a combination of physiological indicators from active and passive sensors that can help doctors determine more precisely the cause, severity, and control level of asthma. Information synthesized from kHealth can be used to alert patients and caregivers for seeking timely clinical assistance to better manage asthma and improve their quality of life.
Paper: http://www.knoesis.org/library/resource.php?id=2153
Citation:
Pramod Anantharam, Tanvi Banerjee, Amit Sheth, Krishnaprasad Thirunarayan, Surendra Marupudi, Vaikunth Sridharan, Shalini G. Forbis, Knowledge-driven Personalized Contextual mHealth Service for Asthma Management in Children , IEEE 4th International Conference on Mobile Services, June 27 - July 2, 2015, New York, USA.
Social media provides a natural platform for dynamic emergence of citizen (as) sensor communities, where the citizens share information, express opinions, and engage in discussions. Often such a Online Citizen Sensor Community (CSC) has stated or implied goals related to workflows of organizational actors with defined roles and responsibilities. For example, a community of crisis response volunteers, for informing the prioritization of responses for resource needs (e.g., medical) to assist the managers of crisis response organizations. However, in CSC, there are challenges related to information overload for organizational actors, including finding reliable information providers and finding the actionable information from citizens. This threatens awareness and articulation of workflows to enable cooperation between citizens and organizational actors. CSCs supported by Web 2.0 social media platforms offer new opportunities and pose new challenges. This work addresses issues of ambiguity in interpreting unconstrained natural language (e.g., ‘wanna help’ appearing in both types of messages for asking and offering help during crises), sparsity of user and group behaviors (e.g., expression of specific intent), and diversity of user demographics (e.g., medical or technical professional) for interpreting user-generated data of citizen sensors. Interdisciplinary research involving social and computer sciences is essential to address these socio-technical issues in CSC, and allow better accessibility to user-generated data at higher level of information abstraction for organizational actors. This study presents a novel web information processing framework focused on actors and actions in cooperation, called Identify-Match-Engage (IME), which fuses top-down and bottom-up computing approaches to design a cooperative web information system between citizens and organizational actors. It includes a.) identification of action related seeking-offering intent behaviors from short, unstructured text documents using both declarative and statistical knowledge based classification model, b.) matching of intentions about seeking and offering, and c.) engagement models of users and groups in CSC to prioritize whom to engage, by modeling context with social theories using features of users, their generated content, and their dynamic network connections in the user interaction networks. The results show an improvement in modeling efficiency from the fusion of top-down knowledge-driven and bottom-up data-driven approaches than from conventional bottom-up approaches alone for modeling intent and engagement. Several applications of this work include use of the engagement interface tool during recent crises to enable efficient citizen engagement for spreading critical information of prioritized needs to ensure donation of only required supplies by the citizens. The engagement interface application also won the United Nations ICT agency ITU's Young Innovator 2014 award.
Krishnaprasad Thirunarayan, Trust Management: Multimodal Data Perspective,
Invited Tutorial, The 2015 International Conference on Collaboration
Technologies and Systems (CTS 2015), June 2015
Ceilometer is a tool that collects usage and performance data, while Heat orchestrates complex deployments on top of OpenStack. Heat aims to autoscale its deployments, scaling up when they're running hot and scaling back when idle.
Ceilometer can access decisive data and trigger the appropriate actions in Heat. The result of these two OpenStack projects meeting is value creation in the form of an alarming API in Ceilometer and its consumption in Heat.
Slides presented at the Fall OpenStack Design Summit in Hong Kong
Using bluemix predictive analytics service in Node-REDLionel Mommeja
This document describes how to use the IBM Bluemix Predictive Analytics service with Node-RED to enable collaboration between data scientists and developers for Internet of Things applications. It provides a step-by-step example of building a predictive model using sensor data from a TI SensorTag to detect failures, deploying the model on the Predictive Analytics service, and calling it from a Node-RED application. This allows data scientists to build models and developers to easily integrate predictive capabilities into their IoT solutions.
This document provides an overview of WSO2 Complex Event Processor (CEP). It discusses key CEP concepts like event streams, queries, and execution plans. It also demonstrates various query patterns for filtering, transforming, enriching, joining, and detecting patterns in event streams. The document outlines the architecture of CEP and shows how to define streams, tables, queries, and adaptors to integrate CEP with external systems. It provides examples of windowing, aggregations, functions, and extensions that can be used in Siddhi queries to process event streams.
Azure Functions are great for a wide range of scenarios, including working with data on a transactional or event-driven basis. In this session, we'll look at how you can interact with Azure SQL, Cosmos DB, Event Hubs, and more so you can see how you can take a lightweight but code-first approach to building APIs, integrations, ETL, and maintenance routines.
This document provides an agenda for a Sumo Metrics Analyst certification course. The course covers collecting, analyzing, and monitoring metrics using Sumo Logic. It includes hands-on labs on collecting host and AWS metrics, analyzing metric formats, converting logs to metrics, and creating dashboards and alerts. The course aims to help students master metrics and earn a Sumo Logic certification by passing an online exam at the end.
Flink Forward Berlin 2017: Patrick Gunia - Migration of a realtime stats prod...Flink Forward
Counting things might sound like a trivial thing to do. But counting things consistently at scale can create unique and difficult challenges. At ResearchGate we count things for different reasons. On the one hand we provide numbers to our members to give them insights about their scientific impact and reach. At the same time, we use numbers ourselves as a basis for data-driven product development. We continuously tune our statistics infrastructure to improve our platform, adapt to new business requirements or fix bugs. A milestone in this improvement process has been the strategic decision to move our stats infrastructure from Storm to Flink. This significantly reduced complexity and required resources, including decreasing the load on our database backend by more than 30%. We will discuss the challenges we’ve encountered and overcome on the way, including handling of state and the need for online and offline processing using streaming and batch processors on the same data.
Flink Forward San Francisco 2018: David Reniz & Dahyr Vergara - "Real-time m...Flink Forward
“Customer experience is the next big battle ground for telcos,” proclaimed recently Amit Akhelikar, Global Director of Lynx Analytics at TM Forum Live! Asia in Singapore. But, how to fight in this battle? A common approach has been to keep “under control” some well-known network quality indicators, like dropped calls, radio access congestion, availability, and so on; but this has proven not to be enough to keep customers happy, like a siege weapon is not enough to conquer a city. But, what if it were possible to know how customers perceive services, at least most demanded ones, like web browsing or video streaming? That would be like a squad of archers ready to battle. And even having that, how to extract value of it and take actions in no time, giving our skilled archers the right targets? Meet CANVAS (Customer And Network Visualization and AnaltyticS), one of the first LATAM implementations of a Flink-based stream processing use case for a telco, which successfully combines leading and innovative technologies like Apache Hadoop, YARN, Kafka, Nifi, Druid and advanced visualizations with Flink core features like non-trivial stateful stream processing (joins, windows and aggregations on event time) and CEP capabilities for alarm generation, delivering a next-generation tool for SOC (Service Operation Center) teams.
Masterclass Webinar: Application Services and Dynamic DashboardAmazon Web Services
This document provides an overview of a technical demonstration that uses various AWS services like SNS, SQS, DynamoDB and S3 to mimic an auto-scaling application and generate dynamic dashboard content. It describes how auto scaling events are captured using SNS notifications and persisted to SQS, then processed to update a DynamoDB table and generate JSON files in S3 that power dynamic frontend content through JavaScript. The goal is to illustrate how these services can be integrated to build scalable, event-driven applications.
IoT Supercharged: Complex event processing for MQTT with Eclipse technologiesIstvan Rath
Slides for our talk at EclipseCon Europe 2015. More details at https://www.eclipsecon.org/europe2015/session/iot-supercharged-complex-event-processing-mqtt-eclipse-technologies
MongoDB Solution for Internet of Things and Big DataStefano Dindo
Internet of Things è uno degli scenari di mercato più importanti su cui investire entro il 2020.
L'Internet of Things permette di trasferire sul Web la vita reale delle persone grazie all'interazione con oggetti e spazi fisici scambiando un grande volume di dati.
Durante il Lab è stata fornita una descrizione di architettura necessaria a supportare progetti di Internet of Things con un focus sull'organizzazione dei dati all'interno di MongoDB, database NoSQL Leader di mercato, per raccogliere ed analizzare grandi volumi di dati in tempo reale ed in modo efficiente.
Lab pratico per la progettazione di soluzioni MongoDB in ambito Internet of T...festival ICT 2016
Le aziende sono chiamate a rispondere velocemente ai cambiamenti di mercato a seguito dell’affermarsi di nuovi scenari (Internet of Things, Social Analysis, manifattura 4.0 ecc.) che richiedono sempre di più l’integrazione con nuove tecnologie. Trattandosi di progetti innovativi, che impattano su processi e contesti aziendali, è fondamentale disporre di soluzioni flessibili per la raccolta e l’analisi dei dati.
MongoDB è un database NoSQL in grado di offrire flessibilità, scalabilità e semplificazione delle attività di sviluppo. Il lab avrà lo scopo di illustrare come creare architetture MongoDB e svolgere attività di Schema Design per la gestione dei dati in ambito IoT e Big Data facendo inoltre riferimento a casi pratici reali che si basano su tecnologie Cloud, necessarie a far fronte ad un mercato sempre più globale.
PVS-Studio is a static code analyzer for C, C++, C#, and Java that detects bugs and vulnerabilities. It supports various compilers and IDE plugins. It uses data flow analysis, symbolic execution, pattern matching, and other techniques to detect bugs like buffer overflows, leaks, dead code, and undefined behavior. Over 700 diagnostics are implemented to date across the supported languages. The analyzer produces warnings classified by standard taxonomies. Users can exclude files, suppress warnings, and integrate it with continuous integration systems. Support and documentation is provided through online and PDF references.
Apex Replay Debugger and Salesforce Platform Events.pptxmohayyudin7826
Exploring Salesforce Platform Events: Discover how to use Platform Events to create real-time applications that streamline your workflows and enhance collaboration.
Apex Replay Debugger: Learn how to troubleshoot your Apex code like a pro. We'll show you how to identify and fix issues efficiently.
Bei Jimdo sammeln wir jede Menge Metriken über alle Teile unseres Systems. Dabei fallen Daten auf allen Ebenen des Systems an: Infrastruktur, System und Applikation. Wichtig ist, dass alle Entwickler zu jedem Zeitpunkt Einblick in die Echtzeit-Metriken ihrer Services nehmen können. Um das zu garantieren, haben wir uns einige Zeit mit der Integration von Prometheus in unsere Systeme beschäftigt.
In unserem Talk werden wir sowohl über den Betrieb von Prometheus als auch über die Integrationen mit dem Rest der Jimdo-Plattform sprechen. Wir werden von Stolpersteinen und Tricks berichten, die wir gelernt haben, sowie einen Einblick in unserer Tool-Landschaft geben.
by Lin Chunyong and Ryan Deivert, Airbnb
AWS Data & Analytics Week is an opportunity to learn about Amazon’s family of managed analytics services. These services provide easy, scalable, reliable, and cost-effective ways to manage your data in the cloud. We explain the fundamentals and take a technical deep dive into Amazon Redshift data warehouse; Data Lake services including Amazon EMR, Amazon Athena, & Amazon Redshift Spectrum; Log Analytics with Amazon Elasticsearch Service; and data preparation and placement services with AWS Glue and Amazon Kinesis. You'll will learn how to get started, how to support applications, and how to scale.
This document discusses monitoring systems using syslog and EventMachine. It proposes building a lightweight, polyglot system that aggregates syslog events and displays metrics and visualizations using various protocols like WebSockets, Server-Sent Events, and Graphite. Event sources would send syslog messages which an EventMachine server would parse and pass to an EM:Channel. A JavaScript client could subscribe to the channel for real-time updates.
Topic: Discover deep insights with Salesforce Einstein Analytics and Discovery
ImpactSalesforceSaturday Session
by @newdelhisfdcdug
Speaker: Jayant Joshi
AGENDA
a. What is SFDC Einstein Analytics?
b. Let us build great Visualizations using Einstein Analytics
c. Discover Deep Insights with Einstein Discovery
d. Demo and QA
https://newdelhisfdcdug.com/salesforce-einstein-analytics-and-discovery/
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaGoDataDriven
Matei Zaharia is an assistant professor of computer science at Stanford University, Chief Technologist and Co-founder of Databricks. He started the Spark project at UC Berkeley and continues to serve as its vice president at Apache. Matei also co-started the Apache Mesos project and is a committer on Apache Hadoop. Matei’s research work on datacenter systems was recognized through two Best Paper awards and the 2014 ACM Doctoral Dissertation Award.
Research project for Digital Systems that helped us understand better how to extend the communication aspects of micro-controllers and their connected interfaces to develop an IoT systems that can communicate with an online service without human intervention.
Similar to Walk through Streaming Technologies: EPL (20)
Codeless Generative AI Pipelines
(GenAI with Milvus)
https://ml.dssconf.pl/user.html#!/lecture/DSSML24-041a/rate
Discover the potential of real-time streaming in the context of GenAI as we delve into the intricacies of Apache NiFi and its capabilities. Learn how this tool can significantly simplify the data engineering workflow for GenAI applications, allowing you to focus on the creative aspects rather than the technical complexities. I will guide you through practical examples and use cases, showing the impact of automation on prompt building. From data ingestion to transformation and delivery, witness how Apache NiFi streamlines the entire pipeline, ensuring a smooth and hassle-free experience.
Timothy Spann
https://www.youtube.com/@FLaNK-Stack
https://medium.com/@tspann
https://www.datainmotion.dev/
milvus, unstructured data, vector database, zilliz, cloud, vectors, python, deep learning, generative ai, genai, nifi, kafka, flink, streaming, iot, edge
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataKiwi Creative
Harness the power of AI-backed reports, benchmarking and data analysis to predict trends and detect anomalies in your marketing efforts.
Peter Caputa, CEO at Databox, reveals how you can discover the strategies and tools to increase your growth rate (and margins!).
From metrics to track to data habits to pick up, enhance your reporting for powerful insights to improve your B2B tech company's marketing.
- - -
This is the webinar recording from the June 2024 HubSpot User Group (HUG) for B2B Technology USA.
Watch the video recording at https://youtu.be/5vjwGfPN9lw
Sign up for future HUG events at https://events.hubspot.com/b2b-technology-usa/
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"sameer shah
Embark on a captivating financial journey with 'Financial Odyssey,' our hackathon project. Delve deep into the past performance of two companies as we employ an array of financial statement analysis techniques. From ratio analysis to trend analysis, uncover insights crucial for informed decision-making in the dynamic world of finance."
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
The Ipsos - AI - Monitor 2024 Report.pdfSocial Samosa
According to Ipsos AI Monitor's 2024 report, 65% Indians said that products and services using AI have profoundly changed their daily life in the past 3-5 years.
1. BigData
Semantic Approach to
Big Data and Event Processing
Walk through Streaming
Technologies: EPL
Riccardo Tommasini
PhD Student Politecnico di Milano
riccardo.tommasini@polimi.it
11. BigData1
1
EPL Statements are
registered into Esper
and continuously
executed as the live
data stream are
pushed through
Event Streams/Complex Event Processing: EPL
Event Processing Language (EPL)
Continuous
Query
Answering
Complex
Pattern Matching
(causality
relationship)
12. BigData1
2
• Push Based
• Data tuple from sensors,
trace files etc.
• Push Based
• Continuously executed
against incoming events
• Receive data tuples from
queries
• Push data tuple to other
queries
• Receive processed tuples
EPL
Query
Subscriber
Source
Listener
EPL ConAnuous Query Processing Model
13. BigData1
3
• Sources act as input
• EPL Queries Integrate
sources
• Listener propagates query
results
• Subscribers act as output
EPL Query Model can be represented as a graph
Graph nodes are functional
components, manually
connect by the data stream.
EPL ConAnuous Query Processing Model
15. BigData1
5
We starts with the demo time
https://github.com/riccardotommasini/esper-tutorial
http://esper-epl-tryout.appspot.com/epltryout/mainform.html
ONLINE VERSION
Java Code Data for the running example
DEMO
16. BigData16
Count the number of fires detected using a set of smoke and
temperature sensors in the last 10 minutes
Request
Events
CondiAon
• Smoke Event: String sensor, boolean state
• Temperature Event: String sensor, double temperature
• Fire Event: String sensor, boolean smoke, double temperature
Fire: The same sensor detects smoke followed by temperature >
50°C
Running Example
22. BigData22
[insert into insert_into_def]
select select_list
from stream_def [as name] [, stream_def [as name]]
[,...]
[where search_conditions]
[group by grouping_expression_list]
[having grouping_search_conditions]
[output output_specification]
[order by order_by_expression_list]
[limit num_rows]
• Similar to SQL (select,
where)
• views are used inset of tables
EPL: Query RegistraAon
23. BigData23
@Name(‘Q0’) select *
from TemperatureSensorEvent
where temperature > 50;
@Name(‘Q1’) select avg(temperature)
from TemperatureSensorEvent;
Stream Lookup
Aggregate
EPL Queries: Examples
24. BigData24
• Similar to SQL tables
• define data available for querying and filtering
• sorting, aggregation, grouping operation are
possible
• usually represented as windows over the
streams of events
EPL: Views
25. BigData25
Type Syntax Description
Logical Sliding win:time(time period) Sliding time window extending the
specified time interval into the past
Logical Tumbling win:time_batch(time
period)
Tumbling window that batches
events and releases them every
specified time interval, with flow
control options
Physical Sliding win:length(size) Sliding length window extending
the specified number of events into
the past
Physical Tumbling win:length_batch(size) Tumbling window that batches
events and releases them when a
given minimum number is has been
collected
EPL: Statements and Windows
36. BigData36
• Looking for a pattern of events over
the incoming data streams
• pattern can be temporal as well as
physical
• it’s implemented via finite states
machine
EPL: Event PaSern Matching
41. BigData
Simple Example: every (A -> B)
• PaSern
– every ( A -> B )
• Events
– A1 B1 C1 B2 A2 D1 A3 B3 E1 A4 F1 B4
• Results
– Detect an A event followed by a B event. At the Ame
when B occurs the paSern matches, then the paSern
matcher restarts and looks for the next A event.
1. Matches on B1 for combinaAon {A1 , B1}
2. Matches on B3 for combinaAon {A2 , B3}
3. Matches on B4 for combinaAon {A4 , B4}
41Stream & Complex Event Processing - Introduction
42. BigData
• PaSern
– every A -> B
• Events
– A1 B1 C1 B2 A2 D1 A3 B3 E1 A4 F1 B4
• Results
– The paSern fires for every A event followed by a B
event.
1. Matches on B1 for combinaAon {A1 , B1}
2. Matches on B3 for combinaAon {A2 , B3} and {A3 , B3}
3. Matches on B4 for combinaAon {A4 , B4}
42Stream & Complex Event Processing - Introduction
Simple Example: every A -> B
43. BigData
• PaSern
– A -> every B
• Events
– A1 B1 C1 B2 A2 D1 A3 B3 E1 A4 F1 B4
• Results
– The paSern fires for an A event followed by every B
event.
1. Matches on B1 for combinaAon {A1 , B1}
2. Matches on B2 for combinaAon {A1 , B2}
3. Matches on B3 for combinaAon {A1 , B3}
4. Matches on B4 for combinaAon {A1 , B4}
43Stream & Complex Event Processing - Introduction
Simple Example: A -> every B
44. BigData
• PaSern
– every A -> every B
• Events
– A1 B1 C1 B2 A2 D1 A3 B3 E1 A4 F1 B4
• Results
– The paSern fires for every A event followed by every B
event.
1. Matches on B1 for combinaAon {A1 , B1}
2. Matches on B2 for combinaAon {A1 , B2}
3. Matches on B3 for combinaAon {A1 , B3}, {A2 , B3} and {A3 , B3}
4. Matches on B4 for combinaAon {A1 , B4}, {A2 , B4}, {A3 , B4}
and {A4 , B4}
44Stream & Complex Event Processing - Introduction
Simple Example: every A -> every B
46. BigData
• Events
– A1 A2 B1
• PaSern
– every A -> B
• Results
– {A1 , B1} and {A2 , B1}
• Events
– A1 A2 B1
• PaSern
– every A -> (B and not A)
• Results
– {A2 , B1}
– The and not operators
cause the sub-
expression looking for
{A1, B?} to end when A2
arrives.
46Stream & Complex Event Processing - Introduction
Guards Example 1
47. BigData
• Events
– A1 received at to+ 1 sec
– A2 received at to+ 3 sec
– B1 received at to+ 4 sec
• PaSern
– every A -> B
• Results
– {A1 , B1} and {A2 , B1}
• Events
– A1 received at to+ 1 sec
– A2 received at to+ 2 sec
– B1 received at to+ 3 sec
• PaSern
– every A -> (B where Amer:within(2 sec) )
• Results
– {A2 , B1}
– The where 9mer:within operators cause
the sub-expression looking for {A1, B?}
to end aker 2 seconds.
47Stream & Complex Event Processing - Introduction
Guards Example 2
53. BigData53
package us.wsu.knoesis.tutorial.events;
public class FireEvent {
private String sensor;
private boolean smoke;
}• Getter and Setter naming
convention are relevant to access
parameter form EPL
• the construct must be defined as
well
• toString is used to represent
display the event, consider
redefinition
Event DeclaraAon Java POJO