In planet-scale deployments, the Operation and Maintenance (O&M) of cloud platforms cannot be done any longer manually or simply with off-the-shelf solutions. It requires self-developed automated systems, ideally exploiting the use of AI to provide tools for autonomous cloud operations. This talk will explain how deep learning, distributed traces, and time-series analysis (sequence analysis) can be used to effectively detect anomalous cloud infrastructure behaviors during operations to reduce the workload of human operators. The iForesight system is being used to evaluate this new O&M approach. iForesight 2.0 is the result of 2 years of research with the goal to provide an intelligent new tool aimed at SRE cloud maintenance teams. It enables them to quickly detect and predict anomalies thanks to the use of artificial intelligence when cloud services are slow or unresponsive.
On the Application of AI for Failure Management: Problems, Solutions and Algo...Jorge Cardoso
Artificial Intelligence for IT Operations (AIOps) is a class of software which targets the automation of operational tasks through machine learning technologies. ML algorithms are typically used to support tasks such as anomaly detection, root-causes analysis, failure prevention, failure prediction, and system remediation. AIOps is gaining an increasing interest from the industry due to the exponential growth of IT operations and the complexity of new technology. Modern applications are assembled from hundreds of dependent microservices distributed across many cloud platforms, leading to extremely complex software systems. Studies show that cloud environments are now too complex to be managed solely by humans. This talk discusses various AIOps problems we have addressed over the years and gives a sketch of the solutions and algorithms we have implemented. Interesting problems include hypervisor anomaly detection, root-cause analysis of software service failures using application logs, multi-modal anomaly detection, root-cause analysis using distributed traces, and verification of virtual private cloud networks.
DeepLearning is not just a hype - it outperforms state-of-the-art ML algorithms. One by one. In this talk we will show how DeepLearning can be used for detecting anomalies on IoT sensor data streams at high speed using DeepLearning4J on top of different BigData engines like ApacheSpark and ApacheFlink. Key in this talk is the absence of any large training corpus since we are using unsupervised machine learning - a domain current DL research threats step-motherly. As we can see in this demo LSTM networks can learn very complex system behavior - in this case data coming from a physical model simulating bearing vibration data. Once draw back of DeepLearning is that normally a very large labaled training data set is required. This is particularly interesting since we can show how unsupervised machine learning can be used in conjunction with DeepLearning - no labeled data set is necessary. We are able to detect anomalies and predict braking bearings with 10 fold confidence. All examples and all code will be made publicly available and open sources. Only open source components are used.
On the Application of AI for Failure Management: Problems, Solutions and Algo...Jorge Cardoso
Artificial Intelligence for IT Operations (AIOps) is a class of software which targets the automation of operational tasks through machine learning technologies. ML algorithms are typically used to support tasks such as anomaly detection, root-causes analysis, failure prevention, failure prediction, and system remediation. AIOps is gaining an increasing interest from the industry due to the exponential growth of IT operations and the complexity of new technology. Modern applications are assembled from hundreds of dependent microservices distributed across many cloud platforms, leading to extremely complex software systems. Studies show that cloud environments are now too complex to be managed solely by humans. This talk discusses various AIOps problems we have addressed over the years and gives a sketch of the solutions and algorithms we have implemented. Interesting problems include hypervisor anomaly detection, root-cause analysis of software service failures using application logs, multi-modal anomaly detection, root-cause analysis using distributed traces, and verification of virtual private cloud networks.
DeepLearning is not just a hype - it outperforms state-of-the-art ML algorithms. One by one. In this talk we will show how DeepLearning can be used for detecting anomalies on IoT sensor data streams at high speed using DeepLearning4J on top of different BigData engines like ApacheSpark and ApacheFlink. Key in this talk is the absence of any large training corpus since we are using unsupervised machine learning - a domain current DL research threats step-motherly. As we can see in this demo LSTM networks can learn very complex system behavior - in this case data coming from a physical model simulating bearing vibration data. Once draw back of DeepLearning is that normally a very large labaled training data set is required. This is particularly interesting since we can show how unsupervised machine learning can be used in conjunction with DeepLearning - no labeled data set is necessary. We are able to detect anomalies and predict braking bearings with 10 fold confidence. All examples and all code will be made publicly available and open sources. Only open source components are used.
Presentation for the Softskills Seminar course @ Telecom ParisTech. Topic is the paper by Domings Hulten "Mining high speed data streams". Presented by me the 30/11/2017
Predictive Analytics - Big Data & Artificial IntelligenceManish Jain
Quick overview of the latest in big data and artificial intelligence. A lot of buzzwords being thrown around, hopefully this presentation will demystify many of the terms.
Human Emotion Recognition using Machine Learningijtsrd
It is quite interesting to recognize the human emotions in the field of machine learning. Using a person's facial expression one can know his emotions or what the person wants to express. But at the same time it's not easy to recognize one's emotion easily its quite challenging at times. Facial expression consist of various human emotions such as sad, happy , excited, angry, frustrated and surprise. Few years back Natural language processing was used to detect the sentiment from the text and then it took a step forward towards emotion detection. Sentiments can be positive, negative or neutral where as emotions are more refined categories. There are many techniques used to recognize emotions. This paper provides a review of research work carried out and published in the field of human emotion recognition and various techniques used for human emotions recognition. Prof. Mrs. Dhanamma Jagli | Ms. Pooja Shetty "Human Emotion Recognition using Machine Learning" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd25217.pdfPaper URL: https://www.ijtsrd.com/computer-science/artificial-intelligence/25217/human-emotion-recognition-using-machine-learning/prof-mrs-dhanamma-jagli
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNINGijcsit
With the advent of the Internet and social media, while hundreds of people have benefitted from the vast sources of information available, there has been an enormous increase in the rise of cyber-crimes, particularly targeted towards women. According to a 2019 report in the [4] Economics Times, India has witnessed a 457% rise in cybercrime in the five year span between 2011 and 2016. Most speculate that this is due to impact of social media such as Facebook, Instagram and Twitter on our daily lives. While these definitely help in creating a sound social network, creation of user accounts in these sites usually needs just an email-id. A real life person can create multiple fake IDs and hence impostors can easily be made. Unlike the real world scenario where multiple rules and regulations are imposed to identify oneself in a unique manner (for example while issuing one’s passport or driver’s license), in the virtual world of social media, admission does not require any such checks. In this paper, we study the different accounts of Instagram, in particular and try to assess an account as fake or real using Machine Learning techniques namely Logistic Regression and Random Forest Algorithm.
Data preprocessing techniques are applied before mining. These can improve the overall quality of the patterns mined and the time required for the actual mining.
Some important data preprocessing that must be needed before applying the data mining algorithm to any data sets are completely described in these slides.
AI Data Summit 2019 Thompson Sampling - Thompson Sampling Tutorial “The redis...Hanan Shteingart
AI Data Summit 2019 Thompson Sampling - Thompson Sampling Tutorial “The rediscovery of a swiss army knife”
aka “Squeezing a 96-page review into a 25 min talk”
Based on Russo, D. J., Van Roy, B., Kazerouni, A., Osband, I., & Wen, Z. (2018). A tutorial on Thompson sampling. Foundations and Trends® inMachine Learning, 11(1), 1-96.
This is a presentation I gave as a short overview of LSTMs. The slides are accompanied by two examples which apply LSTMs to Time Series data. Examples were implemented using Keras. See links in slide pack.
Expectation Maximization (EM) algorithm is a method that is used for finding maximum likelihood or maximum a posteriori (MAP) that is the estimation of parameters in statistical models, and the model depends on unobserved latent variables that is calculated using models. Copy the link given below and paste it in new browser window to get more information on Em Algorithm:- http://www.transtutors.com/homework-help/statistics/em-algorithm.aspx
Very basic Introduction to Big Data. Touches on what it is, characteristics, some examples of Big Data frameworks. Hadoop 2.0 example - Yarn, HDFS and Map-Reduce with Zookeeper.
Microservices architecture involves many services that are being distributed over the network resulting in many more ways of failure. This session will try to cover the available tools that can help you when designing/building such distributed system in Go
Presentation for the Softskills Seminar course @ Telecom ParisTech. Topic is the paper by Domings Hulten "Mining high speed data streams". Presented by me the 30/11/2017
Predictive Analytics - Big Data & Artificial IntelligenceManish Jain
Quick overview of the latest in big data and artificial intelligence. A lot of buzzwords being thrown around, hopefully this presentation will demystify many of the terms.
Human Emotion Recognition using Machine Learningijtsrd
It is quite interesting to recognize the human emotions in the field of machine learning. Using a person's facial expression one can know his emotions or what the person wants to express. But at the same time it's not easy to recognize one's emotion easily its quite challenging at times. Facial expression consist of various human emotions such as sad, happy , excited, angry, frustrated and surprise. Few years back Natural language processing was used to detect the sentiment from the text and then it took a step forward towards emotion detection. Sentiments can be positive, negative or neutral where as emotions are more refined categories. There are many techniques used to recognize emotions. This paper provides a review of research work carried out and published in the field of human emotion recognition and various techniques used for human emotions recognition. Prof. Mrs. Dhanamma Jagli | Ms. Pooja Shetty "Human Emotion Recognition using Machine Learning" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd25217.pdfPaper URL: https://www.ijtsrd.com/computer-science/artificial-intelligence/25217/human-emotion-recognition-using-machine-learning/prof-mrs-dhanamma-jagli
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNINGijcsit
With the advent of the Internet and social media, while hundreds of people have benefitted from the vast sources of information available, there has been an enormous increase in the rise of cyber-crimes, particularly targeted towards women. According to a 2019 report in the [4] Economics Times, India has witnessed a 457% rise in cybercrime in the five year span between 2011 and 2016. Most speculate that this is due to impact of social media such as Facebook, Instagram and Twitter on our daily lives. While these definitely help in creating a sound social network, creation of user accounts in these sites usually needs just an email-id. A real life person can create multiple fake IDs and hence impostors can easily be made. Unlike the real world scenario where multiple rules and regulations are imposed to identify oneself in a unique manner (for example while issuing one’s passport or driver’s license), in the virtual world of social media, admission does not require any such checks. In this paper, we study the different accounts of Instagram, in particular and try to assess an account as fake or real using Machine Learning techniques namely Logistic Regression and Random Forest Algorithm.
Data preprocessing techniques are applied before mining. These can improve the overall quality of the patterns mined and the time required for the actual mining.
Some important data preprocessing that must be needed before applying the data mining algorithm to any data sets are completely described in these slides.
AI Data Summit 2019 Thompson Sampling - Thompson Sampling Tutorial “The redis...Hanan Shteingart
AI Data Summit 2019 Thompson Sampling - Thompson Sampling Tutorial “The rediscovery of a swiss army knife”
aka “Squeezing a 96-page review into a 25 min talk”
Based on Russo, D. J., Van Roy, B., Kazerouni, A., Osband, I., & Wen, Z. (2018). A tutorial on Thompson sampling. Foundations and Trends® inMachine Learning, 11(1), 1-96.
This is a presentation I gave as a short overview of LSTMs. The slides are accompanied by two examples which apply LSTMs to Time Series data. Examples were implemented using Keras. See links in slide pack.
Expectation Maximization (EM) algorithm is a method that is used for finding maximum likelihood or maximum a posteriori (MAP) that is the estimation of parameters in statistical models, and the model depends on unobserved latent variables that is calculated using models. Copy the link given below and paste it in new browser window to get more information on Em Algorithm:- http://www.transtutors.com/homework-help/statistics/em-algorithm.aspx
Very basic Introduction to Big Data. Touches on what it is, characteristics, some examples of Big Data frameworks. Hadoop 2.0 example - Yarn, HDFS and Map-Reduce with Zookeeper.
Microservices architecture involves many services that are being distributed over the network resulting in many more ways of failure. This session will try to cover the available tools that can help you when designing/building such distributed system in Go
Distributed Trace & Log Analysis using MLJorge Cardoso
The field of AIOps, also known as Artificial Intelligence for IT Operations, uses advanced technologies to dramatically improve the monitoring, operation, and troubleshooting of distributed systems. Its main premise is that operations can be automated using monitoring data to reduce the workload of operators (e.g., SREs or production engineers). Our current research explores how AIOps – and many related fields such as deep learning, machine learning, distributed traces, graph analysis, time-series analysis, sequence analysis, advanced statistics, NLP and log analysis – can be explored to effectively detect, localize, predict, and remediate failures in large-scale cloud infrastructures (>50 regions and AZs) by analyzing service management data (e.g., distributed traces, logs, events, alerts, metrics). In particular, this talk will describe how a particular monitoring data structure, called distributed traces, can be analyzed using deep learning to identify anomalies in its spans. This capability empowers operators to quickly identify which components of a distributed system are faulty.
AIOps: Anomalous Span Detection in Distributed Traces Using Deep LearningJorge Cardoso
The field of AIOps, also known as Artificial Intelligence for IT Operations, uses algorithms and machine learning to dramatically improve the monitoring, operation, and maintenance of distributed systems. Its main premise is that operations can be automated using monitoring data to reduce the workload of operators (e.g., SREs or production engineers). Our current research explores how AIOps – and many related fields such as deep learning, machine learning, distributed traces, graph analysis, time-series analysis, sequence analysis, and log analysis – can be explored to effectively detect, localize, and remediate failures in large-scale cloud infrastructures (>50 regions and AZs). In particular, this lecture will describe how a particular monitoring data structure, called distributed trace, can be analyzed using deep learning to identify anomalies in its spans. This capability empowers operators to quickly identify which components of a distributed system are faulty.
The evolution of machine learning and IoT have made it possible for manufacturers to build more effective applications for predictive maintenance than ever before. Despite the huge potential that machine learning offers for predictive maintenance, it's challenging to build solutions that can handle the speed of IoT data streams and the massively large datasets required to train models that can forecast rare events like mechanical failures. Solving these challenges requires knowledge about state-of-the-art dataware, such as MapR, and cluster computing frameworks, such as Spark, which give developers foundational APIs for consuming and transforming data into feature tables useful for machine learning.
This talk was presented in Startup Master Class 2017 - http://aaiitkblr.org/smc/ 2017 @ Christ College Bangalore. Hosted by IIT Kanpur Alumni Association and co-presented by IIT KGP Alumni Association, IITACB, PanIIT, IIMA and IIMB alumni.
My co-presenter was Biswa Gourav Singh. And contributor was Navin Manaswi.
http://dataconomy.com/2017/04/history-neural-networks/ - timeline for neural networks
Advanced Open IoT Platform for Prevention and Early Detection of Forest FiresIvo Andreev
The session was about open architecture using IoT Edge, Azure Cognitive Services, Mosquitto MQTT, Influx DB and GraphQL web services to develop advanced architecture for early detection of forest fires that integrates sensor networks and mobile (drone) technologies for data collection and processing. Unmanned air vehicles (UAVs) will allow coverage of larger areas to raise the percentage of forest fires detections, monitor areas with high fire weather index and such already affected by forest fires. All information is forwarded and stored in cloud computing platform where near real-time processing and alerting is performed.
Provenance for Data Munging EnvironmentsPaul Groth
Data munging is a crucial task across domains ranging from drug discovery and policy studies to data science. Indeed, it has been reported that data munging accounts for 60% of the time spent in data analysis. Because data munging involves a wide variety of tasks using data from multiple sources, it often becomes difficult to understand how a cleaned dataset was actually produced (i.e. its provenance). In this talk, I discuss our recent work on tracking data provenance within desktop systems, which addresses problems of efficient and fine grained capture. I also describe our work on scalable provence tracking within a triple store/graph database that supports messy web data. Finally, I briefly touch on whether we will move from adhoc data munging approaches to more declarative knowledge representation languages such as Probabilistic Soft Logic.
Presented at Information Sciences Institute - August 13, 2015
Similar to Mastering AIOps with Deep Learning (20)
AIOps: Anomalies Detection of Distributed TracesJorge Cardoso
Introduction to the field of AIOps. large-scale monitoring, and observability. Provides an example illustrating how Deep Learning can be used to analyze distributed traces to reveal exactly which component is causing a problem in microservice applications.
Presentation given at the National University of Ireland, Galway (NUI Galway)
on 2019.08.20.
Thanks to Prof. John Breslin
Cloud Reliability: Decreasing outage frequency using fault injectionJorge Cardoso
Invited Keynote at the 9th International Workshop on Software Engineering for Resilient Systems, September 4-5, 2017, Geneva, Switzerland
Title: Cloud Reliability: Decreasing outage frequency using fault injection
Abstract: In 2016, Google Cloud had 74 minutes of total downtime, Microsoft Azure had 270 minutes, and 108 minutes of downtime for Amazon Web Services (see cloudharmony.com). Reliability is one of the most important properties of a successful cloud platform. Several approaches can be explored to increase reliability ranging from automated replication, to live migration, and to formal system analysis. Another interesting approach is to use software fault injection to test a platform during prototyping, implementation and operation. Fault injection was popularized by Netflix and their Chaos Monkey fault-injection tool to test cloud applications. The main idea behind this technique is to inject failures in a controlled manner to guarantee the ability of a system to survive failures during operations. This talk will explain how fault injection can also be applied to detect vulnerabilities of OpenStack cloud platform and how to effectively and efficiently detect the damages caused by the faults injected.
Cloud Operations and Analytics: Improving Distributed Systems Reliability usi...Jorge Cardoso
Lecture given at the Technical University of Munich, 12 December 2016, on Cloud Operations and Analytics: Improving Distributed Systems Reliability using Fault Injection.
Presentation at the International Industry-Academia Workshop on Cloud Reliability and Resilience. 7-8 November 2016, Berlin, Germany.
Organized by EIT Digital and Huawei GRC, Germany.
Twitter: @CloudRR2016
Presentation at the International Industry-Academia Workshop on Cloud Reliability and Resilience. 7-8 November 2016, Berlin, Germany.
Organized by EIT Digital and Huawei GRC, Germany.
Twitter: @CloudRR2016
Gartner analyzed data centers for a period of 10 years and found that 47% of all problems were caused by cloud services outages. The duration of outages ranged between 40 minutes and five days. Ponemon Institute studied the financial impact and found that on average outages cost US$ 690.204, with an average downtime cost of US$ 6.828 per minute. These results are important due to the economic impact of unplanned outages on cloud operations which calls for higher platform reliability.
The first part of this talk will present the mechanisms that pioneers, such as Amazon, Google, and Netflix, have already developed to increase the reliability of their cloud platforms. The second part of the talk will describe how Huawei Research is exploring the use of fault-injection mechanisms to effectively increase the reliability of the Open Telekom Cloud platform from Deutsche Telekom.
For more than 10 years, research on service descriptions has mainly studied software-based services and provided languages such as WSDL, OWL-S, WSMO for SOAP, and hREST for REST. Nonetheless, recent developments from service management (e.g., ITIL and COBIT) and cloud computing (e.g. Software-as-a-Service) have brought new requirements to service descriptions languages: the need to also model business services and account for the multi-faceted nature of services. Business-orientation, co- creation, pricing, legal aspects, and security issues are all elements which must also be part of service descriptions. While ontologies such as e  service and e  value provided a first modeling attempt to capture a business perspective, concerns on how to contract services and the agreements entailed by a contract also need to be taken into account. This has for the most part been disregarded by the e-family of ontologies. In this paper, we review the evolution and provide an overview of Linked USDL, a comprehensive language which provides a (multi-faceted) description to enable the commercialization of (business and technical) services over the web.
Ten years of service research from a computer science perspectiveJorge Cardoso
…It has been more than 10 years since a strong research stream on services started from the field of computer science. The main trigger was without a doubt the introduction of the Web Service Description Language (WSDL), a specification to represent a piece of software functionally which could be remotely invoked. Nonetheless, this was only the “tipping point”. The generalized interest on this new development was followed by interesting topics of research on the application of semantics to enhance the description of services, the composition of services into processes, the analysis of the quality of services, the complexity of processes supporting services, and the development of comprehensive service description languages. This seminar will provide an overview of the main research topics around services and will glimpse at a new research field on the analysis of service networks...
Cloud Computing Automation: Integrating USDL and TOSCAJorge Cardoso
-- Presented at CAiSE 2013, Valencia, Spain --
Standardization efforts to simplify the management of cloud applications are being conducted in isolation. The objective of this paper is to investigate to which extend two promising specifications, USDL and TOSCA, can be integrated to automate the lifecycle of cloud applications. In our approach, we selected a commercial SaaS CRM platform, modeled it using the service description language USDL, modeled its cloud deployment using TOSCA, and constructed a prototypical platform to integrate service selection with deployment. Our evaluation indicates that a high level of integration is possible. We were able to fully automatize the remote deployment of a cloud service after it was selected by a customer in a marketplace. Architectural decisions emerged during the construction of the platform and were related to global service identification and access, multi-layer routing, and dynamic binding.
Understanding how services operate as part of large scale global networks, the related risks and gains of different network structures and their dynamics is becoming increasingly critical for society. Our vision and research agenda focuses on the particularly challenging task of building, analyzing, and reasoning about global service networks. This paper explains how Service Network Analysis (SNA) can be used to study and optimize the provisioning of complex services modeled as Open Semantic Service Networks (OSSN), a computer-understandable digital structure which represents connected and dependent services.
Open Semantic Service Networks: Modeling and AnalysisJorge Cardoso
A new interesting research area is the representation and analysis of the networked economy using Open Semantic Service Networks (OSSN). OSSN are represented using the service description language
USDL to model nodes and using the service relationship model OSSR to model edges. Nonetheless, in their current form USDL and OSSR do not provide constructs to capture the dynamic behavior of service networks. To bridge this gap, we used the General System Theory (GST) as a framework guiding the extension of USDL and OSSR to model dynamic OSSN. We evaluated the extensions made by applying USDL and OSSR to two distinct types of dynamic OSSN analysis: 1) evolutionary by using a Preferential Attachment (PA) and 2) analytical by using concepts from System Dynamics (SD). Results indicate that OSSN can constitute the rst stepping stones toward the analysis of global service-based economies.
Modeling Service Relationships for Service NetworksJorge Cardoso
The last decade has seen an increased interest in the study of networks in many fields of science. Examples are numerous, from sociology to biology, and to physical systems such as power grids. Nonetheless, the field of service networks has received less attention. Previous research has mainly tackled the modeling of single service systems and service compositions, often focusing only on studying temporal relationships between services. The objective of this paper is to propose a computational model to represent the various types of relationships which can be established between services systems to model service networks. This work acquires a particular importance since the study of service networks can bring new scientific discoveries on how service-based economies operate at a global scale.
Description and portability of cloud services with USDL and TOSCAJorge Cardoso
The provisioning and management of cloud services are major concerns since they bring clear benefits such as elasticity, flexibility, scalability, and high availability of applications for enterprises. Two emerging contributions set semantics and machine-understandable specifications for the description and portability of cloud-based services: USDL and TOSCA. In this talk we will explain how both can be articulated to work in conjunction. The Unified Service Description Language (USDL) was created for describing business or real world services to allow services to become tradable and consumable on marketplaces. On the other hand, the Topology and Orchestration Specification for Cloud Applications (TOSCA) was standardized to enable the portability of complex cloud applications and their management across different cloud providers.
To address the emerging importance of services and the relevance of relationships, we have developed and introduced the concept of Open Semantic Service Network (OSSN). OSSN are networks which relate services with the assumption that firms make the information of their services openly available using suitable models. Services, relationships and networks are said to be open (similar to LOD), when their models are transparently available and accessible by external entities and follow an open world assumption. Networks are said to be semantic when they explicitly describe their capabilities and usage, typically using a conceptual or domain model, and ideally using Semantic Web standards and techniques. One limitation of OSSNs is that they were conceived without accounting for the dynamic behavior of service networks. In other words, they can only capture static snapshots of service-based economies but do not include any mechanism to model reactions and effects that services have on other services and the notion of time
To address the emerging importance of services and the relevance of relationships, we have developed and introduced the concept of Open Semantic Service Network (OSSN). OSSN are networks which relate services with the assumption that firms make the information of their services openly available using suitable models. Services, relationships and networks are said to be open (similar to LOD), when their models are transparently available and accessible by external entities and follow an open world assumption. Networks are said to be semantic when they explicitly describe their capabilities and usage, typically using a conceptual or domain model, and ideally using Semantic Web standards and techniques. One limitation of OSSNs is that they were conceived without accounting for the dynamic behavior of service networks. In other words, they can only capture static snapshots of service-based economies but do not include any mechanism to model reactions and effects that services have on other services and the notion of time
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar
The European Union Agency for Law Enforcement Cooperation (Europol) has suffered an alleged data breach after a notorious threat actor claimed to have exfiltrated data from its systems. Infamous data leaker IntelBroker posted on the even more infamous BreachForums hacking forum, saying that Europol suffered a data breach this month.
The alleged breach affected Europol agencies CCSE, EC3, Europol Platform for Experts, Law Enforcement Forum, and SIRIUS. Infiltration of these entities can disrupt ongoing investigations and compromise sensitive intelligence shared among international law enforcement agencies.
However, this is neither the first nor the last activity of IntekBroker. We have compiled for you what happened in the last few days. To track such hacker activities on dark web sources like hacker forums, private Telegram channels, and other hidden platforms where cyber threats often originate, you can check SOCRadar’s Dark Web News.
Stay Informed on Threat Actors’ Activity on the Dark Web with SOCRadar!
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
Traditional software testing methods are being challenged in retail, where customer expectations and technological advancements continually shape the landscape. Enter generative AI—a transformative subset of artificial intelligence technologies poised to revolutionize software testing.
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxrickgrimesss22
Discover the essential features to incorporate in your Winzo clone app to boost business growth, enhance user engagement, and drive revenue. Learn how to create a compelling gaming experience that stands out in the competitive market.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
Navigating the Metaverse: A Journey into Virtual Evolution"Donna Lenk
Join us for an exploration of the Metaverse's evolution, where innovation meets imagination. Discover new dimensions of virtual events, engage with thought-provoking discussions, and witness the transformative power of digital realms."
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
top nidhi software solution freedownloadvrstrong314
This presentation emphasizes the importance of data security and legal compliance for Nidhi companies in India. It highlights how online Nidhi software solutions, like Vector Nidhi Software, offer advanced features tailored to these needs. Key aspects include encryption, access controls, and audit trails to ensure data security. The software complies with regulatory guidelines from the MCA and RBI and adheres to Nidhi Rules, 2014. With customizable, user-friendly interfaces and real-time features, these Nidhi software solutions enhance efficiency, support growth, and provide exceptional member services. The presentation concludes with contact information for further inquiries.
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
Accelerate Enterprise Software Engineering with Platformless
Mastering AIOps with Deep Learning
1. Dr. Jorge Cardoso
Huawei Munich Research Center &
University of Coimbra
--jorge.cardoso@huawei.com--
2018.08.28
Mastering AIOps
using Deep Learning, Time-Series Analysis,
and Distributed Tracing
iForesight iForesight is an intelligent new tool aimed at SRE cloud maintenance
teams. It enables them to quickly detect anomalies thanks to the use of
artificial intelligence when cloud service are slow or unresponsive.
DUSSELDORF, GERMANY
29–31 August
2018
2. 2
Abstract
Artificial Intelligence for IT Operations
In planet-scale deployments, the Operation and Maintenance (O&M) of cloud platforms cannot be done any longer manually
or simply with off-the-shelf solutions. It requires self-developed automated systems, ideally exploiting the use of AI to provide
tools for autonomous cloud operations. This talk will explain how deep learning, distributed traces, and time-series
analysis (sequence analysis) can be used to effectively detect anomalous cloud infrastructure behaviors during
operations to reduce the workload of human operators. The iForesight system is being used to evaluate this new O&M
approach. iForesight 2.0 is the result of 2 years of research with the goal to provide an intelligent new tool aimed at SRE
cloud maintenance teams. It enables them to quickly detect and predict anomalies thanks to the use of artificial intelligence
when cloud services are slow or unresponsive.
Dr. Jorge Cardoso is Chief Architect for Intelligent CloudOps at Huawei’s German Research Center in
Munich. Previously he worked for several major companies such as SAP Research (Germany) on the
Internet of Services and the Boeing Company in Seattle (USA) on Enterprise Application Integration.
He previously gave lectures at the Karlsruhe Institute of Technology (Germany), University of Georgia
(USA), University of Coimbra and University of Madeira (Portugal). He recently published his latest
book titled “Fundamentals of Service Systems” with Springer. His current research involves the
development of the next generation of Cloud Operations and Analytics using AI, Cloud Reliability and
Resilience, and High Performance Business Process Management systems. He has a Ph.D. in
Computer Science from the University of Georgia (USA).
Jorge Cardoso
GitHub | Slideshare.net | GoogleScholar
Positions in Industry
Interests Service Reliability Engineering, AIOps, Cloud Computing, Distributed Systems, Business Process Management
3. 3
AIOps
Artificial Intelligence for IT Operations
Data Source Connectors
Distribution Layer
Time Series Registry
Stability Analysis: Machine Learning and Analytics
Visualization, Exploration, Reporting
Real-time IngestionHistorical Data Storage
Stream ProcessingBatch Processing
Analytical Data Store
DevOpsBusiness Operators Developers
Pattern Matching Anomaly Detection
Structural Analysis RCA Event Causality
Graph Analysis
MetricsTracing Logging
Every year the management of IT Operations is more complex
► Increase in IT size, and event and alert volumes
► Digitalization with cloud, mobile, microservices
► Edge, IoT, serverless, …
To deal with this complexity, businesses are turning to AI to
automate incident management across production stacks,
including application, infrastructure, and monitoring tools.
Reference architecture for AIOps platformsSource: Gartner
4. 4
Objective
Detect Structural Changes of Distributed Traces
N
1
N
2
N3
N4
Distributed Traces [1]
► Scenario Service requests’ response time increased 50% when compared to 12h ago.
► Observation Distributed traces’ structure has changed for the past 30 minutes
► Root cause Traces show a 3*retry to service A, before calling service B
John Miller -> Trace ID = 23T874
SLO (response time) > 50%
keystone -> nova -> glance -> network
Server
uServices
im
age-api
im
age-registry
scheduler
netw
ork
com
pute
com
pute-api
rpc-broker
authentication
Logicaltime
30%
70%
Warning Y
Warning A
Warning B
Warning C
Error Z
A B C C B ZTrace t1
Sequence of spans
[1] B. H. Sigelman, L. A. Barroso, M. Burrows, P. Stephenson, M. Plakal, D. Beaver, S. Jaspan, C. Shanbhag, Dapper, a
Large-Scale Distributed Systems Tracing Infrastructure, 2010, Google, Inc.
5. 5
Objective
Detect Structural Changes of Distributed Traces
alb5g1jksao98aj8ik5e
w21
ab05g1jksao98ajkk5ew
21
ab05g1jksalm8aj8ik5e
w21
ab05g1jksao98aj8i---
---
host list
host show
hypervisor list
hypervisor show
hypervisor stats show
image add project
image create
image delete
image list
image remove project
image save
image set
image show
ip fixed add
flavor create
flavor delete
flavor list
flavor set
flavor show
flavor unset
…
Distributed
Traces
Detect
Normal/Anomalous
Traces
Encoding
alb5g1jksao98aj8ik5e
w21
ab05g1jksao98ajkk5ew
21
ab05g1jksalm8aj8ik5e
w21
ab05g1jksao98aj8i---
---
User
Commands
6. 6
Deep Learning
Long Short Term Memory
Long Short Term Memory (LSTM)
LSTMs [1] are models which capture sequential data by using an internal memory. They are very good for analyzing
sequences of values and predicting the next point of a given time series. This makes LSTMs adequate for Machine
Learning problems that involve sequential data (see [3]) such speech recognition, machine translation, visual
recognition and description, and distributed traces analysis.
Sequential processing using LSTM [2]
[1] Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9.8 (1997): 1735-1780.
[2] Sequential processing in LSTM (from: http://colah.github.io/posts/2015-08-Understanding-LSTMs/
[3] LSTM model description (from Andrej Karpathy. http://karpathy.github.io/2015/05/21/rnn-effectiveness/
A B C C B ZTrace t1
Sequence of spans
7. 7
Structure Formalization
Trace and Span Representation
► Training
▪ Collect traces
▪ Parse and abstract spans
▪ Feed span lists to deep neural network
▪ Train the network with traces
► Operation
▪ Collect real-time traces
▪ Parse and abstract spans
▪ Feed span lists to deep neural network
▪ Retrieve probability matrix
▪ Analyze matrix to decide on the severity of the
anomaly
91 12 43 76 24
04 93 63 25 35
38 18 44 24 61
Trace t1
Trace t2
Trace tn
Trace sampledSequence of spans
91 12 43 76 24
Training
Operation
Predicted reconstruction
91 12 43 76 24
We analyze historical traces and collect the spans generated to create a structure containing span types
(symbols) using label encoding. The network is training with the encoded traces. During operation, an
LSTM network takes as input a sequence of symbols representing a new real-time trace, e.g., 1: “72”, 2:
“83”, 3: “54”, …n: “23”. The output is a matrix with the probability of each symbol to belong to a learned
structure. The analysis of the probability matrix enables to reason on if a trace should be mark are normal
or anomalous.
In this experiment, we processed >1M spans and >100K traces
Partial view of the spans collected
8. 8
Overall Approach
From Encoding to Detection
Keras and Tensorflow
● LSTM network
● dropout layer of 0.2 (high, but necessary to avoid quick
divergence)
● Softmax activation to generate probabilities over the different
categories
● Because it is a classification problem, loss calculation
via categorical cross-entropy compares the output
probabilities against one-one encoding
● ADAM optimizer (instead of the classical stochastic gradient
descent) to update weights
Encoding scheme
● The encoding scheme pads each sequence of inputs with zeros, up to a pre-defined maximum length.
● This allows to pre-allocate a chain of LSTM units of a specific length
● We also pass information about the sequence lengths. This is important for not treating the zero padding as actual
inputs, and from injecting the error signal at the right unit in the sequence during back propagation.
Hot encoding
● A one hot encoding enables to represent categorical variables as binary vectors.
● Categorical values are mapped to integer values.
● Each integer value is represented as a binary vector that is all zero values except the index of the integer, which is
marked with a 1.
Tensorflow tutorial: https://github.com/campdav/text-rnn-tensorflow
Keras tutorial: https://github.com/campdav/text-rnn-keras
Normal (green) and Anomalous (red) Traces Detected
Encoded Traces
Raw Span Events
LSTM Network
Network Training
1
2
3
4
9. 9
Distributed Trace Anomaly Detection
Results
Goal: Classify distributed traces as normal or abnormal
► Benefits
▪ Very high accuracy in detection: 99.73%
▪ Extremely fast detection: O(n)
▪ The technique outperforms existing approaches
e.g., sequence matching, petri nets, clustering
► Limitations
▪ Requires a special (not trivial) handling when using recurrent neural networks, like LSTMs
▪ Long traces may require very long training times
e.g., back propagation across long sequences may result in vanishing gradients...
► Improvements
▪ Truncate traces (remove steps from the beginning or the end)?
e.g., Lower the accuracy of predictions
▪ Summarize traces (remove well-known subtraces)?
e.g., focus on most relevant parts of a trace while reducing their length
▪ Evaluation the prediction of remaining time to completion (see [Tax, 2017])
▪ Operationalize LSTM with other sequential data (see [Du, 2017])
[Tax, 2017] Predictive business process monitoring with LSTM neural networks, Tax, Niek and Verenich, Ilya and La Rosa, Marcello and Dumas,
Marlon, Proceedings of the 29th International Conference on Advanced Information Systems Engineering, 2017, 477--492, Springer.
[Du, 2017] M. Du, F. Li, G. Zheng, and V. Srikumar. 2017. DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning. In
Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (CCS '17). ACM, 1285-1298.