Rave is an Apache incubator project that provides tools for building science gateways using open standards. It allows creation of a downloadable portal using minimal configuration and provides ways for developers to customize and extend the portal. Rave uses a model-view-controller architecture and is implemented in JavaScript and Java with components like user management, widgets, and configuration files that can be modified by developers.
Overview of Indiana University's Advanced Science Gateway support activities for drug discovery, computational chemistry, and other Web portals. For a broader overview of the OGCE project, see http://www.collab-ogce.org/ogce/index.php
Overview of Indiana University's Advanced Science Gateway support activities for drug discovery, computational chemistry, and other Web portals. For a broader overview of the OGCE project, see http://www.collab-ogce.org/ogce/index.php
Python in the Hadoop Ecosystem (Rock Health presentation)Uri Laserson
A presentation covering the use of Python frameworks on the Hadoop ecosystem. Covers, in particular, Hadoop Streaming, mrjob, luigi, PySpark, and using Numba with Impala.
Opal: Simple Web Services Wrappers for Scientific ApplicationsSriram Krishnan
The grid-based infrastructure enables large-scale scientific applications to be run on distributed resources and coupled in innovative ways. However, in practice, grid resources are not very easy to use for the end-users who have to learn how to generate security credentials, stage inputs and outputs, access grid-based schedulers, and install complex client software. There is an imminent need to provide transparent access to these resources so that the end-users are shielded from the complicated details, and free to concentrate on their domain science. Scientific applications wrapped as Web services alleviate some of these problems by hiding the complexities of the back-end security and computational infrastructure, only exposing a simple SOAP API that can be accessed programmatically by application-specific user interfaces. However, writing the application services that access grid resources can be quite complicated, especially if it has to be replicated for every application. In this presentation, we present Opal which is a toolkit for wrapping scientific applications as Web services in a matter of hours, providing features such as scheduling, standards-based grid security and data management in an easy-to-use and configurable manner
Data Platform at Twitter: Enabling Real-time & Batch Analytics at ScaleSriram Krishnan
The Data Platform at Twitter supports engineers and data scientists running batch jobs on Hadoop clusters that are several 1000s of nodes, and real-time jobs on top of systems such as Storm. In this presentation, I discuss the overall Data Platform stack at Twitter. In particular, I talk about enabling real-time and batch analytics at scale with the help of Scalding, which is a Scala DSL for batch jobs using MapReduce, Summingbird, which is a framework for combined real-time and batch processing, and Tsar, which is a framework for real-time time-series aggregations.
Deep Learning to Big Data Analytics on Apache Spark Using BigDL with Xianyan ...Databricks
With the continued success of deep learning techniques, there’s been a rapid growth in applications for perception in many modalities, such as image classification, object detection and speech recognition. In response, Intel’s BigDL is an open source distributed deep learning framework for Apache Spark that includes rich deep learning support and Intel Math Kernel Library acceleration, allowing users to quickly develop deep learning applications with extremely high performance on their existing Hadoop ecosystems.
This sessions will explore several key deep learning applications that Intel successfully built on top of Apache Spark with BigDL. Hear about the technologies they developed and what they learned from building such applications, including: the tool stack in the system and design considerations; an application on image recognition and object detection (faster-rcnn using VGG and PVANET); and an application on speech recognition with deep speech and acoustic feature transformers. He’ll also share other insights and experiences Intel gained while building a unified data analytics platform with Apache Spark MLlib and BigDL.
Accelerating TensorFlow with RDMA for high-performance deep learningDataWorks Summit
Google’s TensorFlow is one of the most popular deep learning (DL) frameworks. In distributed TensorFlow, gradient updates are a critical step governing the total model training time. These updates incur a massive volume of data transfer over the network.
In this talk, we first present a thorough analysis of the communication patterns in distributed TensorFlow. Then we propose a unified way of achieving high performance through enhancing the gRPC runtime with Remote Direct Memory Access (RDMA) technology on InfiniBand and RoCE. Through our proposed RDMA-gRPC design, TensorFlow only needs to run over the gRPC channel and gets the optimal performance. Our design includes advanced features such as message pipelining, message coalescing, zero-copy transmission, etc. The performance evaluations show that our proposed design can significantly speed up gRPC throughput by up to 1.5x compared to the default gRPC design. By integrating our RDMA-gRPC with TensorFlow, we are able to achieve up to 35% performance improvement for TensorFlow training with CNN models.
Speakers
Dhabaleswar K (DK) Panda, Professor and University Distinguished Scholar, The Ohio State University
Xiaoyi Lu, Research Scientist, The Ohio State University
QCon São Paulo: Real-Time Analytics with Spark StreamingPaco Nathan
"Real-Time Analytics with Spark Streaming" presented at QCon São Paulo, 2015-03-26
http://qconsp.com/presentation/real-time-analytics-spark-streaming
This talk presents an overview of Spark and its history and applications, then focuses on the Spark Streaming component used for real-time analytics. We compare it with earlier frameworks such as MillWheel and Storm, and explore industry motivations for open-source micro-batch streaming at scale.
The talk will include demos for streaming apps that include machine-learning examples. We also consider public case studies of production deployments at scale.
We’ll review the use of open-source sketch algorithms and probabilistic data structures that get leveraged in streaming – for example, the trade-off of 4% error bounds on real-time metrics for two orders of magnitude reduction in required memory footprint of a Spark app.
Apache Cassandra is the leading distributed database in use at thousands of sites with the world’s most demanding scalability and availability requirements. Apache Spark is a distributed data analytics computing framework that has gained a lot of traction in processing large amounts of data in an efficient and user-friendly manner. The joining of both provides a powerful combination of real-time data collection with analytics. After a brief overview of Cassandra and Spark, this class will dive into various aspects of the integration.
Bringing it All Together: Apache Metron (Incubating) as a Case Study of a Mod...DataWorks Summit
There have been many voices discussing how to architect streaming
applications on Hadoop. Before now, there have been very few worked
examples existing within the open source. Apache Metron (Incubating) is a
streaming advanced analytics cybersecurity application which utilizes
the components within the Hadoop stack as its platform.
We will attempt to go beyond theoretical discussions of Kappa vs Lambda
architectures and describe the nuts and bolts of a streaming
architecture that enables advanced analytics in Hadoop. We will discuss
the componentry that we had to build and what we could utilize. We will
discuss why we made the architectural decisions that we made and how
they fit together to knit together a coherent application on top of many
different Hadoop ecosystem projects.
We will also discuss the domain specific language that we created out of
necessity to enable a pluggable layer to enable user defined enrichments.
We will discuss how this helped make Metron less rigid and easier to
use. We will also candidly discuss mistakes that we made early on.
Bringing complex event processing to Spark streamingDataWorks Summit
Complex event processing (CEP) is about identifying business opportunities and threats in real time by detecting patterns in data and taking appropriate automated action. Example business use cases for CEP include location-based marketing, smart inventories, targeted ads, Wi-Fi offloading, fraud detection, churn prediction, fleet management, predictive maintenance, security incident event management, and many more. While Spark Streaming provides a distributed resilient framework for ingesting events in real time, effort is still needed to build CEP applications. This is because CEP use cases require correlation of events, which in turn requires us to treat every incoming event as a discrete occurrence in time. Spark Streaming treats the entire batch of events as single occurrence. Many CEP use cases also require alerts to be fired even when there is no incoming event. An example of such use case is to fire an alert when an order-shipped event is NOT received within the SLA times following an order-received event. At Oracle we have adopted a few neat techniques like running continuous query engines as long running tasks, using empty batches as triggers, etc. to bring complex event processing to Spark Streaming.
Join us to learn more on CEP for Spark, the fastest growing data processing platform in the world.
Speakers
Prabhu Thukkaram, Senior Director, Product Development, Oracle
Hoyong Park, Architect, Oracle
Monitoring at scale - Sensu Kafka Kafka-connect Cassandra PrestoDBLeandro Totino Pereira
Opensource/comercial doesn´t scale very well (poller, proxy and databases)
These system often do have SQL database which doesn´t scale well (sharding, master/slave, no TS database for metrics)
These systems are hard to customize for our needs. (integrations with other systems and dashboards)
Don´t provide any queue layer to avoid overload.
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...Databricks
The speaker will review case studies from real-world projects that built AI systems using Natural Language Processing (NLP) in healthcare. These case studies cover projects that deployed automated patient risk prediction, automated diagnosis, clinical guidelines, and revenue cycle optimization.
Sherlock: an anomaly detection service on top of Druid DataWorks Summit
Sherlock is an anomaly detection service built on top of Druid. It leverages EGADS (Extensible Generic Anomaly Detection System; github.com/yahoo/egads) to detect anomalies in time-series data. Users can schedule jobs on an hourly, daily, weekly, or monthly basis, view anomaly reports from Sherlock's interface, or receive them via email.
Sherlock has four major components: timeseries generation, EGADS anomaly detection, Redis backend and Spark Java UI. Timeseries generation involves building, validating, querying, parsing the Druid query. Parsed Druid response is then fed to EGADS anomaly detection component which detects and generates the anomaly reports for each input time-series data. Sherlock uses Redis backend to store jobs metadata, generated anomaly reports and persistent job queue for scheduling jobs, etc. Users can choose to have a clustered Redis or standalone Redis. Sherlock provides user interface built with Spark Java. The UI enables users to submit instant anomaly analysis, create, and launch detection jobs, view anomalies on a heatmap and on a graph. Jigarkumar Patel, Software Development Engineer I, Oath Inc. and, David Servose, Software Systems Engineer, Oath
Don't Let the Spark Burn Your House: Perspectives on Securing SparkDataWorks Summit
Apache Spark is emerging as a key enabler for various enterprise use cases including customer intelligence applications, data warehousing, real-time or streaming, recommendation engines, and log processing. Even the most common use case for Spark around business intelligence (BI) or customer intelligence applications via data science encompasses the complete data worker lifecycle from file processing, workflows, cleansing, enrichment, model building and deployments to dash boarding and reporting. However, many aspects of security and governance with Spark are still emerging and pose challenges to enterprise adoption including areas of authorization, authentication, and comprehensive auditing as well as metadata harvesting and governance. We will demonstrate some examples of the current the state of the art in terms of different open source approaches to Spark security and governance. For example, we will show how Spark technologies can be integrated with enterprise identity providers, and how we can enable fine-grained access control for processes, and how to harvest process metadata while providing detailed audits. We will also provide best practices and common usage patterns to secure your Spark clusters and how best to support enterprise compliance and governance needs when using Spark.
Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain...Big Data Spain
Session presented at Big Data Spain 2015 Conference
15th Oct 2015
Kinépolis Madrid
http://www.bigdataspain.org
Event promoted by: http://www.paradigmadigital.com
Abstract: http://www.bigdataspain.org/program/thu/slot-7.html
GDPR compliance application architecture and implementation using Hadoop and ...DataWorks Summit
The General Data Protection Regulation (GDPR) is a legislation designed to protect personal data of European Union citizens and residents. The main requirement is to log personal data accesses/changes in customer-specific applications. These logs can then be audited by owning entities to provide reporting to end users indicating usage of their personal data. Users have the ""right to be forgotten,â€Âmeaning their personal data can be purged from the system at their request. The regulation goes into effect on May 25,2018 with significant fines for non-compliance.
This session will provide insight on how to approach/implement a GDPR compliance solution using Hadoop and Streaming for any enterprise with heavy volumes of data.This session will delve into deployment strategies, architecture of choice (Kafka,NiFi. and Hive ACID with streaming), implementation best practices, configurations, and security requirements. Hortonworks Professional Services System Architects helped the customer on ground to design, implement, and deploy this application in production.
Speaker
Saurabh Mishra, Hortonworks, Systems Architect
Arun Thangamani, Hortonworks, Systems Architect
Python in the Hadoop Ecosystem (Rock Health presentation)Uri Laserson
A presentation covering the use of Python frameworks on the Hadoop ecosystem. Covers, in particular, Hadoop Streaming, mrjob, luigi, PySpark, and using Numba with Impala.
Opal: Simple Web Services Wrappers for Scientific ApplicationsSriram Krishnan
The grid-based infrastructure enables large-scale scientific applications to be run on distributed resources and coupled in innovative ways. However, in practice, grid resources are not very easy to use for the end-users who have to learn how to generate security credentials, stage inputs and outputs, access grid-based schedulers, and install complex client software. There is an imminent need to provide transparent access to these resources so that the end-users are shielded from the complicated details, and free to concentrate on their domain science. Scientific applications wrapped as Web services alleviate some of these problems by hiding the complexities of the back-end security and computational infrastructure, only exposing a simple SOAP API that can be accessed programmatically by application-specific user interfaces. However, writing the application services that access grid resources can be quite complicated, especially if it has to be replicated for every application. In this presentation, we present Opal which is a toolkit for wrapping scientific applications as Web services in a matter of hours, providing features such as scheduling, standards-based grid security and data management in an easy-to-use and configurable manner
Data Platform at Twitter: Enabling Real-time & Batch Analytics at ScaleSriram Krishnan
The Data Platform at Twitter supports engineers and data scientists running batch jobs on Hadoop clusters that are several 1000s of nodes, and real-time jobs on top of systems such as Storm. In this presentation, I discuss the overall Data Platform stack at Twitter. In particular, I talk about enabling real-time and batch analytics at scale with the help of Scalding, which is a Scala DSL for batch jobs using MapReduce, Summingbird, which is a framework for combined real-time and batch processing, and Tsar, which is a framework for real-time time-series aggregations.
Deep Learning to Big Data Analytics on Apache Spark Using BigDL with Xianyan ...Databricks
With the continued success of deep learning techniques, there’s been a rapid growth in applications for perception in many modalities, such as image classification, object detection and speech recognition. In response, Intel’s BigDL is an open source distributed deep learning framework for Apache Spark that includes rich deep learning support and Intel Math Kernel Library acceleration, allowing users to quickly develop deep learning applications with extremely high performance on their existing Hadoop ecosystems.
This sessions will explore several key deep learning applications that Intel successfully built on top of Apache Spark with BigDL. Hear about the technologies they developed and what they learned from building such applications, including: the tool stack in the system and design considerations; an application on image recognition and object detection (faster-rcnn using VGG and PVANET); and an application on speech recognition with deep speech and acoustic feature transformers. He’ll also share other insights and experiences Intel gained while building a unified data analytics platform with Apache Spark MLlib and BigDL.
Accelerating TensorFlow with RDMA for high-performance deep learningDataWorks Summit
Google’s TensorFlow is one of the most popular deep learning (DL) frameworks. In distributed TensorFlow, gradient updates are a critical step governing the total model training time. These updates incur a massive volume of data transfer over the network.
In this talk, we first present a thorough analysis of the communication patterns in distributed TensorFlow. Then we propose a unified way of achieving high performance through enhancing the gRPC runtime with Remote Direct Memory Access (RDMA) technology on InfiniBand and RoCE. Through our proposed RDMA-gRPC design, TensorFlow only needs to run over the gRPC channel and gets the optimal performance. Our design includes advanced features such as message pipelining, message coalescing, zero-copy transmission, etc. The performance evaluations show that our proposed design can significantly speed up gRPC throughput by up to 1.5x compared to the default gRPC design. By integrating our RDMA-gRPC with TensorFlow, we are able to achieve up to 35% performance improvement for TensorFlow training with CNN models.
Speakers
Dhabaleswar K (DK) Panda, Professor and University Distinguished Scholar, The Ohio State University
Xiaoyi Lu, Research Scientist, The Ohio State University
QCon São Paulo: Real-Time Analytics with Spark StreamingPaco Nathan
"Real-Time Analytics with Spark Streaming" presented at QCon São Paulo, 2015-03-26
http://qconsp.com/presentation/real-time-analytics-spark-streaming
This talk presents an overview of Spark and its history and applications, then focuses on the Spark Streaming component used for real-time analytics. We compare it with earlier frameworks such as MillWheel and Storm, and explore industry motivations for open-source micro-batch streaming at scale.
The talk will include demos for streaming apps that include machine-learning examples. We also consider public case studies of production deployments at scale.
We’ll review the use of open-source sketch algorithms and probabilistic data structures that get leveraged in streaming – for example, the trade-off of 4% error bounds on real-time metrics for two orders of magnitude reduction in required memory footprint of a Spark app.
Apache Cassandra is the leading distributed database in use at thousands of sites with the world’s most demanding scalability and availability requirements. Apache Spark is a distributed data analytics computing framework that has gained a lot of traction in processing large amounts of data in an efficient and user-friendly manner. The joining of both provides a powerful combination of real-time data collection with analytics. After a brief overview of Cassandra and Spark, this class will dive into various aspects of the integration.
Bringing it All Together: Apache Metron (Incubating) as a Case Study of a Mod...DataWorks Summit
There have been many voices discussing how to architect streaming
applications on Hadoop. Before now, there have been very few worked
examples existing within the open source. Apache Metron (Incubating) is a
streaming advanced analytics cybersecurity application which utilizes
the components within the Hadoop stack as its platform.
We will attempt to go beyond theoretical discussions of Kappa vs Lambda
architectures and describe the nuts and bolts of a streaming
architecture that enables advanced analytics in Hadoop. We will discuss
the componentry that we had to build and what we could utilize. We will
discuss why we made the architectural decisions that we made and how
they fit together to knit together a coherent application on top of many
different Hadoop ecosystem projects.
We will also discuss the domain specific language that we created out of
necessity to enable a pluggable layer to enable user defined enrichments.
We will discuss how this helped make Metron less rigid and easier to
use. We will also candidly discuss mistakes that we made early on.
Bringing complex event processing to Spark streamingDataWorks Summit
Complex event processing (CEP) is about identifying business opportunities and threats in real time by detecting patterns in data and taking appropriate automated action. Example business use cases for CEP include location-based marketing, smart inventories, targeted ads, Wi-Fi offloading, fraud detection, churn prediction, fleet management, predictive maintenance, security incident event management, and many more. While Spark Streaming provides a distributed resilient framework for ingesting events in real time, effort is still needed to build CEP applications. This is because CEP use cases require correlation of events, which in turn requires us to treat every incoming event as a discrete occurrence in time. Spark Streaming treats the entire batch of events as single occurrence. Many CEP use cases also require alerts to be fired even when there is no incoming event. An example of such use case is to fire an alert when an order-shipped event is NOT received within the SLA times following an order-received event. At Oracle we have adopted a few neat techniques like running continuous query engines as long running tasks, using empty batches as triggers, etc. to bring complex event processing to Spark Streaming.
Join us to learn more on CEP for Spark, the fastest growing data processing platform in the world.
Speakers
Prabhu Thukkaram, Senior Director, Product Development, Oracle
Hoyong Park, Architect, Oracle
Monitoring at scale - Sensu Kafka Kafka-connect Cassandra PrestoDBLeandro Totino Pereira
Opensource/comercial doesn´t scale very well (poller, proxy and databases)
These system often do have SQL database which doesn´t scale well (sharding, master/slave, no TS database for metrics)
These systems are hard to customize for our needs. (integrations with other systems and dashboards)
Don´t provide any queue layer to avoid overload.
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...Databricks
The speaker will review case studies from real-world projects that built AI systems using Natural Language Processing (NLP) in healthcare. These case studies cover projects that deployed automated patient risk prediction, automated diagnosis, clinical guidelines, and revenue cycle optimization.
Sherlock: an anomaly detection service on top of Druid DataWorks Summit
Sherlock is an anomaly detection service built on top of Druid. It leverages EGADS (Extensible Generic Anomaly Detection System; github.com/yahoo/egads) to detect anomalies in time-series data. Users can schedule jobs on an hourly, daily, weekly, or monthly basis, view anomaly reports from Sherlock's interface, or receive them via email.
Sherlock has four major components: timeseries generation, EGADS anomaly detection, Redis backend and Spark Java UI. Timeseries generation involves building, validating, querying, parsing the Druid query. Parsed Druid response is then fed to EGADS anomaly detection component which detects and generates the anomaly reports for each input time-series data. Sherlock uses Redis backend to store jobs metadata, generated anomaly reports and persistent job queue for scheduling jobs, etc. Users can choose to have a clustered Redis or standalone Redis. Sherlock provides user interface built with Spark Java. The UI enables users to submit instant anomaly analysis, create, and launch detection jobs, view anomalies on a heatmap and on a graph. Jigarkumar Patel, Software Development Engineer I, Oath Inc. and, David Servose, Software Systems Engineer, Oath
Don't Let the Spark Burn Your House: Perspectives on Securing SparkDataWorks Summit
Apache Spark is emerging as a key enabler for various enterprise use cases including customer intelligence applications, data warehousing, real-time or streaming, recommendation engines, and log processing. Even the most common use case for Spark around business intelligence (BI) or customer intelligence applications via data science encompasses the complete data worker lifecycle from file processing, workflows, cleansing, enrichment, model building and deployments to dash boarding and reporting. However, many aspects of security and governance with Spark are still emerging and pose challenges to enterprise adoption including areas of authorization, authentication, and comprehensive auditing as well as metadata harvesting and governance. We will demonstrate some examples of the current the state of the art in terms of different open source approaches to Spark security and governance. For example, we will show how Spark technologies can be integrated with enterprise identity providers, and how we can enable fine-grained access control for processes, and how to harvest process metadata while providing detailed audits. We will also provide best practices and common usage patterns to secure your Spark clusters and how best to support enterprise compliance and governance needs when using Spark.
Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain...Big Data Spain
Session presented at Big Data Spain 2015 Conference
15th Oct 2015
Kinépolis Madrid
http://www.bigdataspain.org
Event promoted by: http://www.paradigmadigital.com
Abstract: http://www.bigdataspain.org/program/thu/slot-7.html
GDPR compliance application architecture and implementation using Hadoop and ...DataWorks Summit
The General Data Protection Regulation (GDPR) is a legislation designed to protect personal data of European Union citizens and residents. The main requirement is to log personal data accesses/changes in customer-specific applications. These logs can then be audited by owning entities to provide reporting to end users indicating usage of their personal data. Users have the ""right to be forgotten,â€Âmeaning their personal data can be purged from the system at their request. The regulation goes into effect on May 25,2018 with significant fines for non-compliance.
This session will provide insight on how to approach/implement a GDPR compliance solution using Hadoop and Streaming for any enterprise with heavy volumes of data.This session will delve into deployment strategies, architecture of choice (Kafka,NiFi. and Hive ACID with streaming), implementation best practices, configurations, and security requirements. Hortonworks Professional Services System Architects helped the customer on ground to design, implement, and deploy this application in production.
Speaker
Saurabh Mishra, Hortonworks, Systems Architect
Arun Thangamani, Hortonworks, Systems Architect
Open Source Search Tools for www2010 conferencesourcesearchtoolswww20100426dA...Ted Drake
Presentation by Ted DRAKE and Rosie JONES for the www2010 conference in North Carolina. This discusses the open source search software, APIs and trends.
Model-Driven Development of Semantic Mashup Applications with the Open-Source...InfoGrid.org
Talk at Enterprise Data World 2010 in San Francisco.
Outlines the difficulties in developing enterprise mash-up applications that aggregate data semantically and in real time. Gives an overview how the InfoGrid internet graph database can help.
Essential Java Libraries Every Developer Should Know AboutInexture Solutions
Whether you're new to Java or a seasoned developer, these libraries will help improve your workflow and make your coding life easier. Don't miss out on this knowledge!
What is Maven? Maven is an automation and management tool developed by Apache Software Foundation. It was initially released on 13 July 2004. In the Yiddish language, the meaning of Maven is “accumulator of knowledge”. Maven is a project management and comprehension tool that provides developers a complete build life-cycle framework.
Parse recently announced that they are retiring their mobile development service, and their current customers will have until January 28, 2017 to move their apps to alternative services. To help you make the transition, AWS is working together with Parse to provide a migration path to AWS. AWS provides a range of services for building, testing, and monitoring mobile apps. In this session, we will introduce you to AWS mobile services, and take you through the steps required to migrate your mobile apps from Parse to AWS through the following use cases:
Parse Core to AWS
Parse Push to Amazon SNS
AWS DevDay San Francisco, June 21, 2016.
Presenters: Tara Walker, Technical Evangelist, Bob Wall, CTO/Founder, Washio
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
2. Apache Rave Overview
• Rave is an Apache incubator for building a Web portal
on the Open Social and W3C Widget specifications.
– Joint effort of Mitre, Hippo Software, SURFnet, and the
OGCE project
– Replaces the OGCE Gadget Container
• Goal 1: Provide a useable, packaged, downloadable
OpenSocial portal.
– Get started with minimal hassle.
• Goal 2: Provide a platform for non-invasive developer
extensions, customizations
– Science gateways, for example
5. Rave Building Blocks
• Rave is implemented in JavaScript, Java with
Spring MVC
– Bean initialization specified in XML configuration files.
– Inversion of Control makes it easy to swap out
implementations.
– Disciplined MVC through Java annotations
• Builds on Apache Shindig and Wookie
– Provide layout management, user management,
administration tools, production backend data
systems, etc.
6. Rave Components
Component Description
Models User, Page, Region, RegionWidget. These are interfaces with
default implementations
Controllers Associates a specific URL with backing code to render JSP views
or provide access to REST and RPC services.
Services Internal services that implement a specific action, such as
adding a new user to the repository.
Repositories Control Object-Relational Mappings between model objects and
backend storage.
Views User interfaces implemented as JSPs. These include welcome
pages, layout managers for both standard and mobile views,
administration pages, and widget store pages.
7. Rave Configuration Files
Configuration File Description Developer
Modifications
applicationContext.xml Instantiatesall beans, Addnew Java Beans here
controllers, and to support extensions.
services.
applicationContext- Specifies allowedURL Change the default
security.xml patterns,enables authentication module
OpenID support, and to or expose additional
specifies the REST services.
authentication provider.
dataContext.xml Used to set up the Override default data
default H2 database and store, initial population
to populate it with methods.
demo accounts.
8. Extending Rave
• Rave is designed to be extended.
– Good design (interfaces, easily pluggable
implementations) and code organization are
required.
– It helps to have a diverse, distributed developer
community
• How can you work on it if we can’t work on it?
• Rave is also packaged so that you can extend it
without touching the source tree.
9. Rave Developer Dependencies
Component Description
rave-portal- Maven POM file listing all Rave-
dependencies producedJARs and third-party
dependencies.
rave-portal-resources Java WAR file containing all Rave
Web resources
rave-shindig Java WAR file containing Rave
modifications and extensions to
Apache Shindig
10. Rave Extension General Steps
• Download and install Rave’s source
– “mvn clean install” puts JARs, WARs, and POMs
into your local Apache Maven repository.
• Create a new Apache Maven project
– You’ll need rave-portal-dependencies POM in your
<dependencies/>.
– Include any configuration files that you would like
to modify.
– Include the source code for your extensions.
11. Case Study: GridShib and Community
Credentials
• XSEDE Science Gateways use shared
community credentials when accessing
backend resources.
– Many portal users map to one community
account.
• GridShib adds attributes to grid credentials
– Gateway membership, originating IP address, user
email, creation time, etc.
• For Rave, we’ll have to change the User
service implementation to support this.
12. GridShib Step By Step
• Install Rave in your Maven repo.
• Create a Maven project with standard directory layout
for WAR packaging
• Create a new user service (ComUserService) for
obtaining a community credential and adding
GridShib attributes.
• Replace applicationContext-security.xml with your
version
• In the XML, replace the default UserService with
ComUserService.
• Place all GridShib resources in src/main/resources
• Place web.xml in src/main/webapp/WEB-INF
– You’ll need an additional listener to get the IP address.
13. GridShib and Rave Postmortem
• The full example is available from the
Rave sandbox SVN.
• It is also includes examples of how to
build new REST services.
• GridShib’s dependence on XML library
jars are a challenge for one step
packaging.
– These must be in an endorsed directory
14. The Apache Way
• Apache is open community, not just open source licensing
or code on the web.
• Projects start as incubators with 1 champion and several
mentors.
– Making good choices is very important
• Champion and mentors will judge you, help you on the
following
• Good, open engineering practices
– DEV mailing list design discussions, issue tracking
• Properly packaged code
– Build out of the box
– Licenses, disclaimers, notices, change logs, etc.
• Developer diversity
– Three or more unconnected developers
15. Apache and Science Gateways
• Apache rewards projects for cross-pollination.
– Connecting with complementary Apache projects
strengthens both sides.
– New requirements, new development methods
• Apache methods foster sustainability
– Building communities of developers, not just users
• Apache methods provide governance
– Incubators learn best practices from mentors
– Releases are peer-reviewed
16. More Information
• As an Apache Incubator, Rave welcomes
(and needs) new developer involvement
• Rave Web Site:
• Rave Developer List (public):
• Rave includes contributions from many
individuals.
Seehttp://incubator.apache.org/rave/for a
list of champions, mentors, and
contributors.