Eberhard Wolff is an architecture and technology manager at adesso, a leading IT consultancy in Germany. He discusses how cloud computing differs from traditional enterprise computing by having many inexpensive instances with unreliable networks. To build reliable systems, applications need to be stateless, easily scaled, and handle failures. He uses the example of Spring Biking, an e-commerce site, to illustrate an architecture with stateless frontends, elastic scaling, and database replication to achieve reliability in the cloud.
Your Guide to Streaming - The Engineer's PerspectiveIlya Ganelin
It feels like every week there's a new open-source streaming platform out there. Yet, if you only look at the descriptions, performance metrics, or even the architecture, they all start to look exactly the same! In short, nothing really differentiates itself - whether it be Storm, Flink, Apex, GearPumk, Samza, KafkaStreams, AkkaStreams, or any of the other myriad technologies. So if they all look the same, how do you really pick a streaming platform to solve the problem that YOU have? This talk is about how to really compare these platforms, and it turns out that they do have their key differences, they're just not the ones you usually think about. The way that you need to compare these systems if you're building something to last, a well-engineered system, is to look at how they handle durability, availability, how easy they are to install and use, and how they deal with failures.
In the engineering world, we don’t always have the luxury of owning our data pipelines end to end. If only we could influence those outside components… Well, we tried, and this our story - replete with failure, discovery, and the serenity of enlightenment. Join us on our journey as we learned more than we ever wanted to know about compression in different Apache projects, deployed our own ingestion pipeline in Apache Flume, and ultimately unified these in a robust framework built on Apache Apex handling 1 TB of data per day. We end with some reflections on the joys and tribulations of the open source realm and some key lessons for other large applications atop multiple Apache solutions.
This talk covers the Vault 8 team's journey at Capital One where we investigated a wide variety of stream processing solutions to build a next generation real-time decisioning platform to power Capital One's infrastructure.
The result of our analysis showed Apache Storm, Apache Flink, and Apache Apex as prime contenders for our use case with Apache Apex ultimately proving to be the solution of choice based on its present readiness for enterprise deployment and its excellent performance.
Low latency high throughput streaming using Apache Apex and Apache KuduDataWorks Summit
True streaming is fast becoming a necessity for many business use cases. On the other hand the data set sizes and volumes are also growing exponentially compounding the complexity of data processing pipelines.There exists a need for true low latency streaming coupled with very high throughput data processing. Apache Apex as a low latency and high throughput data processing framework and Apache Kudu as a high throughput store form a nice combination which solves this pattern very efficiently.
This session will walk through a use case which involves writing a high throughput stream using Apache Kafka,Apache Apex and Apache Kudu. The session will start with a general overview of Apache Apex and capabilities of Apex that form the foundation for a low latency and high throughput engine with Apache kafka being an example input source of streams. Subsequently we walk through Kudu integration with Apex by walking through various patterns like end to end exactly once, selective column writes and timestamp propagations for out of band data. The session will also cover additional patterns that this integration will cover for enterprise level data processing pipelines.
The session will conclude with some metrics for latency and throughput numbers for the use case that is presented.
Speaker
Ananth Gundabattula, Senior Architect, Commonwealth Bank of Australia
Tech Talk Series, Part 3: Why is your CFO right to demand you scale down MySQL?Clustrix
Many web businesses enjoy a spike in traffic at some point in the year. Whether it's Black Friday, the NFL draft day, or Mother’s Day, your app needs to be able to scale and capture customer value when it is most needed. Downtime is not an option.
For a database, that means having enough capacity to ensure transaction latency stays within acceptable limits. For high capacity apps using MySQL, this means you may need to deploy triple the normal capacity usage to sustain traffic for one day. But what do you do with that hardware for the rest of the year? Do you leave it idling? That unused capacity is costing you an arm and a leg, and wasted expenses make CFOs grumpy.
In Part 3 of our Tech Talk series, we discuss what the options are for scaling down MySQL, as well as explore answers to the following questions:
- How do I figure out the costs of not scaling down?
- How does ClustrixDB scale-down differently than MySQL?
- How real is elastically scaling in ClustrixDB? What are the catches?
View the webcast of this Tech Talk on our YouTube channel.
Your Guide to Streaming - The Engineer's PerspectiveIlya Ganelin
It feels like every week there's a new open-source streaming platform out there. Yet, if you only look at the descriptions, performance metrics, or even the architecture, they all start to look exactly the same! In short, nothing really differentiates itself - whether it be Storm, Flink, Apex, GearPumk, Samza, KafkaStreams, AkkaStreams, or any of the other myriad technologies. So if they all look the same, how do you really pick a streaming platform to solve the problem that YOU have? This talk is about how to really compare these platforms, and it turns out that they do have their key differences, they're just not the ones you usually think about. The way that you need to compare these systems if you're building something to last, a well-engineered system, is to look at how they handle durability, availability, how easy they are to install and use, and how they deal with failures.
In the engineering world, we don’t always have the luxury of owning our data pipelines end to end. If only we could influence those outside components… Well, we tried, and this our story - replete with failure, discovery, and the serenity of enlightenment. Join us on our journey as we learned more than we ever wanted to know about compression in different Apache projects, deployed our own ingestion pipeline in Apache Flume, and ultimately unified these in a robust framework built on Apache Apex handling 1 TB of data per day. We end with some reflections on the joys and tribulations of the open source realm and some key lessons for other large applications atop multiple Apache solutions.
This talk covers the Vault 8 team's journey at Capital One where we investigated a wide variety of stream processing solutions to build a next generation real-time decisioning platform to power Capital One's infrastructure.
The result of our analysis showed Apache Storm, Apache Flink, and Apache Apex as prime contenders for our use case with Apache Apex ultimately proving to be the solution of choice based on its present readiness for enterprise deployment and its excellent performance.
Low latency high throughput streaming using Apache Apex and Apache KuduDataWorks Summit
True streaming is fast becoming a necessity for many business use cases. On the other hand the data set sizes and volumes are also growing exponentially compounding the complexity of data processing pipelines.There exists a need for true low latency streaming coupled with very high throughput data processing. Apache Apex as a low latency and high throughput data processing framework and Apache Kudu as a high throughput store form a nice combination which solves this pattern very efficiently.
This session will walk through a use case which involves writing a high throughput stream using Apache Kafka,Apache Apex and Apache Kudu. The session will start with a general overview of Apache Apex and capabilities of Apex that form the foundation for a low latency and high throughput engine with Apache kafka being an example input source of streams. Subsequently we walk through Kudu integration with Apex by walking through various patterns like end to end exactly once, selective column writes and timestamp propagations for out of band data. The session will also cover additional patterns that this integration will cover for enterprise level data processing pipelines.
The session will conclude with some metrics for latency and throughput numbers for the use case that is presented.
Speaker
Ananth Gundabattula, Senior Architect, Commonwealth Bank of Australia
Tech Talk Series, Part 3: Why is your CFO right to demand you scale down MySQL?Clustrix
Many web businesses enjoy a spike in traffic at some point in the year. Whether it's Black Friday, the NFL draft day, or Mother’s Day, your app needs to be able to scale and capture customer value when it is most needed. Downtime is not an option.
For a database, that means having enough capacity to ensure transaction latency stays within acceptable limits. For high capacity apps using MySQL, this means you may need to deploy triple the normal capacity usage to sustain traffic for one day. But what do you do with that hardware for the rest of the year? Do you leave it idling? That unused capacity is costing you an arm and a leg, and wasted expenses make CFOs grumpy.
In Part 3 of our Tech Talk series, we discuss what the options are for scaling down MySQL, as well as explore answers to the following questions:
- How do I figure out the costs of not scaling down?
- How does ClustrixDB scale-down differently than MySQL?
- How real is elastically scaling in ClustrixDB? What are the catches?
View the webcast of this Tech Talk on our YouTube channel.
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsEsther Kundin
An overview of the history of Big Data, followed by a deep dive into the Hadoop ecosystem. Detailed explanation of how HDFS, MapReduce, and HBase work, followed by a discussion of how to tune HBase performance. Finally, a look at industry trends, including challenges faced and being solved by Bloomberg for using Hadoop for financial data.
Transforming Data Architecture Complexity at Sears - StampedeCon 2013StampedeCon
At the StampedeCon 2013 Big Data conference in St. Louis, Justin Sheppard discussed Transforming Data Architecture Complexity at Sears. High ETL complexity and costs, data latency and redundancy, and batch window limits are just some of the IT challenges caused by traditional data warehouses. Gain an understanding of big data tools through the use cases and technology that enables Sears to solve the problems of the traditional enterprise data warehouse approach. Learn how Sears uses Hadoop as a data hub to minimize data architecture complexity – resulting in a reduction of time to insight by 30-70% – and discover “quick wins” such as mainframe MIPS reduction.
AWS Webcast - Introduction to RDS Low Admin High Perf DBSAmazon Web Services
Amazon Relational Database Service (Amazon RDS) makes it easy to set up, operate, and scale a relational database in the cloud. It provides cost-efficient and resizable capacity while managing time-consuming database administration tasks, freeing you up to focus on your applications and business. In this webinar we will discuss how to move your existing databases to RDS with minimum disruption. We will also cover how to deploy very high performance databases on the cloud. Finally, we will provide examples of how customers have successfully deployed high performance databases using RDS.
AWS Storage Tiers for Enterprise Workloads - Best Practices (STG301) | AWS re...Amazon Web Services
Enterprise environments utilize many different kinds of storage technologies from different vendors to fulfill various needs in their IT landscape. These are often very expensive and procurement cycles quite lengthy. They also need specialized expertise in each vendor's storage technologies to configure them and integrate them into the ecosystem, again resulting in prolonged project cycles and added cost. AWS provides end-to-end storage solutions that will fulfill all these needs of Enterprise Environments that are easily manageable, extremely cost effective, fully integrated and totally on demand. These storage technologies include Elastic Block Store (EBS) for instance attached block storage, Amazon Simple Storage Service (Amazon S3) for object (file) storage and Amazon Glacier for archival. An enterprise database environment is an excellent example of a system that could use all these storage technologies to implement an end-to-end solution using striped PIOPS volumes for data files, Standard EBS volumes for log files, S3 for database backup using Oracle Secure Backup and Glacier for long-time archival from S3 based on time lapse rules. In this session, we will explore the best practices for utilizing AWS storage tiers for enterprise workloads.
These slides cover a talk on using distributed computation for database queries. Moore's Law, Amdahl's Law and distribution techniques are highlighted, and a simple performance comparison is provided.
AWS offers customers a range of different database options. These include Amazon DynamoDB, a fully-managed NoSQL database service that makes it simple and cost-effective to store and retrieve any amount of data as well as Amazon Relational Database Service (RDS), a service that makes it easy to set up, operate, and scale a relational database in the cloud with support for MySQL, Microsoft SQL Server, PostgreSQL, and Oracle Database. In this session you’ll get an overview of AWS database options and how they might help support your application and see how to get started.
An introduction into Spark ML plus how to go beyond when you get stuckData Con LA
Abstract:-
This talk will introduce Spark new machine learning frame work (Spark ML) and how to train basic models with it. A companion Jupyter notebook for people to follow along with will be provided. Once we've got the basics down we'll look at what to do when we find we need more than the tools available in Spark ML (and I'll try and convince people to contribute to my latest side project -- Sparkling ML).
Bio:-
Holden Karau is a transgender Canadian, Apache Spark committer, an active open source contributor, and coauthor of Learning Spark and High Performance Spark. When not in San Francisco working as a software development engineer at IBM’s Spark Technology Center, Holden speaks internationally about Spark and holds office hours at coffee shops at home and abroad. She makes frequent contributions to Spark, specializing in PySpark and machine learning. Prior to IBM, she worked on a variety of distributed, search, and classification problems at Alpine, Databricks, Google, Foursquare, and Amazon. She holds a bachelor of mathematics in computer science from the University of Waterloo. Outside of computers she enjoys scootering and playing with fire.
Apache Tez : Accelerating Hadoop Query ProcessingBikas Saha
Apache Tez is the new data processing framework in the Hadoop ecosystem. It runs on top of YARN - the new compute platform for Hadoop 2. Learn how Tez is built from the ground up to tackle a broad spectrum of data processing scenarios in Hadoop/BigData - ranging from interactive query processing to complex batch processing. With a high degree of automation built-in, and support for extensive customization, Tez aims to work out of the box for good performance and efficiency. Apache Hive and Pig are already adopting Tez as their platform of choice for query execution.
Deploying Apache Flume to enable low-latency analyticsDataWorks Summit
The driving question behind redesigns of countless data collection architectures has often been, ?how can we make the data available to our analytical systems faster?? Increasingly, the go-to solution for this data collection problem is Apache Flume. In this talk, architectures and techniques for designing a low-latency Flume-based data collection and delivery system to enable Hadoop-based analytics are explored. Techniques for getting the data into Flume, getting the data onto HDFS and HBase, and making the data available as quickly as possible are discussed. Best practices for scaling up collection, addressing de-duplication, and utilizing a combination streaming/batch model are described in the context of Flume and Hadoop ecosystem components.
Apache Hadoop YARN is the modern Distributed Operating System. It enables the Hadoop compute layer to be a common resource-management platform that can host a wide variety of applications. Multiple organizations are able to leverage YARN in building their applications on top of Hadoop without themselves repeatedly worrying about resource management, isolation, multi-tenancy issues etc.
In this talk, we’ll first hit the ground with the current status of Apache Hadoop YARN – how it is faring today in deployments large and small. We will cover different types of YARN deployments, in different environments and scale.
We'll then move on to the exciting present & future of YARN – features that are further strengthening YARN as the first-class resource-management platform for datacenters running enterprise Hadoop. We’ll discuss the current status as well as the future promise of features and initiatives like – 10x scheduler throughput improvements, docker containers support on YARN, support for long running services (alongside applications) natively without any changes, seamless application upgrades, fine-grained isolation for multi-tenancy using CGroups on disk & network resources, powerful scheduling features like application priorities, intra-queue preemption across applications and operational enhancements including insights through Timeline Service V2, a new web UI and better queue management.
A previous (OUTDATED) overview of resource management in Impala, relevant through Impala 2.2/CDH 5.4.
See the Cloudera documentation for the newest information: https://www.cloudera.com/documentation/enterprise/latest/topics/impala_howto_rm.html#impala_resource_management_example
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsEsther Kundin
An overview of the history of Big Data, followed by a deep dive into the Hadoop ecosystem. Detailed explanation of how HDFS, MapReduce, and HBase work, followed by a discussion of how to tune HBase performance. Finally, a look at industry trends, including challenges faced and being solved by Bloomberg for using Hadoop for financial data.
Transforming Data Architecture Complexity at Sears - StampedeCon 2013StampedeCon
At the StampedeCon 2013 Big Data conference in St. Louis, Justin Sheppard discussed Transforming Data Architecture Complexity at Sears. High ETL complexity and costs, data latency and redundancy, and batch window limits are just some of the IT challenges caused by traditional data warehouses. Gain an understanding of big data tools through the use cases and technology that enables Sears to solve the problems of the traditional enterprise data warehouse approach. Learn how Sears uses Hadoop as a data hub to minimize data architecture complexity – resulting in a reduction of time to insight by 30-70% – and discover “quick wins” such as mainframe MIPS reduction.
AWS Webcast - Introduction to RDS Low Admin High Perf DBSAmazon Web Services
Amazon Relational Database Service (Amazon RDS) makes it easy to set up, operate, and scale a relational database in the cloud. It provides cost-efficient and resizable capacity while managing time-consuming database administration tasks, freeing you up to focus on your applications and business. In this webinar we will discuss how to move your existing databases to RDS with minimum disruption. We will also cover how to deploy very high performance databases on the cloud. Finally, we will provide examples of how customers have successfully deployed high performance databases using RDS.
AWS Storage Tiers for Enterprise Workloads - Best Practices (STG301) | AWS re...Amazon Web Services
Enterprise environments utilize many different kinds of storage technologies from different vendors to fulfill various needs in their IT landscape. These are often very expensive and procurement cycles quite lengthy. They also need specialized expertise in each vendor's storage technologies to configure them and integrate them into the ecosystem, again resulting in prolonged project cycles and added cost. AWS provides end-to-end storage solutions that will fulfill all these needs of Enterprise Environments that are easily manageable, extremely cost effective, fully integrated and totally on demand. These storage technologies include Elastic Block Store (EBS) for instance attached block storage, Amazon Simple Storage Service (Amazon S3) for object (file) storage and Amazon Glacier for archival. An enterprise database environment is an excellent example of a system that could use all these storage technologies to implement an end-to-end solution using striped PIOPS volumes for data files, Standard EBS volumes for log files, S3 for database backup using Oracle Secure Backup and Glacier for long-time archival from S3 based on time lapse rules. In this session, we will explore the best practices for utilizing AWS storage tiers for enterprise workloads.
These slides cover a talk on using distributed computation for database queries. Moore's Law, Amdahl's Law and distribution techniques are highlighted, and a simple performance comparison is provided.
AWS offers customers a range of different database options. These include Amazon DynamoDB, a fully-managed NoSQL database service that makes it simple and cost-effective to store and retrieve any amount of data as well as Amazon Relational Database Service (RDS), a service that makes it easy to set up, operate, and scale a relational database in the cloud with support for MySQL, Microsoft SQL Server, PostgreSQL, and Oracle Database. In this session you’ll get an overview of AWS database options and how they might help support your application and see how to get started.
An introduction into Spark ML plus how to go beyond when you get stuckData Con LA
Abstract:-
This talk will introduce Spark new machine learning frame work (Spark ML) and how to train basic models with it. A companion Jupyter notebook for people to follow along with will be provided. Once we've got the basics down we'll look at what to do when we find we need more than the tools available in Spark ML (and I'll try and convince people to contribute to my latest side project -- Sparkling ML).
Bio:-
Holden Karau is a transgender Canadian, Apache Spark committer, an active open source contributor, and coauthor of Learning Spark and High Performance Spark. When not in San Francisco working as a software development engineer at IBM’s Spark Technology Center, Holden speaks internationally about Spark and holds office hours at coffee shops at home and abroad. She makes frequent contributions to Spark, specializing in PySpark and machine learning. Prior to IBM, she worked on a variety of distributed, search, and classification problems at Alpine, Databricks, Google, Foursquare, and Amazon. She holds a bachelor of mathematics in computer science from the University of Waterloo. Outside of computers she enjoys scootering and playing with fire.
Apache Tez : Accelerating Hadoop Query ProcessingBikas Saha
Apache Tez is the new data processing framework in the Hadoop ecosystem. It runs on top of YARN - the new compute platform for Hadoop 2. Learn how Tez is built from the ground up to tackle a broad spectrum of data processing scenarios in Hadoop/BigData - ranging from interactive query processing to complex batch processing. With a high degree of automation built-in, and support for extensive customization, Tez aims to work out of the box for good performance and efficiency. Apache Hive and Pig are already adopting Tez as their platform of choice for query execution.
Deploying Apache Flume to enable low-latency analyticsDataWorks Summit
The driving question behind redesigns of countless data collection architectures has often been, ?how can we make the data available to our analytical systems faster?? Increasingly, the go-to solution for this data collection problem is Apache Flume. In this talk, architectures and techniques for designing a low-latency Flume-based data collection and delivery system to enable Hadoop-based analytics are explored. Techniques for getting the data into Flume, getting the data onto HDFS and HBase, and making the data available as quickly as possible are discussed. Best practices for scaling up collection, addressing de-duplication, and utilizing a combination streaming/batch model are described in the context of Flume and Hadoop ecosystem components.
Apache Hadoop YARN is the modern Distributed Operating System. It enables the Hadoop compute layer to be a common resource-management platform that can host a wide variety of applications. Multiple organizations are able to leverage YARN in building their applications on top of Hadoop without themselves repeatedly worrying about resource management, isolation, multi-tenancy issues etc.
In this talk, we’ll first hit the ground with the current status of Apache Hadoop YARN – how it is faring today in deployments large and small. We will cover different types of YARN deployments, in different environments and scale.
We'll then move on to the exciting present & future of YARN – features that are further strengthening YARN as the first-class resource-management platform for datacenters running enterprise Hadoop. We’ll discuss the current status as well as the future promise of features and initiatives like – 10x scheduler throughput improvements, docker containers support on YARN, support for long running services (alongside applications) natively without any changes, seamless application upgrades, fine-grained isolation for multi-tenancy using CGroups on disk & network resources, powerful scheduling features like application priorities, intra-queue preemption across applications and operational enhancements including insights through Timeline Service V2, a new web UI and better queue management.
A previous (OUTDATED) overview of resource management in Impala, relevant through Impala 2.2/CDH 5.4.
See the Cloudera documentation for the newest information: https://www.cloudera.com/documentation/enterprise/latest/topics/impala_howto_rm.html#impala_resource_management_example
In this presentation, you will get a look under the covers of Amazon Redshift, a fast, fully-managed, petabyte-scale data warehouse service for less than $1,000 per TB per year. Learn how Amazon Redshift uses columnar technology, optimized hardware, and massively parallel processing to deliver fast query performance on data sets ranging in size from hundreds of gigabytes to a petabyte or more. We'll also walk through techniques for optimizing performance and, you’ll hear from a specific customer and their use case to take advantage of fast performance on enormous datasets leveraging economies of scale on the AWS platform.
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...Mohamed Sayed
Using Open Source and Cloud Computing principles, these slides walk through the architectural patterns for building scalable cloud services. The second part of the presentation focuses on profiling common geolocation tasks like importing large datasets and rendering map tiles.
For people who start to create a cloud service, it’s really important to know how to create a scalable cloud service to fit the growth of the future workloads. In this session, we will introduce how to design a scalable cloud service including AWS services introduction and best practices.
A recap of some of the most interesting things learned from the AWS re:Invent 2013 Conference. Easily the most intense and educational conference I've ever attended.
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInLinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn. This was a presentation made at QCon 2009 and is embedded on LinkedIn's blog - http://blog.linkedin.com/
AWS Webcast - Managing Big Data in the AWS Cloud_20140924Amazon Web Services
This presentation deck will cover specific services such as Amazon S3, Kinesis, Redshift, Elastic MapReduce, and DynamoDB, including their features and performance characteristics. It will also cover architectural designs for the optimal use of these services based on dimensions of your data source (structured or unstructured data, volume, item size and transfer rates) and application considerations - for latency, cost and durability. It will also share customer success stories and resources to help you get started.
Technical 101: AWS Innovation at Scale
This session, gives an insider view of some the innovations that help make the AWS Cloud unique. He will show examples of AWS networking innovations from the interregional network backbone, through custom routers and networking rotocol stack, all the way down to individual servers. He will show examples from AWS server hardware, storage, and power distribution and then, up the stack, in high scale streaming data processing. Rodney will also dive into fundamental database work AWS is delivering to open up scaling and performance limits, reduce costs, and eliminate much of the administrative burden of managing databases. Join this session and walk away with a deeper understanding of the underlying innovations powering the cloud.
Speaker: Rodeny Haywood, Manager Solutions Architecture, Amazon Web Services
This talk show how Spring technologies can help to develop applications for the cloud. PaaS like Google App Engine, Amazon Beanstalk, Cloud Bees and Cloud Foundry are shown as well as other technologies such as NoSQL, RabbitMQ and Hadoop.
This session, gives an insider view of some the innovations that help make the AWS Cloud unique. He will show examples of AWS networking innovations from the interregional network backbone, through custom routers and networking rotocol stack, all the way down to individual servers. He will show examples from AWS server hardware, storage, and power distribution and then, up the stack, in high scale streaming data processing.
Limiting software architecture to the traditional ideas is not enough for today's challenges. This presentation shows additional tools and how problems like maintainability, reliability and usability can be solved.
Continuous Delivery solves many current challenges - but still adoption is limited. This talks shows reasons for this and how to overcome these problems.
Four Times Microservices - REST, Kubernetes, UI Integration, AsyncEberhard Wolff
How you can build microservices:
- REST with the Netflix stack (Eureka for Service Discovery, Ribbon for Load Balancing, Hystrix for Resilience, Zuul for Routing)
- REST with Consul for Services Discovery
- REST with Kubernetes
- UI integration with ESI (Edge Side Includes)
- UI integration on the client with JavaScript
- Async with Apache Kafka
- Async with HTTP + Atom
This presentation show several options how to implement microservices: the Netflix stack, Consul, and Kubernetes. Also integration options like REST and UI integration are covered.
There are many different deployment options - package managers, tools like Chef or Puppet, PaaS and orchestration tools. This presentation give an overview of these tools and approaches like idempotent installation or immutable server.
Held at Continuous Lifecycle 2016
How to Split Your System into MicroservicesEberhard Wolff
Splitting a system into microservices is a challenging task. This talk shows how ideas like Bounded Context, migration scenarios and technical constraints can be used to build a microservice architecture. Held at WJAX 2016.
Microservices and Self-contained System to Scale AgileEberhard Wolff
Architectures like Microservices and Self-contained Systems provide a way to support agile processes and scale them. Held at JUG Saxony Day 2016 in Dresden.
Data Architecturen Not Just for MicroservicesEberhard Wolff
Microservices change the way data is handled and stored. This presentation shows how Bounded Context, Events, Event Sourcing and CQRS provide new approaches to handle data.
We assume software should contain no redundancies and that a clean architecture is the way to a maintainable system. Microservices challenge these assumptions. Keynote from Entwicklertage 2016 in Karlsruhe.
Nanoservices are smaller than Microservices. This presentation shows how technologies like Amazon Lambda, OSGi and Java EE can be used to enable such small services.
Microservices: Architecture to scale AgileEberhard Wolff
Microservices allow for scaling agile processes. This presentation shows what Microservices are, what agility is and introduces Self-contained Systems (SCS). Finally, it shows how SCS can help to scale agile processes.
Microservices, DevOps, Continuous Delivery – More Than Three BuzzwordsEberhard Wolff
Microservices, DevOps and Continuous Delivery are three hypes at the moment. This talk looks into the relationships between these three approaches and gives an idea how these approaches help to solve concrete problems. Held at Continuous Lifecycle 2015.
Infrastructure for Continuous Delivery & Microservices: PaaS or Docker?Eberhard Wolff
Docker seems the perfect infrastructure for Microservices. However, PaaS like Amazon Elastic Beanstalk or Cloud Foundry might be interesting alternatives. This talks shows how PaaS support Continuous delivery. Held at Continuous Lifecycle 2015.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
2. About me
• Eberhard Wolff
• Architecture & Technology Manager at adesso
• adesso is a leading IT consultancy in Germany
• Speaker
• Author (e.g. first German Spring book)
• Blog: http://ewolff.com
• Twitter: @ewolff
• http://slideshare.com/ewolff
• eberhard.wolff@adesso.de
6. •
How is Cloud Different?
Can easily and cheaply add new resources
– Prefer starting new instances over highly available instances
– Prefer adding instances over using a more powerful instance
– Might end up with lots of instances
• Prefer dealing with failure over providing a highly available network
• So basically lots of non powerful instances with an unreliable
network
• How can you end up with a reliable system then?
7. Enter Spring Biking!
• The revolutionary web site to
create customized bikes!
• We got a few million € Venture
Capital
• We need...
– Catalog of all Mountain Bike parts
and bikes
– System to configure custom
Mountain Bikes
– Order system
• Cloud good idea
– No CapEx
– Rapid elasticity -> easy to grow
• Focusing on German market
8. Spring Biking: Architecture
Application • Standard
(Order, Enterprise
Configuration,
Catalog)
Architecture
• Relational
database
Database
9. Spring Biking: Architecture
Wait, didn’t you
Application
(Order,
• Standard
Enterprise
say it should run
Configuration,
Catalog)
Architecture
• Relational
in the Cloud?
Database
database
10. How Spring Biking Deals with
Cloud Challenges
• No state on the web tier
– i.e. no session
Application
– State stored in database
• Easy to automatically start new (Order,
instances if load increases Configuration,
• Every PaaS should deal with elastic Catalog)
scaling
• Example: Amazon Elastic Beanstalk
– Takes a standard Java WAR
– Deploys it
– Add elastic scaling
• Could build something similar yourself
with an IaaS
– Automated deployment
– Elastic scaling and load balancing
available from Amazon IaaS offerings
11. How Spring Biking Deals with
Cloud Challenges
• Relational database fine for now
– Example: Amazon RDS (Relational
Database Service)
– MySQL and Oracle Database
– MySQL: Multi data center replication
– Can deal with failure of one data center
• Add Content Delivery Network (CDN)
– Not too many PaaS in Europe or
Germany
– Latency effects revenue
• Every 100ms latency costs Amazon 1% of
revenue
– So add CDN to lower latency
12. Benefits for the Development
Process
• Trivial to get a new version out
• Easy to create a production like
environment for test or staging
– Take snapshot from production database
– Set up new database with snapshot
– Create a new environment with a different
release of the software
– Automated for production
– Production-like sizing acceptable: You pay
by the hour
• This can also be done using Private
Clouds!
• Can be more important than cost reduction
• Business Agility is a major driver for
(private) Cloud!
13. Next step: Spring Biking Goes
Global!
• Global demand for bikes is on all time high!
• We need to globalize the offering
• A central RDBMS for the global system is not
acceptable
– Latency
– Amazon RDS offers one uniform database for a
Region (e.g. US-East, EU-West)
– Need a different solution for a global system
• Just an example
• Traditional Enterprise scales to a certain limit
• The question is which
• We are not all going to build Twitter or
Facebook
14. CAP Theorem
• Consistency
– All nodes see the same data
• Availability
– Node failure do not prevent survivors from operating
C
• Partition Tolerance
– System continues to operate despite
arbitrary message loss
P A
• Can at max have two
15. CAP Theorem
Consistency
RDBMS
Quorum 2 Phase
Commit
Partition
Tolerance Replication DNS Availability
16. CAP Theorem in the Cloud
• Need A – Availability
– A system that is not available is usually the worst
thing C
– Shutting down nodes is no option
• Need P – Partition Tolerance
– Network is not under your control P A
– Lots of nodes -> partitioning even more likely
• No chance for C – Consistency
– Because we can’t
17. BASE
• Basically Available Soft state
Eventually consistent
• I.e. trade consistency for
availability
• Eventually consistent
– If no updates are sent for a while
all previous updates will
eventually propagate through the
system
– Then all replicas are consistent
– Can deal with network
partitioning: Message will be
transferred later
• All replicas are always available
• Pun concerning ACID…
18. BASE in Spring Biking
Application Application Application
Database Database Database
EU-West US-East Asia-Pacific
Changes to catalog
Eventually propagated
19. Network Partitioning
Application Application Application
Database Database Database
Eventually
Inconsistent
data is EU-West US-East Asia-Pacific
consistent
Network
Partitioning
20. BASE Using Event Sourcing
Event Domain
• Do it yourself using a Model
messaging system
– JMS, RabbitMQ …
– Easy to duplicate state on nodes
– Fail safe: Message will eventually be transferred
– …and high latency is acceptable
• Other reason to use Event Sourcing
– Capture all changes to an application state as a sequence of
events
– Originates in Domain Driven Design
– Also used as a log of actions (to replay, reverse etc)
• Might end up with an Event-driven Architecture
– Might add Complex Event Processing etc.
21. Implementing BASE Using NoSQL
• Some NoSQL databases
include replication
• Example: CouchDB
– Replication between nodes
– Master-master replication
using versioning
– Trivial to set up
– All nodes have the same
data
– Sharding only possible with
additional proxies
22. More Sophisticated
• Apache Cassandra
• Each node is master for certain
data
• Data is replicated to N nodes
• Data is read from R nodes
• After a write W nodes must acknowledge
• N,R,W are configurable
23. Different Parts Require Different
Architecture Application
• So far: Catalog
– Data must be available on each node
Catalog
– Slight inconsistencies are OK
– i.e. new item added to catalog
• Stock information must be consistent
– So customers are not disappointed Order
– Might use caching-like structure
• Orders are immediately send to the back end
– No local storage at all
• A lot more catalog browsing than ordering
24. More load on catalog -> Less load on order ->
More instances Less instances
Catalog Catalog Order
No local
Stock Stock data
Database Database All send
Cache Cache
to
backend
Updates
Stock
Master
25. Handling Log Files
• Business requirements
– Need to measure hits on web pages
– Need to measure hits for individual products etc.
• Sounds like a batch
– File in, statistics out
• But: Data is globally distributed
• Lots of data i.e. cannot be collected at
a central place
• Data should stay where it is
• Some nodes might be offline or not available
• Prefer incomplete answer over no answer at all
26. More Than CAP
• CAP Theorem again
• Consistency, Availability, Network Partitioning
• You can only have two C
• But: We want Availability
• …and a flexible trade off between P A
Consistency and Network Partitioning
• Like Casssandra
27. Harvest and Yield
• Yield: Probability of completing a request
• Harvest: Fraction of data represented in the result
• Harvest and Yield vs. CAP
• Yield = 100% -> Availability
• Harvest = 100% -> Consistency
• Can be used to think about Cassandra configurations
• Can also be used to execute some logic on all data
• …and wait until enough harvest is there to answer a query
• So: Send out a query to all log files
• …and collect the results
28. Map / Reduce
• Map: Apply a function to all data
– Emit (item name, 1) for each log file line
• Master sorts by item name
• Reduce: Add all (item name, 1) to the total
score Reduce
• Map can be done on any node
• Master collects data Map Map
29. Another Case Study
• Financials
• Build a Highly Available, High Throughput System, Low
Latency System on Standard Hardware!
• Just like Google and Amazon
• Driver: Standard infrastructure – cheap and stable
• Driver: Even more availability, throughput, scalability and
lower latency
• You will need to consider CAP, BASE, Harvest & Yield etc.
• Very likely in a Private Cloud
30. Another Case Study
• Random Project
• Make deployment easier!
• Make it easy to create test environment!
• Driver: Business Agility and Developer
Productivity
• Will need to use automated installation + IaaS or
PaaS
• Might be in a Public or Private Cloud
32. Conclusion
• Current PaaS allow to run Enterprise applications unchanged
• At one point you will need to change
• CAP: Consistency, Availability, Partition Tolerance – choose two
• Cloud: AP, no C
• BASE: Basically Available, Soft State, Eventually Consistent
• Can be implemented using Event Sourcing
• …or a NoSQL persistence solution
• Create multiple deployment artifacts to scale each part
• Harvest / Yield: Fine grained CAP
• Map / Reduce to run batches in the Cloud
33. Wir suchen Sie als
! Software-Architekt (m/w)
! Projektleiter (m/w)
! Senior Software Engineer (m/w)
! Kommen Sie zum Stand und gewinnen Sie ein iPad 2!
jobs@adesso.de
www.AAAjobs.de