Using Groovy? Got lots of stuff to do at the same time? Then you need to take a look at GPars (“Jeepers!”), a library providing support for concurrency and parallelism in Groovy. GPars brings powerful concurrency models from other languages to Groovy and makes them easy to use with custom DSLs:
- Actors (Erlang and Scala)
- Dataflow (Io)
- Fork/join (Java)
- Agent (Clojure agents)
In addition to this support, GPars integrates with standard Groovy frameworks like Grails and Griffon.
Background, comparisons to other languages, and motivating examples will be given for the major GPars features.
Terracotta (an open source technology) provides a clustered, durable virtual heap. Terracotta's goal is to make Java apps scale with as little effort as possible. If you are using Hibernate, there are several patterns that can be used to leverage Terracotta and reduce the load on your database so your app can scale.
First, you can use the Terracotta clustered Hibernate cache. This is a high-performance clustered cache and allows you to avoid hitting the database on all nodes in your cluster. It's suitable, not just for read-only, but also for read-mostly and read-write use cases, which traditionally have not been viewed as good use cases for Hibernate second level cache.
Another high performance option is to disconnect your POJOs from their Hibernate session and manage them entirely in Terracotta shared heap instead. This is a great option for conversational data where the conversational data is not of long-term interest but must be persistent and highly-available. This pattern can significantly reduce your database load but does require more changes to your application than using second-level cache.
This talk will examine the basics of what Terracotta provides and examples of how you can scale your Hibernate application with both clustered second level cache and detached clustered state. Also, we'll take a look at Terracotta's Hibernate-specific monitoring tools.
Caching has been an essential strategy for greater performance in computing since the beginning of the field. Nearly all applications have data access patterns that make caching an attractive technique, but caching also has hidden trade-offs related to concurrency, memory usage, and latency.
As we build larger distributed systems, caching continues to be a critical technique for building scalable, high-throughput, low-latency applications. Large systems tend to magnify the caching trade-offs and have created new approaches to distributed caching. There are unique challenges in testing systems like these as well.
Ehcache and Terracotta provide a unique way to start with simple caching for a small system and grow that system over time with a consistent API while maintaining low-latency, high-throughput caching.
.NET developers have a lot of options when it comes to databases these days. Apache Cassandra is a scalable, fault-tolerant database that has already found its way into more than 25% of the Fortune 100 and continues to grow in popularity. But what makes it different from the myriad of other options available? In this talk, we’ll take a deep dive into Cassandra and learn about:
- Cassandra’s internals and how it works
- CQL (the SQL-like query language for Cassandra)
- Data Modeling like a pro
- Tools available for developers
- Writing .NET code that talks to Cassandra
If there’s time and interest, we’ll finish up with how some companies are already using Cassandra to power services you probably interact with in your daily life. You’ll leave with all the tools you need to start build highly available .NET applications and services on top of Cassandra.
Adam Zegelin is Instaclustr's founding software engineer. This presentation will investigate how using micro-batching for submitting writes to Cassandra can improve throughput and reduce client application CPU load. Micro-batching combines writes for the same partition key into a single network request and ensures they hit the “fast path” for writes on a Cassandra node.
Terracotta (an open source technology) provides a clustered, durable virtual heap. Terracotta's goal is to make Java apps scale with as little effort as possible. If you are using Hibernate, there are several patterns that can be used to leverage Terracotta and reduce the load on your database so your app can scale.
First, you can use the Terracotta clustered Hibernate cache. This is a high-performance clustered cache and allows you to avoid hitting the database on all nodes in your cluster. It's suitable, not just for read-only, but also for read-mostly and read-write use cases, which traditionally have not been viewed as good use cases for Hibernate second level cache.
Another high performance option is to disconnect your POJOs from their Hibernate session and manage them entirely in Terracotta shared heap instead. This is a great option for conversational data where the conversational data is not of long-term interest but must be persistent and highly-available. This pattern can significantly reduce your database load but does require more changes to your application than using second-level cache.
This talk will examine the basics of what Terracotta provides and examples of how you can scale your Hibernate application with both clustered second level cache and detached clustered state. Also, we'll take a look at Terracotta's Hibernate-specific monitoring tools.
Caching has been an essential strategy for greater performance in computing since the beginning of the field. Nearly all applications have data access patterns that make caching an attractive technique, but caching also has hidden trade-offs related to concurrency, memory usage, and latency.
As we build larger distributed systems, caching continues to be a critical technique for building scalable, high-throughput, low-latency applications. Large systems tend to magnify the caching trade-offs and have created new approaches to distributed caching. There are unique challenges in testing systems like these as well.
Ehcache and Terracotta provide a unique way to start with simple caching for a small system and grow that system over time with a consistent API while maintaining low-latency, high-throughput caching.
.NET developers have a lot of options when it comes to databases these days. Apache Cassandra is a scalable, fault-tolerant database that has already found its way into more than 25% of the Fortune 100 and continues to grow in popularity. But what makes it different from the myriad of other options available? In this talk, we’ll take a deep dive into Cassandra and learn about:
- Cassandra’s internals and how it works
- CQL (the SQL-like query language for Cassandra)
- Data Modeling like a pro
- Tools available for developers
- Writing .NET code that talks to Cassandra
If there’s time and interest, we’ll finish up with how some companies are already using Cassandra to power services you probably interact with in your daily life. You’ll leave with all the tools you need to start build highly available .NET applications and services on top of Cassandra.
Adam Zegelin is Instaclustr's founding software engineer. This presentation will investigate how using micro-batching for submitting writes to Cassandra can improve throughput and reduce client application CPU load. Micro-batching combines writes for the same partition key into a single network request and ensures they hit the “fast path” for writes on a Cassandra node.
Speed Up Your Existing Relational Databases with Hazelcast and SpeedmentHazelcast
In this webinar
How do you get data from your existing relational databases changed by third party applications into your Hazelcast maps? How do you accomplish this if you have several databases, located on different sites, that need to be aggregated into a global Hazelcast map? How is it possible to reflect data from a relational database that has ten thousand updates per second or more?
Speedment’s SQL Reflector makes it possible to integrate your existing relational data with continuous updates of Hazelcast data-maps in real-time. In this webinar, we will show a couple of real-world cases where database applications are speeded up using Hazelcast maps fed by Speedment. We will also demonstrate how easily your existing database can be “reverse engineered” by the Speedment software that automatically creates efficient Java POJOs that can be used directly by Hazelcast.
We’ll cover these topics:
-Joint solution case studies
-Demo
-Live Q&A
Presenter:
Per-Åke Minborg, CTO at Speedment
Per-Åke Minborg is founder and CTO at Speedment AB. He is a passionate Java developer, dedicated to OpenSource software and an expert in finding new ways of solving problems – the harder problem the better. As a result, he has 15+ US patent applications and invention disclosures. He has a deep understanding of in-memory databases, high-performance solutions, cloud technologies and concurrent programming. He has previously served as CTO and founder of Chilirec and the Phone Pages. Per-Åke has a M.Sc. in Electrical Engineering from Chalmers University of Technology and several years of studies in computer science and computer security at university and PhD level.
DZone Java 8 Block Buster: Query Databases Using StreamsSpeedment, Inc.
Speedment Open Source is a new library that provides many interesting Java 8 features. It is written entirely in Java 8 from start. Speedment uses standard Streams for querying the database and thanks to that, you do not have to learn any new query API. You do not have to think at all about JDBC, ResultSet and other database specific things either.
Hear the talk to the article live from DZone Leader Per Åke Minborg. Read the full article here:
https://dzone.com/articles/java-8-query-databases-using-streams
Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...DataStax
With the addition of vnodes (Virtual Nodes), Cassandra users were able to gain a few benefits as a result of streaming when it came to bootstrapping and decommissioning nodes. On the flip side, having to route requests on larger clusters became a lot more intensive of a workload for all nodes that were then forced to act coordinator nodes. By setting up a tier of proxy nodes, we were able to have our cluster of 50 nodes perform with a 300% improvement on average in a mixed workload environment. This is an explanation of what we did, how we did it, and why it works.
About the Speaker
Eric Lubow CTO, SimpleReach
Eric Lubow is CTO of SimpleReach, where he builds highly-scalable distributed systems for processing analytics data. Eric is also a DataStax MVP for Cassandra, and co-author of Practical Cassandra. In his spare time, Eric is a skydiver, motorcycle rider, mixed martial artist, and dog dad.
Webinar: Getting Started with Apache CassandraDataStax
Would you like to learn how to use Cassandra but don’t know where to begin? Want to get your feet wet but you’re lost in the desert? Longing for a cluster when you don’t even know how to set up a node? Then look no further! Rebecca Mills, Junior Evangelist at Datastax, will guide you in the webinar “Getting Started with Apache Cassandra...”
You'll get an overview of Planet Cassandra’s resources to get you started quickly and easily. Rebecca will take you down the path that's right for you, whether you are a developer or administrator. Join if you are interested in getting Cassandra up and working in the way that suits you best.
NYJavaSIG - Big Data Microservices w/ SpeedmentSpeedment, Inc.
JAVA MICROSERVICES FOR BIG DATA WITH LOW LATENCY - Per-Ake Minborg, CTO Speedment
By leveraging on memory mapped files (eg. Hazelcast, ChronicleMaps etc.), Speedment supports large Java Maps that easily can exceed the size of your server’s RAM. Because the Java Maps are mapped onto files, these maps can be shared instantly between several microservice JVMs and new microservice instances can be added, removed or restarted very quickly. Data can be retrieved with predictable ultra-low latency for a wide range of operations. The solution can be synchronized with an underlying database so that your in-memory maps will be consistently “alive”. The mapped files can be terabytes which has been done in real world deployment cases and there can be a large number of microservices that shares these maps simultaneously.
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...DataStax
Cassandra's support for multiple data centers can bring massive benefits to an organization, however it can also bring painful operational lessons. While there is no recipe for trouble free mutli DC clusters, the best approach is to understand why you are using one, what Cassandra supports, and how it does it. With this knowledge in your toolkit you will have a better chance of fixing the sort of gremlins that can trouble a globally distributed database.
In this talk Alexander Dejanovski, Consultant at The Last Pickle, will outline the motivations people typically have for running a multi DC cluster. He will also look at how multiple DC's are supported through all areas of the Cassandra, how it impacts your application and operations, and how you can always blame the network.
About the Speaker
Alexander DEJANOVSKI Consultant, The Last Pickle
Alexander has been working as a software developer for the last 18 years, mainly for the french leader of express shipments. He's been leading there the effort to build a Cassandra based architecture and migrate services to it from traditional RDBMS. He is involved in the Cassandra community through the development of a JDBC wrapper for the DataStax Java Driver. Recently, he joined The Last Pickle as a Cassandra consultant and now helps customers to get the best out of it.
This is Apache ZooKeeper session.
ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.
By the end of this presentation you should be fairly clear about Apache ZooKeeper.
To watch the video or know more about the course, please visit
http://www.knowbigdata.com/page/big-data-and-hadoop-online-instructor-led-training
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...DataStax
You write with QUORUM, you read with QUORUM. You're safe, right?
Although it may seem that way, you could read a different value than the one you wrote - even if nobody else wrote after you. One way this can happen is if the time on the machines in your cluster is not synchronized closely enough. This is called clock skew, and is just one of the ways you'll see that this anomaly can occur.
In this talk we'll dive in to how Cassandra handles conflicting data, walk through several weird and seemingly impossible situations that can happen (both with and without clock skew), and see what we can do to work around them.
About the Speaker
Donny Nadolny Senior Developer, PagerDuty
Donny Nadolny is a Scala developer at PagerDuty, working on improving the reliability of their backend systems. He spends a large amount of time investigating problems experienced with distributed systems like Cassandra and ZooKeeper.
These slides cover a talk on using distributed computation for database queries. Moore's Law, Amdahl's Law and distribution techniques are highlighted, and a simple performance comparison is provided.
A brief, but action-packed introduction to DataStax Enterprise Search. In this deck, we'll get an overview of DSE Search's value proposition, see some example CQL search queries, and dive into the details of the indexing and query paths.
Releasing Relational Data to the Semantic WebAlex Miller
Enterprises are drowning in data that they can't find, access, or use.
For many years, enterprises have wrestled with the best way to combine all that data into actionable information without building systems that break as schemas evolve. Approaches like warehousing and ETL can be brittle in the face of changing data sources or expensive to create. Data integration at the application level is common but this results in significant complexity in the code. Data-oriented web services attempt to provide reusable sources of integrated data, however these have just added another layer of data access that constrain query and access patterns.
This talk will look at how semantic web technologies can be used to make existing data visible and actionable using standards like RDF (data), R2RML (data translation), OWL (schema definition and integration), SPARQL (federated query), and RIF (rules). The semantic web approach takes the data you already have and makes that data available for query and use across your existing data sources. This base capability is an excellent platform for building federated analytics.
Speed Up Your Existing Relational Databases with Hazelcast and SpeedmentHazelcast
In this webinar
How do you get data from your existing relational databases changed by third party applications into your Hazelcast maps? How do you accomplish this if you have several databases, located on different sites, that need to be aggregated into a global Hazelcast map? How is it possible to reflect data from a relational database that has ten thousand updates per second or more?
Speedment’s SQL Reflector makes it possible to integrate your existing relational data with continuous updates of Hazelcast data-maps in real-time. In this webinar, we will show a couple of real-world cases where database applications are speeded up using Hazelcast maps fed by Speedment. We will also demonstrate how easily your existing database can be “reverse engineered” by the Speedment software that automatically creates efficient Java POJOs that can be used directly by Hazelcast.
We’ll cover these topics:
-Joint solution case studies
-Demo
-Live Q&A
Presenter:
Per-Åke Minborg, CTO at Speedment
Per-Åke Minborg is founder and CTO at Speedment AB. He is a passionate Java developer, dedicated to OpenSource software and an expert in finding new ways of solving problems – the harder problem the better. As a result, he has 15+ US patent applications and invention disclosures. He has a deep understanding of in-memory databases, high-performance solutions, cloud technologies and concurrent programming. He has previously served as CTO and founder of Chilirec and the Phone Pages. Per-Åke has a M.Sc. in Electrical Engineering from Chalmers University of Technology and several years of studies in computer science and computer security at university and PhD level.
DZone Java 8 Block Buster: Query Databases Using StreamsSpeedment, Inc.
Speedment Open Source is a new library that provides many interesting Java 8 features. It is written entirely in Java 8 from start. Speedment uses standard Streams for querying the database and thanks to that, you do not have to learn any new query API. You do not have to think at all about JDBC, ResultSet and other database specific things either.
Hear the talk to the article live from DZone Leader Per Åke Minborg. Read the full article here:
https://dzone.com/articles/java-8-query-databases-using-streams
Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...DataStax
With the addition of vnodes (Virtual Nodes), Cassandra users were able to gain a few benefits as a result of streaming when it came to bootstrapping and decommissioning nodes. On the flip side, having to route requests on larger clusters became a lot more intensive of a workload for all nodes that were then forced to act coordinator nodes. By setting up a tier of proxy nodes, we were able to have our cluster of 50 nodes perform with a 300% improvement on average in a mixed workload environment. This is an explanation of what we did, how we did it, and why it works.
About the Speaker
Eric Lubow CTO, SimpleReach
Eric Lubow is CTO of SimpleReach, where he builds highly-scalable distributed systems for processing analytics data. Eric is also a DataStax MVP for Cassandra, and co-author of Practical Cassandra. In his spare time, Eric is a skydiver, motorcycle rider, mixed martial artist, and dog dad.
Webinar: Getting Started with Apache CassandraDataStax
Would you like to learn how to use Cassandra but don’t know where to begin? Want to get your feet wet but you’re lost in the desert? Longing for a cluster when you don’t even know how to set up a node? Then look no further! Rebecca Mills, Junior Evangelist at Datastax, will guide you in the webinar “Getting Started with Apache Cassandra...”
You'll get an overview of Planet Cassandra’s resources to get you started quickly and easily. Rebecca will take you down the path that's right for you, whether you are a developer or administrator. Join if you are interested in getting Cassandra up and working in the way that suits you best.
NYJavaSIG - Big Data Microservices w/ SpeedmentSpeedment, Inc.
JAVA MICROSERVICES FOR BIG DATA WITH LOW LATENCY - Per-Ake Minborg, CTO Speedment
By leveraging on memory mapped files (eg. Hazelcast, ChronicleMaps etc.), Speedment supports large Java Maps that easily can exceed the size of your server’s RAM. Because the Java Maps are mapped onto files, these maps can be shared instantly between several microservice JVMs and new microservice instances can be added, removed or restarted very quickly. Data can be retrieved with predictable ultra-low latency for a wide range of operations. The solution can be synchronized with an underlying database so that your in-memory maps will be consistently “alive”. The mapped files can be terabytes which has been done in real world deployment cases and there can be a large number of microservices that shares these maps simultaneously.
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...DataStax
Cassandra's support for multiple data centers can bring massive benefits to an organization, however it can also bring painful operational lessons. While there is no recipe for trouble free mutli DC clusters, the best approach is to understand why you are using one, what Cassandra supports, and how it does it. With this knowledge in your toolkit you will have a better chance of fixing the sort of gremlins that can trouble a globally distributed database.
In this talk Alexander Dejanovski, Consultant at The Last Pickle, will outline the motivations people typically have for running a multi DC cluster. He will also look at how multiple DC's are supported through all areas of the Cassandra, how it impacts your application and operations, and how you can always blame the network.
About the Speaker
Alexander DEJANOVSKI Consultant, The Last Pickle
Alexander has been working as a software developer for the last 18 years, mainly for the french leader of express shipments. He's been leading there the effort to build a Cassandra based architecture and migrate services to it from traditional RDBMS. He is involved in the Cassandra community through the development of a JDBC wrapper for the DataStax Java Driver. Recently, he joined The Last Pickle as a Cassandra consultant and now helps customers to get the best out of it.
This is Apache ZooKeeper session.
ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.
By the end of this presentation you should be fairly clear about Apache ZooKeeper.
To watch the video or know more about the course, please visit
http://www.knowbigdata.com/page/big-data-and-hadoop-online-instructor-led-training
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...DataStax
You write with QUORUM, you read with QUORUM. You're safe, right?
Although it may seem that way, you could read a different value than the one you wrote - even if nobody else wrote after you. One way this can happen is if the time on the machines in your cluster is not synchronized closely enough. This is called clock skew, and is just one of the ways you'll see that this anomaly can occur.
In this talk we'll dive in to how Cassandra handles conflicting data, walk through several weird and seemingly impossible situations that can happen (both with and without clock skew), and see what we can do to work around them.
About the Speaker
Donny Nadolny Senior Developer, PagerDuty
Donny Nadolny is a Scala developer at PagerDuty, working on improving the reliability of their backend systems. He spends a large amount of time investigating problems experienced with distributed systems like Cassandra and ZooKeeper.
These slides cover a talk on using distributed computation for database queries. Moore's Law, Amdahl's Law and distribution techniques are highlighted, and a simple performance comparison is provided.
A brief, but action-packed introduction to DataStax Enterprise Search. In this deck, we'll get an overview of DSE Search's value proposition, see some example CQL search queries, and dive into the details of the indexing and query paths.
Releasing Relational Data to the Semantic WebAlex Miller
Enterprises are drowning in data that they can't find, access, or use.
For many years, enterprises have wrestled with the best way to combine all that data into actionable information without building systems that break as schemas evolve. Approaches like warehousing and ETL can be brittle in the face of changing data sources or expensive to create. Data integration at the application level is common but this results in significant complexity in the code. Data-oriented web services attempt to provide reusable sources of integrated data, however these have just added another layer of data access that constrain query and access patterns.
This talk will look at how semantic web technologies can be used to make existing data visible and actionable using standards like RDF (data), R2RML (data translation), OWL (schema definition and integration), SPARQL (federated query), and RIF (rules). The semantic web approach takes the data you already have and makes that data available for query and use across your existing data sources. This base capability is an excellent platform for building federated analytics.
A talk about doing innovative software development including embracing constraints, iterating towards product/market fit, and the qualities of a great innovative team. This presentation was given at the St. Louis Innovation Camp in Feb, 2010.
Stream Execution with Clojure and Fork/joinAlex Miller
One of the greatest benefits of Clojure is its ability to create simple, powerful abstractions that operate at the level of the problem while also operating at the level of the language.
This talk discusses a query processing engine built in Clojure that leverages this abstraction power to combine streams of data for efficient concurrent execution.
* Representing processing trees as s-expressions
* Streams as sequences of data
* Optimizing processing trees by manipulating s-expressions
* Direct execution of s-expression trees
* Compilation of s-expressions into nodes and pipes
* Concurrent processing nodes and pipes using a fork/join pool
Maybe you've heard of Clojure, one of those new-fangled JVM languages. How does anybody get any work done in a language like that? What's up with all those parentheses?
If you're coming from Java and OOP, Clojure can indeed feel disorienting. In this talk we'll demystify the basics of Clojure and dissect the source of its power. Functional programming is on the rise and Clojure is indeed a functional language, but we'll learn the real secret sauce that makes cooking with Clojure fun.
We'll look at how to translate concepts you know in Java (like domain objects, interfaces, collections, and concurrency) into their natural Clojure equivalents. And more importantly, we'll learn how these components interact to make Clojure a beautiful language for building abstractions.
No prior knowledge of Clojure or functional programming is assumed... Clojure novices welcome!
Clojure is a new language that combines the power of Lisp with an existing hosted VM ecosystem (the Java VM). Clojure is a dynamically typed, functional, compiled language with performance on par with Java.
At the heart of all programming lies the need for abstraction, be it abstraction over our data or abstraction over the processes that operate upon it. Clojure provides a core set of powerful abstractions and ways to compose them. These abstractions are based in a heritage of Lisp but also cover many aspects of object-oriented programming as well.
This talk will examine these abstractions and introduce you to both Clojure and functional programming. Attendees are not expected to be familiar with either Clojure or FP.
Women in Data Meetup, February 2014
The talk introduces a visualisation grammar called spruce-leaf. Spruce-leaf is a declarative format for creating interactive leaflet maps. The talk is aimed at developers and non-developers alike - no programming knowledge is necessary. With spruce-leaf you can describe the visualisation in JSON format: merge your shapefile (e.g. map of CCG areas in the UK) with CSV data, colour the features, add legend and hover-on information box.
GPars is a concurrency and parallelism toolkit for the JVM. Founded on java.util.concurrent, it extends it with additional models of concurrency and parallelism, e.g. dataflow, CSP, actors, agents. Although GPars requires Groovy to run it is a toolkit usable from Java as well as Groovy codes.
This was presented at DevoxxUK 2013 2013-03-27T15:50
Lecture on Java Concurrency Day 4 on Feb 18, 2009.Kyung Koo Yoon
Lecture on Java Concurrency Day 4 on Feb 18, 2009. (in Korean)
Lectures are 4 days in all.
See http://javadom.blogspot.com/2011/06/lecture-on-java-concurrency-day-4.html
One of the greatest benefits of Clojure is its ability to create simple, powerful abstractions that operate at the level of the problem while also operating at the level of the language.
This talk discusses a query processing engine built in Clojure that leverages this abstraction power to combine streams of data for efficient concurrent execution.
* Representing processing trees as s-expressions
* Streams as sequences of data
* Optimizing processing trees by manipulating s-expressions
* Direct execution of s-expression trees
* Compilation of s-expressions into nodes and pipes
* Concurrent processing nodes and pipes using a fork/join pool
GPars (Groovy Parallel Systems) is an open-source concurrency and parallelism library for Java and Groovy that gives you a number of high-level abstractions for writing concurrent and parallel code in Groovy (map/reduce, fork/join, asynchronous closures, actors, agents, dataflow concurrency and other concepts), which can make your Java and Groovy code concurrent and/or parallel with little effort.
Async and parallel patterns and application design - TechDays2013 NLArie Leeuwesteijn
TechDays2013 NL session on async and parallel programming. Gives an overview of todays relevant .net technologies, examples and tips and tricks. This session will help you to understand and select and use the right async/parallel technology to use in your .net application. (arie@macaw.nl)
Reactive Programming, Traits and Principles. What is Reactive, where does it come from, and what is it good for? How does it differ from event driven programming? It only functional?
Journey into Reactive Streams and Akka StreamsKevin Webber
Are streams just collections? What's the difference between Java 8 streams and Reactive Streams? How do I implement Reactive Streams with Akka? Pub/sub, dynamic push/pull, non-blocking, non-dropping; these are some of the other concepts covered. We'll also discuss how to leverage streams in a real-world application.
Writing concurrent programs that can run in multiple threads and on multiple cores is crucial but daunting. Futures provides a convenient abstraction for many problem domains. The online course "Intermediate Scala" includes an up-to-date discussion of futures and the parts of java.util.concurrent that underlie the Scala futures implementation. Unlike Java's futures, Scala futures supports composition, transformations and sophisticated callbacks.
The author is managing editor of http://scalacourses.com, which offers self-paced online courses that teach Introductory and Intermediate Scala and Play Framework.
Performance van Java 8 en verder - Jeroen BorgersNLJUG
We weten allemaal dat de grootste verbetering die Java 8 brengt de ondersteuning voor lambda-expressies is. Dit introduceert functioneel programmeren in Java. Door het toevoegen van de Stream API wordt deze verbetering nog groter: iteratie kan nu intern worden afgehandeld door een bibliotheek, je kunt daarmee nu het beginsel "Tell, don’t ask" toepassen op collecties. Je kunt gewoon vertellen dat er een ??functie uitgevoerd moet worden op je verzameling, of vertellen dat dat parallel, door meerdere cores moet gebeuren. Maar wat betekent dit voor de prestaties van onze Java-toepassingen? Kunnen we nu meteen volledig al onze CPU-cores benutten om betere responstijden te krijgen? Hoe werken filter / map / reduce en parallele streams precies intern? Hoe wordt het Fork-Join framework hierin gebruikt? Zijn lambda's sneller dan inner klassen? - Al deze vragen worden beantwoord in deze sessie. Daarnaast introduceert Java 8 meer performance verbeteringen: tiered compilatie, PermGen verwijdering, java.time, Accumulators, Adders en Map verbeteringen. Ten slotte zullen we ook een kijkje nemen in de keuken van de geplande performance verbeteringen voor Java 9: benutting van GPU's, Value Types en arrays 2.0.
Distributed Model Validation with EpsilonSina Madani
Scalable performance is a major challenge with current model management tools. As the size and complexity of models and model management programs increases and the cost of computing falls, one solution for improving performance of model management programs is to perform computations on multiple computers. The developed prototype demonstrates a low-overhead data-parallel approach for distributed model validation in the context of an OCL-like language. The approach minimises communication costs by exploiting the deterministic structure of programs and can take advantage of multiple cores on each (heterogenous) machine with highly configurable computational granularity. Performance evaluation shows linear improvements with more machines and processor cores, being up to 340x faster than the baseline sequential program with 88 computers.
Distributed system coordination by zookeeper and introduction to kazoo python...Jimmy Lai
Zookeeper is a coordination tool to let people build distributed systems easier. In this slides, the author summarizes the usage of zookeeper and provides Kazoo Python library as example.
Organizations continue to adopt Solr because of its ability to scale to meet even the most demanding workflows. Recently, LucidWorks has been leading the effort to identify, measure, and expand the limits of Solr. As part of this effort, we've learned a few things along the way that should prove useful for any organization wanting to scale Solr. Attendees will come away with a better understanding of how sharding and replication impact performance. Also, no benchmark is useful without being repeatable; Tim will also cover how to perform similar tests using the Solr-Scale-Toolkit in Amazon EC2.
Stream processing from single node to a clusterGal Marder
Building data pipelines shouldn't be so hard, you just need to choose the right tools for the task.
We will review Akka and Spark streaming, how they work and how to use them and when.
Continuous Application with FAIR Scheduler with Robert XueDatabricks
This talk presents a continuous application example that relies on Spark FAIR scheduler as the conductor to orchestrate the entire “lambda architecture” in a single spark context. As a typical time series event stream analysis might involved, there are four key components:
– an ETL step to store the raw data
– a series of real time aggregation on the joint of streaming input and historical data to power a model
– model execution
– ad-hoc query for human inspection.
The key benefits of this setup compared to a typical design that has a bunch of Spark application running individually are
1. Decouple streaming batches process from triggering model calculation, model calculations are triggered at a different pace from the stream processing.
2. Model is always processing the latest data, using pure rdd APIs.
3. Launch various operations in different threads on the driver node, ensuring them got submitted to the appropriate fair scheduler pool. Let FAIR scheduler to do the resource distribution.
4. Share code and time by sharing the actual data transformation (like the rdds in the intermediate steps).
5. Support adhoc queries on intermediate state without a dedicated serving layer or output protocol.
6. Only one app to monitor and tune.
A presentation for the Reactive Programming Enthusiasts Denver meet-up.
http://www.meetup.com/Reactive-Programming-Enthusiasts-Denver/
How Reactive Mongo helps utilize your hardware better and achieve a non-blocking application from the bottom up.
Functional Programming and Composing Actorslegendofklang
With the world being non-deterministic, with failure being abundant, and with communication latency being very real—how do we design systems that are capable of dealing with these conditions and how can we expose abstractions that are feasible to reason about?
Scaling Your Cache And Caching At ScaleAlex Miller
Caching has been an essential strategy for greater performance in computing since the beginning of the field. Nearly all applications have data access patterns that make caching an attractive technique, but caching also has hidden trade-offs related to concurrency, memory usage, and latency.
As we build larger distributed systems, caching continues to be a critical technique for building scalable, high-throughput, low-latency applications. Large systems tend to magnify the caching trade-offs and have created new approaches to distributed caching. There are unique challenges in testing systems like these as well.
Ehcache and Terracotta provide a unique way to start with simple caching for a small system and grow that system over time with a consistent API while maintaining low-latency, high-throughput caching.
A preview of likely features that will be included in Java 7 / JDK 7. Note that this presentation is from February 2009 and things are changing quickly.
How Terracotta enables scaled Spring/Hibernate applications. Presented at Chicago JUG in March 2009 by Alex Miller (http://tech.puredanger.com / @puredanger)
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
2. GPars
Jeepers!
• Groovy library - http://gpars.codehaus.org/
• Open-source
• Great team: Václav Pech, Dierk König, Alex
Tkachman, Russel Winder, Paul King
• Integration with Griffon, Grails, etc
10. Computational
Processing
Use multiple cores to reduce the elapsed
time of some cpu-bound work.
11. Executor Pools
Same pools used for asynch are also good
for transaction processing. Leverage more
cores to process the work.
T1
Task 1
T1 T2 T3
Task 2
Task 1 Task 2 Task 3
Task 3
Task 4 Task 5 Task 6
Task 4
Task 5
Task 6
13. Fork/Join Pools
• DSL for new JSR 166y construct (JDK 7)
• DSL for ParallelArray over fork/join
• DSL for map/reduce and functional calls
14. Fork/Join Pools
put take
queue
put take
queue
put take
queue
15. Fork/Join vs Executors
• Executors (GParsExecutorsPool) tuned for:
• small number of threads (4-16)
• medium-grained I/O-bound or CPU-bound
tasks
• Fork/join (GParsPool) tuned for:
• larger number of threads (16-64)
• fine-grained computational tasks, possibly
with dependencies
19. ParallelArray
• Most business problems don’t really look
like divide-and-conquer
• Common approach for working in parallel
on a big data set is to use an array where
each thread operates on independent
chunks of the array.
• ParallelArray implements this over fork/join.
20. ParallelArray
Starting orders Order Order Order Order Order Order
w/due dates 2/15 5/1 1/15 3/15 2/1 4/1
Order Order Order
Filter only overdue 2/15 1/15 2/1
Map to days
15 45 30
overdue
Avg 30 days
21. ParallelArray usage
def isLate = { order -> order.dueDate > new Date() }
def daysOverdue = { order -> order.daysOverdue() }
GParsPool.withPool {
def data = createOrders()
.filter(isLate)
.map(daysOverdue)
println("# overdue = " + data.size())
println("avg overdue by = " + (data.sum() / data.size()))
}
24. JCIP
• It’s the mutable state, stupid.
• Make fields final unless they need to be mutable.
• Immutable objects are automatically thread-safe.
• Encapsulation makes it practical to manage the complexity.
• Guard each mutable variable with a lock.
• Guard all variables in an invariant with the same lock.
• Hold locks for the duration of compound actions.
• A program that accesses a mutable variable from multiple threads
without synchronization is a broken program.
• Don’t rely on clever reasoning about why you don’t need to synchronize.
• Include thread safety in the design process - or explicitly document that
your class is not thread-safe.
• Document your synchronization policy.
25. JCIP
• It’s the mutable state, stupid.
• Make fields final unless they need to be mutable.
• Immutable objects are automatically thread-safe.
• Encapsulation makes it practical to manage the complexity.
• Guard each mutable variable with a lock.
• Guard all variables in an invariant with the same lock.
• Hold locks for the duration of compound actions.
• A program that accesses a mutable variable from multiple threads
without synchronization is a broken program.
• Don’t rely on clever reasoning about why you don’t need to synchronize.
• Include thread safety in the design process - or explicitly document that
your class is not thread-safe.
• Document your synchronization policy.
27. The problem with shared state concurrency...
is the shared state.
28. Actors
• No shared state
• Lightweight processes
• Asynchronous, non-blocking message-passing
• Buffer messages in a mailbox
• Receive messages with pattern matching
• Concurrency model popular in Erlang, Scala, etc
33. Actor Example
class Player extends AbstractPooledActor {
String name
def random = new Random()
void act() {
loop {
react {
// player replies with a random move
reply
Move.values()[random.nextInt(Move.values().length)]
}
}
}
}
34. Agent
• Like Clojure Agents
• Imagine wrapping an actor
around mutable state...
• Messages are:
• functions to apply to (internal) state OR
• just a new value
• Can always safely retrieve value of agent
35. Simple example...
• Use generics to specify data type of wrapped data
• Pass initial value on construction
• Send function to modify state
• Read value by accessing val
def stuff = new Agent<List>([])
// thread 1
stuff.send( {it.add(“pizza”)} )
// thread 2
stuff.send( {it.add(“nachos”)} )
println stuff.val
36. class BankAccount extends Agent<Long> {
def BankAccount() { super(0) }
private def deposit(long deposit)
{ data += deposit }
private def withdraw(long deposit)
{ data -= deposit }
}
final BankAccount acct = new BankAccount()
final Thread atm2 = Thread.start {
acct << { withdraw 200 }
}
final Thread atm1 = Thread.start {
acct << { deposit 500 }
}
[atm1,atm2]*.join()
println "Final balance: ${ acct.val }"
37. Dataflow Variables
• A variable that computes its value when its’
inputs are available
• Value can be set only once
• Data flows safely between variables
• Ordering falls out automatically
• Deadlocks are deterministic
38. Dataflow Tasks
• Logical tasks
• Scheduled over a thread pool, like actors
• Communicate with dataflow variables
39. Dataflow example
final def x = new DataFlowVariable()
final def y = new DataFlowVariable()
final def z = new DataFlowVariable()
task {
z << x.val + y.val
println “Result: ${z.val}”
}
task {
x << 10
}
task {
y << 5
}
40. Dataflow support
• Bind handlers - can be registered on a
dataflow variable to execute on bind
• Dataflow streams - thread-safe unbound
blocking queue to pass data between tasks
• Dataflow operators - additional
abstraction, formalizing inputs/outputs