This document discusses leveraging endpoint flexibility in data-intensive clusters. It proposes Sinbad, a system that improves performance of distributed file systems (DFS) and jobs by steering data replication traffic away from network hotspots. Sinbad follows a master-slave architecture and uses a greedy algorithm to place data blocks based on current network load balances. Evaluation of Sinbad through simulation and experimentation showed it improved DFS and job completion times by 26-79% while better balancing network utilization compared to the default approach. Sinbad maintains balanced storage utilization over the long term as well.
This presentation surveys different ways one can geographically distribute PostgreSQL, including master-slave and multi-master solutions. It discusses pitfalls and emphasizes understanding requirements. The presentation covers some of the existing tools that are available in the community. It also touches upon upcoming PostgreSQL solutions.
Introduction to failover clustering with sql serverEduardo Castro
In this presentation we review the basic requirements to install a SQL Server Failover Cluster.
Regards,
Eduardo Castro Martinez
http://ecastrom.blogspot.com
http://comunidadwindows.org
This presentation surveys different ways one can geographically distribute PostgreSQL, including master-slave and multi-master solutions. It discusses pitfalls and emphasizes understanding requirements. The presentation covers some of the existing tools that are available in the community. It also touches upon upcoming PostgreSQL solutions.
Introduction to failover clustering with sql serverEduardo Castro
In this presentation we review the basic requirements to install a SQL Server Failover Cluster.
Regards,
Eduardo Castro Martinez
http://ecastrom.blogspot.com
http://comunidadwindows.org
Low-latency Multi-threaded Ensemble Learning for Dynamic Big Data StreamsDiego Marrón Vida
Real–time mining of evolving data streams in-
volves new challenges when targeting today’s application do-
mains such as the Internet of the Things: increasing volume,
velocity and volatility requires data to be processed on–the–
fly with fast reaction and adaptation to changes. This paper
presents a high performance scalable design for decision trees
and ensemble combinations that makes use of the vector SIMD
and multicore capabilities available in modern processors to
provide the required throughput and accuracy. The proposed
design offers very low latency and good scalability with the
number of cores on commodity hardware when compared to
other state–of–the art implementations. On an Intel i7-based
system, processing a single decision tree is 6x faster than
MOA (Java), and 7x faster than StreamDM (C++), two well-
known reference implementations. On the same system, the
use of the 6 cores (and 12 hardware threads) available allow
to process an ensemble of 100 learners 85x faster that MOA
while providing the same accuracy. Furthermore, our solution
is highly scalable: on an Intel Xeon socket with large core
counts, the proposed ensemble design achieves up to 16x speed-
up when employing 24 cores with respect to a single threaded
execution
Severalnines Self-Training: MySQL® Cluster - Part VISeveralnines
Part VI of our free self-training slides on MySQL Cluster.
In this part we cover ’Configuration and Installation'
* Data Node configuration
* SQL Node configuration
* Important parameters
* Installation
* Upgrading
Scheduling in distributed systems - Andrii VozniukAndrii Vozniuk
My EPFL candidacy exam presentation: http://wiki.epfl.ch/edicpublic/documents/Candidacy%20exam/vozniuk_andrii_candidacy_writeup.pdf
Here I present how schedulers work in three distributed data processing systems and their possible optimizations. I consider Gamma - a parallel database, MapReduce - a data-intensive system and Condor - a compute-intensive system.
This talk is based on the following papers:
1) Batch Scheduling in Parallel Database Systems by Manish Mehta, Valery Soloviev and David J. DeWitt
2) Improving MapReduce performance in heterogeneous environments by Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz and Ion Stoica
3) Batch Scheduling in Parallel Database Systems by Manish Mehta, Valery Soloviev and David J. DeWitt
5 Ways to Avoid Server and Application DowntimeNeverfail Group
Successfully maintaining server and application uptime requires a diligent watch. This presentation outlines five ways you can avoid server and application downtime to ensure your users are always connected to the programs vital to their success.
Low-latency Multi-threaded Ensemble Learning for Dynamic Big Data StreamsDiego Marrón Vida
Real–time mining of evolving data streams in-
volves new challenges when targeting today’s application do-
mains such as the Internet of the Things: increasing volume,
velocity and volatility requires data to be processed on–the–
fly with fast reaction and adaptation to changes. This paper
presents a high performance scalable design for decision trees
and ensemble combinations that makes use of the vector SIMD
and multicore capabilities available in modern processors to
provide the required throughput and accuracy. The proposed
design offers very low latency and good scalability with the
number of cores on commodity hardware when compared to
other state–of–the art implementations. On an Intel i7-based
system, processing a single decision tree is 6x faster than
MOA (Java), and 7x faster than StreamDM (C++), two well-
known reference implementations. On the same system, the
use of the 6 cores (and 12 hardware threads) available allow
to process an ensemble of 100 learners 85x faster that MOA
while providing the same accuracy. Furthermore, our solution
is highly scalable: on an Intel Xeon socket with large core
counts, the proposed ensemble design achieves up to 16x speed-
up when employing 24 cores with respect to a single threaded
execution
Severalnines Self-Training: MySQL® Cluster - Part VISeveralnines
Part VI of our free self-training slides on MySQL Cluster.
In this part we cover ’Configuration and Installation'
* Data Node configuration
* SQL Node configuration
* Important parameters
* Installation
* Upgrading
Scheduling in distributed systems - Andrii VozniukAndrii Vozniuk
My EPFL candidacy exam presentation: http://wiki.epfl.ch/edicpublic/documents/Candidacy%20exam/vozniuk_andrii_candidacy_writeup.pdf
Here I present how schedulers work in three distributed data processing systems and their possible optimizations. I consider Gamma - a parallel database, MapReduce - a data-intensive system and Condor - a compute-intensive system.
This talk is based on the following papers:
1) Batch Scheduling in Parallel Database Systems by Manish Mehta, Valery Soloviev and David J. DeWitt
2) Improving MapReduce performance in heterogeneous environments by Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz and Ion Stoica
3) Batch Scheduling in Parallel Database Systems by Manish Mehta, Valery Soloviev and David J. DeWitt
5 Ways to Avoid Server and Application DowntimeNeverfail Group
Successfully maintaining server and application uptime requires a diligent watch. This presentation outlines five ways you can avoid server and application downtime to ensure your users are always connected to the programs vital to their success.
Data Lake and the rise of the microservicesBigstep
By simply looking at structured and unstructured data, Data Lakes enable companies to understand correlations between existing and new external data - such as social media - in ways traditional Business Intelligence tools cannot.
For this you need to find out the most efficient way to store and access structured or unstructured petabyte-sized data across your entire infrastructure.
In this meetup we’ll give answers on the next questions:
1. Why would someone use a Data Lake?
2. Is it hard to build a Data Lake?
3. What are the main features that a Data Lake should bring in?
4. What’s the role of the microservices in the big data world?
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInLinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn. This was a presentation made at QCon 2009 and is embedded on LinkedIn's blog - http://blog.linkedin.com/
Cloud infrastructure. Google File System and MapReduce - Andrii VozniukAndrii Vozniuk
My presentation for the Cloud Data Management course at EPFL by Anastasia Ailamaki and Christoph Koch.
It is mainly based on the following two papers:
1) S. Ghemawat, H. Gobioff, S. Leung. The Google File System. SOSP, 2003
2) J. Dean, S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. OSDI, 2004
From cache to in-memory data grid. Introduction to Hazelcast.Taras Matyashovsky
This presentation:
* covers basics of caching and popular cache types
* explains evolution from simple cache to distributed, and from distributed to IMDG
* not describes usage of NoSQL solutions for caching
* is not intended for products comparison or for promotion of Hazelcast as the best solution
Webinar Slides: Geo-Distributed MySQL Clustering Done Right!Continuent
With Multiple Active Primary MySQL Databases
Watch this on-demand webinar to learn the right way to deploy geo-distributed databases. We look at the pitfalls of deploying a single site and passive sites, and from there we show how to provide the best user experience by leveraging geo-distributed MySQL.
When considering geo-distributed MySQL database environments it is important to understand the nuances of having multiple active clusters deployed across sites and clouds. This webinar walks through the proper planning of geo-distributed MySQL for success.
Finally, you’ll learn about our best practices for multiple primary clusters, as well as failover and disaster recovery for MySQL.
AGENDA
- Why Geo-Distributed Databases
- Geo-Distributed MySQL Starts With High Performance Local Clusters
- Extend The Cluster To Multiple Datacenters/Clouds
- Best Practices For Multiple Primary Clusters
- Failover & Disaster Recovery
- Key Benefits
PRESENTER
Matthew Lang, Customer Success Director – Americas, Continuent, has over 25 years of experience in database administration, database programming, and system architecture, including the creation of a database replication product that is still in use today. He has designed highly available, scaleable systems that have allowed startups to quickly become enterprise organizations, utilizing a variety of technologies including open source projects, virtualization and cloud.
Technical Best Practices for Veritas and Microsoft Azure Using a Detailed Ref...Veritas Technologies LLC
Explore best practices around the following use cases related to the Microsoft Azure platform: Long-term retention of data in the cloud, migration of critical workloads including those running in VMware and Hyper-V, and resiliency of business services running in the cloud. Each of these scenarios are part of what Veritas 360 data management in the cloud can provide. Learn the best way to design, deploy, and manage within each of these scenarios on Azure, and gain key insights into how to avoid pitfalls of common practices and how to boost your cloud ROI – demonstrated via a reference architecture.
In this deck from the DDN User Group at ISC 2018, Steve Simms from Indiana University presents: Lustre / ZFS at Indiana University.
Watch the video: https://www.youtube.com/watch?v=evY44I26TIQ
Learn more: https://www.sice.indiana.edu/faculty-research/research/high-performance-computing.html
and
https://www.ddn.com/company/events/isc-user-group/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
How Optimizely (Safely) Maximizes Database Concurrency.pdfScyllaDB
Having a database that’s capable of high concurrency is one thing, but actually tapping all that potential concurrency is another. Fortunately, Optimizely Engineering has developed practical strategies that can help other teams.
Learn how Optimizely Engineering takes full advantage of the high concurrency that’s possible with their NoSQL database, ScyllaDB – while also guaranteeing correctness and protecting the quality of service. Brian Taylor, Principal Software Engineer, will offer a technical deep dive on:
- Understanding concurrency and its impact on throughput and latency
- Closed loop load testing, open loop load testing & the Universal Scaling Law
- The type of load testing you should be performing for capacity planning
- How to identify the region where your database can make the best use of concurrency
- Strategies for optimizing sound concurrency based on your data dependencies
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Good afternoon.
I’m …
Today, I’m going to talk about network transfers that do not have fixed destinations.
This is a joint work with …
Written @ Berkeley (SIGCOMM) and presented on last August in Hong Kong
How it started: internet companies ….
Main motivation in addition to regular stuff:
Lower cost
Less time
Greater flexibility
Linear scalability
But How? And what happens that allows it?
Open source
Started in Google…
Gordon Moore 1965
Capacity has increased while price has decreased
They analyze data in FB and Bing.
Found out 33%....
-----------------
Many data-intensive jobs depend on communication for faster end-to-end completion time.
For example, in one of our earlier works, we found that typical jobs at Facebook spend up to a third of their running time in shuffle or intermediate data transfers.
As in-memory systems proliferate and disks are removed from the I/O pipeline, the network is likely to be the primary bottleneck.
But what do the network usage of data-intensive clusters look like and where do they come from?
To better understand the problem, we have analyzed traces from two data-intensive production clusters at Facebook and Microsoft.
1. Managing Data Transfers in Computer Clusters with Orchestra, SIGCOMM’2011
We have found something very interesting.
While there has been a LOT of attention into deceasing reads over the network or managing intermediate communication, DFS replication creates almost half of all cross-rack traffic.
Note that this doesn’t mean everyone was wrong; communication of intermediate data or shuffle is still a major source of job-level communication.
But, the sources of these writes are ingestion of new data into the cluster and preprocessing of existing data for later use.
Both of which do not show up when someone looks only at the jobs.
Very small amount is actually created by typical jobs.
We’ve also found that during ingestion many writers spend up to 90% of their time in writing. Well, that is their job.
What is this DFS?
Distributed file systems are ubiquitous in data-intensive clusters and form the narrow waist.
Diverse computing frameworks read from and write to the same DFS.
Examples include GFS, HDFS, Cosmos etc.
Typically, distributed file systems store data as files.
Each file is divided into large blocks.
Typical size of a block would be 256MB.
Each block of a file is then replicated to three different machines for fault tolerance.
These three machines are located in two different fault domains, typically racks, for partition tolerance.
Finally, replicas are placed uniformly randomly throughout the cluster to avoid storage imbalance.
Writes to a DFS are typically synchronous.
We address the traffic of distributed file systems in modern clusters like any other elephant flows in the network.
We assume that the endpoints are fixed.
All the existing work balance the network after the locations of the sources and destinations have already been decided.
Because sources and destinations are fixed, they try to find different paths between them or try to change rates in different paths.
But we can do more.
Let us revisit the requirements of replica placement.
Notice that, as long as these constraints are met, the DFS does not care where actually the replicas have been placed.
This means, we can effectively change the destinations of all replication traffic if we satisfy the constraints.
We refer to such transfers as constrained anycast in that replicas can go anywhere, but they are constrained.
In this work, we present Sinbad.
By steering replication traffic away from congested hotspots in the network Sinbad can improve the performance of the network.
However, this can only be useful only if we have significant hotspot activities in the cluster.
We refer to this as the distributed writing problem.
Given blocks of different size and links of different capacities, Sinbad must place the replicas in a way to minimize the average block write time as well as the average file write time.
Note that, block can have different size because blocks are not padded in a DFS.
Now for each block, we consider a job of that length and for each link we consider a machine of that capacity, we see that the distributed writing problem is similar to the job shop scheduling problem.
And it is NP-hard.
Let’s take an example.
We have the same network as before.
We are going to assume that the three core-to-rack links are the possible bottlenecks.
Replica placement requests from two different files come online.
The black file has two blocks and the orange one has three.
Now, let us assume that time is divided into discrete intervals.
We must decide on the three requests during time interval T.
We are also going to assume that intervals are independent; i.e., placement decisions during T will not affect the ones during T+1.
Finally, we are going to assume that link utilizations are stable for the duration of replication or during T, and all blocks have the same size.
It is clear that we should pick the least-loaded link because that will finish the fastest.
Because all blocks are of the same size, it doesn’t matter which block we choose for minimizing the average block write time.
If we also care about minimizing the file write times, we should always pick the smallest file (the one with the least remaining blocks) to go through the fattest link.
Under these assumptions, greedy placement is optimal.
We propose a simple two-step greedy placement policy.
At any point, we pick the least-loaded link and then
That brings us to Sinbad.
Sinbad performs network-aware replica placement for DFS.
<EXPLAIN>
// Mention master-slave architecture etc.
That brings us to Sinbad.
Given a replica placement request, the master greedily places it and returns back the locations.
It also adds some hysteresis to avoid placing too many replicas in the same rack.
Further details on the process can be found in the paper.
One thing to note is that the interface is incredibly simple, which makes it all the more deployable.
All in all, we needed to change only a couple hundred lines of code to implement the whole thing.
We have implemented Sinbad in the HDFS which is the de factor open source DFS used by traditional frameworks like Hadoop as well as upcoming systems like Spark.
We have also performed flow-level simulation of the 3000-node facebook cluster.
The three high-level question one might ask are —
Does it improve performance?
Does it improve network balance?
Will the storage remain balanced.
The short answer to all three is YES.
We have implemented Sinbad in the HDFS which is the de factor open source DFS used by traditional frameworks like Hadoop as well as upcoming systems like Spark.
We have also performed flow-level simulation of the 3000-node facebook cluster.
The three high-level question one might ask are —
Does it improve performance?
Does it improve network balance?
Will the storage remain balanced.
The short answer to all three is YES.
We consider performance from the perspective of the user (i.e., job performance) and that of the system (DFS performance)
<EXPLAIN results>
We have found that if we applied similar technique to in-memory storage systems like Tachyon, the improvements can be even higher because disks are never the bottlenecks.
So, network-balance improved and performance improved as well.
Upper bound: 1.89X
ציר – מקדם השונות של ה
UTILIZATION
We’ve found that network is highly imbalanced in both clusters.
We are looking at a CDF of imbalance in core-to-rack downlinks in the facebook cluster.
In the x-axis we have imbalance measured by the coefficient of variation of link utilizations.
Coefficient of variation is the ratio of standard deviation to the mean of some samples, which is zero when all samples are the same, i.e., there is NO imbalance.
In general, smaller CoV means smaller imbalance.
We’ve measured link utilization as the average of 10s bins.
We see that it is almost never zero and more than 50% of the time it is more than 1 (which is a typical threshold for high imbalance)
Same is true for the Bing cluster as well.
Given that a large fraction of traffic allow flexibility in endpoint placement and the network indeed has hotspots, we can now formally define the problem Sinbad is trying to address.
------------------------------------------------------------------------------------------------------------------------------
The network became more balanced as well.
Notice that in both EC2 experiments and trace-based simulations, the orange moved toward the left, which indicate decreased network imbalance.
Sinbad optimize to Network – 10s check, decide where to put replicas by network and not only by Storage.
Short term = 1h
There have been a LOT of work on better optimizing the network.
And the solutions largely fall into three categories.
The first approach is to increase the capacity of the network.
This includes moving from 1GigE to 10GigE links and increasing bisection bandwidth of datacenter networks.
In fact, there have been a lot of proposals on designing full bisection bandwidth networks.
However, full bisection bandwidth does not mean infinite bandwidth, and the size of workload is always increasing.
In practice, many clusters still have some amount of oversubscription in their core-to-rack links.
The next approach is decreasing the amount of network traffic.
All the work on data locality, and there have been many, try to decrease network communication by moving computation closer to its input.
Recently, many researchers have looked into static analysis of data-intensive applications to decrease communication.
These are all best effort mechanisms, and there is always some data that must traverse the network.
This brings us to the third approach, that is load balancing the network.
Typically it focuses on managing large flows and optimizing communication of intermediate data.
Our recent work on Orchestra and Coflow also fall in this category.
This work is about going one step further in this direction.