Alex Bordei is a developer turned Product Manager. He has been developing infrastructure products for over nine years. Before becoming Bigstep’s Product Manager, he was one of the core developers for Hostway Corporation’s provisioning platform. He then focused on defining and developing products for Hostway’s EMEA market and was one of the pioneers of virtualization in the company. After successfully launching two public clouds based on VMware software, he created the first prototype of Bigstep’s Full Metal Cloud in 2011. He now focuses on guaranteeing that the Full Metal Cloud is the highest performance cloud in the world, for big data applications.
Data Lake and the rise of the microservicesBigstep
By simply looking at structured and unstructured data, Data Lakes enable companies to understand correlations between existing and new external data - such as social media - in ways traditional Business Intelligence tools cannot.
For this you need to find out the most efficient way to store and access structured or unstructured petabyte-sized data across your entire infrastructure.
In this meetup we’ll give answers on the next questions:
1. Why would someone use a Data Lake?
2. Is it hard to build a Data Lake?
3. What are the main features that a Data Lake should bring in?
4. What’s the role of the microservices in the big data world?
Manage Microservices & Fast Data Systems on One Platform w/ DC/OSMesosphere Inc.
The application landscape inside our data center is changing: Along with the trend of moving toward microservices and containers, there are a number of new distributed data processing frameworks such as Kafka or Cassandra being released on a weekly basis. These changes have implications for the ways we think about infrastructure. With the growing need for computing power and the rise of distributed applications comes the need for a reliable and simple-use cluster manager and programming abstraction.
In this presentation, Mesosphere explains how to use DC/OS to manage microservices and fast data systems on a single platform. We will look at how container orchestration, including resource management and service management, can be streamlined to process fast data in a matter of seconds, allowing for predictive user interfaces, product recommendations, and billing charge back, among other modern app components.
Microsoft: Building a Massively Scalable System with DataStax and Microsoft's...DataStax Academy
We have the challenge of how to reliably store massive quantities of data that are available even in the face of infrastructure failures. We have similar challenges on the application side. The most successful cloud architectures break applications down into microservices. How then do we deploy, upgrade and manage the scale of those microservices? This session will illustrate how to tackle these challenges by taking advantage of both Cassandra and Microsoft's next generation PaaS infrastructure called Azure Service Fabric.
JCConf 2016 - Cloud Computing Applications - Hazelcast, Spark and IgniteJoseph Kuo
This session aims to establish applications running against distributed and scalable system, or as we know cloud computing system. We will introduce you not only briefing of Hazelcast but also deeper kernel of it, and how it works with Spark, the most famous Map-reduce library. Furthermore, we will introduce another in-memory cache called Apache Ignite and compare it with Hazelcast to see what's the difference between them. In the end, we will give a demonstration showing how Hazelcast and Spark work together well to form a cloud-base service which is distributed, flexible, reliable, available, scalable and stable. You can find demo code here: https://github.com/CyberJos/jcconf2016-hazelcast-spark
https://cyberjos.blog/java/seminar/jcconf-2016-cloud-computing-applications-hazelcast-spark-and-ignite/
DataStax C*ollege Credit: What and Why NoSQL?DataStax
In the first of our bi-weekly C*ollege Credit series Aaron Morton, DataStax MVP for Apache Cassandra and Apache Cassandra committer and Robin Schumacher, VP of product management at DataStax, will take a look back at the history of NoSQL databases and provide a foundation of knowledge for people looking to get started with NoSQL, or just wanting to learn more about this growing trend. You will learn how to know that NoSQL is right for your application, and how to pick a NoSQL database. This webinar is C* 101 level.
Data Lake and the rise of the microservicesBigstep
By simply looking at structured and unstructured data, Data Lakes enable companies to understand correlations between existing and new external data - such as social media - in ways traditional Business Intelligence tools cannot.
For this you need to find out the most efficient way to store and access structured or unstructured petabyte-sized data across your entire infrastructure.
In this meetup we’ll give answers on the next questions:
1. Why would someone use a Data Lake?
2. Is it hard to build a Data Lake?
3. What are the main features that a Data Lake should bring in?
4. What’s the role of the microservices in the big data world?
Manage Microservices & Fast Data Systems on One Platform w/ DC/OSMesosphere Inc.
The application landscape inside our data center is changing: Along with the trend of moving toward microservices and containers, there are a number of new distributed data processing frameworks such as Kafka or Cassandra being released on a weekly basis. These changes have implications for the ways we think about infrastructure. With the growing need for computing power and the rise of distributed applications comes the need for a reliable and simple-use cluster manager and programming abstraction.
In this presentation, Mesosphere explains how to use DC/OS to manage microservices and fast data systems on a single platform. We will look at how container orchestration, including resource management and service management, can be streamlined to process fast data in a matter of seconds, allowing for predictive user interfaces, product recommendations, and billing charge back, among other modern app components.
Microsoft: Building a Massively Scalable System with DataStax and Microsoft's...DataStax Academy
We have the challenge of how to reliably store massive quantities of data that are available even in the face of infrastructure failures. We have similar challenges on the application side. The most successful cloud architectures break applications down into microservices. How then do we deploy, upgrade and manage the scale of those microservices? This session will illustrate how to tackle these challenges by taking advantage of both Cassandra and Microsoft's next generation PaaS infrastructure called Azure Service Fabric.
JCConf 2016 - Cloud Computing Applications - Hazelcast, Spark and IgniteJoseph Kuo
This session aims to establish applications running against distributed and scalable system, or as we know cloud computing system. We will introduce you not only briefing of Hazelcast but also deeper kernel of it, and how it works with Spark, the most famous Map-reduce library. Furthermore, we will introduce another in-memory cache called Apache Ignite and compare it with Hazelcast to see what's the difference between them. In the end, we will give a demonstration showing how Hazelcast and Spark work together well to form a cloud-base service which is distributed, flexible, reliable, available, scalable and stable. You can find demo code here: https://github.com/CyberJos/jcconf2016-hazelcast-spark
https://cyberjos.blog/java/seminar/jcconf-2016-cloud-computing-applications-hazelcast-spark-and-ignite/
DataStax C*ollege Credit: What and Why NoSQL?DataStax
In the first of our bi-weekly C*ollege Credit series Aaron Morton, DataStax MVP for Apache Cassandra and Apache Cassandra committer and Robin Schumacher, VP of product management at DataStax, will take a look back at the history of NoSQL databases and provide a foundation of knowledge for people looking to get started with NoSQL, or just wanting to learn more about this growing trend. You will learn how to know that NoSQL is right for your application, and how to pick a NoSQL database. This webinar is C* 101 level.
The Hadoop Distributed File System is the foundational storage layer in typical Hadoop deployments. Performance and stability of HDFS are crucial to the correct functioning of applications at higher layers in the Hadoop stack. This session is a technical deep dive into recent enhancements committed to HDFS by the entire Apache contributor community. We describe real-world incidents that motivated these changes and how the enhancements prevent those problems from reoccurring. Attendees will leave this session with a deeper understanding of the implementation challenges in a distributed file system and identify helpful new metrics to monitor in their own clusters.
Presentation on Large Scale Data ManagementChris Bunch
These are the slides for a presentation I recently gave at a seminar on Large Scale Data Management. The first half talks about the current state of affairs in the debate between MapReduce and parallel databases, while the second half focuses on two recent papers on virtual machine migration.
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd KnownDataStax
A brief intro to how Barracuda Networks uses Cassandra and the ways in which they are replacing their MySQL infrastructure, with Cassandra. This presentation will include the lessons they've learned along the way during this migration.
Speaker: Michael Kjellman, Software Engineer at Barracuda Networks
Michael Kjellman is a Software Engineer, from San Francisco, working at Barracuda Networks. Michael works across multiple products, technologies, and languages. He primarily works on Barracuda's spam infrastructure and web filter classification data.
Getting real-time analytics for devices/application/business monitoring from trillions of events and petabytes of data like companies Netflix, Uber, Alibaba, Paypal, Ebay, Metamarkets do.
Extending Twitter's Data Platform to Google CloudDataWorks Summit
Twitter's Data Platform is built using multiple complex open source and in house projects to support Data Analytics on hundreds of petabytes of data. Our platform support storage, compute, data ingestion, discovery and management and various tools and libraries to help users for both batch and realtime analytics. Our DataPlatform operates on multiple clusters across different data centers to help thousands of users discover valuable insights. As we were scaling our Data Platform to multiple clusters, we also evaluated various cloud vendors to support use cases outside of our data centers. In this talk we share our architecture and how we extend our data platform to use cloud as another datacenter. We walk through our evaluation process, challenges we faced supporting data analytics at Twitter scale on cloud and present our current solution. Extending Twitter's Data platform to cloud was complex task which we deep dive in this presentation.
Azure + DataStax Enterprise Powers Office 365 Per User StoreDataStax Academy
We will present our O365 use case scenarios, why we chose Cassandra + Spark, and walk through the architecture we chose for running DataStax Enterprise on azure.
Data Orchestration Summit 2020 organized by Alluxio
https://www.alluxio.io/data-orchestration-summit-2020/
The Future of Computing is Distributed
Professor Ion Stoica, UC Berkeley RISELab
About Alluxio: alluxio.io
Engage with the open source community on slack: alluxio.io/slack
Supporting Hadoop in containers takes much more than the very primitive support Docker provides using the Storage Plugin. A production scale Hadoop deployment inside containers needs to honor anti/affinity, fault-domain and data-locality policies. Kubernetes alone, with primitives such as StatefulSets and PersitentVolumeClaims, is not sufficient to support a complex data-heavy application such as Hadoop. One needs to think about this problem more holistically across containers, networking and storage stacks. Also, constructs around deployment, scaling, upgrade etc in traditional orchestration platforms is designed for applications that have adopted a microservices philosophy, which doesn't fit most Big Data applications across the ingest, store, process, serve and visualization stages of the pipeline. Come to this technical session to learn how to run and manage lifecycle of containerized Hadoop and other applications in the data analytics pipeline efficiently and effectively, far and beyond simple container orchestration. #BigData, #NoSQL, #Hortonworks, #Cloudera, #Kafka, #Tensorflow, #Cassandra, #MongoDB, #Kudu, #Hive, #HBase, PARTHA SEETALA, CTO, Robin Systems.
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...DataStax
Depleting water supplies coupled with increasing global demand is an environmental challenge with lasting impact on societies across the world. Join this webinar to learn how i2O Water, a pioneer in smart water management technologies, is leading the charge against a global crisis with an Internet of Things (IOT) solution built on Apache Cassandra™
Big-Data-as-a-Service (BDaaS) in an enterprise environment requires meeting the often contradictory goals of (1) providing your data scientists, analysts, and data engineers with a self-service consumption model; (2) delivering agile and scalable on-demand infrastructure for the rapidly evolving ecosystem of big data frameworks and application software; while (3) ensuring enterprise-grade capabilities for isolation, security, monitoring, etc.
In this presentation at our BDaaS meetup in Santa Clara, Tom Phelan (chief architect and co-founder of BlueData) reviewed these goals and how to resolve the potential contradictions. He also discussed the infrastructure, application, user experience, security, and maintainability considerations required before selecting (or designing and building) a Big-Data-as-a-Service platform for an enterprise big data deployment.
More info on this BDaaS meetup can be found at: http://www.meetup.com/Big-Data-as-a-Service/events/233999817
Running Distributed TensorFlow with GPUs on Mesos with DC/OS Mesosphere Inc.
Running distributed TensorFlow is challenging, especially if you want to train large models on your own infrastructure. In this talk, Kevin Klues presents an open source TensorFlow framework for distributed training on DC/OS. This framework takes the pain out of deploying distributed TensorFlow, so you can spend less time worrying about your deployment strategy and more time building out your model.
Speaker Bio:
Kevin Klues is an Engineering Manager at Mesosphere where he leads the DC/OS Cluster Operations team. Prior to joining Mesosphere, Kevin worked at Google on an experimental operating system for data centers called Akaros. He and a few others founded the Akaros project while working on their Ph.Ds at UC Berkeley. In a past life, Kevin was a lead developer of the TinyOS project, working at Stanford University, the Technical University of Berlin, and the CSIRO in Australia. When not working, you can usually find Kevin on a snowboard or up in the mountains in some capacity or another.
Data Engineer's Lunch #55: Get Started in Data EngineeringAnant Corporation
In Data Engineer's Lunch #55, CEO of Anant, Rahul Singh, will cover 10 resources every data engineer needs to get started or master their game.
Accompanying Blog: Coming Soon!
Accompanying YouTube: Coming Soon!
Sign Up For Our Newsletter: http://eepurl.com/grdMkn
Join Data Engineer’s Lunch Weekly at 12 PM EST Every Monday:
https://www.meetup.com/Data-Wranglers-DC/events/
Cassandra.Link:
https://cassandra.link/
Follow Us and Reach Us At:
Anant:
https://www.anant.us/
Awesome Cassandra:
https://github.com/Anant/awesome-cassandra
Email:
solutions@anant.us
LinkedIn:
https://www.linkedin.com/company/anant/
Twitter:
https://twitter.com/anantcorp
Eventbrite:
https://www.eventbrite.com/o/anant-1072927283
Facebook:
https://www.facebook.com/AnantCorp/
Join The Anant Team:
https://www.careers.anant.us
Xiaomi is a Chinese technology company, it sells more than 100 million smartphones worldwide in 2018, and also owns one of the world's largest IoT device platforms. Xiaomi builds dozens of mobile apps and Internet services based on intelligent devices, including Ads, news feeds, finance service, game, music, video, personal cloud service and so on. The rapid growth of business results in exponential growth of the data analytics infrastructure. The amount of data has roared more than 20 times in the past 3 years, which renders us big challenges on the HDFS scalability
In this talk, we introduce how we scale HDFS to support hundreds of PB data with thousands nodes:
1. How Xiaomi use Hadoop and the characteristic of our usage
2. We made HDFS federation cluster to be used like a single cluster, most applications don't need to change any code to migrate from a single cluster to a federation cluster. Our works include a wrapper FileSystem compatible with DistributedFileSystem, supporting rename among different name spaces and zookeeper-based mount table renewer.
3. Experience of tuning NameNode to improve scalability
4. How to maintain hundreds of HDFS clusters and the optimization we did on client-side to make user and programs access these clusters easily with high performance
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreDataStax Academy
We will present our Office 365 use case scenarios, why we chose Cassandra + Spark, and walk through the architecture we chose for running DSE on Azure.
The presentation will feature demos on how you too can build similar applications.
Migration Best Practices: From RDBMS to Cassandra without a HitchDataStax Academy
Presenter: Duy Hai Doan, Technical Advocate at Datastax
Libon is a messaging service designed to improve mobile communications through free calls, chat and a voicemail services regardless of operator or Internet access provider. As a mobile communications application, Libon processes billions of messages and calls while backing up billions of contact data. Join this webinar to learn best practices and pitfalls to avoid when tackling a migration project from Relational Database (RDBMS) to Cassandra and how Libon is now able to ingest massive volumes of high velocity data with read and write latency below 10 milliseconds.
Estimating the Total Costs of Your Cloud Analytics PlatformDATAVERSITY
Organizations today need a broad set of enterprise data cloud services with key data functionality to modernize applications and utilize machine learning. They need a platform designed to address multi-faceted needs by offering multi-function Data Management and analytics to solve the enterprise’s most pressing data and analytic challenges in a streamlined fashion. They need a worry-free experience with the architecture and its components.
The Hadoop Distributed File System is the foundational storage layer in typical Hadoop deployments. Performance and stability of HDFS are crucial to the correct functioning of applications at higher layers in the Hadoop stack. This session is a technical deep dive into recent enhancements committed to HDFS by the entire Apache contributor community. We describe real-world incidents that motivated these changes and how the enhancements prevent those problems from reoccurring. Attendees will leave this session with a deeper understanding of the implementation challenges in a distributed file system and identify helpful new metrics to monitor in their own clusters.
Presentation on Large Scale Data ManagementChris Bunch
These are the slides for a presentation I recently gave at a seminar on Large Scale Data Management. The first half talks about the current state of affairs in the debate between MapReduce and parallel databases, while the second half focuses on two recent papers on virtual machine migration.
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd KnownDataStax
A brief intro to how Barracuda Networks uses Cassandra and the ways in which they are replacing their MySQL infrastructure, with Cassandra. This presentation will include the lessons they've learned along the way during this migration.
Speaker: Michael Kjellman, Software Engineer at Barracuda Networks
Michael Kjellman is a Software Engineer, from San Francisco, working at Barracuda Networks. Michael works across multiple products, technologies, and languages. He primarily works on Barracuda's spam infrastructure and web filter classification data.
Getting real-time analytics for devices/application/business monitoring from trillions of events and petabytes of data like companies Netflix, Uber, Alibaba, Paypal, Ebay, Metamarkets do.
Extending Twitter's Data Platform to Google CloudDataWorks Summit
Twitter's Data Platform is built using multiple complex open source and in house projects to support Data Analytics on hundreds of petabytes of data. Our platform support storage, compute, data ingestion, discovery and management and various tools and libraries to help users for both batch and realtime analytics. Our DataPlatform operates on multiple clusters across different data centers to help thousands of users discover valuable insights. As we were scaling our Data Platform to multiple clusters, we also evaluated various cloud vendors to support use cases outside of our data centers. In this talk we share our architecture and how we extend our data platform to use cloud as another datacenter. We walk through our evaluation process, challenges we faced supporting data analytics at Twitter scale on cloud and present our current solution. Extending Twitter's Data platform to cloud was complex task which we deep dive in this presentation.
Azure + DataStax Enterprise Powers Office 365 Per User StoreDataStax Academy
We will present our O365 use case scenarios, why we chose Cassandra + Spark, and walk through the architecture we chose for running DataStax Enterprise on azure.
Data Orchestration Summit 2020 organized by Alluxio
https://www.alluxio.io/data-orchestration-summit-2020/
The Future of Computing is Distributed
Professor Ion Stoica, UC Berkeley RISELab
About Alluxio: alluxio.io
Engage with the open source community on slack: alluxio.io/slack
Supporting Hadoop in containers takes much more than the very primitive support Docker provides using the Storage Plugin. A production scale Hadoop deployment inside containers needs to honor anti/affinity, fault-domain and data-locality policies. Kubernetes alone, with primitives such as StatefulSets and PersitentVolumeClaims, is not sufficient to support a complex data-heavy application such as Hadoop. One needs to think about this problem more holistically across containers, networking and storage stacks. Also, constructs around deployment, scaling, upgrade etc in traditional orchestration platforms is designed for applications that have adopted a microservices philosophy, which doesn't fit most Big Data applications across the ingest, store, process, serve and visualization stages of the pipeline. Come to this technical session to learn how to run and manage lifecycle of containerized Hadoop and other applications in the data analytics pipeline efficiently and effectively, far and beyond simple container orchestration. #BigData, #NoSQL, #Hortonworks, #Cloudera, #Kafka, #Tensorflow, #Cassandra, #MongoDB, #Kudu, #Hive, #HBase, PARTHA SEETALA, CTO, Robin Systems.
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...DataStax
Depleting water supplies coupled with increasing global demand is an environmental challenge with lasting impact on societies across the world. Join this webinar to learn how i2O Water, a pioneer in smart water management technologies, is leading the charge against a global crisis with an Internet of Things (IOT) solution built on Apache Cassandra™
Big-Data-as-a-Service (BDaaS) in an enterprise environment requires meeting the often contradictory goals of (1) providing your data scientists, analysts, and data engineers with a self-service consumption model; (2) delivering agile and scalable on-demand infrastructure for the rapidly evolving ecosystem of big data frameworks and application software; while (3) ensuring enterprise-grade capabilities for isolation, security, monitoring, etc.
In this presentation at our BDaaS meetup in Santa Clara, Tom Phelan (chief architect and co-founder of BlueData) reviewed these goals and how to resolve the potential contradictions. He also discussed the infrastructure, application, user experience, security, and maintainability considerations required before selecting (or designing and building) a Big-Data-as-a-Service platform for an enterprise big data deployment.
More info on this BDaaS meetup can be found at: http://www.meetup.com/Big-Data-as-a-Service/events/233999817
Running Distributed TensorFlow with GPUs on Mesos with DC/OS Mesosphere Inc.
Running distributed TensorFlow is challenging, especially if you want to train large models on your own infrastructure. In this talk, Kevin Klues presents an open source TensorFlow framework for distributed training on DC/OS. This framework takes the pain out of deploying distributed TensorFlow, so you can spend less time worrying about your deployment strategy and more time building out your model.
Speaker Bio:
Kevin Klues is an Engineering Manager at Mesosphere where he leads the DC/OS Cluster Operations team. Prior to joining Mesosphere, Kevin worked at Google on an experimental operating system for data centers called Akaros. He and a few others founded the Akaros project while working on their Ph.Ds at UC Berkeley. In a past life, Kevin was a lead developer of the TinyOS project, working at Stanford University, the Technical University of Berlin, and the CSIRO in Australia. When not working, you can usually find Kevin on a snowboard or up in the mountains in some capacity or another.
Data Engineer's Lunch #55: Get Started in Data EngineeringAnant Corporation
In Data Engineer's Lunch #55, CEO of Anant, Rahul Singh, will cover 10 resources every data engineer needs to get started or master their game.
Accompanying Blog: Coming Soon!
Accompanying YouTube: Coming Soon!
Sign Up For Our Newsletter: http://eepurl.com/grdMkn
Join Data Engineer’s Lunch Weekly at 12 PM EST Every Monday:
https://www.meetup.com/Data-Wranglers-DC/events/
Cassandra.Link:
https://cassandra.link/
Follow Us and Reach Us At:
Anant:
https://www.anant.us/
Awesome Cassandra:
https://github.com/Anant/awesome-cassandra
Email:
solutions@anant.us
LinkedIn:
https://www.linkedin.com/company/anant/
Twitter:
https://twitter.com/anantcorp
Eventbrite:
https://www.eventbrite.com/o/anant-1072927283
Facebook:
https://www.facebook.com/AnantCorp/
Join The Anant Team:
https://www.careers.anant.us
Xiaomi is a Chinese technology company, it sells more than 100 million smartphones worldwide in 2018, and also owns one of the world's largest IoT device platforms. Xiaomi builds dozens of mobile apps and Internet services based on intelligent devices, including Ads, news feeds, finance service, game, music, video, personal cloud service and so on. The rapid growth of business results in exponential growth of the data analytics infrastructure. The amount of data has roared more than 20 times in the past 3 years, which renders us big challenges on the HDFS scalability
In this talk, we introduce how we scale HDFS to support hundreds of PB data with thousands nodes:
1. How Xiaomi use Hadoop and the characteristic of our usage
2. We made HDFS federation cluster to be used like a single cluster, most applications don't need to change any code to migrate from a single cluster to a federation cluster. Our works include a wrapper FileSystem compatible with DistributedFileSystem, supporting rename among different name spaces and zookeeper-based mount table renewer.
3. Experience of tuning NameNode to improve scalability
4. How to maintain hundreds of HDFS clusters and the optimization we did on client-side to make user and programs access these clusters easily with high performance
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreDataStax Academy
We will present our Office 365 use case scenarios, why we chose Cassandra + Spark, and walk through the architecture we chose for running DSE on Azure.
The presentation will feature demos on how you too can build similar applications.
Migration Best Practices: From RDBMS to Cassandra without a HitchDataStax Academy
Presenter: Duy Hai Doan, Technical Advocate at Datastax
Libon is a messaging service designed to improve mobile communications through free calls, chat and a voicemail services regardless of operator or Internet access provider. As a mobile communications application, Libon processes billions of messages and calls while backing up billions of contact data. Join this webinar to learn best practices and pitfalls to avoid when tackling a migration project from Relational Database (RDBMS) to Cassandra and how Libon is now able to ingest massive volumes of high velocity data with read and write latency below 10 milliseconds.
Estimating the Total Costs of Your Cloud Analytics PlatformDATAVERSITY
Organizations today need a broad set of enterprise data cloud services with key data functionality to modernize applications and utilize machine learning. They need a platform designed to address multi-faceted needs by offering multi-function Data Management and analytics to solve the enterprise’s most pressing data and analytic challenges in a streamlined fashion. They need a worry-free experience with the architecture and its components.
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
Whether to take data ingestion cycles off the ETL tool and the data warehouse or to facilitate competitive Data Science and building algorithms in the organization, the data lake – a place for unmodeled and vast data – will be provisioned widely in 2020.
Though it doesn’t have to be complicated, the data lake has a few key design points that are critical, and it does need to follow some principles for success. Avoid building the data swamp, but not the data lake! The tool ecosystem is building up around the data lake and soon many will have a robust lake and data warehouse. We will discuss policy to keep them straight, send data to its best platform, and keep users’ confidence up in their data platforms.
Data lakes will be built in cloud object storage. We’ll discuss the options there as well.
Get this data point for your data lake journey.
ADV Slides: Comparing the Enterprise Analytic SolutionsDATAVERSITY
Data is the foundation of any meaningful corporate initiative. Fully master the necessary data, and you’re more than halfway to success. That’s why leverageable (i.e., multiple use) artifacts of the enterprise data environment are so critical to enterprise success.
Build them once (keep them updated), and use again many, many times for many and diverse ends. The data warehouse remains focused strongly on this goal. And that may be why, nearly 40 years after the first database was labeled a “data warehouse,” analytic database products still target the data warehouse.
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...Igor De Souza
With Industry 4.0, several technologies are used to have data analysis in real-time, maintaining, organizing, and building this on the other hand is a complex and complicated job. Over the past 30 years, we saw several ideas to centralize the database in a single place as the united and true source of data has been implemented in companies, such as Data wareHouse, NoSQL, Data Lake, Lambda & Kappa Architecture.
On the other hand, Software Engineering has been applying ideas to separate applications to facilitate and improve application performance, such as microservices.
The idea is to use the MicroService patterns on the date and divide the model into several smaller ones. And a good way to split it up is to use the model using the DDD principles. And that's how I try to explain and define DataMesh & Data Fabric.
SkyWatch is all about making Earth-observation data digestible and accessible. They believe that creating a single place to bring together the planet’s observational datasets will make new waves in geospatial analytics. In this session, we'll take a look at how companies can take advantage of cloud-native workflows to enable access and analysis across planetary-scale datasets. You’ll hear how SkyWatch leveraged AWS serverless technologies to build a company that transforms petabytes of sensor data from space into useful information. You'll also learn how Sinergise is merging a variety of data streams through products like Sentinel Hub, and creating actionable intelligence for its users.
The Shifting Landscape of Data IntegrationDATAVERSITY
Enterprises and organizations from every industry and scale are working to leverage data to achieve their strategic objectives — whether they are to be more profitable, effective, risk-tolerant, prepared, sustainable, and/or adaptable in an ever-changing world. Data has exploded in volume during the last decade as humans and machines alike produce data at an exponential pace. Also, exciting technologies have emerged around that data to improve our abilities and capabilities around what we can do with data.
Behind this data revolution, there are forces at work, causing enterprises to shift the way they leverage data and accelerate the demand for leverageable data. Organizations (and the climates in which they operate) are becoming more and more complex. They are also becoming increasingly digital and, thus, dependent on how data informs, transforms, and automates their operations and decisions. With increased digitization comes an increased need for both scale and agility at scale.
In this session, we have undertaken an ambitious goal of evaluating the current vendor landscape and assessing which platforms have made, or are in the process of making, the leap to this new generation of Data Management and integration capabilities.
Evolving From Monolithic to Distributed Architecture Patterns in the CloudDenodo
Watch full webinar here: https://goo.gl/rSfYKV
Gartner states in its Predicts 2018: Data Management Strategies Continue to Shift Toward Distributed,
“As data management activities are becoming more widespread in both distributed processing use cases, like IoT, and demands for new types of data, emerging roles such as data scientists or data engineers are expected to be driving the new data management requirements in the coming two years. These trends indicate that both the collection of data as well as the need to connect to data are rapidly becoming the new normal, and that the days of a single data store with all the data of interest — the enterprise data warehouse — are long gone.”
Data management solutions are becoming distributed, heterogeneous and extremely diverse.
Attend this session to learn:
• How to evolve architecture patterns in the cloud using data virtualization.
• How data virtualization accelerates cloud migration and modernization.
• Successful cloud implementation case studies.
Data Virtualization: An Essential Component of a Cloud Data LakeDenodo
Watch full webinar here: https://bit.ly/33GgqE9
Data Lake strategies seem to have found their perfect companion in cloud providers. After years of criticism and struggles in the on-prem Hadoop world, data lakes are flourishing thanks to the simplification in management and low storage prices provided by SaaS vendors. For some, this is the ultimate data strategy. For others, just a repetition of the same mistakes. Attend this session to learn:
- The benefits and shortcoming of cloud data lakes
- The role and value of data virtualization in this scenario
- New development in data virtualization for cloud
By 2020, 50% of all new software will process machine-generated data of some sort (Gartner). Historically, machine data use cases have required non-SQL data stores like Splunk, Elasticsearch, or InfluxDB.
Today, new SQL DB architectures rival the non-SQL solutions in ease of use, scalability, cost, and performance. Please join this webinar for a detailed comparison of machine data management approaches.
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyAlluxio, Inc.
Data Orchestration Summit 2020 organized by Alluxio
https://www.alluxio.io/data-orchestration-summit-2020/
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Sandipan Chakraborty, Director of Engineering (Rakuten)
About Alluxio: alluxio.io
Engage with the open source community on slack: alluxio.io/slack
Maximizing Oil and Gas (Data) Asset Utilization with a Logical Data Fabric (A...Denodo
Watch full webinar here: https://bit.ly/3g9PlQP
It is no news that Oil and Gas companies are constantly faced with immense pressure to stay competitive, especially in the current climate while striving towards becoming data-driven at the heart of the process to scale and gain greater operational efficiencies across the organization.
Hence, the need for a logical data layer to help Oil and Gas businesses move towards a unified secure and governed environment to optimize the potential of data assets across the enterprise efficiently and deliver real-time insights.
Tune in to this on-demand webinar where you will:
- Discover the role of data fabrics and Industry 4.0 in enabling smart fields
- Understand how to connect data assets and the associated value chain to high impact domain areas
- See examples of organizations accelerating time-to-value and reducing NPT
- Learn best practices for handling real-time/streaming/IoT data for analytical and operational use cases
BDWW17 London - Steve Bradbury, GRSC - Big Data to the Rescue: A Fraud Case S...Big Data Week
In 2003, three criminals were jailed for nine years following the largest Card Fraud Case in Europe with a publicised loss to Card Companies of £2.21 million.
Find out how they were caught back then and how Big Data Technologies would have brought them to justice quicker.
Steve Bradbury was the Prime Investigator and Evidence Provider which lead to the convictions using data from Floppy Discs!
BDW17 London - Totte Harinen, Uber - Why Big Data Didn’t End Causal InferenceBig Data Week
Ten years ago there were rumours of the death of causal inference. Big data was supposed to enable us to rely on purely correlational data to predict and control the world. In this talk, I argue that the rumours were strongly exaggerated. Causal inference is becoming increasingly relevant thanks to improvements in inference methods and–ironically–the availability of data. Far from becoming marginalised, causal inference is today more relevant than it’s ever been.
BDW17 London - Rita Simoes, Boehringer Ingelheim - Big Data in Pharma: Sittin...Big Data Week
As far as data is concerned, Pharmaceutical Companies have always been clear-sighted and assertive on what insights to get from it, how, and what to do about it. And then the Big Data Era came in, with its frantic pace, transforming multiple industries all around but, for a number of reasons (privacy and data protection issues on top but not alone) keeping the Pharma Industry behind. How to run the extra mile to keep up with the powerful changes Big Data brings along has become a major concern. Strategic opportunities seem to be around the corner. Is the time to bridge gaps finally here?
BDW17 London - Mick Ridley, Exterion Media & Dale Campbell , TfL - Transformi...Big Data Week
Hello London, the ground-breaking media partnership between Transport for London (TfL) and Exterion Media, gives new opportunities for brands to talk to the London audience in innovative ways and generates vital revenue for London’s transport network.
TfL and Exterion have been working together in the Hello London partnership for a year. Part of the collaboration was around the utilisation of data collected by TfL to better inform advertising investment decisions.
This has led to ground-breaking work in the Out-of-Home advertising sector and the first example of this is Taps Segmentation. Developed by the TfL Data Science team, it allows Exterion to understand demographic patterns at stations based on aggregated contactless and Oyster card usage. This de-personalised data can be analysed for different times of the day and is a game changer – allowing Exterion to rethink how both their classic and digital inventory can be packaged and tailored specifically for clients.
The presentation will cover how TfL and Exterion have collaborated, the approach used by TfL and how Exterion are using it to generate revenue which is reinvested in the transport network.
BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...Big Data Week
Data Science is now well established in our businesses, and everyone considers data as a key asset and critical for our competitiveness.
However, Data Science is not easy to manage, very often projects failed and the investment made is not seeing as profitable.
The aim of this talk is to share the knowledge in different areas:
* avoid classical mistakes in Data Science
* use the right Big Data technology
* apply the right methodology
* make the Data Science team more efficient
BDW17 London - Steve Bradbury - GRSC - Making Sense of the Chaos of DataBig Data Week
DISCOVER
UNDERSTAND
EVOLVE
Presenting a use case taking unstructured data into OCR, Entity Extraction, Case Management and simple to use Visualisations.
BDW17 London - Andy Boura - Thomson Reuters - Does Big Data Have to Mean Big ...Big Data Week
Considering the information security and privacy implications of ever-increasing data volume, and the consolidation of data repositories, in the context of the evolving threat landscape and regulations such as GDPR.
BDW17 London - Tom Woolrich, Financial Times - What Does Big Data Mean for th...Big Data Week
Content:
1. A brief history of the FT
2. What does Big Data mean to the FT?
3. The benefits of Big Data & how we use it
4. How we do it
5. What’s next for us?
BDW17 London - Andrew Fryer, Microsoft - Everybody Needs a Bit of Science in ...Big Data Week
In this session, we’ll look at the key challenges of doing data science at scale – not just the mechanics of processing big data but the things that matter in putting our data to work. Things like how to work on that data in a multi-disciplinary team, how to set up and evaluate experiments and how to move from dev to test and production so those insights can be put to work across the business as needed. I will be showing how we do this at Microsoft, which might sound proprietary but actually is mainly about how we weave open source technologies together – technologies like Spark and Python, and containers.
BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...Big Data Week
We have seen vast improvements to data collection, storage, processing and transport in recent years. An increasing number of networked devices are emitting data and all of us are preparing to handle this wave of valuable data.
Have we, as data professionals, been too focused on the technical challenges and analytical results?
What about the data quality? Are we confident about it? How can we be sure we are making good decisions?
We need to revisit methods of assessing data quality on our modernized data platforms. The quality of our decision making depends on it.
BDW16 London - John Callan, Boxever - Data and Analytics - The Fuel Your Bran...Big Data Week
Unsuccessful marketing campaigns are leaving customers disgruntled, making them 40% less likely to return. Companies are casting aside useful data that can provide further insights into better products/better connections with customers. John Callan, VP of Marketing at Boxever will discuss how AI can change how businesses predict trends, reduce risks, and improve efficiency.
Audience will:
Gain expert-level understanding of data and machine learning that’s used in today’s market
Identify successful ways companies use machine learning to target customers with personalized content
Learn from major airlines use-cases to skillfully target customers and show them exactly what they want to see.
BDW16 London - John Belchamber, Telefonica - New Data, New Strategies, New Op...Big Data Week
Through the experiences of supporting a Multi-Country roll out using data to drive more effective Network capability, we will explain how we have:
Created new internal capability to support local countries, developed skill sets in the country and provided technical infrastucture, algorithms and visualisations to drive the data culture and big data strategies across Telefonica business units.
Through this framework, we will explain how to blend technical and business needs to maximise the benefits and drive better business performance.
BDW16 London - Deenar Toraskar, Think Reactive - Fast Data Key to Efficient C...Big Data Week
The Basel Committee on Banking Supervision (BCBS) and local regulators has been focussed on making banks more safe and resilient. A whole raft of new capital charges and constraints on liquidity and leverage have been introduced: Basel II.5, Basel III, Dodd-Frank, FRTB (“Basel IV”), etc. These have significantly increased the risk data management capabilities banks must have—capabilities that only big data tools can provide.
This talk will cover the challenges of building a position-aware risk management platform that properly aggregates all intra-day trading activity, monitors exposures and risk. The fast data stack can help banks create such a platform and provide a robust foundation to achieve compliance and, ultimately a significant competitive edge by making efficient use of capital.
BDW16 London - Jonny Voon, Innovate UK - Smart Cities and the Buzz Word BingoBig Data Week
With predictions from the United Nations that 66 percent of the world population, including an extra 2.5 billion people, living in urban areas our cities are getting extra attention. If we want to avoid dystopian megacities of the future, then we must begin the technology transformation in our cities now.
BDW16 London - Josh Partridge, Shazam - How Labels, Radio Stations and Brand...Big Data Week
“At Shazam, we think data can be beautiful and stunningly inspiring. The pictures we paint with our data tell stories about changing culture, tastes, and shared discoveries. A truly great new song can sweep across the globe in a wave of Shazams that transcends politics, language, or religion”, Greg Glanday, Chief Revenue Officer at Shazam.
This presentation will offer the audience a few examples of how they can use the data from Shazam to get fantastic insight into the consumers` preferences, and how to take that insight and apply it to a brand.
Giving 3 or 4 great examples of what we do at Shazam, anyone in the audience can understand what this data means, really see this data and then be able to leverage it to make smart marketing decisions.
Main takeaway: a clear understanding of what Shazam data is and how brands can use it.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
3. About Me
• Developer (forcibly) turned Product Manager
• Been designing system architectures as well as products
for the hosting market during the last 8 years.
• Passionated about distributed computing, HPC
environments
and supercomputers
@alexandrubordei
@bigstepinc
alex@bigstep.com
4. Big Data technologies for mainstream and vice-
versa
• Due to the cap on CPU frequency, the horizontal is the only dimension left to grow into.
• Client-server architecture outdated.
• All components of an application must be as independent as possible and as scalable as
possible.
• Big data technologies increasingly used in general purpose applications
• In-memory technologies are orders of magnitude faster than the others.
• Docker promotes and simplifies large scale application management using low-overhead
containers instead of VMs
• Mesos used with Docker and some additional services creates a Distributed OS
5. Source: Tori Randall, Ph.D. prepares a 550-year old Peruvian child mummy for a CT scan
Data as artefacts
• Just like archeological
artefacts, old data can yield
new insights if correlated in a
novel way or analysed with a
new technology.
• Throwing away data because
it is of no use today might
cripple the business
tomorrow.
6. The Data Lake
• Promote data exploration, innovation
• Enable automatic decision making, to Uberize the business
• Perform new, deeper analytics by focusing on correlations between diverse data sources:
clickstream, social media, machine data, documents, audio/video, etc.
• Stores data in its original format
• Stores structured and unstructured data together
8. A Data Lake ≠ A giant universal database
• New scientist needs:
– Domain knowledge
– Clean data
– Easy to work with environment
– Data documentation up to date
– Domain knowledge
– ETL job in place to get new data
• Knowledge is not just in the form of data it is also in the form of code
• HDFS = Just the storage medium
9. • A service provides an API to the rest of the services
• A service is development by a team that has enough domain knowledge to import and clean the
data
• A service could simply store the result of it services periodically back into the shared DB
• Some services offer simply processing services and do not store data at all
A Data Lake = A collection of Data Services
10. An example - A big logistics company
driver service pickup and drop-off service
11. An example - A big logistics company
0.0
4.5
9.0
13.5
18.0
0 3 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Expectedparcels
Time of day
Expected parcels for cluster 51 on 12.11.2015 (Birmingam)
cluster service route service
13. Data Services - It’s about the teams and not the
technology
• Conway law: “[…] organizations which design systems ... are constrained to produce designs
which are copies of the communication structures of these organizations”
14. Data teams
• A data service has its own
release cycle
• Ultra-specialisation is
reduced
• Communication among
members of the same team
is better
• Each Service has a clear
owner
15. Polyglot persistence
• The data does not have to reside in the same place necessarily (same HDFS cluster)
16. Data Service Isolation - Docker
• A Docker container is neither a VM nor a VPS
• Application level virtualisation
• Offers highest levels of consolidation
• Same kernel, apps are isolated processes
• No performance overhead
• Instant deployment
• Single app per container
• Git-like deployment method with branches
and repositories.
• Isolated network stack (network namespaces)
• Isolated zero-copy UnionFS (chroot)
• CPU and Memory isolation (cgroups)
19. Mesos & Marathon
• Allows an app’s environment to be software
defined.
• Docker (currently) knows only about 1 host
• Orchestration layer for Docker containers
• Out of the box load-balancing
• Monitors and restarts containers if failed
• API driven
• Useful for creating high performance, distributed,
fault tolerant architectures.
20. Embrace streaming
• In business reaction time matters
• Resource usage patterns for streaming resemble those of web-centric systems, and need
consolidation for efficiency as well as high availability
21. Spark with Mesos
• Spark & Spark Streaming are great candidates for building data microservices as they are very
fast and easy to use
• Spark can use Mesos as a resource manager
• Spark needs YARN to access Secure HDFS YARN on Mesos: Myriad
22. Conclusions
• Data (micro-)services allows building a data ecosystem within your organisation. A team is a
provider of data to other teams.
• An agile data environment enables an agile business. New tools must be inserted quickly into
the mix. (Eg: found out about Looker today, why not try it on the data).
• There are methods to improve consolidation ratios with 40% while preserving performance of
data services
23. About Bigstep - Data Lake as a service
• High performance, bare metal cloud purpose built for big data
• Automatically deployed (managed and unmanaged) big data software
stacks
• HDFS as a Service
• Managed Data Services platform based on Docker
• Purely on-demand: bare metal instances get deployed in 2 minutes,
can be deleted anytime
• Locally attached drives (12 or 24 2TB drives as well as locally
attached NVMe
• SDN controlled Layer 2 networking (40Gbps per instance, cut through)
• Distributed SSD based storage fabric