The Case for Data Centric Hyperconvergence

•Download as PPTX, PDF•

0 likes•62 views

Eran Rom

End-to-end deep learning on unstructured data showcasing data centric hypeconvergence.

Technology

Data-Centric Hyperconvergence with
OpenStack Swift & Storlets
Mail: eran@itsonlyme.name
IRC: eranrom
Twitter: @EranRom

Takeaways
1. Storlets can be used to do REAL stuff (and
quite easily)
2. Storlets enable cost efficient data centric
services

End-to-End Deep Learning on
Unstructured Data

End-to-End Deep Learning on
Unstructured Data
Training
Set
~100 tagged pictures of
Trump, Obama, Merkel
and Bibi
obama merkel

End-to-End Deep Learning on
Unstructured Data
Training
Set
obama merkel
Test Set
Videos of Trump, Obama,
Merkel and Bibi

Step 1: Data Preparation
Training
Set
Extracted
Training
Set
1. Identify face location
2. Crop
3. Resize
Extract face storlet
merkel
merkel

Step 2: Supervised Learning
Extracted
Training
Set
model
Train model storlet
trump
obama
merkel
bibi

model
Step 3: Model Testing
Video
Recognize face storlet
Test
set
bibi

End-to-End Deep Learning on
Unstructured Data
Training
Set
Extracted
Training
Set
model Video
Extract face storlet Train model storlet
X10
0
Test
set
Recognize face storlet

Demo Setup: S2AIO with Jupyter Notebook
Swift and
Storlets all
in one

Local Scripts & S3 Vs. S2AIO
Swift and
Storlets all
in oneS3
S3 Client
With OpenCV
and SKLearn

Local Scripts & S3 Vs. S2AIO
Swift and
Storlets all
in oneS3
S3 Client
With OpenCV
and SKLearn
Dedicated M4.2XLarge (8 CPUs 32GB RAM)

S2aio on EC2 Vs. EC2/S3
Dedicated M4X2Large (8 VCPUs, 32GB Ram, High Network Performance)
0
10
20
30
40
50
60
70
Extract Train Recognize
Seconds
EC2 Swift & Storlets
EC2 & S3

Sources:
Ethernet: http://www.ethernetalliance.org/roadmap/
Infiniband: http://www.infinibandta.org/content/pages.php?pg=technology_overview
1 1.5 2 2.5 3 4
7.5
30
50
1 1.5
5 6
8
1
10 10
20
1.00 1.79
3.57 3.57
0
10
20
30
40
50
60
2010 2011 2012 2013 2014 2015 2016 2017 2018-2020
GrowthFactor
Storage Vs. Networking Growth
SSD
HDD
Ethernet
Infiniband

16 Disks and 4 Network Ports Servers
800.00
128.00
80.00
14.290.00
100.00
200.00
300.00
400.00
500.00
600.00
700.00
800.00
900.00
SSD HDD Ethernet Infiniband
Storage Vs. Networking Growth

Thank You!
All Demo Code: https://github.com/eranr/e2emlstorlets
My Blog: http://itsonlyme.name/blog

Similar to The Case for Data Centric Hyperconvergence

Tivoli Online Training in India

Ugs8008

Tivoli Online Training in India

united global soft

Tivoli Online Training in India

united global soft

Tivoli online training in India

united global soft

Tivoli online training in India

united global soft

Deep Learning using Tensorflow and Data Science Experience

Roy Cecil

Tivoli Online Training in India

united global soft

Tivoli Online Training in India

united global soft

Tivoli Online Training in India

united global soft

State of Drupal keynote, DrupalCon Austin

Dries Buytaert

developer presentation templates

AkhilJamwal1

Machine Learning is one of the most trendy things in IT world right now, a bunch of new services pop-up every single moment, large companies have started to implement different features that utilise Machine Learning(ML). But.... how mortal developers and small and medium business can effectively use machine learn to improve their solutions? In this talk, we are going to discuss the basic concepts around ML and using high-level libraries in JavaScript to implement on our project. Described as a friendly Machine Learning library ml5.js will be our entry door to the Machine Learning world.

Machine learning for mortal developers - Dublin.JS

Fellyph Cintra

Tivoli Online Training in India

United Global Soft

Similar to The Case for Data Centric Hyperconvergence (13)

Tivoli Online Training in India

Tivoli online training in India

Deep Learning using Tensorflow and Data Science Experience

Tivoli Online Training in India

State of Drupal keynote, DrupalCon Austin

developer presentation templates

Machine learning for mortal developers - Dublin.JS

Tivoli Online Training in India

Recently uploaded

GenCyber Cyber Security Day Presentation

Michael W. Hawkins

The Raspberry Pi 5 was announced on October 2023. This new version of the popular embedded device comes with a new iteration of Broadcom’s VideoCore GPU platform, and was released with a fully open source driver stack, developed by Igalia. The presentation will discuss some of the major changes required to support this new Video Core iteration, the challenges we faced in the process and the solutions we provided in order to deliver conformant OpenGL ES and Vulkan drivers. The talk will also cover the next steps for the open source Raspberry Pi 5 graphics stack. (c) Embedded Open Source Summit 2024 April 16-18, 2024 Seattle, Washington (US) https://events.linuxfoundation.org/embedded-open-source-summit/ https://eoss24.sched.com/event/1aBEx

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

Igalia

Scaling API-first – The story of a global engineering organization

Radu Cotescu

Discord is a free app offering voice, video, and text chat functionalities, primarily catering to the gaming community. It serves as a hub for users to create and join servers tailored to their interests. Discord’s ecosystem comprises servers, each functioning as a distinct online community with its own channels dedicated to specific topics or activities. Users can engage in text-based discussions, voice calls, or video chats within these channels. Understanding Discord Servers Discord servers are virtual spaces where users congregate to interact, share content, and build communities. Servers may revolve around gaming, hobbies, interests, or fandoms, providing a platform for like-minded individuals to connect. Communication Features Discord offers a range of communication tools, including text channels for messaging, voice channels for real-time audio conversations, and video channels for face-to-face interactions. These features facilitate seamless communication and collaboration. What Does NSFW Mean? The acronym NSFW stands for “Not Safe For Work,” indicating content that may be inappropriate for professional or public settings. NSFW Content NSFW content encompasses material that is sexually explicit, violent, or otherwise graphic in nature. It often includes nudity, profanity, or depictions of sensitive topics.

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

UK Journal

BooK Now Call us at +918448380779 to hire a gorgeous and seductive call girl for sex. Take a Delhi Escort Service. The help of our escort agency is mostly meant for men who want sexual Indian Escorts In Delhi NCR. It should be noted that any impersonator will get 100 attention from our Young Girls Escorts in Delhi. They will assume the position of reliable allies. VIP Call Girl With Original Photos Book Tonight +918448380779 Our Cheap Price 1 Hour not available 2 Hours 5000 Full Night 8000 TAG: Call Girls in Delhi, Noida, Gurgaon, Ghaziabad, Connaught Place, Greater Kailash Delhi, Lajpat Nagar Delhi, Mayur Vihar Delhi, Chanakyapuri Delhi, New Friends Colony Delhi, Majnu Ka Tilla, Karol Bagh, Malviya Nagar, Saket, Khan Market, Noida Sector 18, Noida Sector 76, Noida Sector 51, Gurgaon Mg Road, Iffco Chowk Gurgaon, Rajiv Chowk Gurgaon All Delhi Ncr Free Home Deliver

08448380779 Call Girls In Civil Lines Women Seeking Men

Delhi Call girls

GenAI Risks & Security Meetup 01052024.pdf

lior mazor

Tech Trends Report 2024 Future Today Institute.pdf

hans926745

Enterprise Knowledge’s Urmi Majumder, Principal Data Architecture Consultant, and Fernando Aguilar Islas, Senior Data Science Consultant, presented "Driving Behavioral Change for Information Management through Data-Driven Green Strategy" on March 27, 2024 at Enterprise Data World (EDW) in Orlando, Florida. In this presentation, Urmi and Fernando discussed a case study describing how the information management division in a large supply chain organization drove user behavior change through awareness of the carbon footprint of their duplicated and near-duplicated content, identified via advanced data analytics. Check out their presentation to gain valuable perspectives on utilizing data-driven strategies to influence positive behavioral shifts and support sustainability initiatives within your organization. In this session, participants gained answers to the following questions: - What is a Green Information Management (IM) Strategy, and why should you have one? - How can Artificial Intelligence (AI) and Machine Learning (ML) support your Green IM Strategy through content deduplication? - How can an organization use insights into their data to influence employee behavior for IM? - How can you reap additional benefits from content reduction that go beyond Green IM?

Driving Behavioral Change for Information Management through Data-Driven Gree...

Enterprise Knowledge

Boost Fertility New Invention Ups Success Rates.pdf

sudhanshuwaghmare1

Evaluating the top large language models.pdf

ChristopherTHyatt

Data Cloud, More than a CDP by Matt Robison

Anna Loughnan Colquhoun

As privacy and data protection regulations evolve rapidly, organizations operating in multiple jurisdictions face mounting challenges to ensure compliance and safeguard customer data. With state-specific privacy laws coming up in multiple states this year, it is essential to understand what their unique data protection regulations will require clearly. How will data privacy evolve in the US in 2024? How to stay compliant? Our panellists will guide you through the intricacies of these states' specific data privacy laws, clarifying complex legal frameworks and compliance requirements. This webinar will review: - The essential aspects of each state's privacy landscape and the latest updates - Common compliance challenges faced by organizations operating in multiple states and best practices to achieve regulatory adherence - Valuable insights into potential changes to existing regulations and prepare your organization for the evolving landscape

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

TrustArc

What are drone anti-jamming systems? The drone anti-jamming systems and anti-spoof technology protect against interference, jamming, and spoofing of the UAVs. To protect their security, countries are beginning to research drone anti-jamming systems, also known as drone strike weapons. The anti-jam and anti-spoof technology protects against interference, jamming and spoofing. A drone strike weapon is a drone attack weapon that can attack and destroy enemy drones. So what is so unique about this amazing system?

What Are The Drone Anti-jamming Systems Technology?

Antenna Manufacturer Coco

🐬 The future of MySQL is Postgres 🐘

RTylerCroy

How to convert PDF to text with Nanonets

naman860154

How to Troubleshoot Apps for the Modern Connected Worker

ThousandEyes

A Domino Admins Adventures (Engage 2024)

Gabriella Davis

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

Delhi Call girls

Automating Google Workspace (GWS) & more with Apps Script

wesley chun

Abhishek Deb(1), Mr Abdul Kalam(2) M. Des (UX) , School of Design, DIT University , Dehradun. This paper explores the future potential of AI-enabled smartphone processors, aiming to investigate the advancements, capabilities, and implications of integrating artificial intelligence (AI) into smartphone technology. The research study goals consist of evaluating the development of AI in mobile phone processors, analyzing the existing state as well as abilities of AI-enabled cpus determining future patterns as well as chances together with reviewing obstacles as well as factors to consider for more growth.

Exploring the Future Potential of AI-Enabled Smartphone Processors

debabhi2

Recently uploaded (20)

GenCyber Cyber Security Day Presentation

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

Scaling API-first – The story of a global engineering organization

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

08448380779 Call Girls In Civil Lines Women Seeking Men

GenAI Risks & Security Meetup 01052024.pdf

Tech Trends Report 2024 Future Today Institute.pdf

Driving Behavioral Change for Information Management through Data-Driven Gree...

Boost Fertility New Invention Ups Success Rates.pdf

Evaluating the top large language models.pdf

Data Cloud, More than a CDP by Matt Robison

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

What Are The Drone Anti-jamming Systems Technology?

🐬 The future of MySQL is Postgres 🐘

How to convert PDF to text with Nanonets

How to Troubleshoot Apps for the Modern Connected Worker

A Domino Admins Adventures (Engage 2024)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

Automating Google Workspace (GWS) & more with Apps Script

Exploring the Future Potential of AI-Enabled Smartphone Processors

The Case for Data Centric Hyperconvergence

1. Data-Centric Hyperconvergence with OpenStack Swift & Storlets Mail: eran@itsonlyme.name IRC: eranrom Twitter: @EranRom

8. Takeaways 1. Storlets can be used to do REAL stuff (and quite easily) 2. Storlets enable cost efficient data centric services

9. End-to-End Deep Learning on Unstructured Data

10. End-to-End Deep Learning on Unstructured Data Training Set ~100 tagged pictures of Trump, Obama, Merkel and Bibi obama merkel

11. End-to-End Deep Learning on Unstructured Data Training Set obama merkel Test Set Videos of Trump, Obama, Merkel and Bibi

12. Step 1: Data Preparation Training Set Extracted Training Set 1. Identify face location 2. Crop 3. Resize Extract face storlet merkel merkel

13. Step 2: Supervised Learning Extracted Training Set model Train model storlet trump obama merkel bibi

14. model Step 3: Model Testing Video Recognize face storlet Test set bibi

15. End-to-End Deep Learning on Unstructured Data Training Set Extracted Training Set model Video Extract face storlet Train model storlet X10 0 Test set Recognize face storlet

16. Demo Setup: S2AIO with Jupyter Notebook Swift and Storlets all in one

17. Local Scripts & S3 Vs. S2AIO Swift and Storlets all in oneS3 S3 Client With OpenCV and SKLearn

18. Local Scripts & S3 Vs. S2AIO Swift and Storlets all in oneS3 S3 Client With OpenCV and SKLearn Dedicated M4.2XLarge (8 CPUs 32GB RAM)

19. S2aio on EC2 Vs. EC2/S3 Dedicated M4X2Large (8 VCPUs, 32GB Ram, High Network Performance) 0 10 20 30 40 50 60 70 Extract Train Recognize Seconds EC2 Swift & Storlets EC2 & S3

20. But the point is…

21. Sources: Ethernet: http://www.ethernetalliance.org/roadmap/ Infiniband: http://www.infinibandta.org/content/pages.php?pg=technology_overview 1 1.5 2 2.5 3 4 7.5 30 50 1 1.5 5 6 8 1 10 10 20 1.00 1.79 3.57 3.57 0 10 20 30 40 50 60 2010 2011 2012 2013 2014 2015 2016 2017 2018-2020 GrowthFactor Storage Vs. Networking Growth SSD HDD Ethernet Infiniband

22. 16 Disks and 4 Network Ports Servers 800.00 128.00 80.00 14.290.00 100.00 200.00 300.00 400.00 500.00 600.00 700.00 800.00 900.00 SSD HDD Ethernet Infiniband Storage Vs. Networking Growth

23. Thank You! All Demo Code: https://github.com/eranr/e2emlstorlets My Blog: http://itsonlyme.name/blog

Editor's Notes

Storlets are about co-locating storage and compute. That is, instead of bringing the data to the compute, bring the compute, which is much smaller, to the data. The Stork is the Storlets project mascot
More specifically, storlets allow to co-locate Dockerized computations inside Openstack Swift in a serverless fashion
Swift is a massively scalable storage system that has a simple API to store and retrieve data blobs taking care of data redundancy via e.g. replication across failure domains.
We use Docker to run the compute near the data in a secured and isolated manner.
By serverless we mean that an end user can upload to Swift the program to run as done for any other data blob, and we will take care of the rest.
This is what I refer to as a data centric hyper convergence. Like traditional hyperconvergence the idea is to have a storage compute and networking solution that can horizontally scale. Traditional hyperconvergence though is focused on general purpose virtual environments and many times go hand in hand with high end flash arrays. This is being marketed as A solution for big data analytics over semi-structure data. Here we are focusing on unstructured data, which is the majority of the data. Hyperconvergence and data centric hyper convergence are complimentary technologies where one can think of the data centric part as ‘transforming’ the unstructured data to semi-structured data that can be consumed with traditional big data machinery. As such I think that data centric hyperconvergence should also have a data management component in the mix, e.g. metadata search.
The graph shows the growth factor of a single SSD/HDD and single networking ports. In Ethernet we see growth from 10Gb in 2010 to a 100 in 2014. Today we start seeing 200Gb Infiniband started at 56Gb in 2011 and like ethernet were in 100 in 2014 and are now at 200 HDD were not growing as fast, with the X8 factor due to Helium filled HDDs In SSDs, however, we see a really big growth, with Seagate announcing 60TB drive last year, and Toshiba 100TB drive to come out this year. Now, consider that in a typical storage server there are much more disks then network ports…
Considering a 16 disks server with 4 network ports we see a much bigger difference in the growth factor.

The Case for Data Centric Hyperconvergence

Recommended

Recommended

More Related Content

Similar to The Case for Data Centric Hyperconvergence

Similar to The Case for Data Centric Hyperconvergence (13)

Recently uploaded

Recently uploaded (20)

The Case for Data Centric Hyperconvergence

Editor's Notes