This document discusses object striping in Swift to optimize performance for large object storage. It proposes segmenting objects into fixed-size stripes and storing them sparsely across multiple object servers. Two approaches are described: client-unaware striping where the proxy performs striping and collation; and client-aware striping where clients handle striping directly. The solution requires changes to the container and object server schemas, addition of stripe metadata, and modifications to services like replicator to work at the stripe level instead of object level. This approach aims to improve throughput, reduce storage usage, and better utilize multiple servers through parallel reads and writes of stripes.
Building Real-Time Web Applications with Vortex-WebAngelo Corsaro
The Real-Time Web is rapidly growing and as a consequence an increasing number of applications require soft-real time interactions with the server-side as well as with peer web applications. In addition, real-time web technologies are experiencing swift adoption in traditional systems as a means of providing portable and ubiquitously accessible thin client applications.
In spite of this trend, few high level communication frameworks exist that allow efficient and timely data exchange between web applications as well as with the server-side and the back-end system. Vortex Web is one of the first technologies to bring the powerful OMG Data Distribution Service (DDS) abstractions to the world of HTML5 / JavaScript applications. With Vortex Web, HTML5 / JavaScript applications can seamlessly and efficiently share data in a timely manner amongst themselves as well as with any other kind of device or system that supports the standard DDS Interoperability wire protocol (DDSI).
This presentation will (1) introduce the key abstractions provided by Vortex Web, (2) provide an overview of its architecture and explain how Vortex Web uses Web Sockets and Web Workers to provide low latency and high throughput, and (3) get you started developing real-time web applications.
Microservices Architecture with Vortex — Part IIAngelo Corsaro
As we saw in Part I, Microservice Architectures are a modular approach to system-building where we decompose complex applications in, small, autonomous and loosely coupled processes communicating through a language and platform independent API. In this webcast we will briefly refresh the key ideas behind Microservice Architectures and then look at how they can be easily implemented in Vortex. Specifically we will (1) look into the key idioms and patterns that are used when implementing Vortex Microservices and (2) walk you through the design and implementation of a micro service application for a real-world use case.
Introduced in 2004, the Data Distribution Service (DDS) has been steadily growing in popularity and adoption. Today, DDS is at the heart of a large number of mission and business critical systems, such as, Air Traffic Control and Management, Train Control Systems, Energy Production Systems, Medical Devices, Autonomous Vehicles, Smart Cities and NASA’s Kennedy Space Centre Launch System.
Considered the technological trends toward data-centricity and the rate of adoption, tomorrow, DDS will be at the at the heart of an incredible number of Industrial IoT systems.
To help you become an expert in DDS and exploit your skills in the growing DDS market, we have designed the DDS in Action webcast series. This series is a learning journey through which you will (1) discover the essence of DDS, (2) understand how to effectively exploit DDS to architect and program distributed applications that perform and scale, (3) learn the key DDS programming idioms and architectural patterns, (4) understand how to characterise DDS performances and configure for optimal latency/throughput, (5) grow your system to Internet scale, and (6) secure you DDS system.
Reactive architectures are emerging as the way to build systems that are responsive, scalable, resilient and event-driven. In other terms, systems that deliver highly responsive user experiences with a real-time feel, that are scalable, resilient, and ready to be deployed on multicore and cloud computing architectures. The Reactive Manifesto (see http://www.reactivemanifesto.org/) captures the key traits that characterize reactive architectures.
The Data Distribution Service (DDS) incarnates the principles enumerated by the reactive manifesto and provides a very good platform for building reactive systems. In this webcast I will (1) introduce the key principles of Reactive Architectures, (2) explain the DDS features that are essential to build reactive systems, and (3) introduce some programming techniques that remove inversion of control while maintaining applications even-driven.
The document discusses the Vortex platform and how it can be used to build a microblogging application called ChirpIt. It describes ChirpIt's requirements, including supporting chirping, re-chirping, likes/dislikes, trending topics/users, and scaling to millions of users. It then outlines ChirpIt's architecture using Vortex features like durability service, quality of service policies, and temporal properties to store and analyze chirp data.
CORBA, DCOM, and Globe all provide distributed object models but have different design goals and implementations. CORBA aims for interoperability and provides many standardized services, while DCOM focuses on functionality within Windows environments. Globe emphasizes scalability through replication-based fault tolerance and location transparency using a global naming service. All support synchronous communication but Globe does not provide asynchronous messaging or callbacks like CORBA and DCOM. Security approaches also differ, with Globe requiring more work to support standardized mechanisms.
Overview of Microsoft .Net Remoting technologyPeter R. Egli
Overview of Microsoft .Net Remoting technology for inter-object communication.
.Net remoting is a .Net-based distributed object technology for accessing
.Net objects that reside in a different application domain (different process
on the same machine or different process on another machine).
.Net shares concepts with DCOM (Distributed Component Object Model), but
simplifies the communication with regard to transport ports and programmatic
model.
Microsoft's newer WCF (Windows Communication Foundation) provides a unified communication and programming model thus replacing older technologies like .Net remoting and DCOM in many applications.
Building Real-Time Web Applications with Vortex-WebAngelo Corsaro
The Real-Time Web is rapidly growing and as a consequence an increasing number of applications require soft-real time interactions with the server-side as well as with peer web applications. In addition, real-time web technologies are experiencing swift adoption in traditional systems as a means of providing portable and ubiquitously accessible thin client applications.
In spite of this trend, few high level communication frameworks exist that allow efficient and timely data exchange between web applications as well as with the server-side and the back-end system. Vortex Web is one of the first technologies to bring the powerful OMG Data Distribution Service (DDS) abstractions to the world of HTML5 / JavaScript applications. With Vortex Web, HTML5 / JavaScript applications can seamlessly and efficiently share data in a timely manner amongst themselves as well as with any other kind of device or system that supports the standard DDS Interoperability wire protocol (DDSI).
This presentation will (1) introduce the key abstractions provided by Vortex Web, (2) provide an overview of its architecture and explain how Vortex Web uses Web Sockets and Web Workers to provide low latency and high throughput, and (3) get you started developing real-time web applications.
Microservices Architecture with Vortex — Part IIAngelo Corsaro
As we saw in Part I, Microservice Architectures are a modular approach to system-building where we decompose complex applications in, small, autonomous and loosely coupled processes communicating through a language and platform independent API. In this webcast we will briefly refresh the key ideas behind Microservice Architectures and then look at how they can be easily implemented in Vortex. Specifically we will (1) look into the key idioms and patterns that are used when implementing Vortex Microservices and (2) walk you through the design and implementation of a micro service application for a real-world use case.
Introduced in 2004, the Data Distribution Service (DDS) has been steadily growing in popularity and adoption. Today, DDS is at the heart of a large number of mission and business critical systems, such as, Air Traffic Control and Management, Train Control Systems, Energy Production Systems, Medical Devices, Autonomous Vehicles, Smart Cities and NASA’s Kennedy Space Centre Launch System.
Considered the technological trends toward data-centricity and the rate of adoption, tomorrow, DDS will be at the at the heart of an incredible number of Industrial IoT systems.
To help you become an expert in DDS and exploit your skills in the growing DDS market, we have designed the DDS in Action webcast series. This series is a learning journey through which you will (1) discover the essence of DDS, (2) understand how to effectively exploit DDS to architect and program distributed applications that perform and scale, (3) learn the key DDS programming idioms and architectural patterns, (4) understand how to characterise DDS performances and configure for optimal latency/throughput, (5) grow your system to Internet scale, and (6) secure you DDS system.
Reactive architectures are emerging as the way to build systems that are responsive, scalable, resilient and event-driven. In other terms, systems that deliver highly responsive user experiences with a real-time feel, that are scalable, resilient, and ready to be deployed on multicore and cloud computing architectures. The Reactive Manifesto (see http://www.reactivemanifesto.org/) captures the key traits that characterize reactive architectures.
The Data Distribution Service (DDS) incarnates the principles enumerated by the reactive manifesto and provides a very good platform for building reactive systems. In this webcast I will (1) introduce the key principles of Reactive Architectures, (2) explain the DDS features that are essential to build reactive systems, and (3) introduce some programming techniques that remove inversion of control while maintaining applications even-driven.
The document discusses the Vortex platform and how it can be used to build a microblogging application called ChirpIt. It describes ChirpIt's requirements, including supporting chirping, re-chirping, likes/dislikes, trending topics/users, and scaling to millions of users. It then outlines ChirpIt's architecture using Vortex features like durability service, quality of service policies, and temporal properties to store and analyze chirp data.
CORBA, DCOM, and Globe all provide distributed object models but have different design goals and implementations. CORBA aims for interoperability and provides many standardized services, while DCOM focuses on functionality within Windows environments. Globe emphasizes scalability through replication-based fault tolerance and location transparency using a global naming service. All support synchronous communication but Globe does not provide asynchronous messaging or callbacks like CORBA and DCOM. Security approaches also differ, with Globe requiring more work to support standardized mechanisms.
Overview of Microsoft .Net Remoting technologyPeter R. Egli
Overview of Microsoft .Net Remoting technology for inter-object communication.
.Net remoting is a .Net-based distributed object technology for accessing
.Net objects that reside in a different application domain (different process
on the same machine or different process on another machine).
.Net shares concepts with DCOM (Distributed Component Object Model), but
simplifies the communication with regard to transport ports and programmatic
model.
Microsoft's newer WCF (Windows Communication Foundation) provides a unified communication and programming model thus replacing older technologies like .Net remoting and DCOM in many applications.
This document provides an overview of WSO2 Complex Event Processor (CEP). It discusses key CEP concepts like event streams, queries, and execution plans. It also demonstrates various query patterns for filtering, transforming, enriching, joining, and detecting patterns in event streams. The document outlines the architecture of CEP and shows how to define streams, tables, queries, and adaptors to integrate CEP with external systems. It provides examples of windowing, aggregations, functions, and extensions that can be used in Siddhi queries to process event streams.
PyModESt: A Python Framework for Staging of Geo-referenced Data on the Coll...Andreas Schreiber
PyModESt is a Python framework that allows meteorological data providers to easily implement staging scripts for the Collaborative Climate Community Grid (C3-Grid) in a modular way. It handles common tasks like communication, metadata handling, and file management so data providers can focus on retrieving and packaging their data. Open issues include standardizing variable names across data sets and improving authorization methods.
PrismTech's Vortex is a platform that provides seamless, ubiquitous, efficient and timely data sharing across mobile, embedded, desktop, cloud and web applications. Today Vortex is the enabling technology at the core the most innovative Internet of Things and Industrial Internet applications, such as Smart Cities, Smart Grids, and Smart Traffic.
This two part tutorial presentation (1) introduces the key concepts of Vortex, (2) gets you started with using Vortex to efficiently exchange data across mobile, embedded, desktop, cloud and web applications, and (3) provides a series of best practices, patterns and idioms to get the best out of Vortex.
This document discusses several distributed file systems including NFS, Coda, Plan 9, xFS, and SFS. It provides details on each system's architecture, communication methods, naming, caching, replication, fault tolerance, security and access control approaches. The key aspects covered are remote access models, file operations, attributes, sharing semantics, client caching, server organization, naming schemes, and consistency models used by the different systems.
Apache Spark has emerged over the past year as the imminent successor to Hadoop MapReduce. Spark can process data in memory at very high speed, while still be able to spill to disk if required. Spark’s powerful, yet flexible API allows users to write complex applications very easily without worrying about the internal workings and how the data gets processed on the cluster.
Spark comes with an extremely powerful Streaming API to process data as it is ingested. Spark Streaming integrates with popular data ingest systems like Apache Flume, Apache Kafka, Amazon Kinesis etc. allowing users to process data as it comes in.
In this talk, Hari will discuss the basics of Spark Streaming, its API and its integration with Flume, Kafka and Kinesis. Hari will also discuss a real-world example of a Spark Streaming application, and how code can be shared between a Spark application and a Spark Streaming application. Each stage of the application execution will be presented, which can help understand practices while writing such an application. Hari will finally discuss how to write a custom application and a custom receiver to receive data from other systems.
A Locality Sensitive Hashing Filter for Encrypted Vector DatabasesJunpei Kawamoto
This document describes a locality sensitive hashing (LSH) filter for encrypted vector databases. The LSH filter allows cloud servers to filter database tuples without decrypting encrypted data, improving query efficiency. A whitening transformation is applied to encrypted vectors before inserting them into the LSH index to reduce skew in the vector space. When a query is received, the server computes the LSH of the encrypted query vector to identify candidate tuple groups for similarity checking based on the LSH hash values. Tuples with low estimated similarity are skipped, while those with high similarity have their actual encrypted similarity computed.
WSO2 Complex Event Processor (CEP) 4.0 provides real-time analytics capabilities. Some key features of CEP 4.0 include a re-architecture of the CEP server, improvements to the Siddhi query engine (version 3.0), scalable distributed processing using Apache Storm, and better high availability support. CEP 4.0 also offers a domain specific execution manager, real-time dashboards, improved usability, and pluggable event receivers and publishers. It can execute queries in a distributed manner across multiple nodes for scalability.
Cooperative Demonstrable Data Retention for Integrity Verification in Multi-C...Editor IJCATR
Demonstrable data retention (DDR) is a technique which certain the integrity of data in storage outsourcing. In this paper we propose an efficient DDR protocol that prevent attacker in gaining information from multiple cloud storage node. Our technique is for distributed cloud storage and support the scalability of services and data migration. This technique Cooperative store and maintain the client‟s data on multi cloud storage. To insure the security of our technique we use zero-knowledge proof system, which satisfies zero-knowledge properties, knowledge soundness and completeness. We present a Cooperative DDR (CDDR) protocol based on hash index hierarchy and homomorphic verification response. In order to optimize the performance of our technique we use a novel technique for selecting optimal parameter values to reduce the storage overhead and computation costs of client for service providers.
bulk ieee projects in pondicherry,ieee projects in pondicherry,final year ieee projects in pondicherry
Nexgen Technology Address:
Nexgen Technology
No :66,4th cross,Venkata nagar,
Near SBI ATM,
Puducherry.
Email Id: praveen@nexgenproject.com.
www.nexgenproject.com
Mobile: 9751442511,9791938249
Telephone: 0413-2211159.
NEXGEN TECHNOLOGY as an efficient Software Training Center located at Pondicherry with IT Training on IEEE Projects in Android,IEEE IT B.Tech Student Projects, Android Projects Training with Placements Pondicherry, IEEE projects in pondicherry, final IEEE Projects in Pondicherry , MCA, BTech, BCA Projects in Pondicherry, Bulk IEEE PROJECTS IN Pondicherry.So far we have reached almost all engineering colleges located in Pondicherry and around 90km
Spark and Spark Streaming internals allow for low latency, fault tolerance, and diverse workloads. Spark uses a Resilient Distributed Dataset (RDD) model where data is partitioned across a cluster. A directed acyclic graph (DAG) is used to schedule tasks across stages in an optimized way. Spark Streaming runs streaming computations as small deterministic batch jobs by chopping live streams into batches and processing them using RDD transformations and actions.
In this work we examine the fundamental problem of interoperable and interconnected blockchains. In particular, we begin by introducing the Multi-Distributed Ledger Objects (MDLO), which is the result of aggregating multiple Distributed Ledger Objects – DLO (a DLO is a formalization of the blockchain) and that supports append and get operations of records (e.g., transactions) in them from multiple clients concurrently. Next we define the AtomicAppends problem, which emerges when the exchange of digital assets between multiple clients may involve appending records in more than one DLO. We examine the solvability of this problem assuming rational and risk averse clients that may fail by crashing, and under different client utility and append models, timing models, and client failure scenarios. We show that for some cases the existence of an intermediary is necessary for the problem solution. We propose the implementation of such intermediary over a specialized blockchain, we term Smart DLO (SDLO), and we show how this can be used to solve the AtomicAppends problem even in an asynchronous, client competitive environment, where all the clients may crash.
Remote Log Analytics Using DDS, ELK, and RxJSSumant Tambe
Autonomous Probing and Diagnostics for remote IT log data using RTI Connext Data Distribution Service (DDS), Elasticsearch-Logstash-Kibana (ELK), and Reactive Extensions for JavaScript (RxJS). Github: https://github.com/rticommunity/rticonnextdds-reactive/tree/master/javascript
Block-Level Message-Locked Encryption for Secure Large File De-duplicationIRJET Journal
This document proposes a Block-Level Message-Locked Encryption (BL-MLE) scheme to achieve secure de-duplication of large files in cloud storage. BL-MLE divides files into blocks, encrypts each block using a convergent encryption method, and stores metadata to enable block-level and file-level de-duplication while maintaining data security and proof of ownership. The scheme uses AES, RSA, SHA-512 and other cryptographic algorithms. It aims to provide an efficient and secure method for de-duplicating encrypted data at both the file and block level in cloud storage.
- Apache Spark is an open-source cluster computing framework that provides fast, general engine for large-scale data processing. It introduces Resilient Distributed Datasets (RDDs) that allow in-memory processing for speed.
- The document discusses Spark's key concepts like transformations, actions, and directed acyclic graphs (DAGs) that represent Spark job execution. It also summarizes Spark SQL, MLlib, and Spark Streaming modules.
- The presenter is a solutions architect who provides an overview of Spark and how it addresses limitations of Hadoop by enabling faster, in-memory processing using RDDs and a more intuitive API compared to MapReduce.
This document discusses standards for IoT interoperability, including IPSO Smart Objects, OMA LWM2M, and CoAP. IPSO Smart Objects define a simple data model and object model to enable semantic interoperability across IoT devices. OMA LWM2M builds on CoAP to provide a server profile for IoT middleware and defines reusable management objects. CoAP defines a RESTful protocol for constrained networks and devices that can be used for device discovery and interaction.
Reactive Stream Processing Using DDS and RxSumant Tambe
In this presentation you will see why Reactive Extensions (Rx) is a powerful technology for asynchronous stream processing. RTI Data Distribution Service (DDS) will be used as the source of data and as a communication channel for asynchronous data streams. On top of DDS, we'll use Rx to subscribe, observe, project, filter, aggregate, merge, zip, and correlate one or more data streams (Observables). The live demo will be very visual as bouncing shapes of different colors will be transformed in front of you using C# lambdas, Rx.NET, and Visual Studio. You will also learn about the new Rx4DDS.NET library that integrates RTI DDS with Rx.NET. Rx and DDS are a great match because both are reactive. Rx is based on the subject-observer pattern, which is quite analogous to the publish-subscribe pattern of DDS. When used together they support distributed dataflows seamlessly. If time permits, we will touch upon advanced Rx concepts such as stream of streams (IGroupedObservable) and how it captures DDS "keyed topics". The DDS applications using Rx4DDS.NET dramatically simplify concurrency to the extent that it can be simply configured.
The document discusses several interoperability standards for IoT - CoAP, OMA LWM2M, and IPSO Smart Objects. It describes how they build upon each other with CoAP providing REST APIs for constrained devices, OMA LWM2M building on CoAP to define device management objects and models, and IPSO Smart Objects further defining application objects based on the LWM2M model. The standards provide a layered approach to connectivity, services, data models and applications to enable interoperability for IoT devices and services.
The document discusses several interoperable IoT standards including CoAP, OMA LWM2M, and IPSO Smart Objects. CoAP defines a RESTful protocol for constrained devices while OMA LWM2M builds on CoAP to provide device management and an object model. IPSO Smart Objects then build on LWM2M, defining reusable application objects and resources for common IoT domains. Together these standards provide interoperability across all layers of the IoT stack from the device to application software.
This document provides an overview of WSO2 Complex Event Processor (CEP). It discusses key CEP concepts like event streams, queries, and execution plans. It also demonstrates various query patterns for filtering, transforming, enriching, joining, and detecting patterns in event streams. The document outlines the architecture of CEP and shows how to define streams, tables, queries, and adaptors to integrate CEP with external systems. It provides examples of windowing, aggregations, functions, and extensions that can be used in Siddhi queries to process event streams.
PyModESt: A Python Framework for Staging of Geo-referenced Data on the Coll...Andreas Schreiber
PyModESt is a Python framework that allows meteorological data providers to easily implement staging scripts for the Collaborative Climate Community Grid (C3-Grid) in a modular way. It handles common tasks like communication, metadata handling, and file management so data providers can focus on retrieving and packaging their data. Open issues include standardizing variable names across data sets and improving authorization methods.
PrismTech's Vortex is a platform that provides seamless, ubiquitous, efficient and timely data sharing across mobile, embedded, desktop, cloud and web applications. Today Vortex is the enabling technology at the core the most innovative Internet of Things and Industrial Internet applications, such as Smart Cities, Smart Grids, and Smart Traffic.
This two part tutorial presentation (1) introduces the key concepts of Vortex, (2) gets you started with using Vortex to efficiently exchange data across mobile, embedded, desktop, cloud and web applications, and (3) provides a series of best practices, patterns and idioms to get the best out of Vortex.
This document discusses several distributed file systems including NFS, Coda, Plan 9, xFS, and SFS. It provides details on each system's architecture, communication methods, naming, caching, replication, fault tolerance, security and access control approaches. The key aspects covered are remote access models, file operations, attributes, sharing semantics, client caching, server organization, naming schemes, and consistency models used by the different systems.
Apache Spark has emerged over the past year as the imminent successor to Hadoop MapReduce. Spark can process data in memory at very high speed, while still be able to spill to disk if required. Spark’s powerful, yet flexible API allows users to write complex applications very easily without worrying about the internal workings and how the data gets processed on the cluster.
Spark comes with an extremely powerful Streaming API to process data as it is ingested. Spark Streaming integrates with popular data ingest systems like Apache Flume, Apache Kafka, Amazon Kinesis etc. allowing users to process data as it comes in.
In this talk, Hari will discuss the basics of Spark Streaming, its API and its integration with Flume, Kafka and Kinesis. Hari will also discuss a real-world example of a Spark Streaming application, and how code can be shared between a Spark application and a Spark Streaming application. Each stage of the application execution will be presented, which can help understand practices while writing such an application. Hari will finally discuss how to write a custom application and a custom receiver to receive data from other systems.
A Locality Sensitive Hashing Filter for Encrypted Vector DatabasesJunpei Kawamoto
This document describes a locality sensitive hashing (LSH) filter for encrypted vector databases. The LSH filter allows cloud servers to filter database tuples without decrypting encrypted data, improving query efficiency. A whitening transformation is applied to encrypted vectors before inserting them into the LSH index to reduce skew in the vector space. When a query is received, the server computes the LSH of the encrypted query vector to identify candidate tuple groups for similarity checking based on the LSH hash values. Tuples with low estimated similarity are skipped, while those with high similarity have their actual encrypted similarity computed.
WSO2 Complex Event Processor (CEP) 4.0 provides real-time analytics capabilities. Some key features of CEP 4.0 include a re-architecture of the CEP server, improvements to the Siddhi query engine (version 3.0), scalable distributed processing using Apache Storm, and better high availability support. CEP 4.0 also offers a domain specific execution manager, real-time dashboards, improved usability, and pluggable event receivers and publishers. It can execute queries in a distributed manner across multiple nodes for scalability.
Cooperative Demonstrable Data Retention for Integrity Verification in Multi-C...Editor IJCATR
Demonstrable data retention (DDR) is a technique which certain the integrity of data in storage outsourcing. In this paper we propose an efficient DDR protocol that prevent attacker in gaining information from multiple cloud storage node. Our technique is for distributed cloud storage and support the scalability of services and data migration. This technique Cooperative store and maintain the client‟s data on multi cloud storage. To insure the security of our technique we use zero-knowledge proof system, which satisfies zero-knowledge properties, knowledge soundness and completeness. We present a Cooperative DDR (CDDR) protocol based on hash index hierarchy and homomorphic verification response. In order to optimize the performance of our technique we use a novel technique for selecting optimal parameter values to reduce the storage overhead and computation costs of client for service providers.
bulk ieee projects in pondicherry,ieee projects in pondicherry,final year ieee projects in pondicherry
Nexgen Technology Address:
Nexgen Technology
No :66,4th cross,Venkata nagar,
Near SBI ATM,
Puducherry.
Email Id: praveen@nexgenproject.com.
www.nexgenproject.com
Mobile: 9751442511,9791938249
Telephone: 0413-2211159.
NEXGEN TECHNOLOGY as an efficient Software Training Center located at Pondicherry with IT Training on IEEE Projects in Android,IEEE IT B.Tech Student Projects, Android Projects Training with Placements Pondicherry, IEEE projects in pondicherry, final IEEE Projects in Pondicherry , MCA, BTech, BCA Projects in Pondicherry, Bulk IEEE PROJECTS IN Pondicherry.So far we have reached almost all engineering colleges located in Pondicherry and around 90km
Spark and Spark Streaming internals allow for low latency, fault tolerance, and diverse workloads. Spark uses a Resilient Distributed Dataset (RDD) model where data is partitioned across a cluster. A directed acyclic graph (DAG) is used to schedule tasks across stages in an optimized way. Spark Streaming runs streaming computations as small deterministic batch jobs by chopping live streams into batches and processing them using RDD transformations and actions.
In this work we examine the fundamental problem of interoperable and interconnected blockchains. In particular, we begin by introducing the Multi-Distributed Ledger Objects (MDLO), which is the result of aggregating multiple Distributed Ledger Objects – DLO (a DLO is a formalization of the blockchain) and that supports append and get operations of records (e.g., transactions) in them from multiple clients concurrently. Next we define the AtomicAppends problem, which emerges when the exchange of digital assets between multiple clients may involve appending records in more than one DLO. We examine the solvability of this problem assuming rational and risk averse clients that may fail by crashing, and under different client utility and append models, timing models, and client failure scenarios. We show that for some cases the existence of an intermediary is necessary for the problem solution. We propose the implementation of such intermediary over a specialized blockchain, we term Smart DLO (SDLO), and we show how this can be used to solve the AtomicAppends problem even in an asynchronous, client competitive environment, where all the clients may crash.
Remote Log Analytics Using DDS, ELK, and RxJSSumant Tambe
Autonomous Probing and Diagnostics for remote IT log data using RTI Connext Data Distribution Service (DDS), Elasticsearch-Logstash-Kibana (ELK), and Reactive Extensions for JavaScript (RxJS). Github: https://github.com/rticommunity/rticonnextdds-reactive/tree/master/javascript
Block-Level Message-Locked Encryption for Secure Large File De-duplicationIRJET Journal
This document proposes a Block-Level Message-Locked Encryption (BL-MLE) scheme to achieve secure de-duplication of large files in cloud storage. BL-MLE divides files into blocks, encrypts each block using a convergent encryption method, and stores metadata to enable block-level and file-level de-duplication while maintaining data security and proof of ownership. The scheme uses AES, RSA, SHA-512 and other cryptographic algorithms. It aims to provide an efficient and secure method for de-duplicating encrypted data at both the file and block level in cloud storage.
- Apache Spark is an open-source cluster computing framework that provides fast, general engine for large-scale data processing. It introduces Resilient Distributed Datasets (RDDs) that allow in-memory processing for speed.
- The document discusses Spark's key concepts like transformations, actions, and directed acyclic graphs (DAGs) that represent Spark job execution. It also summarizes Spark SQL, MLlib, and Spark Streaming modules.
- The presenter is a solutions architect who provides an overview of Spark and how it addresses limitations of Hadoop by enabling faster, in-memory processing using RDDs and a more intuitive API compared to MapReduce.
This document discusses standards for IoT interoperability, including IPSO Smart Objects, OMA LWM2M, and CoAP. IPSO Smart Objects define a simple data model and object model to enable semantic interoperability across IoT devices. OMA LWM2M builds on CoAP to provide a server profile for IoT middleware and defines reusable management objects. CoAP defines a RESTful protocol for constrained networks and devices that can be used for device discovery and interaction.
Reactive Stream Processing Using DDS and RxSumant Tambe
In this presentation you will see why Reactive Extensions (Rx) is a powerful technology for asynchronous stream processing. RTI Data Distribution Service (DDS) will be used as the source of data and as a communication channel for asynchronous data streams. On top of DDS, we'll use Rx to subscribe, observe, project, filter, aggregate, merge, zip, and correlate one or more data streams (Observables). The live demo will be very visual as bouncing shapes of different colors will be transformed in front of you using C# lambdas, Rx.NET, and Visual Studio. You will also learn about the new Rx4DDS.NET library that integrates RTI DDS with Rx.NET. Rx and DDS are a great match because both are reactive. Rx is based on the subject-observer pattern, which is quite analogous to the publish-subscribe pattern of DDS. When used together they support distributed dataflows seamlessly. If time permits, we will touch upon advanced Rx concepts such as stream of streams (IGroupedObservable) and how it captures DDS "keyed topics". The DDS applications using Rx4DDS.NET dramatically simplify concurrency to the extent that it can be simply configured.
The document discusses several interoperability standards for IoT - CoAP, OMA LWM2M, and IPSO Smart Objects. It describes how they build upon each other with CoAP providing REST APIs for constrained devices, OMA LWM2M building on CoAP to define device management objects and models, and IPSO Smart Objects further defining application objects based on the LWM2M model. The standards provide a layered approach to connectivity, services, data models and applications to enable interoperability for IoT devices and services.
The document discusses several interoperable IoT standards including CoAP, OMA LWM2M, and IPSO Smart Objects. CoAP defines a RESTful protocol for constrained devices while OMA LWM2M builds on CoAP to provide device management and an object model. IPSO Smart Objects then build on LWM2M, defining reusable application objects and resources for common IoT domains. Together these standards provide interoperability across all layers of the IoT stack from the device to application software.
The document discusses the design and implementation of Spark Streaming connectors for real-time data sources like Azure Event Hubs. It covers key aspects like connecting Event Hubs to Spark Streaming, designing the connector to minimize resource usage, ensuring fault tolerance through checkpointing and recovery, and managing message offsets and processing rates in a distributed manner. The connector design addresses challenges like long-running receivers, extra resource requirements, and data loss during failures. Lessons from the initial receiver-based approach informed the design of a more efficient solution.
Tapad's data pipeline is an elastic combination of technologies (Kafka, Hadoop, Avro, Scalding) that forms a reliable system for analytics, realtime and batch graph-building, and logging. In this talk, I will speak about the creation and evolution of the pipeline, and a concrete example – a day in the life of an event tracking pixel. We'll also talk about common challenges that we've overcome such as integrating different pieces of the system, schema evolution, queuing, and data retention policies.
What you need to know about .NET Core 3.0 and beyondJon Galloway
The document provides an overview of .NET Core 3.0 including its top features, upcoming release schedule, and what is coming next. It discusses the key features in .NET Core 3.0 such as Windows desktop apps, microservices, gRPC, and machine learning. It also outlines the future of .NET with .NET 5 which will unify the different .NET implementations into a single platform.
Swift distributed tracing method and tools v2zhang hua
The document proposes a distributed tracing method and tools for Swift object storage. It involves adding middleware to collect trace data with unique IDs as requests pass through Swift components. Trace data including timing would be sent to a repository and correlated to reconstruct processing paths. Analysis tools would allow querying trace data and visualizing span trees to diagnose performance and identify bottlenecks across the distributed Swift infrastructure.
This document provides an overview of IoT and the AWS IoT platform. It discusses key IoT concepts like MQTT and smart home devices. It then details various aspects of the AWS IoT architecture including AWS IoT Core services, security, and pricing. Device SDKs and protocols are covered, as well as how AWS IoT integrates with other AWS services like Lambda, S3, DynamoDB through the AWS IoT rules engine. Device shadowing and registry services are also summarized. Finally, AWS IoT is compared to the Azure IoT Suite in terms of features.
Instrumenting and Scaling Databases with EnvoyDaniel Hochman
Every request to a database at Lyft is proxied by Envoy, providing complete visibility into the L3/L4 aspects of database interactions. This allows engineers to easily visualize changes to a database's load profile and pinpoint the root cause if necessary. Lyft has also open-sourced codecs for MongoDB, DynamoDB, and Redis. Protocol codecs in combination with custom filters yield benefits ranging from operation-level observability to horizontal scalability via sharding. Using Envoy for this purpose means that enhancements are implemented once and usable across a polyglot stack. The talk demonstrates Envoy's utility beyond traditional RPC service interactions in the network.
Apache Pulsar Development 101 with PythonTimothy Spann
Apache Pulsar Development 101 with Python PS2022_Ecosystem_v0.0
There is always the fear a speaker cannot make it. So just in case, since I was the MC for the ecosystem track I put together a talk just in case.
Here it is. Never seen or presented.
WebRTC has been an exciting technology, and extremely fast moving for the past years. While its adoption and its disruptive power are not challenged anymore, the fast evolution pace, and the fast update cycles of the browsers made it difficult to build complex solutions on top of it which would leverage all that webrtc has to offer. Late 2015, the different standard committees and corresponding working groups that compose webRTC have finally reach a consensus, and from the convergence of all their efforts, stable specifications were born.
Through the use of GoToMeeting and other software, we will illustrate first the usual pains that most using webrtc have experienced, and then show how the webrtc APIs, which had started as a peer-to-peer API, were extended with an object model API to provide more options and more controls to this who need it, while keeping the simplicity of P2P for the others. The similitudes between the new Object Model API, and the ORTC API (implemented in edge) will also be illustrated.
Ambry is an open source object store that is responsible for storing all media content at Linkedin. This talk goes over development of Ambry at Linkedin and its architecture to some details.
UKOLN supports the SWORD project which aims to improve the efficiency and options for depositing content into repositories by developing a standard specification. SWORD defines a deposit interface that has been implemented and tested in several repository systems, including EPrints, DSpace, Fedora, and IntraLibrary. The specification is informed by requirements from JISC and other international work in this area.
JConf.dev 2022 - Apache Pulsar Development 101 with JavaTimothy Spann
JConf.dev 2022 - Apache Pulsar Development 101 with Java
https://2022.jconf.dev/
In this session I will get you started with real-time cloud native streaming programming with Java. We will start off with a gentle introduction to Apache Pulsar and setting up your first easy standalone cluster. We will then l show you how to produce and consume message to Pulsar using several different Java libraries including native Java client, AMQP/RabbitMQ, MQTT and even Kafka. After this session you will building real-time streaming and messaging applications with Java. We will also touch on Apache Spark and Apache Flink.
Timothy Spann
Tim Spann is a Developer Advocate @ StreamNative where he works with Apache Pulsar, Apache Flink, Apache NiFi, Apache MXNet, TensorFlow, Apache Spark, big data, the IoT, machine learning, and deep learning. Tim has over a decade of experience with the IoT, big data, distributed computing, streaming technologies, and Java programming. Previously, he was a Principal Field Engineer at Cloudera, a Senior Solutions Architect at AirisData and a senior field engineer at Pivotal. He blogs for DZone, where he is the Big Data Zone leader, and runs a popular meetup in Princeton on big data, the IoT, deep learning, streaming, NiFi, the blockchain, and Spark. Tim is a frequent speaker at conferences such as IoT Fusion, Strata, ApacheCon, Data Works Summit Berlin, DataWorks Summit Sydney, and Oracle Code NYC. He holds a BS and MS in computer science. https://www.datainmotion.dev/p/about-me.html https://dzone.com/users/297029/bunkertor.html https://conferences.oreilly.com/strata/strata-ny-2018/public/schedule/speaker/185963
Framework for Evaluating Distributed Smalltalk InterfaceESUG
The document describes a framework for evaluating distributed Smalltalk interfaces that allows controlling measurements on remote machines, collecting and visualizing results, and identifying performance limits. It examines interfaces like Distributed Smalltalk (DST), CORBA, Implicit Invocation (I3), OpenTalk (OT), and GemStone (GS), and provides examples of measurement results for tasks like naming service registration across different configurations. The framework is intended to automate complex testing of distributed systems and provide immediate analysis of performance characteristics.
The document summarizes a seminar presentation on the World Wide Web (WWW). It discusses the basic client-server architecture of the WWW, with servers hosting documents and clients providing interfaces for users. It also covers the evolution of the WWW to include distributed services beyond just documents. Traditional web systems are described as using simple client-server models with URLs to locate documents on servers. Key aspects like HTTP, document models, and scripting technologies are summarized. Security measures for web transactions like TLS and aspects of caching, replication, and content delivery are also outlined.
Modernising your Applications on AWS: AWS SDKs and Application Web Services –...Amazon Web Services
In this session, you learn about the easy-to-use abstractions included in the AWS SDK for .NET and the AWS SDK for Java. We demonstrate how the AWS Toolkits for Visual Studio and Eclipse help streamline the application development process. You also see how to make use of AWS services such as DynamoDB, SQS and S3 in your applications running on AWS. We deep-dive on Amazon DynamoDB to showcase the features of the SDKs and how you can build advanced applications on top of DynamoDB, using the AWS SDKs.
OpenStack Swift tiering proposal and prototype detailsShiva Chaitanya
This slide deck describes the proposal to add automated tiering* as a native feature in OpenStack Swift. It also provides design & implementation details of a tiering prototype built in OpenStack Swift.
*Automated tiering is a feature provided by storage systems that include multiple media types like SSD, HDD, and Tape. It enables seamless movement of active data to high-performance storage media and inactive data to low-cost, high capacity storage media. Effective utilization of these tiers results in an overall total cost of ownership (TCO) reduction for the customers.
Similar to Object Striping In Swift_OpenStack HongKong Summit 2013_v2.2 (20)
OpenStack Swift tiering proposal and prototype details
Object Striping In Swift_OpenStack HongKong Summit 2013_v2.2
1. Object Striping in Swift
OpenStack Summit 2013, Hong Kong
Contributors: Bipin Kunal, Kashish Bhatia
Author/Presenter: Shriram Pore, Sr. Architect
2. Calsoft Confidential 2
Agenda
Problem statement
Proposed solution
Solution prerequisites
Approach - I : client unaware striping - striping and collation @ proxy
Approach - II : client aware striping - striping and collation @ client
Use-cases
Enhancements and future scope
Appendix
See also
3. Calsoft Confidential 3
Problem statement
Traditionally object stores are viewed as stores for smaller size objects
Swift support for large objects (dynamic and static)
By segmentation
Via manifest file support
Limitations
GET and PUT are costly in terms of time, network and storage utilization
Manifest file is limited by number of segments that can be supported
Varying-size objects handled differently
4. Calsoft Confidential 4
Proposed Solution –
Object Striping in Swift
Object striping + (Sparse Object + Parallel Read/Write + Vectored IO)
Object Striping
Object Segmentation / Chunking
Not confined to large objects
Sparse Object
Not all the stripes of the object need to be consumed
Object size and on-disk object size may differ vastly (e.g. 1 GB object could have 1 MB on-
disk size)
Parallel Read/Write from/to Multiple Object servers
Stripes of an object span across multiple partitions and hence object servers
Optimally involve maximum possible object servers in the read/write operations
Vectored IO
Multiple stripes stored in the sub-object are read/written via vectored IO to optimize the
IO performance
5. Calsoft Confidential 5
Solution Prerequisites
Following slides define the prerequisites changes to implement the
proposed solution
Glossary
Term Description
Object ID Hash of URL /account/container/object
Stripe ID An unique identifier for stripe within an object
Stripe Size Size of the stripe
Sub object Portion of an object, consisting of one or more stripes stored sparsely in a
partition
Metadata cache Cache of striping information at swift clients
Fingerprint MD5 of stripe data
6. Calsoft Confidential 6
Prerequisite -
Container Database Schema
Container DB – stores all the information about the objects associated with the container
Schema changes
object ID = object identifier
stripe size = size of stripe(for the specific object)
object size = size of object(to determine number of stripes it has)
object on-disk size = total size of stripes in sparse object
object version delta size = total size of only changed stripes in sparse object
Different objects may have different stripe size but for a given object, stripe size remains
same
Stripe size may vary between 1M-10M. Default stripe size is 1M
object ID stripe size object size
object on-disk
size
object version
delta size
7. Calsoft Confidential 7
Prerequisite -
Stripe ID and Stripe Offset Formulae
Stripe ID generated by object striping service
Stripe ID is a function f
f : Stripe ID -> ( stripe offset,
stripe size,
[object path],
[partition size] )
Stripe Offset is function g
g : Stripe Offset -> ( stripe id,
stripe size,
[object path],
[partition size] ) ([] - optional)
where
g = f -1
8. Calsoft Confidential 8
Prerequisite -
Role of Stripe ID Hash
object-ID = hash of “/Account/Container/object”
stripe-ID-hash = hash of “/Account/Container/object/stripe-ID“
Salient points
object-ID does not play any role in determining the partition
stripe-ID-hash is used to decide the partition
Partition is now a set of stripe-id hashes
Advantage
Evenly distribute stripes across partitions
Optimize multiple object server participation in PUT/GET request handling
Ring re-balancing in case of addition or removal of object server nodes
9. Calsoft Confidential 9
Prerequisites-
Extended Attributes
Metadata as extended attributes
Per-stripe extended attributes is used by Replicator, Auditor and Updater
Content-
Type
Content-
Length
Etag
(fingerprint)
Creation
time
Path
(eg: /A/C/obj)
Stripe-ID
Content-
Type
Content -
Length
Etag per stripe
(fingerprint)
Creation
time
Path
(eg: /A/C/obj)
Proposed - Per-stripe extended attributes
Now - Per-object extended attributes
10. Calsoft Confidential 10
Object ID
Stripe
ID
Stripe
Size
http
request
offset
Stripe
content
length
Object
offset
H(/A1/C1/Obj1) S2 1024 3000 1024 1024
H(/A1/C1/Obj1) S4 1024 4024 1024 3072
H(/A2/C2/Obj2) S1 1024 5048 1024 0
H(/A2/C2/Obj2) S3 1024 NULL TRIM 2048
3000
4024
5048
6072
OBJ1-S2
OBJ2-S1
S1
S2
S3
S4
S5
0
1024
2048
3072
4096
5120
OBJ1
S1
S2
S3
0
1024
2048
3072
OBJ2
http
body
Collated
headers
Prerequisite -
HTTP PUT Request
PUT request from proxy to the object server
HTTP Header represents list of stripes to store in the sub-object
HTTP body contains the corresponding data of the stripes
Stripe content length <= stripe size
OBJ1-S4
11. Calsoft Confidential 11
Object ID
Stripe
ID
Stripe
delta size
Stripe Max
offset
Status
H(/A1/C1/Obj1) S2 1024 2047 “New Write”
H(/A1/C1/Obj1) S4 0 4095 “Updated”
H(/A2/C2/Obj2) S1 1024 1023 “New Write”
H(/A2/C2/Obj1) S3 -1024 2047 “Trimmed”
Collated http
response
Prerequisite -
HTTP PUT Response
PUT response from object server to proxy
HTTP Header represents list of stripes added, updated or trimmed in the sub-object
Description of header fields and derived values
Status: “new write”, “updated”, “trimmed”, “replicated”
Stripe Max Offset: is the max offset of stripe within the sub-object
Object size = MAX(existing object-size, MAX(stripe-max-offset) )
Stripe delta size: effective data written to the disk
Object on-disk size = SUM(existing object on-disk size, SUM(Stripe delta size) )
12. Calsoft Confidential 12
Collated
headers
Approach 1:
Read/GET Request and Response
Illustration
Object ID
Stripe
ID
Stripe
size
Stripe
content
length
Object offset
H(/A1/C1/Obj1) S2 1024 1024 1024
H(/A1/C1/Obj1) S4 1024 1024 3072
H(/A2/C2/Obj2) S1 1024 1024 0
H(/A2/C2/Obj2) S3 1024 1024 2048
Object ID
Stripe
ID
Stripe
Size
http
response
offset
Stripe
content
length
Object
offset
H(/A1/C1/Obj1) S2 1024 3000 1024 1024
H(/A1/C1/Obj1) S4 1024 4024 1024 3072
H(/A2/C2/Obj2) S1 1024 5048 1024 0
H(/A2/C2/Obj2) S3 1024 6072 1024 2048
OBJ1-S2
OBJ1-S4
OBJ2-S1
OBJ2-S3
3000
4024
5048
6072
7096
GET request from proxy to the object
server
HTTP Header represents list of stripes to
fetch from the sub-object
GET response from object server to proxy
HTTP Header represents list of stripes
fetched from the sub-object
HTTP Body contains the data of
corresponding stripes
http
body
13. Calsoft Confidential 13
Miscellaneous Changes
Avoiding extra write I/O by using fingerprint(MD5 Sum)
Object server calculates the fingerprint for every stripe and compares it with fingerprint
stored as extended attribute of the sub-object
If fingerprint matches, discard the stripe write
Services like replicator, auditor, updater are also impacted to perform
action upon stripes instead of objects
Along with objects, partition also stores a hash table
Hash table stores fingerprints for each object in the partition
Replicator/auditor/updater works by comparing entries in hash table
Introducing object striping, requires per stripe fingerprint to be stored in the hash table
Consequently the services would take decisions based on changes at stripe granularity
14. Calsoft Confidential 14
Approach – I: Client Unaware Striping
Striping and Collation @ Proxy
Object striping and its collation
Transparent to the client
Performed at proxy server
Proxy collates multiple stripes of multiple objects destined to same object server in single
GET/PUT request
The stripes being collated can be sourced from different clients
Collation criteria
Collation is based on service level agreement (SLA) as follows:
Timeout based
Size based
Proxy decides partition and hence object server based on stripe-ID-hash
Note: The stripe size for all the objects is same and is configured at
installation
15. Calsoft Confidential 15
OBJECT SERVER RING
ACCOUNT SERVER RING
SWIFT
PROXY
Object 1
SWIFT
CLIENT 1
SWIFT
CLIENT 2
Object 2
CONTAINER SERVER RING
S1.1
S2.4
S2.2
S2.3
S2.1
S1.3
S2.5
S1.2
S1.1
S1.2
S1.3
S2.1
S2.2
S2.3
S2.4
S2.5Object
1
Object
2
Approach – I
PUT Operation
Proxy stripes the objects received from clients
Proxy decides the partition/object server for every
stripe
Proxy collate the stripes destined to same object server
based on SLA
HTTP request to multiple object servers are sent in
parallel
Collated stripes are written to respective sub-objects
using vectored IO
Response from object server is sent to proxy which is
passed on to client
16. Calsoft Confidential 16
Collated
headers
Approach 1:
Write/PUT Request and Response
Illustration
Object path
Stripe
ID
Stripe
Size
http
request
offset
Stripe
content
length
Object
offset
H(/A1/C1/Obj1) S2 1024 3000 1024 1024
H(/A1/C1/Obj1) S4 1024 4024 1024 3072
H(/A2/C2/Obj2) S1 1024 5048 1024 0
H(/A2/C2/Obj2) S3 1024 NULL TRIM 2048
3000
4024
5048
6072
OBJ1-S2
OBJ2-S1
OBJ1-S4
Object ID
Stripe
ID
Stripe
delta
size
Stripe Max
offset
Status
H(/A1/C1/Obj1) S2 1024 2047
“New
Write”
H(/A1/C1/Obj1) S4 0 4095 “Updated”
H(/A2/C2/Obj2) S1 1024 1023
“New
Write”
H(/A2/C2/Obj1) S3 -1024 2047 “Trimmed”
Scenario: when client1 is writing obj1 and client
2 is writing obj2
At proxy, the obj1 and obj2 are striped based on
configured stripe size
The stripes destined for same object server are
collated by proxy in single HTTP PUT request
Figure above shows S2, S4 of obj1 and S1, S3 of
obj2 are clubbed together
HTTP PUT response returns the status as
“New Write”, “Updated” or “Trimmed” for
stripes to be added, modified or de-allocated
respectively
Also Stripe delta size for each stripe is sent,
using which on-disk size of object is calculated
by proxy
17. Calsoft Confidential 17
OBJECT SERVER RING
ACCOUNT SERVER RING
SWIFT
PROXY
Object 1
SWIFT
CLIENT
SWIFT
CLIENT
Object 2
CONTAINER SERVER RING
S1.1
S2.4
S2.2
S2.3
S2.1
S1.3
S2.5
S1.2
Approach – I
GET Operation
Proxy receives object GET requests from multiple clients
Proxy server fetches striping information of the objects from
container and calculates stripe-IDs
Using the Stripe-IDs determine the partitions/object servers
Proxy also collates stripe GET requests destined to same object server
based on SLA
HTTP request to multiple object servers are sent in parallel
Collated stripes are read from respective sub-objects using vectored
IO
Response from object server is sent to proxy
Proxy assembles the stripes to form the object and send it to client
18. Calsoft Confidential 18
Collated
headers
Approach 1:
Read/GET Request and Response
Illustration
Object ID
Stripe
ID
Stripe
size
Stripe
content
length
Object offset
H(/A1/C1/Obj1) S2 1024 1024 1024
H(/A1/C1/Obj1) S4 1024 1024 3072
H(/A2/C2/Obj2) S1 1024 1024 0
H(/A2/C2/Obj2) S3 1024 1024 2048
Object ID
Stripe
ID
Stripe
Size
http
response
offset
Stripe
content
length
Object
offset
H(/A1/C1/Obj1) S2 1024 3000 1024 1024
H(/A1/C1/Obj1) S4 1024 4024 1024 3072
H(/A2/C2/Obj2) S1 1024 5048 1024 0
H(/A2/C2/Obj2) S3 1024 6072 1024 2048
OBJ1-S2
OBJ1-S4
OBJ2-S1
OBJ2-S3
3000
4024
5048
6072
7096
Scenario: The client1 sends read request
for obj1 and client2 for obj2, to proxy
Proxy fetches object-size from container
Proxy generate stripe-ids and form new
URLs to locate object servers and
partitions
Proxy collate read request for stripes
which lie on same object server
As shown in Figure above, stripe read
request for S2, S4 of obj1 and S1, S3 of
obj2 are clubbed together
Object server sends the stripes requested
Proxy again collates stripes of an obj1
came from different object servers and
send the obj1 requested by the client1
Similarly obj2 is sent to client2 after
collating stripes by proxy
19. Calsoft Confidential 19
Approach – I : Pros and Cons
and Miscellaneous Changes
Pros
Modifications are confined to swift server
Increase in effectiveness of Vectored IO at object server as proxy collates requests from
multiple clients
Cons
Proxy can be bottleneck
Collation of stripes of objects from multiple clients, leads to synchronization issue
Miscellaneous changes
This approach needs modification in services namely swift-container-server, swift-object-
server and swift-proxy-server
Replicator, Auditor and Updater services also requires change to operate over stripes
instead of objects
20. Calsoft Confidential 20
Approach – II : Client Aware Striping -
Striping and Collation @ Client
Client is aware of stripes
Striping and collation is done at client side
Proxy acts more as pass-through
Metadata cache on the client holds the striping information same as in container DB
Listing operation on container yields striping information to client
Striping information is piggy-backed in GET Response
Metadata cache of client contains following attributes:
object name, stripe size, object size, object-on-disk-size, object-version-delta-size
Stripe ID is generated by client for both GET/PUT requests
Stripe size of an object
May vary across different objects but is same within the object, so it is stored in container
DB per object
Decided by client during first PUT (based on parameters like network bandwidth, object
size, etc.)
Immutable there onwards
21. Calsoft Confidential 21
21
OBJECT SERVER RING
ACCOUNT SERVER RING
SWIFT
PROXY
SWIFT
CLIENT
S3.1
S3.2
S1.2 S2.1 S3.2S3.1S1.1
CONTAINER SERVER RING
S1.1
S1.2
S2.1
S2.2
S2.3 S1.1
S1.2
S2.1
S3.1
S3.2
S1.2
S1.1
S2.1
S3.2
S3.1
Object
1 Object
2
Object
3
Approach – II
PUT Operation
Client builds stripes for one or more objects, collates into
single PUT request and sends to proxy server
Proxy identifies the partition/object server for every
stripe using stripe ID passed by client
HTTP request to multiple object servers are sent in
parallel
Stripes are written to respective sub-objects using
vectored IO
Response from object server is sent to proxy which is
passed on to client
Proxy updates the striping information in the container
DB
Clients on reception of responses update its metadata
cache
22. Calsoft Confidential 22
Fig 2. http PUT request from proxy to one of the object servers
Collated
headers
Approach – II :
Write/PUT Request Illustration
Object path
Stripe
ID
Stripe
Size
http
request
offset
Stripe
content
length
Object
offset
/A1/C2/ Obj1 S2 1024 3000 1024 1024
/A1/C2/ Obj1 S1 1024 4024 1024 0
/A1/C2/ Obj2 S1 1536 5048 1536 0
/A1/C2/ Obj2 S3 1536 6584 1536 3072
Fig 1. http PUT request from client to proxy
Object ID
Stripe
ID
Stripe
Size
http
request
offset
Stripe
content
length
Object
offset
H(/A1/C2/ Obj1) S2 1024 3000 1024 1024
H(/A1/C2/ Obj2) S1 1536 4024 1536 0
H(/A1/C2/ Obj2) S3 1536 5560 1536 3072
OBJ1-S2
OBJ1-S1
OBJ2-S1
OBJ2-S3
OBJ1-S2
OBJ1-S1
OBJ2-S3
3000
4024
5560
7096
3000
4024
5048
6584
8120
Scenario: when Swift Client wants to write/modify some stripes of obj1 and obj2
stripe size for obj1 is 1024 bytes and obj2 is 1536 bytes
Swift client forms a single request and clubs information of modified data of stripes in it and
send the write/PUT request to proxy
Proxy based on stripe-id identifies the object servers and forms a new write/PUT request for
stripes. In Fig 2, S2 of obj1 and S1,S3 of obj2 reside on same object server
23. Calsoft Confidential 23
Fig 2. http PUT response from proxy to client
Collated
headers
Approach – II :
Write/PUT Response Illustration
Object ID
Stripe
ID
Stripe
delta
size
Stripe
Max
offset
Status
H(/A1/C2//Obj1) S2 1024 2047 “New Write”
H(/A1/C2/Obj2) S1 1536 1535 “New Write”
H(/A1/C2/Obj2) S3 1536 4607 “New Write”
Object path
Stripe
ID
Stripe
Size
Stripe
delta
size
Stripe
Max
Offset
Status
/A1/C2/Obj1 S2 1024 1024 2047 “New Write”
/A1/C2/Obj2 S1 1536 1536 1535 “New Write”
/A1/C2/Obj2 S3 1536 1536 4607 “New Write”
/A1/C2/ Obj1 S1 1024 1024 1024 “New Write”Fig 1. http PUT response from one of the object servers to proxy
Scenario: when Swift Client wants to write/modify some stripes of obj1 and obj2
Object server write/modify the stripes in sparse object and sends the response to proxy with
status=”new write”. Refer Fig 1.
Proxy in turn sends the response to client with same status. Refer Fig 2.
24. Calsoft Confidential 24
OBJECT SERVER RING
ACCOUNT SERVER RING
S1.1S1.2 S2.1 S3.1
SWIFT
PROXY
S1.1
S1.2
S2.1
SWIFT
CLIENT
SWIFT
CLIENT
S3.1
S5.2
S5.2
S1.2 S2.1 S5.2 S3.1S1.1
CONTAINER SERVER RING
Approach – II
GET Operation
Client builds GET request for multiple stripes of one or more objects
and sends to proxy server
Proxy identifies the partition/object server for every stripe using
stripe ID passed by client
HTTP request to multiple object servers are sent in parallel
Stripes are read from respective sub-objects using vectored IO
Response from object server is sent to proxy
Proxy assembles the stripes coming from multiple object servers
Proxy fetches and piggy-backs object striping information from
container DB for all the objects under consideration in the response
to client
Proxy send the response to client and client updates its objects and
metadata cache
25. Calsoft Confidential 25
Collated
headers
Approach – II :
READ/GET Request Illustration
Object Path Stripe ID
Stripe
Size
Stripe
content
length
Object
offset
/A1/C2/Obj1 S2 1024 1024 1024
/A1/C2/Obj1 S1 1024 1024 0
/A1/C2/Obj2 S1 1536 1536 0
/A1/C2/Obj2 S3 1536 1536 3072
Object ID Stripe ID Stripe Size
Stripe
content
length
Object
offset
H(/A1/C2/Obj1) S2 1024 1024 1024
H(/A1/C2/Obj2) S1 1536 1536 0
H(/A1/C2/Obj2) S3 1536 1536 3072
Fig 2. http GET request from proxy to one of the object servers
Fig 1. http GET request from client to proxy
Scenario: when swift client wants to read stripes belonging to different objects
Fig 1 shows http GET request to read S2,S1 of obj1 and S1,S3 of obj2
The client collates multiple stripe read requests and send it to proxy
Proxy based on stripe-id identifies the location of object servers and forms a new GET request
for stripes which reside on same object server. In Fig 2. S2 of obj1 and S1,S3 of obj2 reside on
same object server
26. Calsoft Confidential 26
Collated
headers
Approach – II :
READ/GET Response Illustration
Object ID
Stripe
ID
Stripe
Size
http
response
offset
Stripe
content
length
Object
offset
H(/A1/C2/Obj1) S2 1024 3000 1024 1024
H(/A1/C2/Obj2) S1 1536 4024 1536 0
H(/A1/C2/Obj2) S3 1536 5048 1536 3072
Object Path
Stripe
ID
Stripe
Size
http
response
offset
Stripe
content
length
Object
offset
/A1/C2/Obj1 S2 1024 3000 1024 1024
/A1/C2/Obj2 S1 1536 4024 1536 0
/A1/C2/Obj2 S3 1536 5048 1536 3072
/A1/C2/Obj1 S1 1024 6584 1024 0
Fig 2. http GET response from proxy to client
Fig 1. http GET response from one of the object servers to proxy
OBJ1-S2
OBJ2-S1
OBJ1-S3
3000
4024
5048
6584
OBJ1-S2
OBJ2-S1
OBJ1-S3
3000
4024
5048
6584
Scenario: when swift client wants to read
stripes belonging to different objects
Object server reads the stripes from sparse
objects and sends the response with data read
to proxy. Refer Fig 1.
Proxy in turn sends the data read back to client.
It also piggy-backs object striping information.
Refer Fig 2.
Object Path Stripe Size Object size
Object on-
disk size
Object
version
delta size
/A1/C2/Obj1 1024 4096 3072 NO_SIZE
/A1/C2/Obj2 1536 4608 3072 NO_SIZE
Collated
headers
Piggy-Back
headers
OBJ1-S1
7608
27. Calsoft Confidential 27
Approach – II : Pros and Cons
and Miscellaneous Changes
Pros
Proxy is offloaded from the striping and collation task
Utilizing client resources more
Reducing network bandwidth usage from client end to object server end
Cons
Proxy has to assemble all the stripes coming from multiple object servers to build
response for client
Miscellaneous changes
This approach needs modification in services namely swift-container-server, swift-object-
server, swift-proxy-server and swift-client
Replicator, Auditor and Updater services also requires change to operate over stripes
instead of objects
28. Calsoft Confidential 28
Use Case – I :
Exporting Object Storage as Volume
Overview
Use swift to export object storage to be utilized by applications as block storage
In-memory representation of set of large objects can be exported by swift client as logical
units/ virtual block devices to applications
The IO performance has no severe impact as the large objects are striped transparently
by the swift client. Stripes of any object that needs to read or written, are transmitted
over network which reduces consumption of network bandwidth
By building volume layer over objects, one can implement various other features and
functionalities like de-duplication, compression, snapshots
29. Calsoft Confidential 29
OBJECT SERVER RING
ACCOUNT SERVER RING
SWIFT
PROXY
SWIFT
CLIENT
CONTAINER SERVER RING
Obj 1
Obj 2
App1 App2
Obj 1 Obj 2
LOGICAL VOLUME
VOLUME MANAGER
DEVICE HANDLER
SWIFT CLIENT
Object
Stripes
Use Case – I :
Volume Stack Over Object Storage
Swift client can represent an in-memory object
(metadata and cached stripes) as virtual block
devices(using device handlers implemented) and those
virtual devices can be clubbed together to form volume
group(using volume managers) which can carve out
logical volumes to be used by different applications
30. Calsoft Confidential 30
OBJECT SERVER RING
ACCOUNT SERVER RING
SWIFT
PROXY
SWIFT
CLIENT
CONTAINER SERVER RING
FABRICS
Obj 1 Obj 2
Obj 1
Obj 2
LUN 1 LUN 2
LUN 3
Application Nodes
Obj 3
Obj 3
Object
Stripes
SWIFT
CLIENT
Use Case – I :
Exporting Object Storage over Fabrics
Swift client can read/write stripes of objects
exported as LUNs over SCSI target with
fabrics like (iSCSI, FC, FCoE, ATAoE, etc) to be
used by remote applications
31. Calsoft Confidential 31
Use Case – II :
Redirect-On-Write Snapshot
Overview
On demand snapshots creation request can be initiated from swift client or by
applications on swift client by sending requests to container server to take a snapshot
With each snapshot, a new version of object is created
As container DB stores striping information of objects, it would also store size of delta i.e.
total size of stripes which got updated, added and/or trimmed
Extending the view:
If both the Use-case I and Use-case II are seen together, snapshot providers (e.g. VSS
providers) can be implemented at SCSI target end, resulting into single call to swift to take
a snapshot
32. Calsoft Confidential 32
X PY Q S T K
X PY Q S T K
X PS Q S Z K
S Z
Current Object
X IS J S Z K
I J
S Z
X PY Q S T K
1st New Writes 2nd New Writes
Current Version
of Object
Current Version
of Object
Snapshot V1
Snapshot V1
Snapshot V2
Use Case – II :
Redirect-On-Write Snapshot
33. Calsoft Confidential 33
Use Case – II :
Operations
Swift client can request, on demand, snapshot creation
Snapshot is created as a new object version in container DB and respective files would get
created on the object servers as PUT requests for various stripes to be added, updated or
trimmed as received following the snapshot creation
Current/active object version is piggy backed in client response
The client has to explicitly do listing operation over containers to list out all the snapshot.
During list request, the response would contain all the metadata of objects for all the
snapshots
The client can also request for stripes of specific snapshot in the GET request
34. Calsoft Confidential 34
Use Case – III :
Continuous Data Protection (CDP)
Overview
Swift supports object versioning
When versioning is enabled for any container, for each new write on objects residing in
object server under the container, a new version of that object is created. Since today
object is nothing but the entire file in swift, storage is consumed significantly to store
version of file(object)
With object striping in swift, object version would store only changed stripes of objects
resulting in less storage consumption. Thus we can have better solution of CDP even for
storing versions of large objects
As container DB stores information of objects, it would also store size of delta i.e. total
size of stripes which got changed
Extending the view:
If both the Use-case I and Use-case III are seen together, near CDP can be achieved using
object versioning at stripe granularity
35. Calsoft Confidential 35
t
m
j
o
u
s
c
d
p
s v
q
r x
z c
l z
e
j
q d
g
r
z
m
l
q
j
0
1
2
3
4
5
t0 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11
tcurrent
View Of Time t2
Use Case – III :
Continuous Data Protection (CDP)
36. Calsoft Confidential 36
Use Case – III :
Operations
As per current implementation before object versioning, the client would send a request to
create a new container to store object versions and link it with main container
Now on every new write, newer version of object is created and would be part of linked
container
In GET request, the client has to mention the version of the object. Based on version number,
proxy collaboratively working with container, determines the location of the set of version
stripes and requests in parallel to get the stripes from object servers
If no object version is mentioned, the latest state of the object is fetched
37. Calsoft Confidential 37
Use Case – IV :
TRIM/UNMAP
Overview
Due to sparse object support, TRIM/UNMAP command can be used to erase stripes,
thereby reducing used storage space
The extended attributes for trimmed stripes are as follows:
When swift client wants to read such sparse object, swift server sends information of
which stripes are trimmed, in http header of response. Similar status can be sent even if
stripe does not exist
Swift client then can allocate zeroed pages for those trimmed stripes(file holes) and then
send to application. So no data is transferred over network
stripe-ID
Content-
Type=
“trimmed”
Content-
Length
Etag per
stripe
(md5sum)
Creation
time
path
(eg: /A/C/obj)
38. Calsoft Confidential 38
Use Case – IV :
Write/PUT Operations
TRIM as a write operation introduces holes into the object
Trim operation is initiated by swift client to de-allocate the stripes
TRIM request is passed to respective object servers. The object server would de-allocate
range of blocks representing the stripes. And the file/sub-object hole is created
The corresponding entries in the extended attributes are updated as “trimmed”
In response, along with ACK the object servers send the number of bytes removed(stripe de-
allocated information) of stripe in “Stripe delta size” and the “Stripe Max offset”. In this case
the “Stripe delta size” is negative, to represent the stripe has been trimmed
The on-disk size of object is updated in container by subtracting those many bytes
39. Calsoft Confidential 39
Fig 2. http PUT request from proxy to one of the object servers
Collated
headers
Use Case – IV :
Write/PUT Request illustration
Object path
Stripe
ID
Stripe
Size
http
request
offset
Stripe
content
length
Object
offset
/A1/C1/Obj1 S2 1024 NULL TRIM 1024
/A1/C1/obj1 S1 1024 NULL TRIM 0
/A1/C2/obj2 S1 1536 NULL TRIM 0
/A1/C2/obj2 S3 1536 NULL TRIM 3072
Fig 1. http PUT request from client to proxy
Object path
Stripe
ID
Stripe
Size
http
request
offset
Stripe
content
length
Object
offset
H(/A1/C1/ Obj1) S2 1024 NULL TRIM 1024
H(/A1/C1/Obj1) S1 1024 NULL TRIM 0
H(/A1/C2/ Obj2) S1 1536 NULL TRIM 0
Scenario: Client wants to trim stripes S2 and S1 of obj1 and stripes S1 and S3 of obj2
Client forms a single http request to club multiple trim requests on stripes and send it to
proxy. In the request, it sends the stripe content length=”TRIM” to notify trim operation.
Refer Fig 1.
Based on stripe-id, proxy identifies the object servers and club requests destined for same
object server. Fig 2. shows that trim for S2 and S1 of obj1; and S1 of obj2 is sent in single
request
40. Calsoft Confidential 40
Fig 2. http PUT response from proxy to client
Collated
headers
Use Case – IV :
Write/PUT Response illustration
Object ID
Stripe
ID
Stripe
delta
size
Stripe
Max
offset
Status
H(/A1/C1/Obj1) S2 -1024 1023 “TRIMMED”
H(/A1/C1Obj1) S1 -1024 0 “TRIMMED”
H(/A1/C2/Obj2) S1 -1536 0 “TRIMMED”
Object path
Stripe
ID
Stripe Size
Stripe
delta
size
Stripe
Max
offset
Status
/A1/C1/ Obj1 S2 1024 -1024 1023 “TRIMMED”
/A1/C1/ Obj1 S1 1024 -1024 0 “TRIMMED”
/A1/C2/ Obj2 S1 1536 -1536 0 “TRIMMED”
/A1/C2/ Obj2 S3 1536 -1536 3071 “TRIMMED”Fig 1. http PUT response from object servers to proxy
Scenario: Client wants to trim stripes S2 and S1 of obj1 and stripes S1 and S3 of obj2
Object server de-allocate the stripes, update the extended attributes and sends the response
to proxy with status=”trimmed” and Stripe delta size=<bytes de-allocated>. Refer Fig 1.
Proxy sends the response back to swift client with same status. Refer Fig 2.
41. Calsoft Confidential 41
Use Case – IV :
Read Operations
Since a client is initially unaware of the striping information or in case of the cached striping
information of a client is stale, it may request for a non-existing/trimmed stripe
The request would be forwarded to object server by proxy
Object servers would be able to identify either the stripe was never written or got trimmed
(marked as trimmed) from the extended attributed pertaining to the requested stripe
It would send this information to swift client via proxy. No zeroed stripes are passed over
network, instead only relevant attributes in the http response would able to identify that the
stripe got trimmed
Swift client would prepare zeroed pages for trimmed stripes and send it to applications sitting
over swift client
42. Calsoft Confidential 42
Collated
headers
Use Case – IV :
READ/GET Request illustration
Object Path Stripe ID
Stripe
Size
Stripe
content
length
Object
offset
/A1/C2/Obj1 S2 1024 1024 1024
/A1/C2/Obj1 S1 1024 1024 0
/A1/C2/Obj2 S1 1536 1536 0
/A1/C2/Obj2 S3 1536 1536 4608
Object path Stripe ID Stripe Size
Stripe
content
length
Object
offset
H(/A1/C2/obj1) S2 1024 1024 1024
H(/A1/C2/obj2) S1 1536 1536 0
H(/A1/C2/obj2) S3 1536 1536 4608
Fig 2. http GET request from proxy to one of the object servers
Fig 1. http GET request from client to proxy
Scenario: when client want to read stripes on behalf of applications
The client collates multiple stripe read requests and send it to proxy. Refer Fig1.
Proxy based on stripe-id identifies the location of object servers and forms a new GET request
for stripes which reside on same object server. In Fig2. S2 of obj1; and S3 and S1 of obj2 reside
on same object server
For GET response, refer the next slide
43. Calsoft Confidential 43
Piggy-Back
headers
Use Case – IV :
READ/GET Response illustration
Object ID
Stripe
ID
Stripe
Size
http
response
offset
Stripe
content
length
Object
offset
H(/A1/C2/Obj1) S2 1024 0 TRIM 1024
H(/A1/C2/Obj2) S1 1536 0 TRIM 0
H(/A1/C2/Obj2) S3 1536 0 TRIM 4608
Object Path
Stripe
ID
Stripe
Size
http
response
offset
Stripe
content
length
Object
offset
/A1/C2/Obj1 S2 1024 0 TRIM 1024
/A1/C2/Obj2 S1 1536 0 TRIM 0
/A1/C2/Obj2 S3 1536 0 TRIM 4608
/A1/C2/Obj1 S1 1024 0 TRIM 0
Fig 2. http GET response from proxy to client
Fig 1. http GET response from one of the object servers to proxy
Scenario: when client want to read stripes on
behalf of applications
Object server learns that the stripes requested
are trimmed, by checking extended attributes
of sparse object. Object server forms a
response with stripe content length=TRIM and
sends it to proxy. Refer Fig 1.
Proxy in turn sends the response to client with
same status. It also piggy-backs the object
striping information. Refer Fig 2.
Object Path Stripe Size Object size
Object on-
disk size
Object
version
delta size
/A1/C2/Obj1 1024 4096 2048 NO_SIZE
/A1/C2/Obj2 1536 4608 1536 NO_SIZE
By reading the status, swift client comes to
know that the stripes were trimmed. It creates
zeroed pages and sends to application which
initiated the read request
Collated
headers
44. Calsoft Confidential 44
Use Case – V :
Objects based file-system
Overview
One can have file-system layer over objects. End users will perform IO on files which are
internally mapped to one or more swift objects depending upon the size of the file at the
swift client end
Operations
A stackable file-system can be built over the objects which maps the files to objects
Known frameworks like FiST or FUSE(file-system in user-space) can be used to implement
stackable file-systems
The file system could be pass through, just handling the mapping of files and objects Or
could provide extended capabilities like encryption, compression, etc.
Now when IOs are encountered over files directly mapped to objects, only the stripe that
is changed are written back by clients which reduces consumption of network bandwidth
Similarly only stripes of the objects corresponding to the file blocks being read are
transmitted over network
45. Calsoft Confidential 45
OBJECT SERVER RING
ACCOUNT SERVER RING
USE CASE 5
OBJECT BASED FILE_SYSTEM
SWIFT
PROXY
SWIFT
CLIENT
CONTAINER SERVER RING
Obj 1
Obj 2
App1 App2
Obj 1 Obj 2
FILES
STACKABLE FILE
SYSTEMS
SWIFT CLIENT
FS1
FS2
Object
Stripes
Use Case – V :
Objects based file-system
46. Calsoft Confidential 46
Enhancements
Enhancements of approach 2:
The stripe size for client can be made different across different sessions depending on the
client and network capabilities
Stripe size on the server could be constant but for clients, it can vary
Client based on network bandwidth available to access server nodes, can decide stripe
size and can share stripe size with server nodes
Communication between server nodes(proxy and object server) can happen with some
different stripe size
But requests and responses to/for clients will be addressed based on their stripe size
With this stripes in the sub-objects could be partially overwritten which can be handled
by using the “stripe content length” header field
47. Calsoft Confidential 47
Future Scope
Future Scope:
Proxy could be a bottleneck as it handles all the control and data requests from the swift
clients
To overcome this bottleneck, a complete new design, where swift clients can
communicate with object servers bypassing the proxy server can be thought over
In such a design, swift clients would only go to proxy once to get the access to object
servers and the location of object within object server
Once client is aware, it can directly send http requests to object servers
Using such approach, there could be security issues which would need to be tackled by
object servers but thought through well
54. Calsoft Confidential 54
References
Lustre file-system architecture
Swift Architecture
Swift Large Objects Dynamic and Static
Vectored IO or Scatter/Gather IO
Sparse file and Thin provisioning
SCSI target and Logical Volume Manager
Redirect-On-Write Snapshot
Continuous Data Protection
TRIM/UNMAP
FiST and FUSE
55. Calsoft Confidential 55
See also
Enterprise aware/embracing OpenStack
OpenStack enabler
Intelligent Load Balancer for Compute Resources
OpenStack Health Monitoring & Management Framework
Tempest extension for Horizon testing support
56. Calsoft Confidential 56
Enterprise Aware/Embracing OpenStack
OpenStack components/services do not leverage on the high end capabilities of appliances
(software & hardware) built by enterprise vendors in compute, storage and networking space.
The appliances are efficient, greener, low at maintenance, lesser form factors suited for
datacenters. Embracing these well into the OpenStack framework along with their enhanced
features adds to OpenStack adoptions across the industry.
For instance, enterprise storage vendors (NAS, SAN, SSD based arrays) have features like
replication, snapshotting, de-duplication, also at times provide object store support
Enterprise Network vendors have features like Deep Packet Inspections & Analysis, Firewall, etc.
Leveraging on the features can help reduce processing, power consumption, and increase
efficiency in respective areas of cloud infrastructures.
Go along with the enterprise vendors for OpenStack to be omnipresent. Increasing the
relevance and longevity of OpenStack for the industry and community.
This is a seed of thought of making OpenStack more and more relevant in emerging cloud world
from strategic and business perspective.
Contributors: Tejas Sumant, Shriram Pore
57. Calsoft Confidential 57
OpenStack Enabler
The idea behind the OpenStack enabler is to have a tool or application which helps OpenStack
based cloud providers to deploy their cloud infrastructure. The enabler graphically assists a
provider to estimate the hardware requirements for meeting the cloud service needs. It also
provides a deployment view of various OpenStack services to orchestrate the infrastructure.
This can give possible blueprints of the cloud deployment which the cloud service provider or
private cloud administrator can choose from.
Consider following use cases -
Case 1:
If the available hardware information is provided, then the OpenStack enabler will predict all
possible ways to deploy the cloud and also predict the virtualization capacity generated by the
given set of hardware.
Case 2:
If the virtualization needs are provided, which are tunable parameters like, number of VMs,
storage capacity per VM, processing power per VM, RAM consumption per VM, objects serving
throughput, etc. then OpenStack enabler predicts the required hardware to satisfy the need. It
can even estimate the CapEx requirements.
Contributors: Sachin Patil, Shrirang Phadke, Shriram Pore
58. Calsoft Confidential 58
Intelligent Load Balancer for Compute
Resources
The compute resource monitoring enables the administrators to get the QoS, health & resource
utilization parameters. This helps achieve the migration of compute resources (VMs) as and
when required. By providing support for auto-scaling of Heat in mind, the idea is to create a
framework that will enable effective usage of ‘Nova compute resources’ utilizing the monitoring
capabilities on the compute resources (Host) and perform live migration between hosts as
required.
The load balancer will create an automated system with capable decision making to address the
impact of monitoring. It will also enable more green resources by enabling and disabling the
compute resources on demand.
Provisioning, Live-migration and Auto-scaling (provided the resources support, remote
management capabilities).
The framework will in turn utilize the supported APIs provided by underlying virtualization
platforms. This framework can also be extended to include:
Automated and seamless cross-platform migration using OVF (Open Virtualization Format)
without the user intervention. E.g. migration between the Open stack implementations hosted
by VMware, Citrix, KVM, Hyper-V and so on. HA (High Availability), FT (Fault Tolerance), SRA
(Scattered Resource Administration) can also be achieved.
Contributors: Swapnil Kulkarni, Shriram Pore
59. Calsoft Confidential 59
OpenStack Health Monitoring &
Management Framework
With OpenStack deployments and usage in mind, the idea is to provide a standardized,
integrated yet independent health monitoring management framework for OpenStack services
and infrastructure (Hypervisor, Storage and Network). The framework can be based on the
Common Information Model (CIM). CIM is an open standard that defines how managed
elements in an IT environment are represented as a common set of objects and relationships
between them. This is intended to allow consistent management of the elements, independent
of their manufacturer or provider.
For OpenStack services, CIM based management framework can be provided with defined
available interfaces (preferably REST & CLI) for monitoring & standardized service responses.
The REST interfaces could be used by OpenStack dashboard or user client directly, whereas the
standard service responses can be used by standard monitoring tools. This will help standardize
OpenStack monitoring capabilities for existing as well as future services.
For OpenStack infrastructure, the same set of interfaces can be implemented for underlying
hardware to get the health statistics. It can be SNMP based CIM implementation or can vary
depending on the provider.
Above implementation of OpenStack health monitoring will help in:
1. Standardized service status responses
2. Standardized API for health monitoring for services and infrastructure
Contributors: Swapnil Kulkarni, Arun Sharma, Shriram Pore
60. Calsoft Confidential 60
Tempest Extension for Horizon Testing
User Interface is one of the major reasons for cloud administrator’s acceptance. The more user
friendly and bug free it is, higher is the acceptance and adoption. Horizon being user interface
for OpenStack, needs to stand the testimony and get more acceptances. The interface needs to
be standardized, consistent and bug free, and hence requires testing. But testing of user
interface is not similar to testing other services/components of OpenStack; hence we suggest
an extension to tempest for Horizon testing.
As OpenStack is growing, here is an integration test which will ensure that the horizon runs
smoothly. It tests basic functionality of three central dashboards "User Dashboard", "System
Dashboard", and "Settings”. Further extend the scope to more complex and complicated
scenarios.
Currently in tempest project, there is no test coverage for horizon dashboard. In this paper we
aim to introduce testing of OpenStack horizon as a part of tempest framework using selenium
python web driver and unittest(2) python standard library. This will execute the tests of horizon
with continues integration. We suggest the use of xvfb (virtual frame-buffer X server for X
Version 11).
This will enable current directory structure of tempest to extend with ‘horizon’ support making
no changes in existing framework.
Contributors: Shrikant Chaudhari, Shriram Pore
61. Calsoft Confidential 61
Tempest Extension for Horizon Testing
User Interface is one of the major reasons for cloud administrator’s acceptance. The more user
friendly and bug free it is, higher is the acceptance and adoption. Horizon being user interface
for OpenStack, needs to stand the testimony and get more acceptances. The interface needs to
be standardized, consistent and bug free, and hence requires testing. But testing of user
interface is not similar to testing other services/components of OpenStack; hence we suggest
an extension to tempest for Horizon testing.
As OpenStack is growing, here is an integration test which will ensure that the horizon runs
smoothly. It tests basic functionality of three central dashboards "User Dashboard", "System
Dashboard", and "Settings”. Further extend the scope to more complex and complicated
scenarios.
Currently in tempest project, there is no test coverage for horizon dashboard. In this paper we
aim to introduce testing of OpenStack horizon as a part of tempest framework using selenium
python web driver and unittest(2) python standard library. This will execute the tests of horizon
with continues integration. We suggest the use of xvfb (virtual frame-buffer X server for X
Version 11).
This will enable current directory structure of tempest to extend with ‘horizon’ support making
no changes in existing framework.
Contributors: Shrikant Chaudhari, Shriram Pore
62. Calsoft Confidential 62
Co-Presenters Biography
Parag Kulkarni – VP Engineering, Calsoft Inc.
A veteran of storage industry
More than 20 years of experience in architecting and developing products
Key strength lies in quickly understanding product requirements and translating them into architectural
and engineering specs for implementation.
Led the engineering team at Calsoft.
Led the development of Database Editions product at Veritas (Symantec)
A key contributing member at leading storage companies like Informix (IBM).
Masters of Technology in Computer Science from IIT Roorkee
Degree in Industrial Management from University of Indore, India.
63. Calsoft Confidential 63
Authors Biography
Shriram Pore - Senior Architect, Calsoft Inc.
A veteran of storage industry
More than 12+ years of experience in architecting and developing products
Key strength lies in quickly understanding product requirements and translating them into architectural
and engineering specs for implementation.
Working on Virtualization and cloud platforms more towards OpenStack
Building in-house expertise and executing projects using or extending OpenStack
Architected, designed and implemented solutions for NAS, Virtualization integrated Backup-Recovery and
Clone of VM
Led FS and replication teams on the file-server and also led the CORE component team from system
management perspective (configuration, provision, and manage various facilities of file-server).
Executed ideation projects related to Lustre file-system.
Master Of Computer Science from India, Pune University
Bachelor of Computer Science from India, Pune University
Publication - http://goo.gl/3eVwAm - Pragmatic about software product performance
Publication - http://goo.gl/BbABCO - How can Hypervisors leverage Advanced Storage features?
To know more about Shriram connect on http://lnkd.in/iWFJz2
64. Calsoft Confidential 64
Thank You.
Calsoft Inc.
www.calsoftinc.com
smm@calsoftinc.com
USA
2953 Bunker Hill Lane, Suite 400,
Santa Clara, CA 95054.
Phone: +1 (408) 834 7086
INDIA
SR Iriz, 4th Floor, Plot A, S.No. 134/2/1
Pashan - Baner Link Road, Pune 411008
Phone: +91 (20) 4079 2900
Editor's Notes
15 June 2007
Not taking too much of time let us jump to the Agenda.
It is as show, first we quickly go through the problem statement that we are try to address
Followed by simplest overview of Proposed Solution
There are couple of approaches that we have come up with, but before that we will touch base on the pre-requisite changes to build the solution, irrespective of the approach that is chosen.
After discussing the Approaches, we would go through five use cases as to how this can be used in solving certain current world problems and its fitment into the real world picture
Finally some enhancements and future scope, which we are still working on.
We would not discuss ahead of it, but the Appendix section shows the JSON represnetation of the HTTP headers, more for the developers who believe in this idea and would want to implement it
There some following slides in See Also sections – these are upcoming papers which will be published on Calsofts website
Point 1
Having said that, swift does support large objects in form of dynamic and static large objects which we all must be aware of
Swift does this by dividing the large objects into segments and uses manifest file approach – manifest file is the root of the large object which stores information of all the segments that make up the object
There are some limitations to this
Obviously being large objects1st limitation
Secondly 2nd limitation – as the number of segments grow the manifest file handling gets difficult
Last but the most import limitation that we would like to highlight is that large objects and regular small & medium size objects are today handled differently
The equation of the proposed solution looks somewhat as show here
Object striping is a technology and when it is blended with Sparse Objects plus Parallel Read/Writes to multiple Objects servers plus vectored IO support at the object server end makes up Object Striping in swift.
Objects striping is similar to segmentations or chunking but we do not want to confine it to large objects
object striping leads to span single object across multiple objects servers – what does that mean we will re-visit in 2 mins
Sparse Object – this is something interesting that we were struck by when we started thinking about striping.
The idea is not all the stripes of the object needs to be consumed
For instance as the example is give the object size could be 1 GB but the consumed stripe could be just 1 MB
Essentially we can be more storage efficient by making the objects sparse
Rather as we go ahead we all would realize that object size is something that we are not limited by at all, the object size could be as large as desired
With object striping we intend to distribute the stripes of single objects to multiple different partitions.
Since certain set of partitions reside on certain object servers we have distributed single object on multiple object servers. In other words single object is spanning multiple object servers – further added due to the desired redundancy
This means that now reads and writes of single objects can happen in parallel from multiple object servers.
This is adding to the complexity at the proxy level which should not hamper the performance so it really needs to be optimally done
So just to give a over all picture of how the object would look like when stored in swift
A single object is broken down into multiple sub-objects, these sub-objects lie in unique partitions
These sub-objects are sparse files which store only those stripes of the object which are intended to be stored in that particular partition.
The sub-object could hold one for more unique stripes and at the specific offset w.r.t. the object
This means that the sub-object itself is sparse, that’s where we see Vectored IO support is desired to read and write multiple stripes to single sub-object from multiple offsets within the sub-object