In laymen's term, this is a file system that realizes hot and cold data identification, moving cold data to secondary storage (dropbox here), retrieving cold data from secondary storage as an essential activity.
In this project we implemented this file system and handled all the general and specific cases to allow seamless transfer of data from hot to cold and cold to hot.
Direct NFS is an Oracle implementation of the standard NFS protocol inside Oracle Database. Using this native solution Oracle instance is able to better handle IO traffic and bypass a kernel layer of operating system. dNFS is commonly used for Exadata backups into ZFS Appliance and for Oracle databases using a NFS based solutions (like NetApp or Delphix ).
The goal of this presentation is a show DBA’s what is a Direct NFS, how to configure, use and monitor it. Proper configuration and monitoring are important to achieve a maximum performance and easy problem diagnostic. There will be examples of configuration with and without direct NFS.
In this presentation is briefly introduced the use of Docker for Data Science.
Are presented arguments like the management of containers and the creation of new Docker images
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...Glenn K. Lockwood
Comparing the burst buffers of today, such as the Cray DataWarp-based burst buffer implemented on NERSC Cori, to the proto-burst buffer deployed on SDSC's Gordon supercomputer in 2012.
[若渴]Study on Side Channel Attacks and Countermeasures Aj MaChInE
[投影片錯誤更正] p.43 中間32數字改成64。右上藍色小框64改成63
原本要整理Meltdown與Spectre,但這兩個所利用的硬體行為之後都跟cache side channel有關係,所以閱讀Meltdown與Spectre之餘,就整理了相關cache side channel攻擊與防禦。
回饋問題:
一: 為什麼LLC要切割成LLC slice?
"Modern Intel processors, starting with the Sandy Bridge microarchitecture, use a more complex architecture for the LLC, to improve its performance. The LLC is divided into per-core slices, which are connected by a ring bus. Slices can be accessed concurrently and are effectively separate caches, although the bus ensures that each core can access the full LLC (with higher latency for remote slices)."
二: flush+reload with shared memory pages,為什麼要 flush+reload? 不是可以直接存取到資料?
討論的是共用shared library,洩漏victim使用shared library的情形。
三: RDTSCP ?
可量測執行指令的cycle數。
四: side channel攻擊需要環境運作的程式不能太複雜?
Kuon: 實際案例 embed運作環境並不複雜,e.g. trustzone上可能只運作openSSL。
AJ: 就算在複雜環境,可以找到觸發Victim的特定運算點,也是可以進行觀測。
Direct NFS is an Oracle implementation of the standard NFS protocol inside Oracle Database. Using this native solution Oracle instance is able to better handle IO traffic and bypass a kernel layer of operating system. dNFS is commonly used for Exadata backups into ZFS Appliance and for Oracle databases using a NFS based solutions (like NetApp or Delphix ).
The goal of this presentation is a show DBA’s what is a Direct NFS, how to configure, use and monitor it. Proper configuration and monitoring are important to achieve a maximum performance and easy problem diagnostic. There will be examples of configuration with and without direct NFS.
In this presentation is briefly introduced the use of Docker for Data Science.
Are presented arguments like the management of containers and the creation of new Docker images
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...Glenn K. Lockwood
Comparing the burst buffers of today, such as the Cray DataWarp-based burst buffer implemented on NERSC Cori, to the proto-burst buffer deployed on SDSC's Gordon supercomputer in 2012.
[若渴]Study on Side Channel Attacks and Countermeasures Aj MaChInE
[投影片錯誤更正] p.43 中間32數字改成64。右上藍色小框64改成63
原本要整理Meltdown與Spectre,但這兩個所利用的硬體行為之後都跟cache side channel有關係,所以閱讀Meltdown與Spectre之餘,就整理了相關cache side channel攻擊與防禦。
回饋問題:
一: 為什麼LLC要切割成LLC slice?
"Modern Intel processors, starting with the Sandy Bridge microarchitecture, use a more complex architecture for the LLC, to improve its performance. The LLC is divided into per-core slices, which are connected by a ring bus. Slices can be accessed concurrently and are effectively separate caches, although the bus ensures that each core can access the full LLC (with higher latency for remote slices)."
二: flush+reload with shared memory pages,為什麼要 flush+reload? 不是可以直接存取到資料?
討論的是共用shared library,洩漏victim使用shared library的情形。
三: RDTSCP ?
可量測執行指令的cycle數。
四: side channel攻擊需要環境運作的程式不能太複雜?
Kuon: 實際案例 embed運作環境並不複雜,e.g. trustzone上可能只運作openSSL。
AJ: 就算在複雜環境,可以找到觸發Victim的特定運算點,也是可以進行觀測。
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...NETWAYS
It gives an introduction to the architecture of Bareos, and how the components of Bareos interact. The configuration of Bareos will be discussed and the main Bareos features will be shown. As a practical part of the workshop the adaption of the preconfigured standard backup scheme to the attendees’ wishes will be developed.
Attendees are kindly asked to contribute configuration tasks that they want to have solved.
AFF4: The new standard in forensic imaging and why you should careBradley Schatz
This seminar will outline why a new forensic container standard is needed and outline recent efforts to standardize the Advanced Forensic Format 4 forensic container (AFF4). Originally proposed in 2009 by Michael Cohen, Simson Garfinkel, and Bradley Schatz, the AFF4 forensic container supports a range of next generation forensic image features such as storage virtualisation, arbitrary metadata, and partial, non-linear and discontinuous images. Current AFF4 implementations include Rekall, The Pmem suite of Memory Acquisition tools, Evimetry Wirespeed, and Google Rapid Response.
The seminar will present an introduction to the format, outline the current state of adoption within the forensic ecosystem, and announce the availability of open source implementations.
Introduction to Redis 3.0, and it’s features and improvements. What’s difference between Redis / Memcached / Aerospike ? The strong sides of Redis, and away from the weak sides.
本議程介紹 Redis 3.0 及其歷史,探討 Redis 的特性與改進。並一併分析 Redis / Memcached / Aerospike 三者之間的差異,有助於未來面對業務場景需求提供瞭解與判斷。最後,分享 Redis 適用之場景,及其不適用場景下的備案或整合方案。議程適於 Redis 初學者、對 Redis 想深入瞭解者,及曾經莫名被 Redis 雷擊或坑殺者。
Accelerating forensic and incident response workflow: the case for a new stan...Bradley Schatz
Today’s forensic processes are mired by practices carried over from a pre-networked world. Practitioners and responders are faced with the unsatisfactory choice of either forensically preserving only a limited amount of evidence while accepting the risk of missing relevant information (triage), or delaying analysis while waiting for full forensic preservation. This seminar will examine the role of existing forensic imaging formats in creating such an environment, and examine how an improved forensic image format (the AFF4 forensic container format) enables practitioners to perform forensic analysis without the delays imposed by current approaches.
Accelerating forensic and incident response workflow: the case for a new stan...Bradley Schatz
Today’s forensic processes are mired by practices carried over from a pre-networked world. Practitioners and responders are faced with the unsatisfactory choice of either forensically preserving only a limited amount of evidence while accepting the risk of missing relevant information (triage), or delaying analysis while waiting for full forensic preservation. This seminar will examine the role of existing forensic imaging formats in creating such an environment, and examine how an improved forensic image format (the AFF4 forensic container format) enables practitioners to perform forensic analysis without the delays imposed by current approaches.
Introduction to Git/Github - A beginner's guideRohit Arora
Introduction to Git/Github - A beginner's guide
Agenda:
Installing Git
Introduction to Version Control
Git Basics
Creating a new local Git repository
Cloning a Git repository
Making use of Git commit history
Reverting files to previous states
Creating a Github Repository
Adding, Committing & Pushing changes
Branching
Merging Branches
Sending Pull Requests
Conflict Resolution
and 3 Exercises
Ansible for beginners...?
This presentation shows Ansible can not only Provisioning but also orchestration like capistrano or fabric.
Module is super easy to create by not only Python like shell, Ruby and so on.
This presentation is an introduction to Ansible, an IT automation tool which can configure systems, deploy software, and orchestrate more advanced IT tasks such as continuous deployments or zero downtime rolling updates.
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...NETWAYS
It gives an introduction to the architecture of Bareos, and how the components of Bareos interact. The configuration of Bareos will be discussed and the main Bareos features will be shown. As a practical part of the workshop the adaption of the preconfigured standard backup scheme to the attendees’ wishes will be developed.
Attendees are kindly asked to contribute configuration tasks that they want to have solved.
AFF4: The new standard in forensic imaging and why you should careBradley Schatz
This seminar will outline why a new forensic container standard is needed and outline recent efforts to standardize the Advanced Forensic Format 4 forensic container (AFF4). Originally proposed in 2009 by Michael Cohen, Simson Garfinkel, and Bradley Schatz, the AFF4 forensic container supports a range of next generation forensic image features such as storage virtualisation, arbitrary metadata, and partial, non-linear and discontinuous images. Current AFF4 implementations include Rekall, The Pmem suite of Memory Acquisition tools, Evimetry Wirespeed, and Google Rapid Response.
The seminar will present an introduction to the format, outline the current state of adoption within the forensic ecosystem, and announce the availability of open source implementations.
Introduction to Redis 3.0, and it’s features and improvements. What’s difference between Redis / Memcached / Aerospike ? The strong sides of Redis, and away from the weak sides.
本議程介紹 Redis 3.0 及其歷史,探討 Redis 的特性與改進。並一併分析 Redis / Memcached / Aerospike 三者之間的差異,有助於未來面對業務場景需求提供瞭解與判斷。最後,分享 Redis 適用之場景,及其不適用場景下的備案或整合方案。議程適於 Redis 初學者、對 Redis 想深入瞭解者,及曾經莫名被 Redis 雷擊或坑殺者。
Accelerating forensic and incident response workflow: the case for a new stan...Bradley Schatz
Today’s forensic processes are mired by practices carried over from a pre-networked world. Practitioners and responders are faced with the unsatisfactory choice of either forensically preserving only a limited amount of evidence while accepting the risk of missing relevant information (triage), or delaying analysis while waiting for full forensic preservation. This seminar will examine the role of existing forensic imaging formats in creating such an environment, and examine how an improved forensic image format (the AFF4 forensic container format) enables practitioners to perform forensic analysis without the delays imposed by current approaches.
Accelerating forensic and incident response workflow: the case for a new stan...Bradley Schatz
Today’s forensic processes are mired by practices carried over from a pre-networked world. Practitioners and responders are faced with the unsatisfactory choice of either forensically preserving only a limited amount of evidence while accepting the risk of missing relevant information (triage), or delaying analysis while waiting for full forensic preservation. This seminar will examine the role of existing forensic imaging formats in creating such an environment, and examine how an improved forensic image format (the AFF4 forensic container format) enables practitioners to perform forensic analysis without the delays imposed by current approaches.
Introduction to Git/Github - A beginner's guideRohit Arora
Introduction to Git/Github - A beginner's guide
Agenda:
Installing Git
Introduction to Version Control
Git Basics
Creating a new local Git repository
Cloning a Git repository
Making use of Git commit history
Reverting files to previous states
Creating a Github Repository
Adding, Committing & Pushing changes
Branching
Merging Branches
Sending Pull Requests
Conflict Resolution
and 3 Exercises
Ansible for beginners...?
This presentation shows Ansible can not only Provisioning but also orchestration like capistrano or fabric.
Module is super easy to create by not only Python like shell, Ruby and so on.
This presentation is an introduction to Ansible, an IT automation tool which can configure systems, deploy software, and orchestrate more advanced IT tasks such as continuous deployments or zero downtime rolling updates.
Ansible: How to Get More Sleep and Require Less CoffeeSarah Z
Why you need automation, configuration management and remote execution in your life. An intro to Ansible and how it can make your life in Ops infinitely easier.
Ansible is tool for Configuration Management. The big difference to Chef and Puppet is, that Ansible doesn't need a Master and doesn't need a special client on the servers. It works completely via SSH and the configuration is done in Yaml.
These slides give a short introduction & motivation for Ansible.
Lots of small objects in a swift cluster can lead to performance issues on the object servers. We propose a backend change to improve performance for this workload.
Improving Memory Utilization of Spark Jobs Using AlluxioAlluxio, Inc.
Alluxio Community Office Hours
Nov 25, 2019
Speaker: Bin Fan, Alluxio
Check alluxio.io for more events.
Join the community conversations on Slack: alluxio.io/slack
AEM Meetup Sydney, 2017-05-31.
A closer look at the content migration tool and its various options. Discussion around how to use the tool for version upgrades and BAU activity (like Blue/Green deployments). Highlighting benefits, potential issues and things to consider when using the tool.
One of the requirements for mission critical systems is to provide reliable volume backup without impacting running system. The recommended way of cinder backup is to unmount volume before backup to avoid crash consistent backup. Unmounting is intrusive in nature and may not be feasible for mission critical systems.
This presentation focuses on strategy to achieve non-intrusive cinder backup. The presentation was given in Openstack summit at Sydney on 06 Nov 2017.
https://www.openstack.org/videos/sydney-2017/truly-non-intrusive-openstack-cinder-backup-for-mission-critical-systems
Managing big data stored on ADLSgen2/Databricks may be challenging. Setting up security, moving or copying the data of Hive tables or their partitions may be very slow, especially when dealing with hundreds of thousands of files.
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...HostedbyConfluent
Apache Kafka is a key part of the Big Data infrastructure at Salesforce, enabling publish/subscribe and data transport in near real-time at enterprise scale handling trillions of messages per day. In this session, hear from the teams at Salesforce that manage Kafka as a service, running over a hundred clusters across on-premise and public cloud environments with over 99.9% availability. Hear about best practices and innovations, including:
* How to manage multi-tenant clusters in a hybrid environment
* High volume data pipelines with Mirus replicating data to Kafka and blob storage
* Kafka Fault Injection Framework built on Trogdor and Kibosh
* Automated recovery without data loss
* Using Envoy as an SNI-routing Kafka gateway
We hope the audience will have practical takeaways for building, deploying, operating, and managing Kafka at scale in the enterprise.
Getting Started with Apache Spark and Alluxio for Blazingly Fast AnalyticsAlluxio, Inc.
Alluxio Austin Meetup
Aug 15, 2019
Speaker: Bin Fan
Apache Spark and Alluxio are cousin open source projects that originated from UC Berkeley’s AMPLab. Running Spark with Alluxio is a popular stack particularly for hybrid environments. In this session, I will briefly introduce Apache Spark and Alluxio, share the top ten tips for performance tuning for real-world workloads, and demo Alluxio with Spark.
Paradigm Wars: Object Oriented Vs Functional Programming in creating MarkParserRohit Arora
In this project my team and I tried our hands-on Functional Programming. Our aim was to compare challenges, efforts, and ease in developing a same application "MarkParser" (a Markdown parser) using both the paradigms.
This project was carried as a semester project requirement for CSC 522 Automated Learning & Data Mining.
The project focuses on predicting forest cover type in the 4 Wilderness Areas of Roosevelt National Park located at Colorado.
The data for the project was obtained from Kaggle (it is also hosted on UCI repository under the name "forest cover type").
We obtained incremental improvement with every new classification technique we tried and simultaneously our Kaggle ranking also went up.
The project aimed at developing Facility Booking System (FBS), an intranet based solution at NPL for reserving and tracking all the Facilities at NPL, by designing an algorithm for optimal utilization of resources. One of the major challenges for the project was to incorporate Google-Calendar like interface to view and select appointments. (This project/software was not part of any curriculum requirement).
Ambient Intelligence is a concept of future environment near us, with ubiquitous means hidden computing around us without electronics being visible...the future the way you everybody wants: ).
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
We have compiled the most important slides from each speaker's presentation. This year’s compilation, available for free, captures the key insights and contributions shared during the DfMAy 2024 conference.
Online aptitude test management system project report.pdfKamal Acharya
The purpose of on-line aptitude test system is to take online test in an efficient manner and no time wasting for checking the paper. The main objective of on-line aptitude test system is to efficiently evaluate the candidate thoroughly through a fully automated system that not only saves lot of time but also gives fast results. For students they give papers according to their convenience and time and there is no need of using extra thing like paper, pen etc. This can be used in educational institutions as well as in corporate world. Can be used anywhere any time as it is a web based application (user Location doesn’t matter). No restriction that examiner has to be present when the candidate takes the test.
Every time when lecturers/professors need to conduct examinations they have to sit down think about the questions and then create a whole new set of questions for each and every exam. In some cases the professor may want to give an open book online exam that is the student can take the exam any time anywhere, but the student might have to answer the questions in a limited time period. The professor may want to change the sequence of questions for every student. The problem that a student has is whenever a date for the exam is declared the student has to take it and there is no way he can take it at some other time. This project will create an interface for the examiner to create and store questions in a repository. It will also create an interface for the student to take examinations at his convenience and the questions and/or exams may be timed. Thereby creating an application which can be used by examiners and examinee’s simultaneously.
Examination System is very useful for Teachers/Professors. As in the teaching profession, you are responsible for writing question papers. In the conventional method, you write the question paper on paper, keep question papers separate from answers and all this information you have to keep in a locker to avoid unauthorized access. Using the Examination System you can create a question paper and everything will be written to a single exam file in encrypted format. You can set the General and Administrator password to avoid unauthorized access to your question paper. Every time you start the examination, the program shuffles all the questions and selects them randomly from the database, which reduces the chances of memorizing the questions.
Water billing management system project report.pdfKamal Acharya
Our project entitled “Water Billing Management System” aims is to generate Water bill with all the charges and penalty. Manual system that is employed is extremely laborious and quite inadequate. It only makes the process more difficult and hard.
The aim of our project is to develop a system that is meant to partially computerize the work performed in the Water Board like generating monthly Water bill, record of consuming unit of water, store record of the customer and previous unpaid record.
We used HTML/PHP as front end and MYSQL as back end for developing our project. HTML is primarily a visual design environment. We can create a android application by designing the form and that make up the user interface. Adding android application code to the form and the objects such as buttons and text boxes on them and adding any required support code in additional modular.
MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software. It is a stable ,reliable and the powerful solution with the advanced features and advantages which are as follows: Data Security.MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
An Approach to Detecting Writing Styles Based on Clustering Techniquesambekarshweta25
An Approach to Detecting Writing Styles Based on Clustering Techniques
Authors:
-Devkinandan Jagtap
-Shweta Ambekar
-Harshit Singh
-Nakul Sharma (Assistant Professor)
Institution:
VIIT Pune, India
Abstract:
This paper proposes a system to differentiate between human-generated and AI-generated texts using stylometric analysis. The system analyzes text files and classifies writing styles by employing various clustering algorithms, such as k-means, k-means++, hierarchical, and DBSCAN. The effectiveness of these algorithms is measured using silhouette scores. The system successfully identifies distinct writing styles within documents, demonstrating its potential for plagiarism detection.
Introduction:
Stylometry, the study of linguistic and structural features in texts, is used for tasks like plagiarism detection, genre separation, and author verification. This paper leverages stylometric analysis to identify different writing styles and improve plagiarism detection methods.
Methodology:
The system includes data collection, preprocessing, feature extraction, dimensional reduction, machine learning models for clustering, and performance comparison using silhouette scores. Feature extraction focuses on lexical features, vocabulary richness, and readability scores. The study uses a small dataset of texts from various authors and employs algorithms like k-means, k-means++, hierarchical clustering, and DBSCAN for clustering.
Results:
Experiments show that the system effectively identifies writing styles, with silhouette scores indicating reasonable to strong clustering when k=2. As the number of clusters increases, the silhouette scores decrease, indicating a drop in accuracy. K-means and k-means++ perform similarly, while hierarchical clustering is less optimized.
Conclusion and Future Work:
The system works well for distinguishing writing styles with two clusters but becomes less accurate as the number of clusters increases. Future research could focus on adding more parameters and optimizing the methodology to improve accuracy with higher cluster values. This system can enhance existing plagiarism detection tools, especially in academic settings.
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...ssuser7dcef0
Power plants release a large amount of water vapor into the
atmosphere through the stack. The flue gas can be a potential
source for obtaining much needed cooling water for a power
plant. If a power plant could recover and reuse a portion of this
moisture, it could reduce its total cooling water intake
requirement. One of the most practical way to recover water
from flue gas is to use a condensing heat exchanger. The power
plant could also recover latent heat due to condensation as well
as sensible heat due to lowering the flue gas exit temperature.
Additionally, harmful acids released from the stack can be
reduced in a condensing heat exchanger by acid condensation. reduced in a condensing heat exchanger by acid condensation.
Condensation of vapors in flue gas is a complicated
phenomenon since heat and mass transfer of water vapor and
various acids simultaneously occur in the presence of noncondensable
gases such as nitrogen and oxygen. Design of a
condenser depends on the knowledge and understanding of the
heat and mass transfer processes. A computer program for
numerical simulations of water (H2O) and sulfuric acid (H2SO4)
condensation in a flue gas condensing heat exchanger was
developed using MATLAB. Governing equations based on
mass and energy balances for the system were derived to
predict variables such as flue gas exit temperature, cooling
water outlet temperature, mole fraction and condensation rates
of water and sulfuric acid vapors. The equations were solved
using an iterative solution technique with calculations of heat
and mass transfer coefficients and physical properties.
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
Hot and cold data storage
1. HOT COLD
Unified Virtual File System
For Hot & Cold Data Storage
Aditya Ambre Madhura S. Raghavan Rohit Arora
ENTERPRISE STORAGE ARCHITECTURE
GROUP 2
2. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
AGENDA
➔ Problem Statement
➔ Project Goals and Features
➔ Architecture and Workflow
➔ Verification Cases
➔ Summary
3. Least
Frequently
Accessed
Data
HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
PROBLEM STATEMENT
➔ Lifecycle of Data.
◆ Access frequency.
◆ Storage capacity and hardware characteristics.
➔ User intervention - Running jobs/scripts.
➔ Acknowledging Data temperature
➔ Tight coupling needed between storage components
Frequently
Accessed
Data
4. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
WHAT IS A HOT FILE?
Data File that
➔ Very frequently accessed.
➔ Mostly contains business critical information.
➔ Needs to be accessed quickly.
5. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
WHAT IS A COLD FILE?
Data File that
➔ Is infrequently accessed.
➔ Contains less important information.
➔ Need not be quickly accessed.
6. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
GOAL: WHAT OUR PROJECT IS?
➔ From decoupled storage components - To - tightly coupled two-
tiered storage system
➔ Manage hot & cold data between primary and secondary storage.
➔ Manage primary storage space utilization.
➔ File transfer do not interrupt FS operations.
➔ User agnostic about file transfer and storage.
➔ Optimal storage of cold data.
7. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
WHAT OUR PROJECT IS?
8. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
FEATURES
➔ Infinite Storage illusion
➔ Automatic cold data identification and transfer
➔ Consistent CRUD operations for both hot and cold files
➔ Block level storage
➔ On the fly deduplication
➔ Uninterrupted file access
➔ File level Consistency
➔ Optimal storage space utilization
9. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
OUR ARCHITECTURE
Cold File
Tracking
Hot File
Tracking
File Tracking
Layer
Data Block
Processing Layer
Write block
to cold
Get block
from cold
De-duplication
COLD
STORAGE
APPLICATION
Write Read
FUSE OPERATIONS
Read, Write, Delete, Rename, etc.
2f0f3ff2c7439635e7faa85…
3f35ec5fe4ae0b963779c8…
4a8f9ec938243beac4b2d…
Hot File
Cold File
10. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
HOT-TO-COLD WORKFLOW
COLD
STORAGE
APPLICATION
Write
FUSE {WRITE} OPERATIONS
File Tracking
Layer
Data Block
Processing Layer
13. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
HOT-TO-COLD WORKFLOW
File Tracking
Layer
1. List all the files
2. Sort files by access time - oldest to newest
3. Select files to be transferred - (till <=50%)
4. Sort above files by size - large to small
5. Send the largest & least accessed files to
Data Processing layer
Cold File tracking
22. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
COLD-TO-HOT WORKFLOW
COLD
STORAGE
APPLICATION
FUSE {READ} OPERATIONS
File Tracking
Layer
Data Block
Processing Layer
Read
Request
Check: Is File on Hot Storage?
2f0f3ff2c7439635e7faa85…
3f35ec5fe4ae0b963779c8…
4a8f9ec938243beac4b2d…
23. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
COLD-TO-HOT WORKFLOW
COLD
STORAGE
APPLICATION
FUSE {READ} OPERATIONS
File Tracking
Layer
Data Block
Processing Layer
Read
Request
Check: Is File on Hot Storage?
Get block
from cold
No 2f0f3ff2c7439635e7faa85…
3f35ec5fe4ae0b963779c8…
4a8f9ec938243beac4b2d…
24. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
COLD-TO-HOT WORKFLOW
Data Block
Processing Layer
1. Request copy of Hashtable
2. Get Hashtable
Get Block
from Cold
COLD
STORAGE
1. Request Hashtable
2. Gets Hashtable
25. 2f0f3ff2…
7439635…
e7faa85…
3f35ec5f…
e4ae0b9...
HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
COLD-TO-HOT WORKFLOW
Data Block
Processing Layer
1. Request copy of Hashtable
2. Get Hashtable
3. Read block presence on cold
Get Block
from Cold
COLD
STORAGE
3. Is block
present?
26. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
COLD-TO-HOT WORKFLOW
Data Block
Processing Layer
1. Request copy of Hashtable
2. Get Hashtable
3. Read block presence on cold
4. Request/Get block from cold
Get Block
from Cold
COLD
STORAGE
4 Request Block
4. Gets Block
2f0f3ff2…
7439635…
e7faa85…
3f35ec5f…
e4ae0b9...
2f0f3ff2…
7439635…
e7faa85…
3f35ec5f…
e4ae0b9...
Block 1 Block 2 Block 3
27. 2f0f3ff2…
7439635…
e7faa85…
3f35ec5f…
e4ae0b9...
HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
COLD-TO-HOT WORKFLOW
Data Block
Processing Layer
1. Request copy of Hashtable
2. Get Hashtable
3. Read block presence on cold
4. Request/Get block from cold
5. Write transferred’ block
content to memory block
6. Construct complete file
Get Block
from Cold
COLD
STORAGE
Block 1
Block 2
Block 3
6.
28. 2f0f3ff2…
7439635…
e7faa85…
3f35ec5f…
e4ae0b9...
HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
COLD-TO-HOT WORKFLOW
Data Block
Processing Layer
1. Request copy of Hashtable
2. Get Hashtable
3. Read block presence on cold
4. Request/Get block from cold
5. Write transferred’ block
content to memory block
6. Construct complete file
7. Delete copy of Hashtable
Get Block
from Cold
COLD
STORAGE
Block 1
Block 2
Block 3
7. Delete
Hashtable
29. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
COLD-TO-HOT WORKFLOW
COLD
STORAGE
APPLICATION
FUSE {READ} OPERATIONS
File Tracking
Layer
Data Block
Processing Layer
ReadRead
Request
Get block
from cold
Block Read
Request
No 2f0f3ff2c7439635e7faa85…
3f35ec5fe4ae0b963779c8…
4a8f9ec938243beac4b2d…
30. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
MINIMAL THRESHOLD WORKFLOW
COLD
STORAGE
APPLICATION
FUSE {READ} OPERATIONS
File Tracking
Layer
Data Block
Processing Layer
Some
Operation
Get block
from cold
Block Read
Request
Yes 2f0f3ff2c7439635e7faa85…
3f35ec5fe4ae0b963779c8…
4a8f9ec938243beac4b2d…
Check: Storage <= 30%
Get Cold FileHot File
Tracking
31. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
READ OPERATION WORKFLOW
COLD
STORAGE
APPLICATION
FUSE {READ} OPERATIONS
File Tracking
Layer
Data Block
Processing Layer
Some
Operation
Get block
from cold
Block Read
Request
Yes 2f0f3ff2c7439635e7faa85…
3f35ec5fe4ae0b963779c8…
4a8f9ec938243beac4b2d…
Check: Storage >30% & < 70%
Get Cold FileHot File
Tracking
32. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
QUICK DEMO
33. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
SCENARIOS / VERIFICATION CASES
I. GENERAL
➔ File System 70% full -> Transfer to cold storage.
➔ File System drops less than 30% -> Transfer from cold storage.
➔ File transfers -> Do not interrupt general FS operations.
➔ Redundant/Duplicate blocks ->Not transferred.
34. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
SCENARIOS / VERIFICATION CASES
II. SPECIFIC
➔ Files transferred –> Based on access and size.
➔ File removed on hot storage –> After last block is transferred.
➔ File in transition accessed –> Abort transfer, access granted!
➔ File space reclamation and File access –> Synchronized.
➔ Only one background process running at specific time.
➔ Delayed delete (rm) -> Transparent to user.
35. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
ASSUMPTIONS
➔ Network is always available.
➔ Hot-Cold classification at file level
➔ Cold Storage is infinite.
➔ Files are not very small or very large.
➔ Delay is accepted for rarely accessed files.
➔ File access granularity – in seconds.
36. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
SUMMARY
➔ Acknowledged data temperatures - hot and cold
➔ Project Features
◆ Auto file identification.
◆ File transfer
◆ Deduplication
➔ Architecture and workflows in action.
➔ Design and implementation of file tracking layer
➔ Design and implementation of Block Data Process Layer
➔ Design decisions for specific verification scenarios.
37. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
FUTURE SCOPE
➔ Variable block size and Block size specifications.
➔ Garbage collection on secondary/cold storage.
➔ Cold file identification parameters and profiles.
➔ Distributed cold storage.
38. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
REFERENCES
1. S. Quinlan and S. Dorward, “Venti: A new approach to archival storage,” in
Proceedings of the First USENIX Conference on File and Storage
Technologies (FAST), 2002. http://plan9.bell-labs.com/sys/doc/venti/venti.
pdf
2. Chuanyi Liu, Dapeng Ju, et al, “Semantic data de-duplication for archival
storage systems,” in Proceedings of the 13th IEEE Asia-Pacific Computer
Systems Architecture Conference (ACSAC 2008), Hsinchu, Taiwan, August,
2008.
3. Sean Quinlan, Jim McKie Russ Cox, “Fossil, an Archival File Server”, Lucent
Technologies Bell Labs, Unpublished memorandum (September 2003).
4. http://www.storiant.com/resources/Cold-Storage-Is-Hot-Again.pdf
5. “What is Unified Storage system ” http://searchstorage.techtarget.
com/definition/unified-storage
6. File System in User Space - http://fuse.sourceforge.net/
39. HOT COLD
CSC 568 Enterprise Storage Architecture (NC State University)
QUESTIONS ?