1. G-Cube OpenStack Solution
G-Cube Inc.
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent
2. G-Cube OpenStack Solution
• Data Defined Storage (DDS)
-Integrated data-centric management architecture
-Storing, retaining, and accessing data based on content, meaning, and value.
-Core technology
Media Independent Data Storage
Data Security & Identity Management
Distributed Metadata Repository
http://en.wikipedia.org/wiki/Data_Defined_Storage
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 2
3. Software Defined Storage vs Data Defined Storage
Software Defined Storage
• Storage-centric
management
• User manages storage.
• SDS decribes storage.
• Human (should) know
-Storage features
Data Defined Storage
• Data-centric management
• User describes data
• DDS manages storage
• DDS (should) know
-Data description
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 3
4. In Storage Centric Management
Architecture
Deliver package 1 until tom
orrow, package 2 until this
weekend, package 3 until t
his month…
what they do
Who deliver? by human
(admin)
high-cost , low-efficient.
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 4
5. In Storage Centric Management
Architecture
Deliver package 1 until
tomorrow, package 2 until
this weekend, package 3
until this month…
what they do
Who deliver?
• automated by existing
solutions unsatify
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 5
6. In Data Centric Management
Architecture
Deliver package 1 until
tomorrow, package 2 until this
weekend, package 3 until this
month…
what they do
Data
description
package 1 via airli
ne, package 2 by t
ruck…
Who deliver?
automated & unified management
low-cost & high-efficient
satisfy users!
what we(will) do
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 6
7. Data Defined Storage의 필요성
• 서비스 뿐만 아니라 스토리지도 더 이상 flat 하지 않다.
-Multi-Tenant Service
New service frameworks: Cloud, VDI, Big-data.
Traditional services: DB, WAS, Multimedia...
-Multi-Aspect Storage
예전: 시스템메모리 + 하드디스크 (Enterprise or Personal)
지금: 메모리 + SSD + HDD + cloud storage
- 비휘발성 메모리 소자(DRAM, NAND Flash, MRAM, FeRAM) 인터페이스/프
로토콜 (SATA/SAS, PCI-E NVME), 하드웨어 (SLC/MLC/TLC, rpm,
redundancy-level), 위치인접성 (Local DAS, SAN, WAN)
• Media Independent Data Storage 솔루션이 필요.
-관리자 및 사용자가 복잡한 storage 특성을 이해하여 효율적으로 data를 mapping하는
것이 불가능.
• Data I/O는 무조건 고성능이 아니라, 사용자가 원하는 수준에 가장 맞는 데이터
입출력 서비스를 제공.
-최소의 비용으로 전체적으로 최적의 service quality를 보장.
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 7
8. G-Cube OpenStack Solution의 시장 예상
• 적용이 가능/비교적 용이한 시장
-Private cloud provider
Game publishers, 중소 기업 규모의 가상 머신/VDI 환경, 포털사, 대학/기관/기업
의 전산센터.
서비스 사용자와 제공자가 동일 하거나 tightly-coupled 되어 있어서 비용 절감이
나 서비스 품질 개선을 통해서 TOC 감소가 중요한 시장.
다양한 특성의 데이터 (게임 데이터, 블로그/홈페이지, 미디어, 이미지, 사용자/고
객 계정 등) 가 통합 관리되고 있는 환경.
-Big data infra, SNS service provider
방대한 양의 데이터가 존재하고 해당 데이터의 유지 및 관리 비용이 높음.
데이터의 양과 사용자 수가 폭발적으로 증가함에 따라 전체 서비스 질 저하/용량
부족에 따른 장비의 추가 및 확장이 빈번함.
-TCO나 성능에 대한 요구 사항, 그리고 scale-out을 위해서 도입할 장비 및 솔루션의 구
매 license 비용이 새로운 OpenStack solution 도입을 위한 개발 소요 비용에 비해서 상
당히 높은 시장.
관련 내부 개발 인력을 보유한 경우 시장 진입이 보다 유리.
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 8
9. G-Cube OpenStack Solution의 시장 예상
• 적용이 (당장) 힘든 시장
-Public cloud infra provider
Amazon Web Service, Google Cloud Service, Microsoft Cloud Service…
궁극적인 시장 경쟁자.
서비스 사용자와 제공자가 loosely-coupled 또는 분리된 scale-out data storage
service 환경을 economics of scale에 의해 비용 효율적으로 제공. (e.g., 대용량 배
송 서비스)
기업형 cloud 분야에서 QoS에 따른 node/storage 구성을 달성. (e.g., 특급 배송 서
비스)
-Enterprise storage market
Oracle, SAP.
ACID와 같은 높은 데이터 신뢰도가 중요.
오랜 기간 동안 검증된 서비스 및 제품에 대한 안정성 references가 보장되어야 진
입이 용이.
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 9
10. G-Cube OpenStack Layers
cluster/cloud frame works, applications
osd virtualization
- cluster : Portal
local resource mgmt : repository
osd device : container
block device driver nvme driver
-key - value system
-big data analysis
-global file system
etc.
- network을 통해 다수의 osd device를 통합제공
- local osd device와 일관된 interface를 통해 속성,상태
등 의 query, set을 제공(storage 이외의 network등의 정
보 추가)
- local osd device의 정의 범위에 따라서는 단순한 map
ping 만 수행하는 layer로 축소될 수 있음
-다수의 device를 묶어 제공.
-정해진 data 속성에 따라 device 활용
-하부 device의 속성, 상태 및 data 속성에 따른 동작
등을 query, set
cluster boundary
single node boundary
virtual storage boundary
physical storage boundary
memory, cpu 등 storage
뿐 아니라 node resource
를 통합하여, 관리하며
해당 resource를 이용한
service 제공 ( caching 등)
storage hardware (memory, SSD, HDD)
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 10
11. Layers & their key roles
portal (dock)
repository
container
device
coordination
cluster-level
namespace (object – node)
network
distributed functions
scheduler
local resource management
node-level
local functions
namespace (object – container)
scheduler
local storage management
device-level
namespace (object - LBNs)
representing a block device file which can represent
hdd/ssd/ramdisk , array, networked device,
logical volume, partition, and so on.
talk to container with a feature set
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 11
12. Service data description & parameter flow
streaming avi
portal
cache가 빨라야하고. m
ulti user access 를 고려
하면 network도..
repository
sequential, mul
ti read, large!
크기만 신경쓰
면되네.. 그럼
싼 HDD로..
large!!
난 cache도 커서 sequential multi re
ad는 I/O 별로 안해도 되니, 용량만
큰데로 넣으면 되겠네
container
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 12
13. Operating Scenario - Terms
from the point of not only
performance such as
responsiveness, throughput, but also
functionality like reliability,
continuity, and functionality
Node boundary
container boundary
status
Container is a virtual storage device
which can be represented as one
dimensional array as follows
- a single device (SSD, HDD,
Ramdisk)
- a bunch of disks (e.g. disk array)
- hybrid storage (DRAM, SSD, HDD)
- and their networking storage
Tenant is an agent (module) who
has one or more data I/O streams
as follows
- Application (mobile, PC)
- Server (WAS, DB) and its I/O
agents (WAS cgi-bin module, DB
storage engine, logging module)
- VM Hypervisor I/O module
- server-side I/O agent in NAS
and SAN
- Filesystem server I/O agent
- …
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 13
14. DATA IN/OUT Requirement (기존 방식)
tenant
(user,
server)
status
data data data data
data in/out
(rw)
스토지와 data 매핑 기준 : falut-resilience & load-balance & size
mapper가 보는 node
containers의 위상 :=
flat
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 14
15. DATA IN/OUT Requirement (기존 방식)
tenant
(user,
server)
data in/out data data data data
(rw)
status
cluster container의
위상 (e.g., In multi-apects,
read
responsiveness )
mapping considerations: falut-resilience & load-balance & ?
(mapping takes no advantage of status per container even if it is known)
e.g. HDD
e.g. Ramdisk
e.g. SSD
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 15
16. DATA IN/OUT Requirement (기존 방식)
tenant
(user,
server)
data in/out data data data data
(rw)
status
mapping considerations: falut-resilience & load-balance & ?
(mapping takes no advantage of status per container even if it is known)
tenants가 요구하는
read responsivenss
의 위상 over-statisfied
unstatisfied unstatisfied
over-statisfied statisfied
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 16
17. DATA IN/OUT Requirement (G-Cube approach)
tenant
(user,
server)
data in/out data data data data
(rw)
status
mapping considerations: falut-resilience & load-balance & in/out requirement from
tenants (schedule)
tenants가 요구하는
read responsivenss
의 위상
All-statisfied
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 17
18. G-Cube OpenStack Interfaces
portal
repository
container
device
Local interfaces Remote interfaces
block others openstack block others openstack
block
interfaces
POSIX file
interfaces,
RESTful API,
key-value,
Swift…
dynamic field
interfaces (like
stub)
ioctl (network)
block interfaces POSIX file
interfaces,
RESTful API,
key-value,
Swift…
dynamic field
interfaces (like
stub)
ioctl (network)
block openstack block openstack
block interfaces dynamic field interfaces
(like stub)
ioctl
block interfaces
(SAN)
dynamic field interfaces
(like stub)
ioctl
block openstack block openstack
block device file
(bio interface)
SAN
dynamic field interfaces
(like stub)
ioctl
block device file dynamic field interfaces
(like stub)
ioctl
block osd block osd
block device file
(bio interface)
ioctl N/A N/A
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 18
19. G-Cube OpenStack Data Description
• Service level에서의 data description
-Sequentiality vs Randomness in IO system
Service level의 data에서는 sequential, random 의 접근이 단순히 I/O 뿐만 아니라
processing의 의미도 들어감.
Sequential : streaming , image (single), contents body 와 같이 전체 data를 처음
부터 읽어서 service하는 종류들
Random : VOD (player의 메뉴상 tracker를 마우스 등으로 찍어서 움직이는 것들.
youtube 등), image set
-Concurrency: 단일 data에 대한 sequential 접근 뿐 아니라, 동일 key 혹은
type에 대한 접근 방식 또한 정의를 필요로 함.
e.g., Prefetching.
-BLOB: Data 접근의 특이성이 없거나 파악이 힘든 경우 (e.g, 가상 머신 guest
OS image) large-size의 binary object로 보고, 요구 사항에 맞게 데이터를 처
리.
-No archive data : service level에서의 data의 경우, archive data는 거의 존재
하지 않음.
archive 의 경우는 별도의 process를 통해 backup등을 수행함
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 19
20. G-Cube OpenStack Data Description
• G-Cube OpenStack interface 요구사항 : object type에 대한 정의를 하면
서 해당 type data에 대한 description을 할 수 있어야 함
-e.g. contents title, writer, image thumbnail 등
• Classification: 일반적인 data type에 대해서는 미리 세부 parameter를
정의하여, 단순화된 interface로 사용할 수 있도록 제공.
• Service 운용 수준에서 보면, 크게 on-line data를 위한 sequential /
random I/O & processing 에 대한 정의와 off-line data를 위한 정의 정도
로 볼 수 있음.
-세부적인 사항은 개별 service 별 특징으로 처리해야 하며, 일반화 불가능함.
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 20
21. G-Cube OpenStack Data Description
• Considerations
-data (object) 에 대한 것과 data relation (object type) 에 대한 것이 필요함
• Application (web?) service data attributes dimensions
-process isolation in single data (or relative data) : portal layer와 관련.
-number of concurrent access : portal, repository, (container)
-I/O randomness : portal, repository, container
-size of data (or relative data) : portal, repository, container
• Application service data 의 경우 storage 에서 자주 언급하는, read/write pattern
은 불필요.
-On-line service 되는 data의 거의 대부분이 write once, read many의 성격을 가짐.
-Write-intensive 한 형태의 data는 특정 workload type (logging, transactional DB)으로
구분할 수 있음.
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 21
22. G-Cube OpenStack Device Features
• 기본적인 요구사항
-초기 단계에서는 구체적으로 저장 장치 및 노드의 정보를 공유.
• Online parameter extraction
-기본적으로 운용중인 저장 장치에 대해서 성능을 online으로 알아내는 것은 힘듬 (위험성 및
내구성의 문제 유발).
-Offline parameter extraction approach: storage device의 modeling parameters를 offline으로
추출하고 model name에 따른 추출값을 저장. Vendor별 storage device model이 많지 않으
므로 가능한 접근 방법임.
• Storage device features
-Disk model name
-Type (SSD, Enterprise HDD, )
-Interfaces & protocol (SATA, SAS, PCI-E NVME)
-Performance factors
Sequential/Random Read/Write, Mixed (70:30).
Latency/Throughput
-Reliability factors
N-device fault tolerant. abstraction of redundancy level.
SMART features.
-Capability
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 22
23. G-Cube OpenStack Functionality
• Data Functionality (우선 순위별 정렬은 아님)
-Replication.
-High availability.
-Cache & tiering.
-Compression.
-Security.
-…
• Data functionality는 plug-in으로 구성하여 data 요구 사항에 맞게 적용가능한 형
태로 개발.
-e.g., Data set A 또는 volume 구성 후, 해당 set이나 volume에 compression/replication
기능을 제공하도록 구성.
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 23
24. G-Cube OpenStack 확장성
• Online Migration.
-장비의 추가/장애, 노드의 추가/장애 등을 고려한 설계 및 개발
-Scale-up/scale-out 고려.
• Scale-out
-Network bandwidth을 고려하였을 때, OpenStack solutions의 scale-out을 위한 가장 큰
특징은 no central metadata node client level에서 key로부터 determinstic하게
addressing (hashing)이 가능.
-Multi-dimensional feature set을 고려했을 때, metadata node 없이 scale-out 한
storage 구성을 위한 data mapping algorithms 설계 및 구현.
This information is confidential and was prepared by G-Cube solely for the use of our client and investor; it is not to be relied on by any 3rd party without G-Cube's prior written consent SEO 141028-IR-Business proposal v05 24