SlideShare a Scribd company logo
FAQ - Data Access for IRIS DB
 Data Interfaces
 Standard SQL interface, JDBC Interface
 Custom loading/exporting via CLI (Command Line Interface)
 Data Format
 For low traffic - Simple JDBC connection would suffice.
 For high volume traffic – Bulk loader may treat data stored in files of 1m, 5m traffic.
 No specific restrictions/configuration for frequency – the size of the data chunk for bulk
loading is decided by the developer of the loader
 Data Access Mechanism
 Simple SQL interface is provided – for short term queries.
 CLI interface with some suite of loaders are provided for higher volume of data.
 Data export is done by CLI commands, producing file result.
 Standard FTP is used both ways for importing and exporting, with the help of CLI loaders.
 Simple Oracle/IRIS export tool is provided.
 Export/Import tools can be customized for other DBMS, if requested.
1
FAQ - Performance Consideration of IRIS DB
Performance Limitations
 Maximum performance is achieved as follows (SK Telecom)
 Number of nodes – 35 nodes (including active/standby masters)
 Each node with 256 GB RAM, 36 TB DISKs, 12 cores CPU, 2 x 1 Gbps
 10G switch x 1, 3 Racks
 Daily Traffic: 80 billion records per day.
 Performance depends on the number of nodes
 Per node – inserting 8.6 billion records (or 2.5 TB) per day (in 100% CPU Utilization)
 Sizing factors
 Number of duplication (usually 2)
 Traffic volume incoming (affects the CPU & Network capacity)
 Which kinds of real-time computation is required.
 How many summary operations per min/5min
 How long the data should be retained in the system. (90 days, or 6 months, etc)
 By default, all data is compressed, (ratio 50%~70%, doubling the available space.)
 Types of incoming queries.
 Time-ranged selection of complex joining & filtering queries
 Typical Setup for PoC
 Depends on the PoC requirement and the hardware set for the PoC
 Typical configuration would be,
 5 nodes, (1 master, 4 data)
 4 data nodes with 12 cores CPU, 64 GB RAM, 2 x 2TB HDD, 1 Gbps Net
 All packaged in 4 U Chassis
 1 external Gbps switch with 10G Uplink
 Accepting 2 billion records per day (roughly 640 GB per day)
 Provides on-premise cluster packaged in 4U chassis for PoC purpose.
2
FAQ - Data Management in IRIS DB
 Data Duplication & Recovery
 Users may choose duplication level, normally 2. (Hadoop default is 3)
 All data is to be stored in different nodes.
 Disk level failure (including server level failure) doesn’t cause service downtime, if the
failure level is within duplicated tolerance.
 Disk/Server failure is recovered by H/W replacement and additional data recovery
commands.
 Data recovery is done by manual operation, not automatic.
 Necessary command line toolset is provided.
 Recovery time is dependent on the amount of data affected.
 Network bandwidth is the deciding factor of recovery time – 1 Gbps internal bandwidth within
cluster.
 Missing data (lost by Disk/HW failure) is identified by the location management table.
 Location data is stored in master node.
 Location data can be recovered from data node, from scratch (even if master node is
completely failed – time-consuming.)
 Master node is secured by active/stand-by duplication.
3
FAQ - Data Model of IRIS DB
 Data Model
 It’s a distributed database, sharded/partitioned, shared-nothing.
 By default, all tables are potentially regarded as big tables.
 A big table is to be partitioned into distributed nodes
 JOIN inter big tables are not allowed.
 For JOIN like operations over big tables, we give special interface with open-source
adaptors with Hadoop & Spark, which enables every kinds of Map/Reduce jobs.
 Special purpose ‘global table’ needs user-specification at creation time.
 ‘Global tables’ are small tables duplications on all data nodes within cluster.
 JOIN is allowed between a big table and a global table.
 Global tables usually contain lookup data for configuration or similar.
 Users are required to specify ‘partition key’s.
 IRIS-DB is mostly same as traditional database, if not partitioning nature.
 Users may create tables by conventional CREATE SQL commands.
 With ‘HINT’ as the partitioning key specification.
4
FAQ - Support for IRIS DB
 Documentation & Support
 Documentation & support is done by support engineers in Mobigen, Seoul.
 Support and documentations are mostly in Korean, now.
 For English or other languages, we need some more work.
 Specific business condition is to be identified for more support other than in
Korean.
 Documents are mostly for developers & system administrators.
 For developers, a user guide on SQL and tool commands is provided.
 For admin, a user guide for system administration & data management is
provided.
 Most supporting issues recently raised from the customers are:
 Questions related to the specific SQL syntax for specific cases
 Data management issues (Data expiration & Storage management issues)
 Scale out planning issues (How much new nodes are required, for expected
traffic)
 Troubles related to H/W failures – DISK failure.
 H/W failures are handled by H/W supporting engineers from OEM manufacturer.
 S/W failures are handled by Mobigen engineers.
5
FAQ - Network Issues of IRIS DB
 Network Restrictions
 Normally a 10Gbps uplink, 1Gbps inner cluster network is used.
 If inner connections are required to be wider than 1 Gbps,
two 1Gbps lines are bonded and make 2Gbps bandwidth.
 L2 Switch is added as required by the number of data nodes grows.
 For higher bandwidth, a ‘direct-access’ is supported.
 In a ‘direct access mode’, external client may access not only the master nodes, but
also the data nodes directly.
 The direct access is controlled by the CLI command or the client JDBC library.
 For direct access mode, data nodes are to be visible outside the cluster.
 This requires the IP address to be visible from outside.
 From external view, the whole cluster is regarded as a single machine, under the
control of JDBC library or CLI commands.
 JDBC client library and CLI command accesses the (active) master node first, and
accesses the data nodes if necessary (on direct mode).
 If active node is shutdown, stand-by node is activated, and from the external view
point the downtime is observed as a session close. A reconnection may lead to the
connection to the secondary master. This retrial is done by the library.
6

More Related Content

What's hot

A Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
A Non-Standard use Case of Hadoop: High Scale Image Processing and AnalyticsA Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
A Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
DataWorks Summit
 

What's hot (20)

[db tech showcase Tokyo 2017] C34: Replacing Oracle Database at DBS Bank ~Ora...
[db tech showcase Tokyo 2017] C34: Replacing Oracle Database at DBS Bank ~Ora...[db tech showcase Tokyo 2017] C34: Replacing Oracle Database at DBS Bank ~Ora...
[db tech showcase Tokyo 2017] C34: Replacing Oracle Database at DBS Bank ~Ora...
 
Optimizing Presto Connector on Cloud Storage
Optimizing Presto Connector on Cloud StorageOptimizing Presto Connector on Cloud Storage
Optimizing Presto Connector on Cloud Storage
 
Running Analytics at the Speed of Your Business
Running Analytics at the Speed of Your BusinessRunning Analytics at the Speed of Your Business
Running Analytics at the Speed of Your Business
 
Redis Day TLV 2018 - 10 Reasons why Redis should be your Primary Database
Redis Day TLV 2018 - 10 Reasons why Redis should be your Primary DatabaseRedis Day TLV 2018 - 10 Reasons why Redis should be your Primary Database
Redis Day TLV 2018 - 10 Reasons why Redis should be your Primary Database
 
Why data warehouses cannot support hot analytics
Why data warehouses cannot support hot analyticsWhy data warehouses cannot support hot analytics
Why data warehouses cannot support hot analytics
 
Building a Real-Time Gaming Analytics Service with Apache Druid
Building a Real-Time Gaming Analytics Service with Apache DruidBuilding a Real-Time Gaming Analytics Service with Apache Druid
Building a Real-Time Gaming Analytics Service with Apache Druid
 
Redis Day TLV 2018 - Spring Session Redis
Redis Day TLV 2018 - Spring Session RedisRedis Day TLV 2018 - Spring Session Redis
Redis Day TLV 2018 - Spring Session Redis
 
RedisConf17 - Redis Enterprise on IBM Power Systems
RedisConf17 - Redis Enterprise on IBM Power SystemsRedisConf17 - Redis Enterprise on IBM Power Systems
RedisConf17 - Redis Enterprise on IBM Power Systems
 
Red Hat Storage: Emerging Use Cases
Red Hat Storage: Emerging Use CasesRed Hat Storage: Emerging Use Cases
Red Hat Storage: Emerging Use Cases
 
A Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
A Non-Standard use Case of Hadoop: High Scale Image Processing and AnalyticsA Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
A Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
 
Druid Adoption Tips and Tricks
Druid Adoption Tips and TricksDruid Adoption Tips and Tricks
Druid Adoption Tips and Tricks
 
Building Data Applications with Apache Druid
Building Data Applications with Apache DruidBuilding Data Applications with Apache Druid
Building Data Applications with Apache Druid
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
 
From limited Hadoop compute capacity to increased data scientist efficiency
From limited Hadoop compute capacity to increased data scientist efficiencyFrom limited Hadoop compute capacity to increased data scientist efficiency
From limited Hadoop compute capacity to increased data scientist efficiency
 
Lightning Talk: What You Need to Know Before You Shard in 20 Minutes
Lightning Talk: What You Need to Know Before You Shard in 20 MinutesLightning Talk: What You Need to Know Before You Shard in 20 Minutes
Lightning Talk: What You Need to Know Before You Shard in 20 Minutes
 
Sqream DB on OpenPOWER performance
Sqream DB on OpenPOWER performanceSqream DB on OpenPOWER performance
Sqream DB on OpenPOWER performance
 
[db tech showcase OSS 2017] A11: How Percona is Different, and How We Support...
[db tech showcase OSS 2017] A11: How Percona is Different, and How We Support...[db tech showcase OSS 2017] A11: How Percona is Different, and How We Support...
[db tech showcase OSS 2017] A11: How Percona is Different, and How We Support...
 
Scalable Filesystem Metadata Services with RocksDB
Scalable Filesystem Metadata Services with RocksDBScalable Filesystem Metadata Services with RocksDB
Scalable Filesystem Metadata Services with RocksDB
 
Leveraging ApsaraDB to Deploy Business Data on the Cloud
Leveraging ApsaraDB to Deploy Business Data on the CloudLeveraging ApsaraDB to Deploy Business Data on the Cloud
Leveraging ApsaraDB to Deploy Business Data on the Cloud
 
Hadoop Operations - Best practices from the field
Hadoop Operations - Best practices from the fieldHadoop Operations - Best practices from the field
Hadoop Operations - Best practices from the field
 

Viewers also liked

3차보고서 20150106
3차보고서 201501063차보고서 20150106
3차보고서 20150106
Joung Hun Youm
 
Microsoft의 과거.현재 그리고 미래전략은(溫故知新)
Microsoft의 과거.현재 그리고 미래전략은(溫故知新)Microsoft의 과거.현재 그리고 미래전략은(溫故知新)
Microsoft의 과거.현재 그리고 미래전략은(溫故知新)
Marcetto Co., Ltd
 
목표달성 120% 영업관리 7단계 - LSMP
목표달성 120% 영업관리 7단계 - LSMP목표달성 120% 영업관리 7단계 - LSMP
목표달성 120% 영업관리 7단계 - LSMP
화현 류
 

Viewers also liked (17)

사업 실적
사업 실적사업 실적
사업 실적
 
iris solution_overview_for_bigdata
iris solution_overview_for_bigdatairis solution_overview_for_bigdata
iris solution_overview_for_bigdata
 
(주)모비젠 회사소개서 일반
(주)모비젠 회사소개서 일반(주)모비젠 회사소개서 일반
(주)모비젠 회사소개서 일반
 
3차보고서 20150106
3차보고서 201501063차보고서 20150106
3차보고서 20150106
 
데이터가 우리 눈 앞에 펼쳐지기 까지 이디엄 김한도
데이터가 우리 눈 앞에 펼쳐지기 까지 이디엄 김한도데이터가 우리 눈 앞에 펼쳐지기 까지 이디엄 김한도
데이터가 우리 눈 앞에 펼쳐지기 까지 이디엄 김한도
 
Microsoft의 과거.현재 그리고 미래전략은(溫故知新)
Microsoft의 과거.현재 그리고 미래전략은(溫故知新)Microsoft의 과거.현재 그리고 미래전략은(溫故知新)
Microsoft의 과거.현재 그리고 미래전략은(溫故知新)
 
리얼타임 소셜 비즈니스 플랫폼 제출용
리얼타임 소셜 비즈니스 플랫폼 제출용리얼타임 소셜 비즈니스 플랫폼 제출용
리얼타임 소셜 비즈니스 플랫폼 제출용
 
[PAG 비즈니스 플랫폼데이] Oracle Korea 글로벌 진출을 위한 솔루션 파트너 협력방안
[PAG 비즈니스 플랫폼데이] Oracle Korea 글로벌 진출을 위한 솔루션 파트너 협력방안[PAG 비즈니스 플랫폼데이] Oracle Korea 글로벌 진출을 위한 솔루션 파트너 협력방안
[PAG 비즈니스 플랫폼데이] Oracle Korea 글로벌 진출을 위한 솔루션 파트너 협력방안
 
IRIS
IRISIRIS
IRIS
 
Big Data Myth 1. 우리 회사엔 빅데이터가 없어요
Big Data Myth 1. 우리 회사엔 빅데이터가 없어요Big Data Myth 1. 우리 회사엔 빅데이터가 없어요
Big Data Myth 1. 우리 회사엔 빅데이터가 없어요
 
위시스_제휴제안서
위시스_제휴제안서위시스_제휴제안서
위시스_제휴제안서
 
150613 당신이 아마도 몰랐을 빅데이터 이야기 (YEF 공유)
150613 당신이 아마도 몰랐을 빅데이터 이야기 (YEF 공유)150613 당신이 아마도 몰랐을 빅데이터 이야기 (YEF 공유)
150613 당신이 아마도 몰랐을 빅데이터 이야기 (YEF 공유)
 
글로벌 웹사이트 구축을 꿈꾸는 당신에게: Azure WebSites and WebJobs
글로벌 웹사이트 구축을 꿈꾸는 당신에게: Azure WebSites and WebJobs글로벌 웹사이트 구축을 꿈꾸는 당신에게: Azure WebSites and WebJobs
글로벌 웹사이트 구축을 꿈꾸는 당신에게: Azure WebSites and WebJobs
 
Docker기반 분산 플랫폼
Docker기반 분산 플랫폼Docker기반 분산 플랫폼
Docker기반 분산 플랫폼
 
실시간 빅데이터와 머신 데이터
실시간 빅데이터와 머신 데이터실시간 빅데이터와 머신 데이터
실시간 빅데이터와 머신 데이터
 
목표달성 120% 영업관리 7단계 - LSMP
목표달성 120% 영업관리 7단계 - LSMP목표달성 120% 영업관리 7단계 - LSMP
목표달성 120% 영업관리 7단계 - LSMP
 
지니빌더 솔루션 상품제안서
지니빌더 솔루션 상품제안서지니빌더 솔루션 상품제안서
지니빌더 솔루션 상품제안서
 

Similar to FAQ

Oracle db architecture
Oracle db architectureOracle db architecture
Oracle db architecture
Simon Huang
 
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 Improving Apache Spark by Taking Advantage of Disaggregated Architecture Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Databricks
 
Designing Information Structures For Performance And Reliability
Designing Information Structures For Performance And ReliabilityDesigning Information Structures For Performance And Reliability
Designing Information Structures For Performance And Reliability
bryanrandol
 

Similar to FAQ (20)

Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
 
Oracle db architecture
Oracle db architectureOracle db architecture
Oracle db architecture
 
Oracle 11gR2 plain servers vs Exadata - 2013
Oracle 11gR2 plain servers vs Exadata - 2013Oracle 11gR2 plain servers vs Exadata - 2013
Oracle 11gR2 plain servers vs Exadata - 2013
 
Using SAS GRID v 9 with Isilon F810
Using SAS GRID v 9 with Isilon F810Using SAS GRID v 9 with Isilon F810
Using SAS GRID v 9 with Isilon F810
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data Analysis
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data Analysis
 
MYSQL
MYSQLMYSQL
MYSQL
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data Analysis
 
Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis
 
Extreme Availability using Oracle 12c Features: Your very last system shutdown?
Extreme Availability using Oracle 12c Features: Your very last system shutdown?Extreme Availability using Oracle 12c Features: Your very last system shutdown?
Extreme Availability using Oracle 12c Features: Your very last system shutdown?
 
Tom Kyte and and Cary Milsap - 2013
Tom Kyte and and Cary Milsap - 2013Tom Kyte and and Cary Milsap - 2013
Tom Kyte and and Cary Milsap - 2013
 
Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Accelerate and Scale Big Data Analytics with Disaggregated Compute and StorageAccelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
 
Why Oracle Engineered systems - 2013
Why Oracle Engineered systems - 2013Why Oracle Engineered systems - 2013
Why Oracle Engineered systems - 2013
 
Greenplum Architecture
Greenplum ArchitectureGreenplum Architecture
Greenplum Architecture
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
 
Sql server 2016 it just runs faster sql bits 2017 edition
Sql server 2016 it just runs faster   sql bits 2017 editionSql server 2016 it just runs faster   sql bits 2017 edition
Sql server 2016 it just runs faster sql bits 2017 edition
 
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 Improving Apache Spark by Taking Advantage of Disaggregated Architecture Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 
Designing Information Structures For Performance And Reliability
Designing Information Structures For Performance And ReliabilityDesigning Information Structures For Performance And Reliability
Designing Information Structures For Performance And Reliability
 
DRBD Deep Dive - Philipp Reisner - LINBIT
DRBD Deep Dive - Philipp Reisner - LINBITDRBD Deep Dive - Philipp Reisner - LINBIT
DRBD Deep Dive - Philipp Reisner - LINBIT
 

Recently uploaded

Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 

Recently uploaded (20)

Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
 
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
 
Agnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in KrakówAgnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in Kraków
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
AI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning FrameworkAI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning Framework
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
Crafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationCrafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM Integration
 
GraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysisGraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysis
 
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
Top Mobile App Development Companies 2024
Top Mobile App Development Companies 2024Top Mobile App Development Companies 2024
Top Mobile App Development Companies 2024
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
iGaming Platform & Lottery Solutions by Skilrock
iGaming Platform & Lottery Solutions by SkilrockiGaming Platform & Lottery Solutions by Skilrock
iGaming Platform & Lottery Solutions by Skilrock
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 

FAQ

  • 1. FAQ - Data Access for IRIS DB  Data Interfaces  Standard SQL interface, JDBC Interface  Custom loading/exporting via CLI (Command Line Interface)  Data Format  For low traffic - Simple JDBC connection would suffice.  For high volume traffic – Bulk loader may treat data stored in files of 1m, 5m traffic.  No specific restrictions/configuration for frequency – the size of the data chunk for bulk loading is decided by the developer of the loader  Data Access Mechanism  Simple SQL interface is provided – for short term queries.  CLI interface with some suite of loaders are provided for higher volume of data.  Data export is done by CLI commands, producing file result.  Standard FTP is used both ways for importing and exporting, with the help of CLI loaders.  Simple Oracle/IRIS export tool is provided.  Export/Import tools can be customized for other DBMS, if requested. 1
  • 2. FAQ - Performance Consideration of IRIS DB Performance Limitations  Maximum performance is achieved as follows (SK Telecom)  Number of nodes – 35 nodes (including active/standby masters)  Each node with 256 GB RAM, 36 TB DISKs, 12 cores CPU, 2 x 1 Gbps  10G switch x 1, 3 Racks  Daily Traffic: 80 billion records per day.  Performance depends on the number of nodes  Per node – inserting 8.6 billion records (or 2.5 TB) per day (in 100% CPU Utilization)  Sizing factors  Number of duplication (usually 2)  Traffic volume incoming (affects the CPU & Network capacity)  Which kinds of real-time computation is required.  How many summary operations per min/5min  How long the data should be retained in the system. (90 days, or 6 months, etc)  By default, all data is compressed, (ratio 50%~70%, doubling the available space.)  Types of incoming queries.  Time-ranged selection of complex joining & filtering queries  Typical Setup for PoC  Depends on the PoC requirement and the hardware set for the PoC  Typical configuration would be,  5 nodes, (1 master, 4 data)  4 data nodes with 12 cores CPU, 64 GB RAM, 2 x 2TB HDD, 1 Gbps Net  All packaged in 4 U Chassis  1 external Gbps switch with 10G Uplink  Accepting 2 billion records per day (roughly 640 GB per day)  Provides on-premise cluster packaged in 4U chassis for PoC purpose. 2
  • 3. FAQ - Data Management in IRIS DB  Data Duplication & Recovery  Users may choose duplication level, normally 2. (Hadoop default is 3)  All data is to be stored in different nodes.  Disk level failure (including server level failure) doesn’t cause service downtime, if the failure level is within duplicated tolerance.  Disk/Server failure is recovered by H/W replacement and additional data recovery commands.  Data recovery is done by manual operation, not automatic.  Necessary command line toolset is provided.  Recovery time is dependent on the amount of data affected.  Network bandwidth is the deciding factor of recovery time – 1 Gbps internal bandwidth within cluster.  Missing data (lost by Disk/HW failure) is identified by the location management table.  Location data is stored in master node.  Location data can be recovered from data node, from scratch (even if master node is completely failed – time-consuming.)  Master node is secured by active/stand-by duplication. 3
  • 4. FAQ - Data Model of IRIS DB  Data Model  It’s a distributed database, sharded/partitioned, shared-nothing.  By default, all tables are potentially regarded as big tables.  A big table is to be partitioned into distributed nodes  JOIN inter big tables are not allowed.  For JOIN like operations over big tables, we give special interface with open-source adaptors with Hadoop & Spark, which enables every kinds of Map/Reduce jobs.  Special purpose ‘global table’ needs user-specification at creation time.  ‘Global tables’ are small tables duplications on all data nodes within cluster.  JOIN is allowed between a big table and a global table.  Global tables usually contain lookup data for configuration or similar.  Users are required to specify ‘partition key’s.  IRIS-DB is mostly same as traditional database, if not partitioning nature.  Users may create tables by conventional CREATE SQL commands.  With ‘HINT’ as the partitioning key specification. 4
  • 5. FAQ - Support for IRIS DB  Documentation & Support  Documentation & support is done by support engineers in Mobigen, Seoul.  Support and documentations are mostly in Korean, now.  For English or other languages, we need some more work.  Specific business condition is to be identified for more support other than in Korean.  Documents are mostly for developers & system administrators.  For developers, a user guide on SQL and tool commands is provided.  For admin, a user guide for system administration & data management is provided.  Most supporting issues recently raised from the customers are:  Questions related to the specific SQL syntax for specific cases  Data management issues (Data expiration & Storage management issues)  Scale out planning issues (How much new nodes are required, for expected traffic)  Troubles related to H/W failures – DISK failure.  H/W failures are handled by H/W supporting engineers from OEM manufacturer.  S/W failures are handled by Mobigen engineers. 5
  • 6. FAQ - Network Issues of IRIS DB  Network Restrictions  Normally a 10Gbps uplink, 1Gbps inner cluster network is used.  If inner connections are required to be wider than 1 Gbps, two 1Gbps lines are bonded and make 2Gbps bandwidth.  L2 Switch is added as required by the number of data nodes grows.  For higher bandwidth, a ‘direct-access’ is supported.  In a ‘direct access mode’, external client may access not only the master nodes, but also the data nodes directly.  The direct access is controlled by the CLI command or the client JDBC library.  For direct access mode, data nodes are to be visible outside the cluster.  This requires the IP address to be visible from outside.  From external view, the whole cluster is regarded as a single machine, under the control of JDBC library or CLI commands.  JDBC client library and CLI command accesses the (active) master node first, and accesses the data nodes if necessary (on direct mode).  If active node is shutdown, stand-by node is activated, and from the external view point the downtime is observed as a session close. A reconnection may lead to the connection to the secondary master. This retrial is done by the library. 6