Percona ServerをMySQL 5.6と5.7用に作るエンジニアリング(そしてMongoDBのヒント)Colin Charles
Engineering that goes into making Percona Server for MySQL 5.6 & 5.7 different (and a hint of MongoDB) for dbtechshowcase 2017 - the slides also have some Japanese in it. This should help a Japanese audience to read it. If there are questions due to poor translation, do not hesitate to drop me an email (byte@bytebot.net) or tweet: @bytebot
Percona ServerをMySQL 5.6と5.7用に作るエンジニアリング(そしてMongoDBのヒント)Colin Charles
Engineering that goes into making Percona Server for MySQL 5.6 & 5.7 different (and a hint of MongoDB) for dbtechshowcase 2017 - the slides also have some Japanese in it. This should help a Japanese audience to read it. If there are questions due to poor translation, do not hesitate to drop me an email (byte@bytebot.net) or tweet: @bytebot
Wakame-vnet / Open Source Project for Virtual Network & SDNaxsh co., LTD.
Wakame-vnet is a toolkit for Virtual Networking based on the Edge Networking Architecture. The user can freely design own L2/L3 network on top of physical network using Wakame-vnet.
1) The document explores a new concept called error permissive computing that improves computing capabilities and reduces power consumption by allowing and managing hardware errors through system software instead of eliminating errors through general purpose hardware error correction.
2) It describes several approaches for implementing error permissive computing including a software framework called BITFLEX that enables approximate computing, an FPGA-based memory emulator for evaluating new system software mechanisms, and techniques for sparse and topology-aware communication that can accelerate large-scale deep learning and reduce communication costs.
3) The goal is to take a holistic approach across hardware and software layers to perform lightweight error correction at the software level while eliminating general purpose error correction in hardware for improved efficiency.
Opportunities of ML-based data analytics in ABCIRyousei Takano
This document discusses opportunities for using machine learning-based data analytics on the ABCI supercomputer system. It summarizes:
1) An introduction to the ABCI system and how it is being used for AI research.
2) How sensor data from the ABCI system and job logs could be analyzed using machine learning to optimize data center operation and improve resource utilization and scheduling.
3) Two potential use cases - using workload prediction to enable more efficient cooling system control, and applying machine learning to better predict job execution times to improve scheduling.
ABCI: An Open Innovation Platform for Advancing AI Research and DeploymentRyousei Takano
AI Infrastructure for Everyone (Democratization AI) aims to build an AI infrastructure platform that is accessible to everyone from beginners to experts. The platform provides up to 512-node computing resources, ready-to-use software, datasets, and pre-trained models. It also offers services like an easy-to-use web-based IDE for beginners and an AI cloud with on-demand, reserved, and batch processing options. The goal is to accelerate AI research and promote social implementation of AI technologies.
The document discusses the performance of three SPEC CPU2006 benchmarks - 483.xalancbmk, 462.libquantum, and 471.omnetpp - under different last-level cache (LLC) configurations and when subjected to LLC cache interference from a background workload. Key findings include reduced performance for the benchmarks when run with a smaller LLC size or when interfered with by a LLC jammer workload, but maintained performance when QoS techniques were applied to isolate the benchmark workload in the LLC.
The document summarizes four presentations from the USENIX NSDI 2016 conference session on resource sharing:
1. "Ernest: Efficient Performance Prediction for Large-Scale Advanced Analytics" proposes a framework that uses results from small training jobs to efficiently predict performance of data analytics workloads in cloud environments and reduce the number of required training jobs.
2. "Cliffhanger: Scaling Performance Cliffs in Web Memory Caches" presents algorithms to dynamically allocate memory across queues in Memcached to smooth out performance cliffs and potentially save memory usage.
3. "FairRide: Near-Optimal, Fair Cache Sharing" introduces a caching policy that provides isolation guarantees, prevents strategic behavior, and
This document discusses optimizations for TCP/IP networking performance on multicore systems. It describes several inefficiencies in the Linux kernel TCP/IP stack related to shared resources between cores, broken data locality, and per-packet processing overhead. It then introduces mTCP, a user-level TCP/IP stack that addresses these issues through a thread model with pairwise threading, batch packet processing from I/O to applications, and a BSD-like socket API. mTCP achieves a 2.35x performance improvement over the kernel TCP/IP stack on a web server workload.
Flow-centric Computing - A Datacenter Architecture in the Post Moore EraRyousei Takano
1) The document proposes a new "flow-centric computing" data center architecture for the post-Moore era that focuses on data flows.
2) It involves disaggregating server components and reassembling them as "slices" consisting of task-specific processors and storage connected by an optical network to efficiently process data.
3) The authors expect optical networks to enable high-speed communication between processors, replacing general CPUs, and to potentially revolutionize how data is processed in future data centers.
A Look Inside Google’s Data Center NetworksRyousei Takano
1) Google has been developing their own data center network architectures using merchant silicon switches and centralized network control since 2005 to keep up with increasing bandwidth demands.
2) Their network designs have evolved from Firehose and Watchtower to the current Saturn and Jupiter networks, increasing port speeds from 1/10Gbps to 40/100Gbps and aggregate bandwidth from terabits to petabits per second.
3) Their network architectures employ Clos topologies with merchant silicon switches at the top-of-rack, aggregation, and spine layers and centralized control of traffic routing.
- Hardware such as DRAM and NAND flash are facing scaling challenges as density increases, which could impact performance and cost. New non-volatile memory (NVM) technologies may provide opportunities to address these challenges but require software and system architecture changes to realize their full potential. Key considerations include persistence, performance, and programming models.
AIST Super Green Cloud: lessons learned from the operation and the performanc...Ryousei Takano
This document discusses lessons learned from operating the AIST Super Green Cloud (ASGC), a fully virtualized high-performance computing (HPC) cloud system. It summarizes key findings from the first six months of operation, including performance evaluations of SR-IOV virtualization and HPC applications. It also outlines conclusions and future work, such as improving data movement efficiency across hybrid cloud environments.
The document summarizes the author's participation report at the IEEE CloudCom 2014 conference. Some key points include:
- The author attended sessions on virtualization and HPC on cloud.
- Presentations had a strong academic focus and many presenters were Asian.
- Eight papers on HPC on cloud covered topics like reliability, energy efficiency, performance metrics, and applications like Monte Carlo simulations.
Exploring the Performance Impact of Virtualization on an HPC CloudRyousei Takano
The document evaluates the performance impact of virtualization on high-performance computing (HPC) clouds. Experiments were conducted on the AIST Super Green Cloud, a 155-node HPC cluster. Benchmark results show that while PCI passthrough mitigates I/O overhead, virtualization still incurs performance penalties for MPI collectives as node counts increase. Application benchmarks demonstrate overhead is limited to around 5%. The study concludes HPC clouds are promising due to utilization improvements from virtualization, but further optimization of virtual machine placement and pass-through technologies could help reduce overhead.
From Rack scale computers to Warehouse scale computersRyousei Takano
This document discusses the transition from rack-scale computers to warehouse-scale computers through the disaggregation of technologies. It provides examples of rack-scale architectures like Open Compute Project and Intel Rack Scale Architecture. For warehouse-scale computers, it examines HP's The Machine project using application-specific cores, universal memory, and photonics fabric. It also outlines UC Berkeley's FireBox project utilizing 1 terabit/sec optical fibers, many-core systems-on-chip, and non-volatile memory modules connected via high-radix photonic switches.
高性能かつスケールアウト可能なHPCクラウド AIST Super Green CloudRyousei Takano
The document contains configuration instructions for creating a cluster in a cloud computing environment called myCluster. It specifies creating a frontend node and 16 compute nodes using specified templates, compute and disk offerings. It also defines the cluster name, zone, network, and SSH key to use. The cluster can then be started and later destroyed along with a configuration file.
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data CenterRyousei Takano
The document describes Iris, an inter-cloud resource integration system that enables elastic cloud data centers. Iris uses nested virtualization technologies including nested KVM to construct a virtual infrastructure spanning multiple distributed data centers. It provides a new Hardware as a Service (HaaS) model for inter-cloud federation at the infrastructure provider level. The authors demonstrate Apache CloudStack can seamlessly manage resources across emulated inter-cloud environments using Iris.
9. 伸縮⾃自在なデータセンター
2. Requests
HaaS resources
3. Allocates compute
resources
Resource
coordinator
Resource
manager
IaaS users
IaaS admin
4. Allocates network
resources
Resource
manager
1. Service degradation
due to excessive load
DC Network (control plane)
IaaS
middleware
VM
VM
VM
VM
VM
VM
BM
Inter-‐‑‒DC network
DC Network (Data plane)
GW
Data Center 1 (物理理資源貸出業者)
GW
DC Network
Data Center 2 (クラウド事業者)
9
12. インタークラウドモデル
• vIaaS (Virtual Infrastructure as a Service)
• HaaS (Hardware as a Service)
(a)
vIaaS:
overlay
model
DC
DC
IaaS
VI
(=
IaaS)
DC
(requester)
DC
IaaS
federation layer
IaaS
VI
VI
IaaS tenant
VI
(=
IaaS)
DC
(provider)
DC
(requester)
(b)
HaaS:
extension
model
12
35. 産総研の提案するユースケース:
IaaSマイグレーションによる災害復復旧
• 災害発⽣生時にIaaS全体を遠隔データセンタへ移送
– IaaS:管理理ノードを含むVMの集合
• CloudStackやOpenStack等で管理理
• 物理理資源を抽象化するHaaS (Hardware as a
Service)層を⽤用いて実現
IaaS user A
IaaS
user C
IaaS user B
IaaS A (e.g., CloudStack)
HaaS
1GB x 10000VM
= 10TB
(100G : 13.3 min)
IaaS
user D
IaaS B
HaaS
Physical machines + OFS
Physical machines + OFS
site A
site B
35
36. FELIX実現に向けた産総研の取り組み
① OGF Network Services
Interface (NSI)の標準化
② インタークラウド資源管理理
技術
インタークラウド
②インター
資源管理理
GridARS資源管理理システムの開発
DC資源管理理
③ HaaSモデルの実現
Iris仮想インフラ構築システムの
開発
①NSI
NW資源管理理
クラウド
資源管理理
DC資源管理理
専用回線
③HaaSモデル
36