Windows Server 2016 で作るシンプルなハイパーコンバージドインフラ (Microsoft TechSummit 2016)Takamasa Maejima
2016年11月に開催された Microsoft TechSummit 2016 での、Windows Server 2016 ストレージ機能 (SDS) を活用したハイパーコンバージドインフラ (HCI) に関するセッションスライドです。
[イベント名] Microsoft TechSummit 2016
[開催日] 2016年11月1日
[セッションID] CDP-002
[セッションタイトル] Windows Server 2016 で作るシンプルなハイパーコンバージドインフラ
Distributed Tracing, from internal SAAS insightsHuy Do
This document discusses challenges with open source distributed tracing solutions and potential improvements. It outlines problems with storage costs and scalability, instrumentation libraries, and user interfaces when using tracing for multi-tenant systems. Alternative approaches are proposed, including using logs instead of traces to analyze performance, tail-based sampling methods, and log-first architectures that construct traces from buffered spans. The document suggests moving open source tracers to a firehose mode and developing a new trace client that samples traces based on features and errors to prioritize useful data while limiting storage costs.
Write on memory TSDB database (gocon tokyo autumn 2018)Huy Do
The document discusses building a time series database (TSDB) for storing metrics data. It describes needing fast write and read speeds with efficient storage usage. It evaluated existing solutions but found they did not meet performance or cost needs. The team decided to build their own in-memory TSDB using several open source Go packages - Prometheus/tsdb for storage, valyala/gozstd for compression, hashicorp/raft for replication, and LINE/centraldogma for configuration management. Their resulting storage can write over 1 million samples per second and store billions of samples in a single machine.
Windows Server 2016 で作るシンプルなハイパーコンバージドインフラ (Microsoft TechSummit 2016)Takamasa Maejima
2016年11月に開催された Microsoft TechSummit 2016 での、Windows Server 2016 ストレージ機能 (SDS) を活用したハイパーコンバージドインフラ (HCI) に関するセッションスライドです。
[イベント名] Microsoft TechSummit 2016
[開催日] 2016年11月1日
[セッションID] CDP-002
[セッションタイトル] Windows Server 2016 で作るシンプルなハイパーコンバージドインフラ
Distributed Tracing, from internal SAAS insightsHuy Do
This document discusses challenges with open source distributed tracing solutions and potential improvements. It outlines problems with storage costs and scalability, instrumentation libraries, and user interfaces when using tracing for multi-tenant systems. Alternative approaches are proposed, including using logs instead of traces to analyze performance, tail-based sampling methods, and log-first architectures that construct traces from buffered spans. The document suggests moving open source tracers to a firehose mode and developing a new trace client that samples traces based on features and errors to prioritize useful data while limiting storage costs.
Write on memory TSDB database (gocon tokyo autumn 2018)Huy Do
The document discusses building a time series database (TSDB) for storing metrics data. It describes needing fast write and read speeds with efficient storage usage. It evaluated existing solutions but found they did not meet performance or cost needs. The team decided to build their own in-memory TSDB using several open source Go packages - Prometheus/tsdb for storage, valyala/gozstd for compression, hashicorp/raft for replication, and LINE/centraldogma for configuration management. Their resulting storage can write over 1 million samples per second and store billions of samples in a single machine.
This document discusses garbage collection (GC) algorithms and their implementation. It provides high-level information about GC including that it collects garbage to prevent memory leaks, identifies live objects, and reclaims space from dead objects. It then discusses specific GC algorithms like mark-sweep GC in more detail and how hard their implementation can be, touching on marking phases, sweeping phases, and challenges like conservative vs precise GC. Example code is provided for a simple mark-sweep GC. Overall the document aims to explain why understanding GC implementation is important for performance and avoiding issues like application hangs.
The document describes Lineflow, an internal tool developed by LINE corp to improve engineering efficiency. It aims to automate LINE's software development process through a command line interface. Key points:
- Lineflow was built with Golang to define commands as a domain-specific language that reflects LINE's development workflow. It tracks command history to suggest next steps.
- This helped save engineers time by automating repetitive manual tasks like copying code between branches. It is estimated to save 41 engineer-hours per day.
- Developing internal tools requires dedicated resources and evangelism to gain adoption, but it lays the foundation for further improvements and integration with other tools.
GOCON Autumn (Story of our own Monitoring Agent in golang)Huy Do
This document discusses building a monitoring agent in Go and describes key components and design considerations. It proposes:
1. A modular agent architecture with pluggable inputs, codecs, and outputs to collect host metrics.
2. A push-based collection model where inputs directly retrieve metrics to reduce middleware impact.
3. A buffer to prevent data loss during output failures by maintaining write offsets.
4. Exposing agent state via HTTP for monitoring health, performance, and preventing issues like resend storms.
The agent is intended to be scalable, reliable, and manageable for deploying monitoring at scale. Go is chosen for its portability and quick prototyping abilities.
This document discusses writing a custom byte serializer in Go. It begins by outlining some common serialization libraries and their drawbacks. The document then describes designing a custom serializer with a fixed struct layout that directly encodes fields without reflection, for better performance. An initial implementation performs well but is improved further by rewriting it to reuse a single global buffer and eliminate allocations. Benchmarking shows the optimized custom serializer can achieve speeds comparable to the fastest libraries. Overall lessons learned include avoiding hidden allocations, handling serialization failures robustly, and adding features like versioning and checksums for reliability.
This document discusses dependency injection in Ruby. Dependency injection is when one part of code uses or relies on another part. In Ruby, common ways to implement dependency injection include using the constructor/initialize method and passing parameters, including modules, and inheritance. While these provide dependency injection, they can also introduce disadvantages like tight coupling, unexpected behavior from included modules, and rigid inheritance hierarchies. Dependency injection frameworks provide an alternative approach but may not be needed for many Ruby applications. The key benefits of dependency injection are improved testability, loose coupling between classes, and more extensible code.
NoSQL databases should not be chosen just because a system is slow or to replace RDBMS. The appropriate choice depends on factors like the nature of the data, how the data scales, and whether ACID properties are needed. NoSQL databases are categorized by data model (document, column family, graph, key-value store) which affects querying. Other considerations include scalability based on the CAP theorem and operational factors like the distribution model and whether there is a single point of failure. The best choice depends on the specific requirements and risks losing data if chosen incorrectly.
This document discusses how to build command line interface (CLI) applications in Ruby. It introduces the Thor gem for building CLI apps with single commands and then discusses building interactive shell apps using a read-eval-print loop (REPL) and the GNU Readline library. The document provides examples of CLI apps, why they are useful, and tips for making better CLI apps, and concludes by showcasing a Facebook command line application built with these techniques.