微博架构与平台安全

•

274 likes•12,466 views

Tim Y

新浪微博平台与安全架构

Audience Level Intermediate Synopsis Ceph – the most popular storage solution for OpenStack – stores all data as a collection of objects. This object store was originally implemented on top of a POSIX filesystem, an approach that turned out to have a number of problems, notably with performance and complexity. BlueStore, a new storage backend for Ceph, was created to solve these issues; the Ceph Jewel release included an early prototype. The code and on-disk format were declared stable (but experimental) for Ceph Kraken, and now in the upcoming Ceph Luminous release, BlueStore will be the recommended default storage backend. With a 2-3x performance boost, you’ll want to look at migrating your Ceph clusters to BlueStore. This talk goes into detail about what BlueStore does, the problems it solves, and what you need to do to use it. Speaker Bio: Tim works for SUSE, hacking on Ceph and related technologies. He has spoken often about distributed storage and high availability at conferences such as linux.conf.au. In his spare time he wrangles pigs, chickens, sheep and ducks, and was declared by one colleague “teammate most likely to survive the zombie apocalypse”.

In this presentation, you will get a look under the covers of Amazon Redshift, a fast, fully-managed, petabyte-scale data warehouse service for less than $1,000 per TB per year. Learn how Amazon Redshift uses columnar technology, optimized hardware, and massively parallel processing to deliver fast query performance on data sets ranging in size from hundreds of gigabytes to a petabyte or more. We'll also walk through techniques for optimizing performance and, you’ll hear from a specific customer and their use case to take advantage of fast performance on enormous datasets leveraging economies of scale on the AWS platform.

Performance Tuning RocksDB for Kafka Streams’ State Stores

confluent

Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...

confluent

RocksDB is the default state store for Kafka Streams. In this talk, we will discuss how to improve single node performance of the state store by tuning RocksDB and how to efficiently identify issues in the setup. We start with a short description of the RocksDB architecture. We discuss how Kafka Streams restores the state stores from Kafka by leveraging RocksDB features for bulk loading of data. We give examples of hand-tuning the RocksDB state stores based on Kafka Streams metrics and RocksDB’s metrics. At the end, we dive into a few RocksDB command line utilities that allow you to debug your setup and dump data from a state store. We illustrate the usage of the utilities with a few real-life use cases. The key takeaway from the session is the ability to understand the internal details of the default state store in Kafka Streams so that engineers can fine-tune their performance for different varieties of workloads and operate the state stores in a more robust manner.

RocksDB Performance and Reliability Practices

Yoshinori Matsunobu

Meta/Facebook's database serving social workloads is running on top of MyRocks (MySQL on RocksDB). This means our performance and reliability depends a lot on RocksDB. Not just MyRocks, but also we have other important systems running on top of RocksDB. We have learned many lessons from operating and debugging RocksDB at scale. In this session, we will offer an overview of RocksDB, key differences from InnoDB, and share a few interesting lessons learned from production.

Amazon DynamoDB Under the Hood: How We Built a Hyper-Scale Database (DAT321) ...

Amazon Web Services

Come to this session to learn how Amazon DynamoDB was built as the hyper-scale database for internet-scale applications. In January 2012, Amazon launched DynamoDB, a cloud-based NoSQL database service designed from the ground up to support extreme scale, with the security, availability, performance, and manageability needed to run mission-critical workloads. This session discloses for the first time the underpinnings of DynamoDB, and how we run a fully managed nonrelational database used by more than 100,000 customers. We cover the underlying technical aspects of how an application works with DynamoDB for authentication, metadata, storage nodes, streams, backup, and global replication.

Aws glue를 통한 손쉬운 데이터 전처리 작업하기

Amazon Web Services Korea

AWS Glue는 고객이 분석을 위해 손쉽게 데이터를 준비하고 로드할 수 있게 지원하는 완전관리형 ETL(추출, 변환 및 로드) 서비스입니다. AWS 관리 콘솔에서 클릭 몇 번으로 ETL 작업을 생성하고 실행할 수 있습니다. 빅데이터 분석 시 다양한 데이터 소스에 대한 전처리 작업을 할 때, 별도의 데이터 처리용 서버나 인프라를 관리할 필요가 없습니다. 본 세션에서는 지난 5월 서울 리전에 출시한 Glue 서비스에 대한 자세한 소개와 함께 다양한 활용 팁을 데모와 함께 소개해 드립니다.

Amazon DocumentDB vs MongoDB 의 내부 아키텍쳐 와 장단점 비교

Amazon Web Services Korea

최근 국내와 글로벌 서비스에서 MongoDB를 사용하는 사례가 급증하고 있습니다. 다만 전통적인 RDBMS에 비해, 아직 지식과 경험의 축적이 적게 되어 있어 손쉬운 접근과 트러블 슈팅등에 문제가 있는 것도 사실입니다. 이 세션에서는 MongoDB 와 AWS의 DocumentDB의 Architecure를 간단히 살펴보고 MongoDB 및 DocumentDB의 비교를 진행하며 특히 MongoDB와 DocumentDB를 사용할때 주의해야할 중요 포인트에 대해서 알아봅니다.

효율적인 빅데이터 분석 및 처리를 위한 Glue, EMR 활용 - 김태현 솔루션즈 아키텍트, AWS :: AWS Summit Seoul 2019

Amazon Web Services Korea

효율적인 빅데이터 분석 및 처리를 위한 Glue, EMR 활용 김태현 솔루션즈 아키텍트, AWS AWS에서는 Big Data 분석 및 처리를 위해 분석 목적에 맞는 다양한 Big Data Framework 서비스를 지원합니다. 이 세션에서는 시간이 지날수록 증가하는 데이터의 분석 및 처리를 위해 사용되는 AWS Glue와 Amazon EMR 같은 AWS Big Data Framework의 내부구조를 살펴보고 머신러닝을 포함한 다양한 분석 및 ETL을 위해 효율적으로 사용할 수 있는 방법들을 소개합니다.

Operating and Supporting Delta Lake in Production

Databricks

Delta lake is widely adopted. There are things to be aware of when dealing with petabytes of data in Delta Lake. These smart decisions can give the best efficiency and increase the adoption of Delta. Best practices like OPTIMIZE, ZORDER have to wisely chosen. We have support stories where we successfully resolved performance issues by applying the right performance strategy. There are a set of common issues or repeated questions from our strategic customers face when using Delta and in this session we cover them and how to address them.

HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase

HBaseCon

In this presentation, we will introduce Hotspot's Garbage First collector (G1GC) as the most suitable collector for latency-sensitive applications running with large memory environments. We will first discuss G1GC internal operations and tuning opportunities, and also cover tuning flags that set desired GC pause targets, change adaptive GC thresholds, and adjust GC activities at runtime. We will provide several HBase case studies using Java heaps as large as 100GB that show how to best tune applications to remove unpredicted, protracted GC pauses.

Dreaming Infrastructurekyhpudding

Caching solutions with Redis

George Platon

What is new in Apache Hive 3.0?

DataWorks Summit

Apache Hive is a rapidly evolving project which continues to enjoy great adoption in the big data ecosystem. As Hive continues to grow its support for analytics, reporting, and interactive query, the community is hard at work in improving it along with many different dimensions and use cases. This talk will provide an overview of the latest and greatest features and optimizations which have landed in the project over the last year. Materialized views, the extension of ACID semantics to non-ORC data, and workload management are some noteworthy new features. We will discuss optimizations which provide major performance gains, including significantly improved performance for ACID tables. The talk will also provide a glimpse of what is expected to come in the near future.

redis basics

Manoj Kumar

Troubleshooting Kerberos in Hadoop: Taming the Beast

DataWorks Summit

Kerberos is the ubiquitous authentication mechanism when it comes to secure any Hadoop Services. With recent updates in Hadoop core and various Apache Hadoop components, inherent Kerberos support has matured and has come a long way. Understanding & configuring Kerberos is still a challenge but even more painful & frustrating is troubleshooting a Kerberos issue. There are lot of things (small & big) that can go wrong (and will go wrong!). This talk covers the Kerberos debugging part in detail and discusses the tools & tricks that can be used to narrow down any Kerberos issue. Rather than discussing the issues and their resolution, we will focus on how to approach a Kerberos problem and do's / dont's in Kerberos scene. This talk will provide a step by step guide that will equip the audience for troubleshooting future Kerberos problems. Agenda is to discuss: - Systematic approach to Kerberos troubleshooting - Kerberos Tools available in Hadoop arsenal - Tips & Tricks to narrow down Kerberos issues quickly - Some nasty Kerberos issues from Support trenches Some prior knowledge on Kerberos basics will be appreciated but is not a prerequisite. Speaker: Vipin Rathor, Sr. Product Specialist (HDP Security), Hortonworks

Best Practices of running PostgreSQL in Virtual Environments

Jignesh Shah

Storage tiering and erasure coding in Ceph (SCaLE13x)

Sage Weil

Ceph is designed around the assumption that all components of the system (disks, hosts, networks) can fail, and has traditionally leveraged replication to provide data durability and reliability. The CRUSH placement algorithm is used to allow failure domains to be defined across hosts, racks, rows, or datacenters, depending on the deployment scale and requirements. Recent releases have added support for erasure coding, which can provide much higher data durability and lower storage overheads. However, in practice erasure codes have different performance characteristics than traditional replication and, under some workloads, come at some expense. At the same time, we have introduced a storage tiering infrastructure and cache pools that allow alternate hardware backends (like high-end flash) to be leveraged for active data sets while cold data are transparently migrated to slower backends. The combination of these two features enables a surprisingly broad range of new applications and deployment configurations. This talk will cover a few Ceph fundamentals, discuss the new tiering and erasure coding features, and then discuss a variety of ways that the new capabilities can be leveraged.

[Main Session] 카프카, 데이터 플랫폼의 최강자

Oracle Korea

오늘날 빅데이터 분석, 처리부터 모든 개발 플랫폼을 연결해주는 카프카의 등장 배경과 의미를 살펴보고, 실무에서 적용한 경험을 바탕으로 적절한 카프카 사용 사례를 정비해 보겠습니다. 또한 카프카의 내부 구동 방식에 대하여 소개하는 시간을 갖겠습니다. 마지막으로 실무에서 카프카를 운영하면서 경험한 구성, 운영 및 모니터링 등 경험을 공유하는 시간입니다. (by. 카카오 고승범) * 본 세션은 “입문자/초급자/중급자” 분들께 두루 적합한 세션입니다.

Plazma - Treasure Data’s distributed analytical database -

Treasure Data, Inc.

Timelines at Scale (Raffi Krikorian - VP of Engineering at Twitter)

Chris Bolman

高性能并发Web服务器实现核心内幕ideawu

艺龙旅行网架构案例分享-Qcon2011Yiwei Ma

What's hot

Building Your Data Warehouse with Amazon Redshift

Amazon Web Services

Performance Tuning RocksDB for Kafka Streams’ State Stores

confluent

Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...

confluent

RocksDB Performance and Reliability Practices

Yoshinori Matsunobu

Amazon DynamoDB Under the Hood: How We Built a Hyper-Scale Database (DAT321) ...

Amazon Web Services

Aws glue를 통한 손쉬운 데이터 전처리 작업하기

Amazon Web Services Korea

Amazon DocumentDB vs MongoDB 의 내부 아키텍쳐 와 장단점 비교

Amazon Web Services Korea

효율적인 빅데이터 분석 및 처리를 위한 Glue, EMR 활용 - 김태현 솔루션즈 아키텍트, AWS :: AWS Summit Seoul 2019

Amazon Web Services Korea

Operating and Supporting Delta Lake in Production

Databricks

HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase

HBaseCon

Dreaming Infrastructurekyhpudding

Caching solutions with Redis

George Platon

What is new in Apache Hive 3.0?

DataWorks Summit

redis basics

Manoj Kumar

Troubleshooting Kerberos in Hadoop: Taming the Beast

DataWorks Summit

Best Practices of running PostgreSQL in Virtual Environments

Jignesh Shah

Storage tiering and erasure coding in Ceph (SCaLE13x)

Sage Weil

[Main Session] 카프카, 데이터 플랫폼의 최강자

Oracle Korea

Plazma - Treasure Data’s distributed analytical database -

Treasure Data, Inc.

Timelines at Scale (Raffi Krikorian - VP of Engineering at Twitter)

Chris Bolman

What's hot (20)

Building Your Data Warehouse with Amazon Redshift

Performance Tuning RocksDB for Kafka Streams’ State Stores

Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...

RocksDB Performance and Reliability Practices

Amazon DynamoDB Under the Hood: How We Built a Hyper-Scale Database (DAT321) ...

Aws glue를 통한 손쉬운 데이터 전처리 작업하기

Amazon DocumentDB vs MongoDB 의 내부 아키텍쳐 와 장단점 비교

효율적인 빅데이터 분석 및 처리를 위한 Glue, EMR 활용 - 김태현 솔루션즈 아키텍트, AWS :: AWS Summit Seoul 2019

Operating and Supporting Delta Lake in Production

HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase

Dreaming Infrastructure

Caching solutions with Redis

What is new in Apache Hive 3.0?

redis basics

Troubleshooting Kerberos in Hadoop: Taming the Beast

Best Practices of running PostgreSQL in Virtual Environments

Storage tiering and erasure coding in Ceph (SCaLE13x)

[Main Session] 카프카, 데이터 플랫폼의 최강자

Plazma - Treasure Data’s distributed analytical database -

Timelines at Scale (Raffi Krikorian - VP of Engineering at Twitter)

Viewers also liked

高性能并发Web服务器实现核心内幕ideawu

艺龙旅行网架构案例分享-Qcon2011Yiwei Ma

Qcon 2011：Beansdb 的设计与实现

Davies Liu

Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCCal Henderson

构建可扩展的微博系统

airsex

Yupoo! (花瓣网/又拍云) 架构中的消息与任务系统

Dahui Feng

百姓网如何优化网速-Qcon2011Yiwei Ma

周爱民关于架构之我的观点George Ang

天涯论坛的技术进化史-Qcon2011Yiwei Ma

Viewers also liked (9)

高性能并发Web服务器实现核心内幕

艺龙旅行网架构案例分享-Qcon2011

Qcon 2011：Beansdb 的设计与实现

Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC

构建可扩展的微博系统

Yupoo! (花瓣网/又拍云) 架构中的消息与任务系统

百姓网如何优化网速-Qcon2011

周爱民关于架构之我的观点

天涯论坛的技术进化史-Qcon2011

Editor's Notes

1 week
MPSS &#x5927;&#x7CFB;&#x7EDF;&#x666E;&#x904D;&#x539F;&#x5219; &#x4E00;&#x53F0;&#x670D;&#x52A1;&#x5668;&#x591A;&#x4E2A;&#x5C0F;&#x7684;&#x670D;&#x52A1;&#xFF0C;&#x800C;&#x4E0D;&#x662F;&#x4E00;&#x53F0;&#x670D;&#x52A1;&#x5668;&#x4E00;&#x4E2A;&#x5927;&#x670D;&#x52A1; &#x6700;&#x9AD8;&#x5C42;&#x6B21;&#xFF0C;&#x865A;&#x62DF;&#x5316;
MPSS &#x5927;&#x7CFB;&#x7EDF;&#x666E;&#x904D;&#x539F;&#x5219; &#x4E00;&#x53F0;&#x670D;&#x52A1;&#x5668;&#x591A;&#x4E2A;&#x5C0F;&#x7684;&#x670D;&#x52A1;&#xFF0C;&#x800C;&#x4E0D;&#x662F;&#x4E00;&#x53F0;&#x670D;&#x52A1;&#x5668;&#x4E00;&#x4E2A;&#x5927;&#x670D;&#x52A1; &#x6700;&#x9AD8;&#x5C42;&#x6B21;&#xFF0C;&#x865A;&#x62DF;&#x5316;
MPSS &#x5927;&#x7CFB;&#x7EDF;&#x666E;&#x904D;&#x539F;&#x5219; &#x4E00;&#x53F0;&#x670D;&#x52A1;&#x5668;&#x591A;&#x4E2A;&#x5C0F;&#x7684;&#x670D;&#x52A1;&#xFF0C;&#x800C;&#x4E0D;&#x662F;&#x4E00;&#x53F0;&#x670D;&#x52A1;&#x5668;&#x4E00;&#x4E2A;&#x5927;&#x670D;&#x52A1; &#x6700;&#x9AD8;&#x5C42;&#x6B21;&#xFF0C;&#x865A;&#x62DF;&#x5316;
&#x9501;&#x8868;&#xFF1A;&#x540C;&#x65F6;&#x8BC4;&#x8BBA;&#xFF0C;&#x52A0;&#x5173;&#x6CE8;&#x589E;&#x591A;
&#x89E3;&#x51B3;&#x65B9;&#x6CD5;&#xFF1A; 1. &#x5408;&#x5E76;&#x6240;&#x6709;&#x8868;(&#x4E0D;&#x662F;&#x6700;&#x4F73;) 2. &#x4E8C;&#x6B21;&#x7D22;&#x5F15;&#xFF0C;
&#x5F02;&#x6B65;&#x539F;&#x56E0;&#xFF1A;&#x53D1;&#x8868;&#x9700;&#x8981;&#x5165;&#x5E93;&#xFF0C;&#x7D22;&#x5F15;&#xFF0C;&#x7EDF;&#x8BA1;&#xFF0C;&#x540E;&#x53F0;&#xFF0C;&#x8BA1;&#x6570;&#x7B49;&#xFF0C;&#x5468;&#x671F;&#x8F83;&#x957F;
todo: &#x6539;&#x8FDB;&#x601D;&#x8DEF;&#xFF0C;&#x589E;&#x52A0;&#x4F2A;&#x4EE3;&#x7801;

微博架构与平台安全

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (9)

Editor's Notes