How to plan a hadoop cluster for testing and production environmentAnna Yen
Athemaster wants to share our experience to plan Hardware Spec, server initial and role deployment with new Hadoop Users. There are 2 testing environments and 3 production environments for case study.
Big Data Taiwan 2014 Track1-3: Big Data, Big Challenge — Splunk 幫你解決 Big Data...Etu Solution
講者:SYSTEX 數據加值應用發展部產品經理 | 陶靖霖
議題簡介:認清現實吧! Big Data 是個熱門詞彙、熱門議題,但是問題的核心仍然圍繞在資料處理的流程、架構與技術,要踏入 Big Data 的領域,使用者會遭遇哪些挑戰? Splunk 被譽為「全球最佳的 Big Data Company」,究竟在資料處理的流程中擁有什麼獨特的技術優勢,能夠幫助使用者克服這些挑戰?又有哪些成功幫助使用者從資料中萃取出價值的應用案例?歡迎來認識 Splunk 以及全球 Big Data 成功案例。
source: http://www.sfbayacm.org/?p=1394
The specifics of a cloud’s computing architecture may have an impact on application design. This is particularly important in Infrastructure as a Service (IaaS) cloud environments.
This presentation analyzes aspects of the Amazon EC2 IaaS cloud environment that differ from a traditional datacenter and introduces general best practices for ensuring data privacy, storage persistence, and reliable DBMS backup. Best practices for application robustness and scalability on demand are reviewed and are especially significant in leveraging the full potential of an IaaS cloud. The need for a cloud application management and configuration system is briefly reviewed and two alternate approaches to cloud application management are described (RightScale and Kaavo).
How to plan a hadoop cluster for testing and production environmentAnna Yen
Athemaster wants to share our experience to plan Hardware Spec, server initial and role deployment with new Hadoop Users. There are 2 testing environments and 3 production environments for case study.
Big Data Taiwan 2014 Track1-3: Big Data, Big Challenge — Splunk 幫你解決 Big Data...Etu Solution
講者:SYSTEX 數據加值應用發展部產品經理 | 陶靖霖
議題簡介:認清現實吧! Big Data 是個熱門詞彙、熱門議題,但是問題的核心仍然圍繞在資料處理的流程、架構與技術,要踏入 Big Data 的領域,使用者會遭遇哪些挑戰? Splunk 被譽為「全球最佳的 Big Data Company」,究竟在資料處理的流程中擁有什麼獨特的技術優勢,能夠幫助使用者克服這些挑戰?又有哪些成功幫助使用者從資料中萃取出價值的應用案例?歡迎來認識 Splunk 以及全球 Big Data 成功案例。
source: http://www.sfbayacm.org/?p=1394
The specifics of a cloud’s computing architecture may have an impact on application design. This is particularly important in Infrastructure as a Service (IaaS) cloud environments.
This presentation analyzes aspects of the Amazon EC2 IaaS cloud environment that differ from a traditional datacenter and introduces general best practices for ensuring data privacy, storage persistence, and reliable DBMS backup. Best practices for application robustness and scalability on demand are reviewed and are especially significant in leveraging the full potential of an IaaS cloud. The need for a cloud application management and configuration system is briefly reviewed and two alternate approaches to cloud application management are described (RightScale and Kaavo).
Brief research on Amazon S3 for my company.
Feel free to comment/feedback. Thanks!
Connect with me on LinkedIn : sg.linkedin.com/in/yulunteo/
Seems like there are still plenty of people viewing this presentation after so long.
Maybe i should consider doing a update for Cloudfront/Glacier as well..
View these slides if you're you new to cloud computing and would like to learn more about Amazon Web Services (AWS), if you intend to implement a project and would like to discover the basics of the AWS cloud or if you are a business looking to evaluate cloud computing.
In the webinar based on these slides, we answered the following questions:
• What is Cloud Computing with AWS and what benefits can it deliver?
• Who is using AWS and what are they using it for?
• How can I use AWS Services to run my workloads?
View the webinar recording on YouTube here: http://youtu.be/QROD20r6-sQ
Pegasus: Designing a Distributed Key Value System (Arch summit beijing-2016)涛 吴
This slide delivered by Zuoyan Qin, Chief engineer from XiaoMi Cloud Storage Team, was for talk at Arch summit Beijing-2016 regarding how Pegasus was designed.
特别对于一些 IO 密集型领域: DB / Streaming / 监控 / some Internet service
什么都很贵, SAN / NAS / SAS 硬盘
磁盘损坏率,不要相信 MTBF ,算出来经常是 AFR 1% 不到 实际是 3%-15%
SSD - 嵌入式系统 , 写入优化 Record – 是记录方式的,区别于大部分文件系统的 Shard-disk SAN- Redhat GFS 等 DFS - SMB is also known as Common Internet File System (CIFS) DFS-FT MS DFS / DFS-Parallel 没啥好说的,还有下面 DFS-Parallel-FT GoogleFilesystem / CloudStore / Lustre [Laster]
Namespace 的查找是一个消耗很大的操作 , NFS 文件句柄缓存 NFS 文件句柄缓存 Facebook have extended the Linux kernel to allow NFS file opens via inode number rather than filename to avoid the NetApp scaling issue. Namespace 可以扁平化