Baidu Cloud Computing


Published on

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Capable of storing thousands of TB of data High and sustainable aggregate IO bandwidth Hundreds of GB/s read performance Tens of GB/s write performance Uninterruptible service Built-in fault tolerance and high availability Automatic machine management Plug and play
  • Deque query 2 times faster than stl List insert 10 times faster than stl Hash query 2.5 times faster than stl
  • Offline data statistics Map/Reduce Offline data computing (high CPU) MPI Online/offline data condition query (second – min) Big Table Online/offline data aggregate query OLAP Online relational data query (millisecond – sec) Distributed DB Online key-value query (millisecond) Key-Value DB(MOLA)
  • Hot deploy Anti attack, load balance, apps dynamic deploy, App config, resource constrain Protect system, resource computing and limit Upload, download big file
  • Under the concept, the company's search engine will not only provide query results, but can also carry out commands like launching an application or linking a user directly with an online service.
  • Baidu Cloud Computing

    1. 1. Baidu Cloud Computing Practice Fei Dong 2010-11-12
    2. 2. Where is Baidu
    3. 3. Outline• Introduction• Cloud Products• Baidu App Engine• Box Computing• Challenges
    4. 4. History of Baidu• Baidu was established in 2000 by Robin Li and Eric Xu.• Trade on NASTAQ in 2005, Mkt cap > 30B, 4 th largest internet company in the world• Vast majority of its revenues from online advertising, Pay for performance (P4P)• Success from multimedia search, "MP3 Search"
    5. 5. Baidu Status• #1 site in China, #6 around the world• List company in NASTAQ, Mkt cap > 30B, 4 th largest internet company• leading Chinese online search engine: hold 80% of market in China• Mission: provide the best way for people to find information• Technology-driven company,8000 employees, 3000 engineers
    6. 6. Product
    7. 7. Introduction• Technology Dept. – PS, NS, SYS, IBASE, ECOM, EB, CLIENT/MOB• Roles – PM, RD, QA, OP, FE• Groups – NLP, Spider, Distributed system, Security, etc• Topics – IBase: Storage System, Web Server, Message Queue, Programming framework etc.
    8. 8. Cloud Computing Meaning• Serve for inside company, reorganize resource, unify the interface• New requirement: Large Scale Storage, computing, high performance, high availability, dynamic user needs• Transfer technology from backend to frontend, business relay on cloud computing
    9. 9. Cloud Stack PROGRAMMING Horizontal Cloud Services UB Framework BSL PHP, Bingo … Monitoring/Metering/Security WEB SERVER Lighttpd Horizontal Cloud Services Apache BWS Transmit APP Horizontal Cloud Services App Engine Passport … STORAGE SYSTEM MYSQL Horizontal Cloud Services Mola BDDB … BATCH STORAGE Hadoop Horizontal…Cloud Services
    10. 10. Cloud Computing Products• Pyramid (DFS/DTS/DCS)• Online Storage System (Mola, MySQL DBProxy)• Offline computing: Hadoop (HCE)• Platform as a Service: Baidu App Engine (BAE)• Cloud cache management: ZCache
    11. 11. PyramidDFS : Distributed File SystemDTS : Distributed Table System DCDCS : Distributed Computing System SChoose Machine, cost, save energyAssumptions: DTS• File mutation is appending (often concurrently) rather than overwriting DF• Once written, files are only read (often sequentially) for S times many• Component failures are norm• High sustained bandwidth is more important than low latency
    12. 12. Pyramid – DFS• Single DFS Master + many chunk servers• Files : fix-sized chunks (256MB)• Chunks : replicated on multiple chunk servers (mostly 3 copies) 2 0 0 2 5 4 5 0 1 3 8 DM 3 5 7 7 8 1 2 1 3 6 4 6 4 6 7 8
    13. 13. Pyramid - DTS• Single DTS master + many workers• Sorted and partitioned by row key – Each partition is about 256MB – Partitions can be split or merged due to insertion or deletion•B + tree hierarchy
    14. 14. Pyramid - DCSuse DFS or DTS as input/output
    15. 15. Pyramid - Tradeoff• Strong consistency vs Single point• Separate layers as DFS, DTS. If there is bug in DTS, the data still exist on DFS vs. Two layer structure leads to complex engineering.• Tablet autoload or autounload, B+ tree: Snapshot, Checkpoint, Lock-free vs. maintenance• Strong ability to fault-tolerance, high performance both in sequence and random read vs. writing latency will accumulate to hundreds of ms
    16. 16. KV storage system: Mola
    17. 17. DB Proxy
    18. 18. Hadoop in Baidu (1)• Statistics –Storage: 20PB+ –New data per day: 10TB+ –Process data per day: 1PB+ –Jobs per day: 10K+ –File system: Hadoop –Storage system: HBase –Nodes: 2K+
    19. 19. Hadoop in Baidu (2)
    20. 20. Hadoop Optimization• HDFS namenode distributed• HDFS datanode read/write asynchronize• MapReduce jobtracker distributed• MapReduce Hadoop C++ extension• DISQL: statistical analysis of logs
    21. 21. DISQL • Distributed framework for statistic requirementHuman Computer Interaction LSP Distributed shellProgramming API Simple mode DQuery modeHigh layer framework DISQLLower layer framework Hadoop MOLA DDB MPI Big Table ? LINUX
    22. 22. Baidu App Engine Key-value Key-value DB DB Cache Cache TaskQ TaskQ Public Cloud PHP PHP ueu ueu Entire Web Solution Mail Mail Cron Cron FetchURL FetchURL
    23. 23. BAE Concerns• static scalability & dynamical scalability• isolation & security• high availability ( computing & data )
    24. 24. BAE Tech Features• Multiple apps• Web Server Cluster• Elastic code execution platform• Data Center• Resource Statistics• Auto Monitor• Security Runtime Env.• SDK
    25. 25. BAE Dependent• Software – FS (Linux EXT3) – DB (MySQL, DBProxy) – DataCenter (Mola) – Web Server Cluser(Lighttpd, mod_*) – App Language (PHP, BINGO, Smarty) – Network (RPC Framework, MCPack protocols)
    26. 26. BAE Arch. dashboard dashboard S A T Web server cluster U A T T O I Code execution Code execution cluster M S cluster A T N I A C Cloud Service: Data Cloud Service: Data G S center, cache, mysql, center, cache, mysql, E fetchurl, crontab… fetchurl, crontab…
    27. 27. BAE SandboxPOSIX ENVIRONMENT c HTTP Server Sandbox App Config PHP Sandbox Your APP
    28. 28. BAE Future• LAMP runtime environment => diverse languages support• Support 10k+ applications => 80% products of Baidu will be immigrated.• Billion traffic per day => More than 2k machines in 2 years, CPU IDEL 90% -> 60%
    29. 29. Box Computing
    30. 30. Box Computing
    31. 31. Box computing vs. Cloud Computing• Cloud Computing focus more on the back end, –i.e. the infrastructure of the services, the scalable computing• Box Computing more concerns about the front end, –i.e. the requirements from the users and how to meet the requirements.
    32. 32. Aladdin Plan• In Aladdin open platform, the third party is allowed to submit its own service together with its structured data –signed up your app –choose keywords that you want display –choose a template –submit your data in XML form.• resolve the existing search engines could not crawl and retrieval of "hidden network of" information from.
    33. 33. Challenges• Massive data set: TB, billions of PV day• Return results in very short time: ms• Large scale real time business computing
    34. 34. Reference••••
    35. 35. Thank you!