SlideShare a Scribd company logo
1 of 9
Download to read offline
Big Data on EC2: Mashing Technology in the Cloud

ShareThis


Paco Nathan, Data Insights Team




ScaleCamp 2009-06-09
Scenario…


✤   Given: >10^6 publishers, >10^9 users, >10^10 urls

✤   Early-stage start-up, < 25 people, seeking to minimize spend on Ops
    and capex for data centers

✤   Serving widgets on page views for popular online publications: ESPN,
    HuffPost, FOX, CS Monitor, CBS Marketwatch, Wired, TechCrunch, etc.

✤   Spikes in popularity of stories leads to elastic demands throughout
    the system architecture: serving API, logging, DW, BI, etc.

✤   Business needs to improve user experience by analyzing how people
    share online content


ScaleCamp 2009-06-09
System Architecture


✤   100% infrastructure based on Amazon AWS

✤   Each component designed for cost-effective, horizontal scale-out

✤   AsterData: infrastructure based on a “hub-and-spoke” pattern of batch
    jobs and data consolidation

✤   Cascading: abstraction layer for tying together system components

✤   Batch jobs run on Amazon Elastic MapReduce and AsterData SQL/MR

✤   Vertical search based on Bixo and Katta


ScaleCamp 2009-06-09
ScaleCamp 2009-06-09
Cascading

✤   Syntax is for humans, APIs are for software

✤   Cascading defines apps in terms of functions applied to data flows,
    incorporates end-points; whereas MR is coarse, key/values are brittle

✤   Direct benefits to team’s responsibilities, process, deliverables

✤   Also consider impact on team size, composition, staffing, training

✤   Ideal for provisioning/monitoring elastic resources, such as EMR

✤   Great potential to allow for migrating apps

✤   (see also: Cascading talk @ Hadoop Summit tomorrow, 6pm)

ScaleCamp 2009-06-09
Amazon Elastic MapReduce

✤   Can scale-out wide and select different instance types at launch

✤   Reads input from S3, writes output to S3, optionally copies logs to S3

✤   Excellent command line tools make dev/test/debug cycle more efficient

✤   Risks for DIY Hadoop clusters: time+cost to acquire leases, data locality,
    etc., — EMR resolves these to make large apps more cost-effective

✤   Simplifies needs for Ops — which can be quite difficult and expensive
    to staff for Big Data apps, and pose a risk to scale-out strategy

✤   See the Cascading examples: LogAnalyzer for CloudFront and Multitool

ScaleCamp 2009-06-09
AsterData nCluster Cloud Edition

✤   Scalable, fault-tolerant, relational database — directly in the cloud

✤   Takes only minutes to go from idea to a prototype which scales well;
    accessible by data scientists who have less coding background

✤   Use of SQL unions on map phase, plus other SQL primitives for shuffle
    and reduce phases, allows for complex MR workflows

✤   Leveraging SQL primitives during shuffle and reduce phases allows for
    automated optimizations

✤   Contributed code among developer community, implementing libraries
    for popular algorithms based on In-Database MapReduce


ScaleCamp 2009-06-09
ShareThis as Case Study

See also: ShareThis case study, "Cascading"
by Chris K Wensel, in…




ScaleCamp 2009-06-09
Contacts:
http://sharethis.com
@pacoid on Twitter
http://cascading.org
http://asterdata.com/product/ncluster_cloud.php
http://aws.amazon.com/elasticmapreduce
http://github.com/emi/bixo
http://katta.sourceforge.net
http://www.hadoopbook.com




ScaleCamp 2009-06-09

More Related Content

What's hot

PowerStream: Propelling Energy Innovation with Predictive Analytics
PowerStream: Propelling Energy Innovation with Predictive Analytics PowerStream: Propelling Energy Innovation with Predictive Analytics
PowerStream: Propelling Energy Innovation with Predictive Analytics SingleStore
 
High availability, real-time and scalable architectures
High availability, real-time and scalable architecturesHigh availability, real-time and scalable architectures
High availability, real-time and scalable architecturesJampp
 
Modeling the Smart and Connected City of the Future with Kafka and Spark
Modeling the Smart and Connected City of the Future with Kafka and SparkModeling the Smart and Connected City of the Future with Kafka and Spark
Modeling the Smart and Connected City of the Future with Kafka and SparkSingleStore
 
Processing 19 billion messages in real time and NOT dying in the process
Processing 19 billion messages in real time and NOT dying in the processProcessing 19 billion messages in real time and NOT dying in the process
Processing 19 billion messages in real time and NOT dying in the processJampp
 
Razorfish - Amazon EMR usecase
Razorfish - Amazon EMR usecaseRazorfish - Amazon EMR usecase
Razorfish - Amazon EMR usecaseguestda111d9
 
Big problems Big Data, simple solutions
Big problems Big Data, simple solutionsBig problems Big Data, simple solutions
Big problems Big Data, simple solutionsClaudio Pontili
 
Multi objective tasks scheduling algorithm for cloud computing throughput opt...
Multi objective tasks scheduling algorithm for cloud computing throughput opt...Multi objective tasks scheduling algorithm for cloud computing throughput opt...
Multi objective tasks scheduling algorithm for cloud computing throughput opt...Shakas Technologies
 
Change Data Capture - Scale by the Bay 2019
Change Data Capture - Scale by the Bay 2019Change Data Capture - Scale by the Bay 2019
Change Data Capture - Scale by the Bay 2019Petr Zapletal
 
Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...
Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...
Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...Coburn Watson
 
Howtomakeyourown gi sdashboard
Howtomakeyourown gi sdashboardHowtomakeyourown gi sdashboard
Howtomakeyourown gi sdashboardGeoMedeelel
 
0 supermapproductsintroduction
0 supermapproductsintroduction0 supermapproductsintroduction
0 supermapproductsintroductionGeoMedeelel
 
02 supermapiclientforjavascriptintroduction
02 supermapiclientforjavascriptintroduction02 supermapiclientforjavascriptintroduction
02 supermapiclientforjavascriptintroductionGeoMedeelel
 
01 supermapiportaloverview
01 supermapiportaloverview01 supermapiportaloverview
01 supermapiportaloverviewGeoMedeelel
 
Autonomous analytics on streaming data
Autonomous analytics on streaming dataAutonomous analytics on streaming data
Autonomous analytics on streaming dataClaudiu Barbura
 
01 supermapiserverintroduction
01 supermapiserverintroduction01 supermapiserverintroduction
01 supermapiserverintroductionGeoMedeelel
 
Scalable crawling with Kafka, scrapy and spark - November 2021
Scalable crawling with Kafka, scrapy and spark - November 2021Scalable crawling with Kafka, scrapy and spark - November 2021
Scalable crawling with Kafka, scrapy and spark - November 2021Max Lapan
 
AWS Customer Presentation - Cloud Made
AWS Customer Presentation - Cloud MadeAWS Customer Presentation - Cloud Made
AWS Customer Presentation - Cloud MadeAmazon Web Services
 
Cloud computing's truly open silver lining: OpenStack
Cloud computing's truly open silver lining: OpenStackCloud computing's truly open silver lining: OpenStack
Cloud computing's truly open silver lining: OpenStackAsociatia ProLinux
 
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSpark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSingleStore
 

What's hot (20)

PowerStream: Propelling Energy Innovation with Predictive Analytics
PowerStream: Propelling Energy Innovation with Predictive Analytics PowerStream: Propelling Energy Innovation with Predictive Analytics
PowerStream: Propelling Energy Innovation with Predictive Analytics
 
High availability, real-time and scalable architectures
High availability, real-time and scalable architecturesHigh availability, real-time and scalable architectures
High availability, real-time and scalable architectures
 
Modeling the Smart and Connected City of the Future with Kafka and Spark
Modeling the Smart and Connected City of the Future with Kafka and SparkModeling the Smart and Connected City of the Future with Kafka and Spark
Modeling the Smart and Connected City of the Future with Kafka and Spark
 
Processing 19 billion messages in real time and NOT dying in the process
Processing 19 billion messages in real time and NOT dying in the processProcessing 19 billion messages in real time and NOT dying in the process
Processing 19 billion messages in real time and NOT dying in the process
 
Razorfish - Amazon EMR usecase
Razorfish - Amazon EMR usecaseRazorfish - Amazon EMR usecase
Razorfish - Amazon EMR usecase
 
Big problems Big Data, simple solutions
Big problems Big Data, simple solutionsBig problems Big Data, simple solutions
Big problems Big Data, simple solutions
 
Multi objective tasks scheduling algorithm for cloud computing throughput opt...
Multi objective tasks scheduling algorithm for cloud computing throughput opt...Multi objective tasks scheduling algorithm for cloud computing throughput opt...
Multi objective tasks scheduling algorithm for cloud computing throughput opt...
 
Change Data Capture - Scale by the Bay 2019
Change Data Capture - Scale by the Bay 2019Change Data Capture - Scale by the Bay 2019
Change Data Capture - Scale by the Bay 2019
 
Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...
Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...
Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...
 
Howtomakeyourown gi sdashboard
Howtomakeyourown gi sdashboardHowtomakeyourown gi sdashboard
Howtomakeyourown gi sdashboard
 
0 supermapproductsintroduction
0 supermapproductsintroduction0 supermapproductsintroduction
0 supermapproductsintroduction
 
Practical Experiences With ArcGIS Server
Practical Experiences With ArcGIS ServerPractical Experiences With ArcGIS Server
Practical Experiences With ArcGIS Server
 
02 supermapiclientforjavascriptintroduction
02 supermapiclientforjavascriptintroduction02 supermapiclientforjavascriptintroduction
02 supermapiclientforjavascriptintroduction
 
01 supermapiportaloverview
01 supermapiportaloverview01 supermapiportaloverview
01 supermapiportaloverview
 
Autonomous analytics on streaming data
Autonomous analytics on streaming dataAutonomous analytics on streaming data
Autonomous analytics on streaming data
 
01 supermapiserverintroduction
01 supermapiserverintroduction01 supermapiserverintroduction
01 supermapiserverintroduction
 
Scalable crawling with Kafka, scrapy and spark - November 2021
Scalable crawling with Kafka, scrapy and spark - November 2021Scalable crawling with Kafka, scrapy and spark - November 2021
Scalable crawling with Kafka, scrapy and spark - November 2021
 
AWS Customer Presentation - Cloud Made
AWS Customer Presentation - Cloud MadeAWS Customer Presentation - Cloud Made
AWS Customer Presentation - Cloud Made
 
Cloud computing's truly open silver lining: OpenStack
Cloud computing's truly open silver lining: OpenStackCloud computing's truly open silver lining: OpenStack
Cloud computing's truly open silver lining: OpenStack
 
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSpark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
 

Viewers also liked

Ahead of the curve with Ecommerce
Ahead of the curve with EcommerceAhead of the curve with Ecommerce
Ahead of the curve with EcommerceRahul Singh
 
Social Media for Atlantic Canada Marketing
Social Media for Atlantic Canada MarketingSocial Media for Atlantic Canada Marketing
Social Media for Atlantic Canada MarketingMediaBadger
 
Women and Social Media in Canada
Women and Social Media in CanadaWomen and Social Media in Canada
Women and Social Media in CanadaAyelet Baron
 
How Canadians Shop
How Canadians ShopHow Canadians Shop
How Canadians Shopguest970d5
 
Kristof Coussement - The Debate: the Future of (Big) Data Analytics Software
Kristof Coussement - The Debate: the Future of (Big) Data Analytics SoftwareKristof Coussement - The Debate: the Future of (Big) Data Analytics Software
Kristof Coussement - The Debate: the Future of (Big) Data Analytics SoftwareBAQMaR
 
Histmojurassicpark 090807051516-phpapp01
Histmojurassicpark 090807051516-phpapp01Histmojurassicpark 090807051516-phpapp01
Histmojurassicpark 090807051516-phpapp01ajeesh n
 
eMarketer Webinar: Demographics in Canada—Age-based Digital Behaviors
eMarketer Webinar: Demographics in Canada—Age-based Digital BehaviorseMarketer Webinar: Demographics in Canada—Age-based Digital Behaviors
eMarketer Webinar: Demographics in Canada—Age-based Digital BehaviorseMarketer
 
Animatronics
AnimatronicsAnimatronics
AnimatronicsPLANB1
 
The Impact of Big Data On Marketing Analytics (UpStream Software)
The Impact of Big Data On Marketing Analytics (UpStream Software)The Impact of Big Data On Marketing Analytics (UpStream Software)
The Impact of Big Data On Marketing Analytics (UpStream Software)Revolution Analytics
 
Ultra wideband technology (UWB)
Ultra wideband technology (UWB)Ultra wideband technology (UWB)
Ultra wideband technology (UWB)Mustafa Khaleel
 
Canada country presentation_presentation
Canada country presentation_presentationCanada country presentation_presentation
Canada country presentation_presentationTeresa Maroto Ramos
 
Wireless Application Protocol ppt
Wireless Application Protocol pptWireless Application Protocol ppt
Wireless Application Protocol pptgo2project
 

Viewers also liked (20)

Ahead of the curve with Ecommerce
Ahead of the curve with EcommerceAhead of the curve with Ecommerce
Ahead of the curve with Ecommerce
 
Social Media for Atlantic Canada Marketing
Social Media for Atlantic Canada MarketingSocial Media for Atlantic Canada Marketing
Social Media for Atlantic Canada Marketing
 
Social media for brands.pdf
Social media for brands.pdfSocial media for brands.pdf
Social media for brands.pdf
 
Canada
CanadaCanada
Canada
 
Women and Social Media in Canada
Women and Social Media in CanadaWomen and Social Media in Canada
Women and Social Media in Canada
 
How Canadians Shop
How Canadians ShopHow Canadians Shop
How Canadians Shop
 
Canadaeconomics
CanadaeconomicsCanadaeconomics
Canadaeconomics
 
Maglev trains
Maglev trainsMaglev trains
Maglev trains
 
Wind power forecasting accuracy and uncertainty in Finland
Wind power forecasting accuracy and uncertainty in FinlandWind power forecasting accuracy and uncertainty in Finland
Wind power forecasting accuracy and uncertainty in Finland
 
Kristof Coussement - The Debate: the Future of (Big) Data Analytics Software
Kristof Coussement - The Debate: the Future of (Big) Data Analytics SoftwareKristof Coussement - The Debate: the Future of (Big) Data Analytics Software
Kristof Coussement - The Debate: the Future of (Big) Data Analytics Software
 
Animatronics Portfolio
Animatronics PortfolioAnimatronics Portfolio
Animatronics Portfolio
 
Histmojurassicpark 090807051516-phpapp01
Histmojurassicpark 090807051516-phpapp01Histmojurassicpark 090807051516-phpapp01
Histmojurassicpark 090807051516-phpapp01
 
eMarketer Webinar: Demographics in Canada—Age-based Digital Behaviors
eMarketer Webinar: Demographics in Canada—Age-based Digital BehaviorseMarketer Webinar: Demographics in Canada—Age-based Digital Behaviors
eMarketer Webinar: Demographics in Canada—Age-based Digital Behaviors
 
Stan winston & animatronics
Stan winston & animatronicsStan winston & animatronics
Stan winston & animatronics
 
Animatronics
AnimatronicsAnimatronics
Animatronics
 
Maglev trains ppt
Maglev trains pptMaglev trains ppt
Maglev trains ppt
 
The Impact of Big Data On Marketing Analytics (UpStream Software)
The Impact of Big Data On Marketing Analytics (UpStream Software)The Impact of Big Data On Marketing Analytics (UpStream Software)
The Impact of Big Data On Marketing Analytics (UpStream Software)
 
Ultra wideband technology (UWB)
Ultra wideband technology (UWB)Ultra wideband technology (UWB)
Ultra wideband technology (UWB)
 
Canada country presentation_presentation
Canada country presentation_presentationCanada country presentation_presentation
Canada country presentation_presentation
 
Wireless Application Protocol ppt
Wireless Application Protocol pptWireless Application Protocol ppt
Wireless Application Protocol ppt
 

Similar to Big Data on EC2: Mashing Tech in the Cloud

Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...DATAVERSITY
 
AWS Start-Up Tour 2009 / ShareThis
AWS Start-Up Tour 2009 / ShareThisAWS Start-Up Tour 2009 / ShareThis
AWS Start-Up Tour 2009 / ShareThisPaco Nathan
 
High Performance Computing on AWS: Accelerating Innovation with virtually unl...
High Performance Computing on AWS: Accelerating Innovation with virtually unl...High Performance Computing on AWS: Accelerating Innovation with virtually unl...
High Performance Computing on AWS: Accelerating Innovation with virtually unl...Amazon Web Services
 
Building a Big Data & Analytics Platform using AWS
Building a Big Data & Analytics Platform using AWS Building a Big Data & Analytics Platform using AWS
Building a Big Data & Analytics Platform using AWS Amazon Web Services
 
Cloud Computing Realities - Getting past the hype and setting your cloud stra...
Cloud Computing Realities - Getting past the hype and setting your cloud stra...Cloud Computing Realities - Getting past the hype and setting your cloud stra...
Cloud Computing Realities - Getting past the hype and setting your cloud stra...Compuware APM
 
Apache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterpriseApache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterprisejbellis
 
ESA and the Cloud
ESA and the CloudESA and the Cloud
ESA and the CloudNetcetera
 
Cloud Computing ...changes everything
Cloud Computing ...changes everythingCloud Computing ...changes everything
Cloud Computing ...changes everythingLew Tucker
 
Leveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern AnalyticsLeveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern Analyticsconfluent
 
Simplify Your Migration to AWS and Cut Costs by 30% with TSO Logic
 Simplify Your Migration to AWS and Cut Costs by 30% with TSO Logic Simplify Your Migration to AWS and Cut Costs by 30% with TSO Logic
Simplify Your Migration to AWS and Cut Costs by 30% with TSO LogicAmazon Web Services
 
Optimizing Cost and Capacity for Compute - CMP302 - Santa Clara AWS Summit
Optimizing Cost and Capacity for Compute - CMP302 - Santa Clara AWS SummitOptimizing Cost and Capacity for Compute - CMP302 - Santa Clara AWS Summit
Optimizing Cost and Capacity for Compute - CMP302 - Santa Clara AWS SummitAmazon Web Services
 
Cloud Has Become the New Normal: TCS
Cloud Has Become the New Normal: TCS Cloud Has Become the New Normal: TCS
Cloud Has Become the New Normal: TCS Amazon Web Services
 
Zeller Edm Summit Agile Deployment Of Predictive Analytics
Zeller Edm Summit   Agile Deployment Of Predictive AnalyticsZeller Edm Summit   Agile Deployment Of Predictive Analytics
Zeller Edm Summit Agile Deployment Of Predictive AnalyticsRonald.Ramos
 
High Performance Computing on AWS
High Performance Computing on AWSHigh Performance Computing on AWS
High Performance Computing on AWSAmazon Web Services
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computingwebscale
 
AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)
AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)
AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)Amazon Web Services
 
Giga Spaces Alternative To GAE_JavaOne 09
Giga Spaces Alternative To GAE_JavaOne 09Giga Spaces Alternative To GAE_JavaOne 09
Giga Spaces Alternative To GAE_JavaOne 09Amnon Raviv
 
AWS Summit 2018 Summary
AWS Summit 2018 SummaryAWS Summit 2018 Summary
AWS Summit 2018 SummaryAshish Mrig
 
Cloud Architecture - Multi Cloud, Edge, On-Premise
Cloud Architecture - Multi Cloud, Edge, On-PremiseCloud Architecture - Multi Cloud, Edge, On-Premise
Cloud Architecture - Multi Cloud, Edge, On-PremiseAraf Karsh Hamid
 
Apache Hadoop India Summit 2011 talk "Making Hadoop Enterprise Ready with Am...
Apache Hadoop India Summit 2011 talk  "Making Hadoop Enterprise Ready with Am...Apache Hadoop India Summit 2011 talk  "Making Hadoop Enterprise Ready with Am...
Apache Hadoop India Summit 2011 talk "Making Hadoop Enterprise Ready with Am...Yahoo Developer Network
 

Similar to Big Data on EC2: Mashing Tech in the Cloud (20)

Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
 
AWS Start-Up Tour 2009 / ShareThis
AWS Start-Up Tour 2009 / ShareThisAWS Start-Up Tour 2009 / ShareThis
AWS Start-Up Tour 2009 / ShareThis
 
High Performance Computing on AWS: Accelerating Innovation with virtually unl...
High Performance Computing on AWS: Accelerating Innovation with virtually unl...High Performance Computing on AWS: Accelerating Innovation with virtually unl...
High Performance Computing on AWS: Accelerating Innovation with virtually unl...
 
Building a Big Data & Analytics Platform using AWS
Building a Big Data & Analytics Platform using AWS Building a Big Data & Analytics Platform using AWS
Building a Big Data & Analytics Platform using AWS
 
Cloud Computing Realities - Getting past the hype and setting your cloud stra...
Cloud Computing Realities - Getting past the hype and setting your cloud stra...Cloud Computing Realities - Getting past the hype and setting your cloud stra...
Cloud Computing Realities - Getting past the hype and setting your cloud stra...
 
Apache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterpriseApache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterprise
 
ESA and the Cloud
ESA and the CloudESA and the Cloud
ESA and the Cloud
 
Cloud Computing ...changes everything
Cloud Computing ...changes everythingCloud Computing ...changes everything
Cloud Computing ...changes everything
 
Leveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern AnalyticsLeveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern Analytics
 
Simplify Your Migration to AWS and Cut Costs by 30% with TSO Logic
 Simplify Your Migration to AWS and Cut Costs by 30% with TSO Logic Simplify Your Migration to AWS and Cut Costs by 30% with TSO Logic
Simplify Your Migration to AWS and Cut Costs by 30% with TSO Logic
 
Optimizing Cost and Capacity for Compute - CMP302 - Santa Clara AWS Summit
Optimizing Cost and Capacity for Compute - CMP302 - Santa Clara AWS SummitOptimizing Cost and Capacity for Compute - CMP302 - Santa Clara AWS Summit
Optimizing Cost and Capacity for Compute - CMP302 - Santa Clara AWS Summit
 
Cloud Has Become the New Normal: TCS
Cloud Has Become the New Normal: TCS Cloud Has Become the New Normal: TCS
Cloud Has Become the New Normal: TCS
 
Zeller Edm Summit Agile Deployment Of Predictive Analytics
Zeller Edm Summit   Agile Deployment Of Predictive AnalyticsZeller Edm Summit   Agile Deployment Of Predictive Analytics
Zeller Edm Summit Agile Deployment Of Predictive Analytics
 
High Performance Computing on AWS
High Performance Computing on AWSHigh Performance Computing on AWS
High Performance Computing on AWS
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)
AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)
AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)
 
Giga Spaces Alternative To GAE_JavaOne 09
Giga Spaces Alternative To GAE_JavaOne 09Giga Spaces Alternative To GAE_JavaOne 09
Giga Spaces Alternative To GAE_JavaOne 09
 
AWS Summit 2018 Summary
AWS Summit 2018 SummaryAWS Summit 2018 Summary
AWS Summit 2018 Summary
 
Cloud Architecture - Multi Cloud, Edge, On-Premise
Cloud Architecture - Multi Cloud, Edge, On-PremiseCloud Architecture - Multi Cloud, Edge, On-Premise
Cloud Architecture - Multi Cloud, Edge, On-Premise
 
Apache Hadoop India Summit 2011 talk "Making Hadoop Enterprise Ready with Am...
Apache Hadoop India Summit 2011 talk  "Making Hadoop Enterprise Ready with Am...Apache Hadoop India Summit 2011 talk  "Making Hadoop Enterprise Ready with Am...
Apache Hadoop India Summit 2011 talk "Making Hadoop Enterprise Ready with Am...
 

More from George Ang

Wrapper induction construct wrappers automatically to extract information f...
Wrapper induction   construct wrappers automatically to extract information f...Wrapper induction   construct wrappers automatically to extract information f...
Wrapper induction construct wrappers automatically to extract information f...George Ang
 
Opinion mining and summarization
Opinion mining and summarizationOpinion mining and summarization
Opinion mining and summarizationGeorge Ang
 
Huffman coding
Huffman codingHuffman coding
Huffman codingGeorge Ang
 
Do not crawl in the dust 
different ur ls similar text
Do not crawl in the dust 
different ur ls similar textDo not crawl in the dust 
different ur ls similar text
Do not crawl in the dust 
different ur ls similar textGeorge Ang
 
大规模数据处理的那些事儿
大规模数据处理的那些事儿大规模数据处理的那些事儿
大规模数据处理的那些事儿George Ang
 
腾讯大讲堂02 休闲游戏发展的文化趋势
腾讯大讲堂02 休闲游戏发展的文化趋势腾讯大讲堂02 休闲游戏发展的文化趋势
腾讯大讲堂02 休闲游戏发展的文化趋势George Ang
 
腾讯大讲堂03 qq邮箱成长历程
腾讯大讲堂03 qq邮箱成长历程腾讯大讲堂03 qq邮箱成长历程
腾讯大讲堂03 qq邮箱成长历程George Ang
 
腾讯大讲堂04 im qq
腾讯大讲堂04 im qq腾讯大讲堂04 im qq
腾讯大讲堂04 im qqGeorge Ang
 
腾讯大讲堂05 面向对象应对之道
腾讯大讲堂05 面向对象应对之道腾讯大讲堂05 面向对象应对之道
腾讯大讲堂05 面向对象应对之道George Ang
 
腾讯大讲堂06 qq邮箱性能优化
腾讯大讲堂06 qq邮箱性能优化腾讯大讲堂06 qq邮箱性能优化
腾讯大讲堂06 qq邮箱性能优化George Ang
 
腾讯大讲堂07 qq空间
腾讯大讲堂07 qq空间腾讯大讲堂07 qq空间
腾讯大讲堂07 qq空间George Ang
 
腾讯大讲堂08 可扩展web架构探讨
腾讯大讲堂08 可扩展web架构探讨腾讯大讲堂08 可扩展web架构探讨
腾讯大讲堂08 可扩展web架构探讨George Ang
 
腾讯大讲堂09 如何建设高性能网站
腾讯大讲堂09 如何建设高性能网站腾讯大讲堂09 如何建设高性能网站
腾讯大讲堂09 如何建设高性能网站George Ang
 
腾讯大讲堂01 移动qq产品发展历程
腾讯大讲堂01 移动qq产品发展历程腾讯大讲堂01 移动qq产品发展历程
腾讯大讲堂01 移动qq产品发展历程George Ang
 
腾讯大讲堂10 customer engagement
腾讯大讲堂10 customer engagement腾讯大讲堂10 customer engagement
腾讯大讲堂10 customer engagementGeorge Ang
 
腾讯大讲堂11 拍拍ce工作经验分享
腾讯大讲堂11 拍拍ce工作经验分享腾讯大讲堂11 拍拍ce工作经验分享
腾讯大讲堂11 拍拍ce工作经验分享George Ang
 
腾讯大讲堂14 qq直播(qq live) 介绍
腾讯大讲堂14 qq直播(qq live) 介绍腾讯大讲堂14 qq直播(qq live) 介绍
腾讯大讲堂14 qq直播(qq live) 介绍George Ang
 
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍George Ang
 
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍George Ang
 
腾讯大讲堂16 产品经理工作心得分享
腾讯大讲堂16 产品经理工作心得分享腾讯大讲堂16 产品经理工作心得分享
腾讯大讲堂16 产品经理工作心得分享George Ang
 

More from George Ang (20)

Wrapper induction construct wrappers automatically to extract information f...
Wrapper induction   construct wrappers automatically to extract information f...Wrapper induction   construct wrappers automatically to extract information f...
Wrapper induction construct wrappers automatically to extract information f...
 
Opinion mining and summarization
Opinion mining and summarizationOpinion mining and summarization
Opinion mining and summarization
 
Huffman coding
Huffman codingHuffman coding
Huffman coding
 
Do not crawl in the dust 
different ur ls similar text
Do not crawl in the dust 
different ur ls similar textDo not crawl in the dust 
different ur ls similar text
Do not crawl in the dust 
different ur ls similar text
 
大规模数据处理的那些事儿
大规模数据处理的那些事儿大规模数据处理的那些事儿
大规模数据处理的那些事儿
 
腾讯大讲堂02 休闲游戏发展的文化趋势
腾讯大讲堂02 休闲游戏发展的文化趋势腾讯大讲堂02 休闲游戏发展的文化趋势
腾讯大讲堂02 休闲游戏发展的文化趋势
 
腾讯大讲堂03 qq邮箱成长历程
腾讯大讲堂03 qq邮箱成长历程腾讯大讲堂03 qq邮箱成长历程
腾讯大讲堂03 qq邮箱成长历程
 
腾讯大讲堂04 im qq
腾讯大讲堂04 im qq腾讯大讲堂04 im qq
腾讯大讲堂04 im qq
 
腾讯大讲堂05 面向对象应对之道
腾讯大讲堂05 面向对象应对之道腾讯大讲堂05 面向对象应对之道
腾讯大讲堂05 面向对象应对之道
 
腾讯大讲堂06 qq邮箱性能优化
腾讯大讲堂06 qq邮箱性能优化腾讯大讲堂06 qq邮箱性能优化
腾讯大讲堂06 qq邮箱性能优化
 
腾讯大讲堂07 qq空间
腾讯大讲堂07 qq空间腾讯大讲堂07 qq空间
腾讯大讲堂07 qq空间
 
腾讯大讲堂08 可扩展web架构探讨
腾讯大讲堂08 可扩展web架构探讨腾讯大讲堂08 可扩展web架构探讨
腾讯大讲堂08 可扩展web架构探讨
 
腾讯大讲堂09 如何建设高性能网站
腾讯大讲堂09 如何建设高性能网站腾讯大讲堂09 如何建设高性能网站
腾讯大讲堂09 如何建设高性能网站
 
腾讯大讲堂01 移动qq产品发展历程
腾讯大讲堂01 移动qq产品发展历程腾讯大讲堂01 移动qq产品发展历程
腾讯大讲堂01 移动qq产品发展历程
 
腾讯大讲堂10 customer engagement
腾讯大讲堂10 customer engagement腾讯大讲堂10 customer engagement
腾讯大讲堂10 customer engagement
 
腾讯大讲堂11 拍拍ce工作经验分享
腾讯大讲堂11 拍拍ce工作经验分享腾讯大讲堂11 拍拍ce工作经验分享
腾讯大讲堂11 拍拍ce工作经验分享
 
腾讯大讲堂14 qq直播(qq live) 介绍
腾讯大讲堂14 qq直播(qq live) 介绍腾讯大讲堂14 qq直播(qq live) 介绍
腾讯大讲堂14 qq直播(qq live) 介绍
 
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
 
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
 
腾讯大讲堂16 产品经理工作心得分享
腾讯大讲堂16 产品经理工作心得分享腾讯大讲堂16 产品经理工作心得分享
腾讯大讲堂16 产品经理工作心得分享
 

Recently uploaded

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 

Recently uploaded (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

Big Data on EC2: Mashing Tech in the Cloud

  • 1. Big Data on EC2: Mashing Technology in the Cloud ShareThis Paco Nathan, Data Insights Team ScaleCamp 2009-06-09
  • 2. Scenario… ✤ Given: >10^6 publishers, >10^9 users, >10^10 urls ✤ Early-stage start-up, < 25 people, seeking to minimize spend on Ops and capex for data centers ✤ Serving widgets on page views for popular online publications: ESPN, HuffPost, FOX, CS Monitor, CBS Marketwatch, Wired, TechCrunch, etc. ✤ Spikes in popularity of stories leads to elastic demands throughout the system architecture: serving API, logging, DW, BI, etc. ✤ Business needs to improve user experience by analyzing how people share online content ScaleCamp 2009-06-09
  • 3. System Architecture ✤ 100% infrastructure based on Amazon AWS ✤ Each component designed for cost-effective, horizontal scale-out ✤ AsterData: infrastructure based on a “hub-and-spoke” pattern of batch jobs and data consolidation ✤ Cascading: abstraction layer for tying together system components ✤ Batch jobs run on Amazon Elastic MapReduce and AsterData SQL/MR ✤ Vertical search based on Bixo and Katta ScaleCamp 2009-06-09
  • 5. Cascading ✤ Syntax is for humans, APIs are for software ✤ Cascading defines apps in terms of functions applied to data flows, incorporates end-points; whereas MR is coarse, key/values are brittle ✤ Direct benefits to team’s responsibilities, process, deliverables ✤ Also consider impact on team size, composition, staffing, training ✤ Ideal for provisioning/monitoring elastic resources, such as EMR ✤ Great potential to allow for migrating apps ✤ (see also: Cascading talk @ Hadoop Summit tomorrow, 6pm) ScaleCamp 2009-06-09
  • 6. Amazon Elastic MapReduce ✤ Can scale-out wide and select different instance types at launch ✤ Reads input from S3, writes output to S3, optionally copies logs to S3 ✤ Excellent command line tools make dev/test/debug cycle more efficient ✤ Risks for DIY Hadoop clusters: time+cost to acquire leases, data locality, etc., — EMR resolves these to make large apps more cost-effective ✤ Simplifies needs for Ops — which can be quite difficult and expensive to staff for Big Data apps, and pose a risk to scale-out strategy ✤ See the Cascading examples: LogAnalyzer for CloudFront and Multitool ScaleCamp 2009-06-09
  • 7. AsterData nCluster Cloud Edition ✤ Scalable, fault-tolerant, relational database — directly in the cloud ✤ Takes only minutes to go from idea to a prototype which scales well; accessible by data scientists who have less coding background ✤ Use of SQL unions on map phase, plus other SQL primitives for shuffle and reduce phases, allows for complex MR workflows ✤ Leveraging SQL primitives during shuffle and reduce phases allows for automated optimizations ✤ Contributed code among developer community, implementing libraries for popular algorithms based on In-Database MapReduce ScaleCamp 2009-06-09
  • 8. ShareThis as Case Study See also: ShareThis case study, "Cascading" by Chris K Wensel, in… ScaleCamp 2009-06-09