This is a talk that I gave at Stanford's EE203 (Entrepreneurial Engineer) on Tuesday Feb 9th, 2010. It covers my experience at Stanford, VivaSmart, Yahoo, Accel Partners, and Cloudera.
2. Outline About Me The VivaSmart Story The EIR Experience The Cloudera Story .. so far What is Hadoop? Open Source Business Models Lessons Learned & Advice
3. About Me I got BS and MS from Cairo University in Egypt. I came to US in 1995 to get my PhD from Stanford, with goal to go back to Egypt and teach. I got infected by the Entrepreneurship bug, it is rampant at Stanford, hopefully you’ll get infected too In 1999 I took a leave of absence from PhD to start VivaSmart, which I sold to Yahoo in 2000. I stayed with Yahoo till mid-2008, also finished my PhD in mid-2007 with Mendel Rosenblum. I started Cloudera in fall of 2008.
4. The VivaSmart Story It started as Booksmart in Spring of 1999. Initial prototype was built by Thai Tran. We got funded by a great angel (Frank Marshall). Couldn’t raise VC money, but we were able to raise more angel money, and got lighthouse customers. Noticed that it is hard to drive traffic, decided to focus on catalog management technology (Aptivia). Got initial acquisition termsheet from Excite@Home for $12M but they reneged at last minute (4/2000) Yahoo Shopping acquired us for $9M in June 2000.
5. The EIR Experience EIR = Entrepreneur in Residence. Joined Accel Partners in June 2008 as an EIR. Spent most of the summer researching possible ideas for my next venture, also helped with due diligence for a number of companies. Experienced the fund raising process from the VC side, very useful to see how they think. Met my Cloudera co-founders through Accel Andrew Braccia (agb) and Ping Li (pli) from Accel Partners joined the Cloudera Board of Directors.
6. The Cloudera Story … so far Oct 2008: Got $5M round A funding from Accel Partners and a number of strategic angel investors. Four founders (too many?): Mike Olson (Oracle) Jeff Hammerbacher (Facebook) Christophe Bisciglia (Google) AmrAwadallah (Yahoo) Announced the company in March of 2009. May 2009: Got $6M in funding from Greylock Ventures (opportunistic B round) AneelBhusri joined our board from Greylock
7. Cloudera’s Elevator Pitch A single,consolidated repository to enable insights across complex and structured data. Complex Data Documents Web feeds System logs Online forums SharePoint Sensor data EMB archives Photo/Video Structured Data (“relational”) CRM Financials Logistics Inventory Sales records HR records
8. What is Hadoop? The foundation of our system is built on top of Apache Hadoop, which is a scalable distributed data processing system. The scalability of Hadoop comes from marriage of: HDFS: Self-Healing High-Bandwidth Clustered Storage. MapReduce: Fault-Tolerant Distributed Processing. The software manages and heals it self. Leverages the economies of scale of commodity hardware (multi-core chips, many disks per system) Compute moves to data (not other way around).
9. Hadoop History 2002-2004: Doug Cutting and Mike Cafarella started working on Nutch 2003-2004: Google publishes GFS and MapReduce papers 2004: Cutting adds DFS & MapReduce support to Nutch 2006: Yahoo! hires Cutting, Hadoop spins out of Nutch 2007: NY Times converts 4TB of archives over 100 EC2s 2008: Web-scale deployments at Y!, Facebook, Last.fm April 2008: Yahoo does fastest sort of a TB, 3.5mins over 910 nodes May 2009: Yahoo does fastest sort of a TB, 62secs over 1460 nodes Yahoo sorts a PB in 16.25hours over 3658 nodes June 2009, Oct 2009: Hadoop Summit (750), Hadoop World (500) September 2009: Doug Cutting joins Cloudera
10. Open Source Software Business Models Open Source is attractive since it gets you: Free Distribution: People can download and try it out Darwinian Effect: Lots of developers try to solve the problem, best solution wins. Faster Innovation: Customers build the product with you! OSS Business Models: Support/Maintenance/Service agreements Open Core: core is free, but there is value-add proprietary technology around it (“Community” vs “Enterprise” Edition) Monetization through enablement of other services (e.g. Firefox makes money from Google Search).
11. Lessons Learned & Advice Make sure your idea can actually make money! Hire great people (corollary: Fire swiftly). Make sure you are passionate about your idea. Listen to customers, but look for the problems, it is your job to come up with solutions. Be agile, iterate quickly, don’t spend a year planning, don’t be afraid to make mistakes. Don’t be afraid to fail, but don’t persist in your failing ways, learn from failure quickly and evolve (Moore) Have faith, but don’t let it blind you from reality
12. Books I Recommend “Blue Ocean Strategy”, W. Chan Kim, Renée Mauborgne. “The Innovator’s Dilemma”, Clayton Christensen “The Innovator’s Solution”, Clayton Christensen, and Michael Raynor “Good to Great”, Jim Collins “The Seven Habits of Highly Effective People”, Stephen Covey “Crossing the Chasm”,“Tornado”, Geoffrey Moore “The Black Swan”, NassimTaleb.
13. Contact Information We Are Hiring: jobs+ee203@cloudera.com AmrAwadallah CTO, Cloudera Inc. http://twitter.com/awadallah Online Training Videos and Info: http://cloudera.com/hadoop-training http://cloudera.com/blog http://twitter.com/cloudera
Editor's Notes
http://developer.yahoo.net/blogs/hadoop/2009/05/hadoop_sorts_a_petabyte_in_162.html100s of deployments worldwide (http://wiki.apache.org/hadoop/PoweredBy)