Infochimps #1 Big Data Platform for the Cloud


Published on

The Infochimps Platform is the simplest, fastest, and most flexible way to implement proven big data infrastructure in the cloud. Scalably and affordably ingest data from wherever you need — your in-house systems, external data feeds, data from the web, or our Data Marketplace. Make it useful with in-stream data decoration and augmentation. Store and analyze it in the best place for your application. Hadoop, NoSQL, real-time analytics — how do you tie it all together? The Infochimps Platform takes the mystery and difficulty out of big data and seamlessly integrates it with your existing environment, so you can focus on gaining business insights from your data fast.

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Infochimps #1 Big Data Platform for the Cloud

  1. 1. The Infochimps Big Data Cloud! Faster and Smarter Decision-Making!30 days from critical business problems to impactful insight. Our managed Big Data Platform-as-a-Service Cloud withproven application developer tools and infrastructure remove risk, accelerate deployment, and streamline your Big Dataprojects- enabling you to quickly start gaining insights, then scale to more data and use cases as you go.
  2. 2. Key Benefits ! Fast! Critical It only takes a few hours to deploy a complete solution to a public cloud Business or your private enterprise cloud. This means you can achieve immediate insights without sacrificing custom development ability.! Problems Simple! + It shouldn‘t take a rocket scientist to tap into the insights Big Data can provide. We’ve created analytic services and application developer frameworks that make interacting with Big Data systems much easier by letting you use languages already familiar to you.! Flexible! big data cloud Our comprehensive architecture means you can combine real-time, ad- hoc, and batch analytics depending on your application needs. You can also start your system at the size that s right for you, and grow it over time to additional data and use cases as your business evolves.! = Enterprise Ready! We reduce risk with the stability of our managed platform, our firm stance Impactful on data security, and our compatibility with many public, private, and Business hybrid cloud environments.! Insights 2!
  3. 3. Big Data Drivers! §  The proliferation of data capture and creation More More technologies Content! Devices! §  Increased “interconnectedness” drives consumption (creating more data) §  Inexpensive storage makes More New & Consumption! Better it possible to keep more, longer Information! §  Innovative software and analysis tools turn data into information §  Every gigabyte of stored content can generate a Big Data encompasses not only the petabyte or more of transient data* content itself, but how it’s *Source: IDC 2011 analyzed and consumed. §  The information about you is much greater than the information you create 3!
  4. 4. Our Customers & Use Cases! Customer Segmentation Cisco is processing 100s of terabytes of weblog data to segment customers downloading software from their support portal by product, geography, and industry. Social Media Listening Infomart built a brand new social media listening platform consuming100s of millions of messages from a variety of social networks in real-time, adding custom influence and authority scores, and building a simple front-end on top of Elasticsearch’s powerful API. Mission Critical Data Pipeline Spongecell’s ad network produces over 10,000+ events per second and lost data means lost revenue. They built a robust, loss-free, high-volume data pipeline that processes all their events meaning they never worry about their data again. Retail Analytics Koupon helps their large retail customers run marketing campaigns around mobile coupons. They collect data from mobile devices and add context around demographics and geolocation to provide their customers with in-depth insight about their customers. 4!
  5. 5. Big Data Cloud Services: Overview! Data Integration and Real-Time Analytics Ad-Hoc Query and Near-Real-Time Analytics Batch Analytics 5!
  6. 6. Big Data Cloud Services: Data Flow!6!
  7. 7. Social Media Listening Platform! Analytics! •  Sentiment Analysis •  Authority Scoring •  Influencer Ranking •  Gender Classifier Application!7!
  8. 8. Ironfan™! Foundation for Your Big Data Services! ! Ironfan is a systems provisioning, deployment, and updating tool. Ironfan automates not only machine configuration, but entire systems configuration to enable the complete Big Data stack, including data integration, routing, storage, computation, monitoring, and more.! ! 1.  Cycle time goes from weeks to minutes! 2.  Service discovery means your machines auto- wire themselves together! 3.  Infrastructure-as-Code provides a simple, iterative, testable contract for how your system will function! 4.  Leverages a combination of proprietary and open source code, including Chef and Fog!8!
  9. 9. Data Delivery Service™! Data Integration & Real-Time Analytics! ! Data Delivery Service™ (DDS) integrates seamlessly with your existing environment, provides highly scalable ETL (extract- transform-load) capabilities, and enables real-time, streaming data analytics.! ! DDS™ gives you scalability & flexibility! ! §  Tap into virtually any data source! §  Internal! §  External! §  Real-Time Stream Processing! §  Ingestions! §  Analytics! §  Make Well-Informed Business Decisions! §  On-the-fly queries!9!
  10. 10. Database Management! Ad-Hoc Query & Analytics! Whether its HBase, Cassandra, Elasticsearch, MongoDB, MySQL, or others, we ensure the right data storage for the job is always right at your fingertips.! Database management gives you peace-of-mind! §  Databases and data storage, as a service. We are your outsourced Big Data database administrator (DBA), providing ! §  Database maintenance! §  Updates! §  Support ! §  Database Agnostics! §  Amazon S3! §  HBase! §  Cassandra! §  Elasticsearch! §  MongoDB! §  MySQL! §  + Many More! §  Deploy to your internal cloud or to a public cloud!10!
  11. 11. Cloud Hadoop! Batch Analytics! Perform large-scale batch analysis as you need it, whether ad-hoc Hadoop clusters or always-on production workflows. Access all the tools you need, with on-demand scaling and tuning.! Cloud Hadoop gives you cloud elasticity & efficiency! §  Turn clusters on at a moment s notice! §  Scale and customize on the fly! §  Leverage tools that make Hadoop easier! §  Wukong™! §  Pig! §  Hive! §  Leverage tools that extend Hadoop! §  Azkaban! §  Sqoop! §  + more! Video: Hadoop Cluster ! in 20 Minutes!11!
  12. 12. Wukong™! Simplified Scripts for Analytics! Wukong™ provides a simplified analytics scripting experience. Write your analytics in developer-friendly Ruby, run code locally for faster development cycles, and leverage existing analytics scripts.! Wukong™ gives you Superpowers!! §  Ruby for Big Data Analytics - That means you can use a familiar, fun programming language to do both Hadoop jobs and DDS™ algorithms.! ! §  Quickly Iterate - Rather than developing and testing everything on your production Hadoop and DDS™ clusters, you can develop scripts locally on your laptop.! §  Leverage Familiar Standard-In/Standard-Out Language - Wukong™ can leverage your existing standard-in/standard-out code with Big Data.!12!
  13. 13. Dashpot™! Reporting & Systems Management! Dashpot™ is a lightweight analytics and operations dashboard for administrators & developers! Dashpot™ gives you visibility and control!! §  Real-Time visualizations from streaming data! §  Deep Visibility ! §  Individual Machines! §  Overall Systems! §  Quickly Start & Stop functional units in your data clusters!13!
  14. 14. Platform API! Custom Applications and Dashboards! With a unified API, control of the platform and visibility of the data within it are just a few web requests away. ! ! The Platform API gives you fine-grain control!! ! § HTTP-based API! §  Simple JSON commands! § Access data through a simple, unified endpoint! § Manage Platform Configuration Settings!14!
  15. 15. big data cloud Bringing Big Data Analytics To Your Enterprise Data Analytics 15!
  16. 16. Traditional vs. DIY vs. Infochimps! Traditional Big Data Big Data Data Infrastructure Infrastructure Cloud big data cloud •  24 Month Project •  12 Month Project •  1 Month Project •  $1M for 10TB •  $300K for 10TB •  $10K / month for 10TB •  Analyzing 15% of •  Analyzing up to 100% of •  Analyzing up to 100% of Enterprise Data Enterprise Data Enterprise Data + 15,000+ external sources 16!
  17. 17. Cloud Delivery! Data Center Infrastructure! ‣  Lights Out Data Center! ‣  Global Footprint! ‣  Co-located with Data! ‣  99.95 - 99.995% SLA!17!
  18. 18. Cloud Delivery!Business Intelligence! ‣  Visualize your data! ‣  Business Reporting! ‣  Application Integration! ‣  Integrated with the Cloud!18!
  19. 19. Cloud Delivery! Professional Services! ‣  Big Data Planning! ‣  Data Modeling! ‣  Analytics! ‣  Architecture/Design! ‣  Implementation!19!
  20. 20. Infochimps Engagement Model! Deploy initial design to development & staging cloud, iteratively add functionality Identify first use case, create Deploy to production proposal, design workflows and public or private cloud iterate on architecture locally and scale out20!
  21. 21. Contact Information! Brian Krpec! Director of Sales 512-709-4704 cell @bkrpec21!