The Infochimps Platform is the simplest, fastest, and most flexible way to implement proven big data infrastructure in the cloud. Scalably and affordably ingest data from wherever you need — your in-house systems, external data feeds, data from the web, or our Data Marketplace. Make it useful with in-stream data decoration and augmentation. Store and analyze it in the best place for your application. Hadoop, NoSQL, real-time analytics — how do you tie it all together? The Infochimps Platform takes the mystery and difficulty out of big data and seamlessly integrates it with your existing environment, so you can focus on gaining business insights from your data fast.
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
Infochimps #1 Big Data Platform for the Cloud
1. The Infochimps Big Data Cloud!
Faster and Smarter Decision-Making!
30 days from critical business problems to impactful insight. Our managed Big Data Platform-as-a-Service Cloud with
proven application developer tools and infrastructure remove risk, accelerate deployment, and streamline your Big Data
projects- enabling you to quickly start gaining insights, then scale to more data and use cases as you go.
2. Key Benefits !
Fast! Critical
It only takes a few hours to deploy a complete solution to a public cloud Business
or your private enterprise cloud. This means you can achieve immediate
insights without sacrificing custom development ability.!
Problems
Simple! +
It shouldn‘t take a rocket scientist to tap into the insights Big Data can
provide. We’ve created analytic services and application developer
frameworks that make interacting with Big Data systems much easier by
letting you use languages already familiar to you.!
Flexible! big data cloud
Our comprehensive architecture means you can combine real-time, ad-
hoc, and batch analytics depending on your application needs. You can
also start your system at the size that s right for you, and grow it over time
to additional data and use cases as your business evolves.!
=
Enterprise Ready!
We reduce risk with the stability of our managed platform, our firm stance
Impactful
on data security, and our compatibility with many public, private, and Business
hybrid cloud environments.!
Insights
2!
3. Big Data Drivers!
§ The proliferation of data
capture and creation More
More
technologies
Content! Devices!
§ Increased “interconnectedness”
drives consumption
(creating more data)
§ Inexpensive storage makes More New &
Consumption! Better
it possible to keep more, longer
Information!
§ Innovative software and
analysis tools turn data into
information
§ Every gigabyte of stored content can generate a
Big Data encompasses not only the petabyte or more of transient data*
content itself, but how it’s *Source: IDC 2011
analyzed and consumed.
§ The information about you is much greater than
the information you create
3!
4. Our Customers & Use Cases!
Customer Segmentation
Cisco is processing 100s of terabytes of weblog data to segment customers
downloading software from their support portal by product, geography, and
industry.
Social Media Listening
Infomart built a brand new social media listening platform consuming100s of millions of
messages from a variety of social networks in real-time, adding custom influence and
authority scores, and building a simple front-end on top of Elasticsearch’s powerful
API.
Mission Critical Data Pipeline
Spongecell’s ad network produces over 10,000+ events per second and lost data
means lost revenue. They built a robust, loss-free, high-volume data pipeline that
processes all their events meaning they never worry about their data again.
Retail Analytics
Koupon helps their large retail customers run marketing campaigns around mobile coupons.
They collect data from mobile devices and add context around demographics and
geolocation to provide their customers with in-depth insight about their customers.
4!
5. Big Data Cloud Services: Overview!
Data Integration and Real-Time Analytics
Ad-Hoc Query and Near-Real-Time Analytics
Batch Analytics
5!
7. Social Media Listening Platform!
Analytics!
• Sentiment Analysis
• Authority Scoring
• Influencer Ranking
• Gender Classifier
Application!
7!
8. Ironfan™!
Foundation for Your Big Data Services!
!
Ironfan is a systems provisioning, deployment, and
updating tool. Ironfan automates not only machine
configuration, but entire systems configuration to
enable the complete Big Data stack, including data
integration, routing, storage, computation, monitoring,
and more.!
!
1. Cycle time goes from weeks to minutes!
2. Service discovery means your machines auto-
wire themselves together!
3. Infrastructure-as-Code provides a simple,
iterative, testable contract for how your system
will function!
4. Leverages a combination of proprietary and open
source code, including Chef and Fog!
8!
9. Data Delivery Service™!
Data Integration & Real-Time Analytics!
!
Data Delivery Service™ (DDS) integrates seamlessly with your
existing environment, provides highly scalable ETL (extract-
transform-load) capabilities, and enables real-time, streaming
data analytics.!
!
DDS™ gives you scalability & flexibility!
!
§ Tap into virtually any data source!
§ Internal!
§ External!
§ Real-Time Stream Processing!
§ Ingestions!
§ Analytics!
§ Make Well-Informed Business Decisions!
§ On-the-fly queries!
9!
10. Database Management!
Ad-Hoc Query & Analytics!
Whether it's HBase, Cassandra, Elasticsearch, MongoDB,
MySQL, or others, we ensure the right data storage for the job
is always right at your fingertips.!
Database management gives you peace-of-mind!
§ Databases and data storage, as a service. We are your
outsourced Big Data database administrator (DBA), providing !
§ Database maintenance!
§ Updates!
§ Support !
§ Database Agnostics!
§ Amazon S3!
§ HBase!
§ Cassandra!
§ Elasticsearch!
§ MongoDB!
§ MySQL!
§ + Many More!
§ Deploy to your internal cloud or to a public cloud!
10!
11. Cloud Hadoop!
Batch Analytics!
Perform large-scale batch analysis as you need it, whether
ad-hoc Hadoop clusters or always-on production
workflows. Access all the tools you need, with on-demand
scaling and tuning.!
Cloud Hadoop gives you cloud elasticity &
efficiency!
§ Turn clusters on at a moment s notice!
§ Scale and customize on the fly!
§ Leverage tools that make Hadoop easier!
§ Wukong™!
§ Pig!
§ Hive!
§ Leverage tools that extend Hadoop!
§ Azkaban!
§ Sqoop!
§ + more!
Video: Hadoop Cluster !
in 20 Minutes!
11!
12. Wukong™!
Simplified Scripts for Analytics!
Wukong™ provides a simplified analytics scripting experience.
Write your analytics in developer-friendly Ruby, run code locally
for faster development cycles, and leverage existing analytics
scripts.!
Wukong™ gives you Superpowers!!
§ Ruby for Big Data Analytics - That means you can use a familiar,
fun programming language to do both Hadoop jobs and DDS™
algorithms.!
!
§ Quickly Iterate - Rather than developing and testing everything on
your production Hadoop and DDS™ clusters, you can develop scripts
locally on your laptop.!
§ Leverage Familiar Standard-In/Standard-Out Language -
Wukong™ can leverage your existing standard-in/standard-out code
with Big Data.!
12!
13. Dashpot™!
Reporting & Systems
Management!
Dashpot™ is a lightweight analytics and operations
dashboard for administrators & developers!
Dashpot™ gives you visibility and
control!!
§ Real-Time visualizations from streaming data!
§ Deep Visibility !
§ Individual Machines!
§ Overall Systems!
§ Quickly Start & Stop functional units in your data
clusters!
13!
14. Platform API!
Custom Applications and Dashboards!
With a unified API, control of the platform and visibility of the data
within it are just a few web requests away. !
!
The Platform API gives you fine-grain control!!
!
§ HTTP-based API!
§ Simple JSON commands!
§ Access data through a simple, unified endpoint!
§ Manage Platform Configuration Settings!
14!
15. big data cloud
Bringing Big Data Analytics
To Your Enterprise Data
Analytics
15!
16. Traditional vs. DIY vs. Infochimps!
Traditional
Big Data
Big Data
Data Infrastructure
Infrastructure
Cloud
big data cloud
• 24 Month Project
• 12 Month Project
• 1 Month Project
• $1M for 10TB
• $300K for 10TB
• $10K / month for 10TB
• Analyzing 15% of • Analyzing up to 100% of • Analyzing up to 100% of
Enterprise Data
Enterprise Data
Enterprise Data + 15,000+
external sources
16!
17. Cloud Delivery!
Data Center Infrastructure!
‣ Lights Out Data Center!
‣ Global Footprint!
‣ Co-located with Data!
‣ 99.95 - 99.995% SLA!
17!
19. Cloud Delivery!
Professional Services!
‣ Big Data Planning!
‣ Data Modeling!
‣ Analytics!
‣ Architecture/Design!
‣ Implementation!
19!
20. Infochimps Engagement Model!
Deploy initial design to
development & staging cloud,
iteratively add functionality
Identify first use case, create Deploy to production
proposal, design workflows and public or private cloud
iterate on architecture locally and scale out
20!
21. Contact Information!
Brian Krpec!
Director of Sales
512-709-4704 cell
brian.krpec@infochimps.com
@bkrpec
21!