SlideShare a Scribd company logo
We do Hadoop.
	
  
	
  
	
  
Using Tableau Software with
Hortonworks Data Platform
September 2013
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
© 2013 Hortonworks Inc.
http://www.hortonworks.com
We do Hadoop.
	
  
About Hortonworks
Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution,
Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big
data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust
and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators
and technology vendors.
3460 West Bayshore Rd.
Palo Alto, CA 94303 USA
US: 1.855.846.7866
International: 1.408.916.4121
www.hortonworks.com
Twitter: twitter.com/hortonworks
Facebook: facebook.com/hortonworks
LinkedIn: linkedin.com/company/hortonworks
Modern	
  businesses	
  need	
  to	
  manage	
  vast	
  amounts	
  of	
  data,	
  and	
  in	
  many	
  cases	
  they	
  
have	
  accumulated	
  this	
  data	
  for	
  years.	
  Many	
  enterprises	
  have	
  built	
  large-­‐scale	
  
environments	
  for	
  transactional	
  data	
  with	
  analytic	
  databases,	
  but	
  are	
  now	
  inundated	
  
with	
  new	
  types	
  of	
  data,	
  such	
  as	
  social	
  media	
  data,	
  server	
  logs,	
  clickstream	
  data,	
  web	
  
logs,	
  machine/sensor	
  data,	
  and	
  geolocation	
  data.	
  These	
  new	
  data	
  sources	
  all	
  share	
  
the	
  common	
  Big	
  Data	
  characteristics	
  of	
  volume	
  (size),	
  velocity	
  (speed),	
  and	
  variety	
  
(type),	
  and	
  have	
  sometimes	
  been	
  thought	
  of	
  as	
  low	
  value,	
  or	
  even	
  as	
  “exhaust	
  data”:	
  
too	
  expensive	
  to	
  store	
  and	
  analyze.	
  	
  
It	
  is	
  these	
  types	
  of	
  data	
  that	
  are	
  turning	
  the	
  conversation	
  from	
  “data	
  analytics”	
  to	
  
“big	
  data	
  analytics.”	
  With	
  Hortonworks,	
  businesses	
  are	
  learning	
  to	
  see	
  these	
  types	
  of	
  
data	
  as	
  inexpensive,	
  accessible	
  sources	
  of	
  insight	
  and	
  competitive	
  advantage.	
  
The	
  Hortonworks	
  Data	
  Platform	
  allows	
  you	
  to	
  store,	
  process,	
  and	
  manage	
  data	
  at	
  
scale.	
  It	
  is	
  designed	
  to	
  integrate	
  with	
  and	
  extend	
  existing	
  data	
  applications.	
  With	
  
Hortonworks,	
  enterprises	
  can	
  retain	
  and	
  process	
  more	
  data,	
  join	
  new	
  and	
  existing	
  
data	
  sets,	
  and	
  lower	
  the	
  cost	
  of	
  data	
  analysis.	
  
Tableau	
  can	
  be	
  used	
  with	
  Hortonworks	
  to	
  explore	
  this	
  expanded	
  data	
  set.	
  Tableau	
  
can	
  access	
  the	
  data	
  in	
  the	
  Hortonworks	
  Data	
  Platform,	
  visualize	
  that	
  data,	
  and	
  
provide	
  valuable	
  insights	
  for	
  your	
  business.	
  Tableau	
  can	
  also	
  combine	
  the	
  data	
  in	
  
the	
  Hortonworks	
  Data	
  Platform	
  with	
  data	
  in	
  traditional	
  analytics	
  databases	
  to	
  
create	
  a	
  blended	
  view	
  of	
  multiple	
  data	
  sources.	
  
The	
  combined	
  capabilities	
  of	
  Hortonworks	
  and	
  Tableau	
  make	
  Big	
  Data	
  less	
  
expensive,	
  more	
  accessible,	
  and	
  easier	
  to	
  understand	
  and	
  use	
  for	
  business	
  
advantage.	
  	
  
In	
  the	
  following	
  sections,	
  we	
  will	
  show	
  you:	
  
• The	
  main	
  features	
  of	
  the	
  Hortonworks	
  Data	
  Platform	
  and	
  Tableau.	
  	
  
• Where	
  Tableau	
  fits	
  in	
  with	
  the	
  Hortonworks	
  Data	
  Platform	
  as	
  part	
  of	
  a	
  
modern	
  data	
  architecture.	
  
• How	
  you	
  can	
  use	
  Tableau	
  with	
  Hortonworks	
  for	
  data	
  exploration	
  and	
  
visualization.	
  	
  
	
  
	
  
	
  
	
  
	
  
We do Hadoop.
	
  
About Hortonworks
Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution,
Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big
data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust
and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators
and technology vendors.
3460 West Bayshore Rd.
Palo Alto, CA 94303 USA
US: 1.855.846.7866
International: 1.408.916.4121
www.hortonworks.com
Twitter: twitter.com/hortonworks
Facebook: facebook.com/hortonworks
LinkedIn: linkedin.com/company/hortonworks
The Hortonworks Data Platform
	
  
The	
  Hortonworks	
  Data	
  Platform	
  (HDP)	
  is	
  an	
  enterprise-­‐grade,	
  hardened	
  Apache	
  
Hadoop	
  distribution	
  that	
  enables	
  you	
  to	
  store,	
  process,	
  and	
  manage	
  large	
  data	
  sets.	
  
Apache	
  Hadoop	
  is	
  an	
  open-­‐source	
  software	
  framework	
  that	
  allows	
  for	
  the	
  
distributed	
  processing	
  of	
  large	
  data	
  sets	
  across	
  clusters	
  of	
  computers	
  using	
  simple	
  
programming	
  models.	
  It	
  is	
  designed	
  for	
  high-­‐availability	
  and	
  fault-­‐tolerance,	
  and	
  
can	
  scale	
  from	
  a	
  single	
  server	
  up	
  to	
  thousands	
  of	
  machines.	
  
The	
  Hortonworks	
  Data	
  Platform	
  combines	
  the	
  most	
  useful	
  and	
  stable	
  versions	
  of	
  
Apache	
  Hadoop	
  and	
  its	
  related	
  projects	
  into	
  a	
  single	
  tested	
  and	
  certified	
  package.	
  
Hortonworks	
  offers	
  the	
  latest	
  innovations	
  from	
  the	
  open	
  source	
  community,	
  along	
  
with	
  the	
  testing	
  and	
  quality	
  you	
  expect	
  from	
  enterprise-­‐quality	
  software.	
  
The	
  Hortonworks	
  Data	
  Platform	
  is	
  designed	
  to	
  integrate	
  with	
  and	
  extend	
  the	
  
capabilities	
  of	
  your	
  existing	
  investments	
  in	
  data	
  applications,	
  tools,	
  and	
  processes.	
  
With	
  Hortonworks,	
  you	
  can	
  refine,	
  analyze,	
  and	
  gain	
  business	
  insights	
  from	
  both	
  
structured	
  and	
  unstructured	
  data	
  –	
  quickly,	
  easily,	
  and	
  economically.	
  
Hortonworks Data Platform: Key Features and Benefits
	
  
With	
  the	
  Hortonworks	
  Data	
  Platform,	
  enterprises	
  can	
  retain	
  and	
  process	
  more	
  data,	
  
join	
  new	
  and	
  existing	
  data	
  sets,	
  and	
  lower	
  the	
  cost	
  of	
  data	
  analysis.	
  Hortonworks	
  
enables	
  enterprises	
  to	
  implement	
  the	
  following	
  data	
  management	
  principles:	
  	
  
• Retain as much data as possible. Traditional	
  data	
  warehouses	
  age,	
  and	
  
over	
  time	
  will	
  eventually	
  store	
  only	
  summary	
  data.	
  Analyzing	
  detailed	
  
records	
  is	
  often	
  critical	
  to	
  uncovering	
  useful	
  business	
  insights.	
  	
  
• Join new and existing data sets. Enterprises	
  can	
  build	
  large-­‐scale	
  
environments	
  for	
  transactional	
  data	
  with	
  analytic	
  databases,	
  but	
  these	
  
solutions	
  are	
  not	
  always	
  well	
  suited	
  to	
  processing	
  nontraditional	
  data	
  sets	
  
such	
  as	
  text,	
  images,	
  machine	
  data,	
  and	
  online	
  data.	
  Hortonworks	
  enables	
  
enterprises	
  to	
  incorporate	
  both	
  structured	
  and	
  unstructured	
  data	
  in	
  one	
  
comprehensive	
  data	
  management	
  system.	
  	
  	
  
	
  
	
  
	
  
	
  
	
  
We do Hadoop.
	
  
About Hortonworks
Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution,
Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big
data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust
and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators
and technology vendors.
3460 West Bayshore Rd.
Palo Alto, CA 94303 USA
US: 1.855.846.7866
International: 1.408.916.4121
www.hortonworks.com
Twitter: twitter.com/hortonworks
Facebook: facebook.com/hortonworks
LinkedIn: linkedin.com/company/hortonworks
• Archive	
  data	
  at	
  low	
  cost.	
  It	
  is	
  not	
  always	
  clear	
  what	
  portion	
  of	
  stored	
  data	
  
will	
  be	
  of	
  value	
  for	
  future	
  analysis.	
  Therefore,	
  it	
  can	
  be	
  difficult	
  to	
  justify	
  
expensive	
  processes	
  to	
  capture,	
  cleanse,	
  and	
  store	
  that	
  data.	
  Hadoop	
  scales	
  
easily,	
  so	
  you	
  can	
  store	
  years	
  of	
  data	
  without	
  much	
  incremental	
  cost,	
  and	
  find	
  
deeper	
  patterns	
  that	
  your	
  competitors	
  may	
  miss.	
  
	
  
• Access	
  all	
  data	
  efficiently.	
  Data	
  needs	
  to	
  be	
  readily	
  accessible.	
  Apache	
  
Hadoop	
  clusters	
  can	
  provide	
  a	
  low-­‐cost	
  solution	
  for	
  storing	
  massive	
  data	
  sets	
  
while	
  still	
  making	
  the	
  information	
  readily	
  available.	
  Hadoop	
  is	
  designed	
  to	
  
efficiently	
  scan	
  all	
  of	
  the	
  data,	
  which	
  is	
  complimentary	
  to	
  databases	
  that	
  are	
  
efficient	
  at	
  finding	
  subsets	
  of	
  data.	
  	
  
	
  
• Apply	
  basic	
  data	
  cleansing	
  and	
  data	
  cataloging.	
  Categorize	
  and	
  label	
  all	
  
data	
  in	
  Hadoop	
  with	
  enough	
  descriptive	
  information	
  (metadata)	
  to	
  make	
  
sense	
  of	
  it	
  later,	
  and	
  to	
  enable	
  integration	
  with	
  transactional	
  databases	
  and	
  
analytic	
  tools.	
  This	
  greatly	
  reduces	
  the	
  time	
  and	
  effort	
  required	
  to	
  integrate	
  
with	
  other	
  data	
  sets,	
  and	
  avoids	
  a	
  scenario	
  in	
  which	
  valuable	
  data	
  is	
  
eventually	
  rendered	
  useless.	
  	
  
	
  
• Integrate	
  with	
  existing	
  platforms	
  and	
  applications.	
  Hortonworks	
  
connects	
  seamlessly	
  with	
  many	
  leading	
  analytic,	
  data	
  integration,	
  and	
  
database	
  management	
  tools.	
  	
  
	
  
Tableau
	
  
Tableau	
  is	
  a	
  data	
  analysis	
  tool	
  that	
  can	
  be	
  used	
  for	
  data	
  exploration	
  and	
  
visualization.	
  Tableau	
  is	
  designed	
  to	
  support	
  people’s	
  natural	
  tendency	
  to	
  think	
  
visually.	
  Rather	
  than	
  typing	
  data	
  into	
  forms	
  or	
  clicking	
  through	
  wizards,	
  Tableau	
  
features	
  an	
  intuitive	
  drag-­‐and-­‐drop	
  interface.	
  You	
  can	
  connect	
  to	
  data	
  in	
  a	
  few	
  
clicks,	
  then	
  visualize	
  and	
  create	
  interactive	
  dashboards	
  with	
  a	
  few	
  more.	
  	
  
Traditional	
  business	
  intelligence	
  (BI)	
  platforms	
  have	
  required	
  users	
  to	
  build	
  
elaborate	
  “universes,”	
  “cubes,”	
  or	
  “temporary	
  tables”	
  before	
  any	
  real	
  work	
  can	
  be	
  
done.	
  Tableau	
  eliminates	
  those	
  steps	
  completely.	
  There’s	
  no	
  requirement	
  to	
  pull	
  
data	
  into	
  a	
  silo	
  –	
  you	
  work	
  directly	
  from	
  your	
  database.	
  
	
  
	
  
	
  
	
  
	
  
We do Hadoop.
	
  
About Hortonworks
Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution,
Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big
data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust
and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators
and technology vendors.
3460 West Bayshore Rd.
Palo Alto, CA 94303 USA
US: 1.855.846.7866
International: 1.408.916.4121
www.hortonworks.com
Twitter: twitter.com/hortonworks
Facebook: facebook.com/hortonworks
LinkedIn: linkedin.com/company/hortonworks
Tableau	
  includes	
  a	
  “Show	
  Me”	
  feature	
  –	
  a	
  visualization	
  best	
  practices	
  engine	
  –	
  that	
  
enables	
  you	
  to	
  easily	
  view	
  your	
  data	
  using	
  different	
  visualizations,	
  such	
  as	
  graphs,	
  
bar	
  and	
  pie	
  charts,	
  and	
  map-­‐based	
  data	
  representations.	
  Tableau	
  also	
  enables	
  you	
  to	
  
share	
  your	
  visualizations	
  on	
  a	
  secure	
  server	
  with	
  colleagues,	
  customers,	
  and	
  
partners.	
  	
  
With	
  Tableau	
  you	
  can	
  connect	
  directly	
  to	
  databases,	
  cubes,	
  data	
  warehouses,	
  files,	
  
spreadsheets,	
  and	
  Hadoop.	
  Your	
  connection	
  is	
  live,	
  so	
  you	
  see	
  up-­‐to-­‐the-­‐minute	
  
data.	
  It	
  takes	
  only	
  a	
  few	
  clicks,	
  and	
  no	
  programming	
  is	
  required.	
  In	
  minutes	
  you’ll	
  be	
  
accessing	
  data,	
  consolidating	
  numbers,	
  and	
  visualizing	
  results	
  without	
  advance	
  set-­‐
up.	
  Tableau	
  is	
  true	
  ad-­‐hoc	
  business	
  analytics.	
  
Reference Architecture
Traditional Enterprise Data Architecture
	
  
Today,	
  nearly	
  every	
  enterprise	
  already	
  has	
  some	
  sort	
  of	
  database	
  management	
  
system	
  already	
  in	
  place.	
  Generally,	
  these	
  environments	
  are	
  structured	
  as	
  follows:	
  	
  
• Data	
  comes	
  from	
  a	
  set	
  of	
  data	
  sources	
  –	
  most	
  typically	
  from	
  enterprise	
  
applications	
  such	
  as	
  Enterprise	
  Resource	
  Planning	
  (ERP),	
  Customer	
  
Relationship	
  Management	
  (CRM),	
  and	
  any	
  custom	
  applications	
  used	
  to	
  
gather	
  data.	
  
	
  
• That	
  data	
  is	
  extracted,	
  transformed,	
  and	
  loaded	
  into	
  a	
  data	
  system	
  such	
  as	
  a	
  
Relational	
  Database	
  Management	
  System	
  (RDBMS),	
  an	
  Enterprise	
  Data	
  
Warehouse	
  (EDW),	
  or	
  even	
  a	
  Massively	
  Parallel	
  Processing	
  (MPP)	
  system.	
  
	
  
• A	
  set	
  of	
  analytical	
  applications	
  –	
  either	
  packaged	
  (e.g.	
  Tableau)	
  or	
  custom	
  –	
  
then	
  access	
  the	
  data	
  in	
  those	
  systems	
  to	
  enable	
  users	
  to	
  garner	
  insights	
  from	
  
the	
  data.	
  
	
  
We do Hadoop.
	
  
About Hortonworks
Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution,
Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big
data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust
and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators
and technology vendors.
3460 West Bayshore Rd.
Palo Alto, CA 94303 USA
US: 1.855.846.7866
International: 1.408.916.4121
www.hortonworks.com
Twitter: twitter.com/hortonworks
Facebook: facebook.com/hortonworks
LinkedIn: linkedin.com/company/hortonworks
	
  
Figure 1: Traditional Database Architecture
Modern Data Architecture
In	
  addition	
  to	
  traditional	
  transactional	
  data	
  in	
  analytic	
  databases,	
  enterprises	
  now	
  
also	
  need	
  to	
  gather,	
  process,	
  and	
  analyze	
  new	
  unstructured	
  data	
  sets	
  that	
  are	
  
growing	
  exponentially.	
  	
  
This	
  new	
  information	
  can	
  include	
  text,	
  images,	
  machine-­‐generated	
  data,	
  and	
  online	
  
data	
  from	
  social	
  media.	
  It	
  also	
  includes	
  data	
  such	
  as	
  log	
  files	
  that	
  was	
  once	
  thought	
  
of	
  as	
  having	
  relatively	
  little	
  value;	
  too	
  expensive	
  to	
  store	
  and	
  analyze.	
  These	
  new	
  
types	
  of	
  data	
  are	
  turning	
  the	
  focus	
  from	
  “data	
  analytics”	
  to	
  “big	
  data	
  analytics”	
  
because	
  so	
  much	
  insight	
  can	
  be	
  gleaned	
  from	
  these	
  new	
  data	
  sources	
  for	
  business	
  
advantage.	
  
	
  
	
  
	
  
We do Hadoop.
	
  
About Hortonworks
Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution,
Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big
data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust
and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators
and technology vendors.
3460 West Bayshore Rd.
Palo Alto, CA 94303 USA
US: 1.855.846.7866
International: 1.408.916.4121
www.hortonworks.com
Twitter: twitter.com/hortonworks
Facebook: facebook.com/hortonworks
LinkedIn: linkedin.com/company/hortonworks
The	
  Hortonworks	
  Data	
  Platform	
  is	
  increasingly	
  being	
  introduced	
  into	
  enterprise	
  
environments	
  to	
  manage	
  the	
  massive	
  amounts	
  of	
  these	
  new	
  types	
  of	
  data	
  –	
  as	
  well	
  
as	
  existing	
  data	
  –	
  in	
  an	
  efficient	
  and	
  cost-­‐effective	
  manner.	
  
	
  
Figure 2: Modern Database Architecture
The Hortonworks Data Platform does not replace traditional data systems used for
building analytic applications – the RDBMS, EDW and MPP systems – but is instead
designed to integrate with and extend these systems.
By providing a framework to capture, store, and process vast quantities of both
structured and unstructured data in a cost efficient and highly scalable manner, the
Hortonworks platform is driving the creation of a new generation of enterprise
database systems.
	
  
	
  
	
  
	
  
	
  
	
  
We do Hadoop.
	
  
About Hortonworks
Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution,
Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big
data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust
and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators
and technology vendors.
3460 West Bayshore Rd.
Palo Alto, CA 94303 USA
US: 1.855.846.7866
International: 1.408.916.4121
www.hortonworks.com
Twitter: twitter.com/hortonworks
Facebook: facebook.com/hortonworks
LinkedIn: linkedin.com/company/hortonworks
Tableau and the Hortonworks Data Platform
Tableau	
  can	
  be	
  used	
  with	
  Hortonworks	
  to	
  explore	
  your	
  expanded	
  data	
  set.	
  Tableau	
  
can	
  directly	
  access	
  the	
  data	
  in	
  the	
  Hortonworks	
  Data	
  Platform,	
  as	
  well	
  as	
  the	
  data	
  in	
  
traditional	
  analytic	
  databases,	
  and	
  can	
  combine	
  them	
  in	
  a	
  single	
  view	
  using	
  a	
  
capability	
  known	
  as	
  “data	
  blending.”	
  Tableau	
  can	
  then	
  explore	
  and	
  visualize	
  the	
  
blended	
  data,	
  providing	
  valuable	
  business	
  insights.	
  
	
  
Figure 3: Tableau and Hortonworks
	
  
	
   	
  
We do Hadoop.
	
  
About Hortonworks
Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution,
Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big
data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust
and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators
and technology vendors.
3460 West Bayshore Rd.
Palo Alto, CA 94303 USA
US: 1.855.846.7866
International: 1.408.916.4121
www.hortonworks.com
Twitter: twitter.com/hortonworks
Facebook: facebook.com/hortonworks
LinkedIn: linkedin.com/company/hortonworks
Use Cases
Enterprises	
  can	
  combine	
  Tableau	
  with	
  the	
  Hortonworks	
  Data	
  Platform	
  for	
  the	
  
following	
  use	
  cases:	
  	
  
• Data	
  Exploration	
  
• Data	
  Visualization	
  
Data	
  Exploration	
  
In	
  the	
  Data	
  Exploration	
  use	
  case,	
  organizations	
  are	
  capturing	
  and	
  storing	
  a	
  large	
  
quantity	
  of	
  new	
  data	
  (sometimes	
  referred	
  to	
  as	
  a	
  data	
  lake)	
  in	
  Hadoop,	
  and	
  then	
  
exploring	
  that	
  data	
  directly.	
  	
  	
  
Data	
  Exploration	
  can	
  be	
  used	
  to	
  explore	
  information	
  that	
  was	
  previously	
  ignored	
  
(social	
  media	
  data,	
  server	
  logs,	
  clickstream	
  data,	
  web	
  logs,	
  machine/sensor	
  data,	
  
and	
  geolocation	
  data),	
  generate	
  reports	
  and	
  visualizations	
  from	
  that	
  data,	
  and	
  use	
  
new	
  or	
  existing	
  analytic	
  applications	
  to	
  leverage	
  these	
  new	
  types	
  of	
  data.	
  	
  
	
  
Figure 4: Data Exploration with Tableau and HDP
	
  
	
   	
  
We do Hadoop.
	
  
About Hortonworks
Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution,
Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big
data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust
and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators
and technology vendors.
3460 West Bayshore Rd.
Palo Alto, CA 94303 USA
US: 1.855.846.7866
International: 1.408.916.4121
www.hortonworks.com
Twitter: twitter.com/hortonworks
Facebook: facebook.com/hortonworks
LinkedIn: linkedin.com/company/hortonworks
Data	
  Visualization	
  
Traditionally,	
  gaining	
  insights	
  from	
  a	
  set	
  of	
  data	
  has	
  meant	
  writing	
  SQL	
  queries	
  to	
  
extract	
  information	
  from	
  a	
  database	
  –	
  often	
  requiring	
  the	
  assistance	
  of	
  a	
  
programmer	
  –	
  and	
  then	
  working	
  with	
  spreadsheets	
  to	
  derive	
  insights	
  from	
  tables	
  of	
  
data.	
  	
  
Data	
  visualization	
  leverages	
  people’s	
  natural	
  tendency	
  to	
  think	
  visually.	
  It’s	
  much	
  
easier	
  for	
  people	
  to	
  understand	
  data	
  when	
  they	
  see	
  it	
  visually	
  represented.	
  It’s	
  much	
  
more	
  difficult	
  for	
  people	
  to	
  try	
  to	
  extract	
  meaning	
  by	
  looking	
  at	
  a	
  table	
  of	
  data.	
  
To	
  illustrate	
  this,	
  let’s	
  use	
  Tableau	
  to	
  visualize	
  some	
  sample	
  clickstream	
  data	
  from	
  
an	
  online	
  retail	
  store.	
  Let’s	
  take	
  a	
  look	
  at	
  website	
  visits	
  by	
  product	
  category	
  in	
  the	
  
state	
  of	
  Florida.	
  	
  
With	
  Tableau	
  and	
  Hortonworks,	
  you	
  can	
  connect	
  to	
  the	
  data	
  directly	
  and	
  visualize	
  
the	
  latest	
  data.	
  With	
  just	
  a	
  few	
  clicks	
  in	
  Tableau,	
  you	
  end	
  up	
  with	
  the	
  following	
  
visualization	
  of	
  the	
  retail	
  store	
  data:	
  	
  	
  
	
  
Figure 5: Sample Retail Store Data in Tableau
	
  
	
  
	
  
	
  
	
  
We do Hadoop.
	
  
About Hortonworks
Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution,
Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big
data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust
and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators
and technology vendors.
3460 West Bayshore Rd.
Palo Alto, CA 94303 USA
US: 1.855.846.7866
International: 1.408.916.4121
www.hortonworks.com
Twitter: twitter.com/hortonworks
Facebook: facebook.com/hortonworks
LinkedIn: linkedin.com/company/hortonworks
Here	
  we	
  can	
  instantly	
  see	
  the	
  product	
  category	
  details	
  for	
  each	
  state	
  by	
  moving	
  the	
  
pointer	
  over	
  the	
  pie	
  charts.	
  At	
  a	
  glance	
  we	
  can	
  see	
  that	
  clothing	
  is	
  the	
  largest	
  
category	
  in	
  Florida,	
  followed	
  by	
  shoes	
  and	
  handbags.	
  With	
  a	
  few	
  more	
  clicks,	
  we	
  
could	
  visualize	
  that	
  same	
  data	
  by	
  age	
  or	
  gender,	
  or	
  change	
  the	
  view	
  to	
  a	
  bar	
  chart	
  or	
  
tree	
  map.	
  	
  
This	
  combination	
  of	
  ease-­‐of-­‐use	
  and	
  broader	
  access	
  means	
  that	
  a	
  business	
  or	
  
financial	
  analyst	
  no	
  longer	
  needs	
  to	
  wait	
  for	
  a	
  database	
  specialist	
  in	
  order	
  to	
  access	
  
data.	
  Tableau	
  also	
  enables	
  you	
  to	
  share	
  your	
  interactive	
  visualizations	
  on	
  a	
  secure	
  
server	
  with	
  colleagues,	
  customers,	
  and	
  partners,	
  providing	
  them	
  with	
  the	
  tools	
  they	
  
need	
  to	
  answer	
  their	
  own	
  questions.	
  It’s	
  true	
  democratization	
  of	
  data.	
  	
  
Getting	
  Started	
  with	
  Hortonworks	
  and	
  Tableau	
  
Here	
  are	
  a	
  few	
  links	
  to	
  help	
  you	
  get	
  started	
  with	
  Hortonworks	
  and	
  Tableau:	
  
• The	
  Hortonworks	
  Sandbox	
  –	
  This	
  free	
  download	
  contains	
  a	
  stand-­‐alone,	
  
single-­‐node	
  Hadoop	
  environment,	
  along	
  with	
  a	
  set	
  of	
  hands-­‐on,	
  step-­‐by-­‐step	
  
tutorials.	
  
	
  
• Tableau	
  trial	
  version	
  –	
  This	
  page	
  contains	
  links	
  to	
  fully	
  functional	
  trial	
  
versions	
  of	
  Tableau	
  Desktop,	
  Tableau	
  Server,	
  and	
  Tableau	
  Online.	
  	
  
	
  	
  
• Hortonworks	
  Hive	
  ODBC	
  driver	
  –	
  The	
  Hortonworks	
  Add-­‐Ons	
  page	
  contains	
  
links	
  to	
  the	
  Hortonworks	
  Hive	
  ODBC	
  driver.	
  On	
  Windows,	
  Tableau	
  requires	
  
the	
  32-­‐bit	
  version	
  of	
  the	
  Hortonworks	
  ODBC	
  driver,	
  even	
  when	
  running	
  on	
  
64-­‐bit	
  versions	
  of	
  Windows.	
  
	
  
• Best	
  Practices	
  for	
  Hadoop	
  Data	
  Analysis	
  with	
  Tableau	
  

More Related Content

What's hot

Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitHadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop Summit
DataWorks Summit
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
Hortonworks
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Hortonworks
 
Internet of things Crash Course Workshop
Internet of things Crash Course WorkshopInternet of things Crash Course Workshop
Internet of things Crash Course Workshop
DataWorks Summit
 
Design a Dataflow in 7 minutes with Apache NiFi/HDF
Design a Dataflow in 7 minutes with Apache NiFi/HDFDesign a Dataflow in 7 minutes with Apache NiFi/HDF
Design a Dataflow in 7 minutes with Apache NiFi/HDF
Hortonworks
 
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsPredicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Hortonworks
 
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Hortonworks
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Hortonworks
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDriving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
DataWorks Summit
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014
Hortonworks
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache Ranger
DataWorks Summit
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data Governance
Hortonworks
 
Hortonworks Data In Motion Series Part 4
Hortonworks Data In Motion Series Part 4Hortonworks Data In Motion Series Part 4
Hortonworks Data In Motion Series Part 4
Hortonworks
 
Hortonworks Presentation at Big Data London
Hortonworks Presentation at Big Data LondonHortonworks Presentation at Big Data London
Hortonworks Presentation at Big Data London
Hortonworks
 
Luo june27 1150am_room230_a_v2
Luo june27 1150am_room230_a_v2Luo june27 1150am_room230_a_v2
Luo june27 1150am_room230_a_v2
DataWorks Summit
 
Hortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts PresentationHortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts Presentation
Hortonworks
 
Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks
 
YARN: the Key to overcoming the challenges of broad-based Hadoop Adoption
YARN: the Key to overcoming the challenges of broad-based Hadoop AdoptionYARN: the Key to overcoming the challenges of broad-based Hadoop Adoption
YARN: the Key to overcoming the challenges of broad-based Hadoop Adoption
DataWorks Summit
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar Slides
Hortonworks
 

What's hot (20)

Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitHadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop Summit
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
 
Internet of things Crash Course Workshop
Internet of things Crash Course WorkshopInternet of things Crash Course Workshop
Internet of things Crash Course Workshop
 
Design a Dataflow in 7 minutes with Apache NiFi/HDF
Design a Dataflow in 7 minutes with Apache NiFi/HDFDesign a Dataflow in 7 minutes with Apache NiFi/HDF
Design a Dataflow in 7 minutes with Apache NiFi/HDF
 
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsPredicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
 
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDriving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache Ranger
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data Governance
 
Hortonworks Data In Motion Series Part 4
Hortonworks Data In Motion Series Part 4Hortonworks Data In Motion Series Part 4
Hortonworks Data In Motion Series Part 4
 
Hortonworks Presentation at Big Data London
Hortonworks Presentation at Big Data LondonHortonworks Presentation at Big Data London
Hortonworks Presentation at Big Data London
 
Luo june27 1150am_room230_a_v2
Luo june27 1150am_room230_a_v2Luo june27 1150am_room230_a_v2
Luo june27 1150am_room230_a_v2
 
Hortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts PresentationHortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts Presentation
 
Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1
 
YARN: the Key to overcoming the challenges of broad-based Hadoop Adoption
YARN: the Key to overcoming the challenges of broad-based Hadoop AdoptionYARN: the Key to overcoming the challenges of broad-based Hadoop Adoption
YARN: the Key to overcoming the challenges of broad-based Hadoop Adoption
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar Slides
 

Viewers also liked

Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for Windows
Hortonworks
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Cloudera, Inc.
 
Introduction to Hortonworks Data Platform
Introduction to Hortonworks Data PlatformIntroduction to Hortonworks Data Platform
Introduction to Hortonworks Data Platform
Hortonworks
 
Zementis hortonworks-webinar-2014-09
Zementis hortonworks-webinar-2014-09Zementis hortonworks-webinar-2014-09
Zementis hortonworks-webinar-2014-09
Hortonworks
 
Hive Functions Cheat Sheet
Hive Functions Cheat SheetHive Functions Cheat Sheet
Hive Functions Cheat Sheet
Hortonworks
 
How Universities Use Big Data to Transform Education
How Universities Use Big Data to Transform EducationHow Universities Use Big Data to Transform Education
How Universities Use Big Data to Transform Education
Hortonworks
 
Hive - 1455: Cloud Storage
Hive - 1455: Cloud StorageHive - 1455: Cloud Storage
Hive - 1455: Cloud Storage
Hortonworks
 
Scaling real time streaming architectures with HDF and Dell EMC Isilon
Scaling real time streaming architectures with HDF and Dell EMC IsilonScaling real time streaming architectures with HDF and Dell EMC Isilon
Scaling real time streaming architectures with HDF and Dell EMC Isilon
Hortonworks
 
S3Guard: What's in your consistency model?
S3Guard: What's in your consistency model?S3Guard: What's in your consistency model?
S3Guard: What's in your consistency model?
Hortonworks
 
Hortonworks Data Cloud for AWS
Hortonworks Data Cloud for AWS Hortonworks Data Cloud for AWS
Hortonworks Data Cloud for AWS
Hortonworks
 
How to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDBHow to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDB
Hortonworks
 
Webinar Series Part 5 New Features of HDF 5
Webinar Series Part 5 New Features of HDF 5Webinar Series Part 5 New Features of HDF 5
Webinar Series Part 5 New Features of HDF 5
Hortonworks
 
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks
 
SQL to Hive Cheat Sheet
SQL to Hive Cheat SheetSQL to Hive Cheat Sheet
SQL to Hive Cheat Sheet
Hortonworks
 
SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...
SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...
SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...
Hortonworks
 
The path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial ServicesThe path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial Services
Hortonworks
 
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Delivering a Flexible IT Infrastructure for Analytics on IBM Power SystemsDelivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Hortonworks
 
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseEnabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical Enterprise
Hortonworks
 
Double Your Hadoop Hardware Performance with SmartSense
Double Your Hadoop Hardware Performance with SmartSenseDouble Your Hadoop Hardware Performance with SmartSense
Double Your Hadoop Hardware Performance with SmartSense
Hortonworks
 

Viewers also liked (19)

Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for Windows
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
 
Introduction to Hortonworks Data Platform
Introduction to Hortonworks Data PlatformIntroduction to Hortonworks Data Platform
Introduction to Hortonworks Data Platform
 
Zementis hortonworks-webinar-2014-09
Zementis hortonworks-webinar-2014-09Zementis hortonworks-webinar-2014-09
Zementis hortonworks-webinar-2014-09
 
Hive Functions Cheat Sheet
Hive Functions Cheat SheetHive Functions Cheat Sheet
Hive Functions Cheat Sheet
 
How Universities Use Big Data to Transform Education
How Universities Use Big Data to Transform EducationHow Universities Use Big Data to Transform Education
How Universities Use Big Data to Transform Education
 
Hive - 1455: Cloud Storage
Hive - 1455: Cloud StorageHive - 1455: Cloud Storage
Hive - 1455: Cloud Storage
 
Scaling real time streaming architectures with HDF and Dell EMC Isilon
Scaling real time streaming architectures with HDF and Dell EMC IsilonScaling real time streaming architectures with HDF and Dell EMC Isilon
Scaling real time streaming architectures with HDF and Dell EMC Isilon
 
S3Guard: What's in your consistency model?
S3Guard: What's in your consistency model?S3Guard: What's in your consistency model?
S3Guard: What's in your consistency model?
 
Hortonworks Data Cloud for AWS
Hortonworks Data Cloud for AWS Hortonworks Data Cloud for AWS
Hortonworks Data Cloud for AWS
 
How to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDBHow to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDB
 
Webinar Series Part 5 New Features of HDF 5
Webinar Series Part 5 New Features of HDF 5Webinar Series Part 5 New Features of HDF 5
Webinar Series Part 5 New Features of HDF 5
 
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
 
SQL to Hive Cheat Sheet
SQL to Hive Cheat SheetSQL to Hive Cheat Sheet
SQL to Hive Cheat Sheet
 
SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...
SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...
SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...
 
The path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial ServicesThe path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial Services
 
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Delivering a Flexible IT Infrastructure for Analytics on IBM Power SystemsDelivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
 
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseEnabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical Enterprise
 
Double Your Hadoop Hardware Performance with SmartSense
Double Your Hadoop Hardware Performance with SmartSenseDouble Your Hadoop Hardware Performance with SmartSense
Double Your Hadoop Hardware Performance with SmartSense
 

Similar to Using Tableau with Hortonworks Data Platform

Unifying Big Data Integration | Diyotta India
Unifying Big Data Integration | Diyotta IndiaUnifying Big Data Integration | Diyotta India
Unifying Big Data Integration | Diyotta India
diyotta
 
Ben Marden - Making sense of Big Data
Ben Marden - Making sense of Big Data Ben Marden - Making sense of Big Data
Ben Marden - Making sense of Big Data
WeAreEsynergy
 
Enterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the UnionEnterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the Union
Hortonworks
 
Hadoop data-lake-white-paper
Hadoop data-lake-white-paperHadoop data-lake-white-paper
Hadoop data-lake-white-paper
Supratim Ray
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack Europe
Hortonworks
 
Bigger Data For Your Budget
Bigger Data For Your BudgetBigger Data For Your Budget
Bigger Data For Your Budget
Hortonworks
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and Hortonworks
Hortonworks
 
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopEnrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Hortonworks
 
Model Your Hadoop Hive Databases with Embarcadero ER/Studio
Model Your Hadoop Hive Databases with Embarcadero ER/StudioModel Your Hadoop Hive Databases with Embarcadero ER/Studio
Model Your Hadoop Hive Databases with Embarcadero ER/Studio
Embarcadero Technologies
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks
Pactera_US
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Hortonworks
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks
 
Impetus and Hortonworks Strategic Partnership- Press Release
Impetus and Hortonworks Strategic Partnership- Press ReleaseImpetus and Hortonworks Strategic Partnership- Press Release
Impetus and Hortonworks Strategic Partnership- Press Release
Impetus Technologies
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Innovative Management Services
 
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldBridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven World
CA Technologies
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
Hortonworks
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
Hortonworks
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Hortonworks
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
Slim Baltagi
 
The Forrester Wave - Big Data Hadoop
The Forrester Wave - Big Data HadoopThe Forrester Wave - Big Data Hadoop
The Forrester Wave - Big Data Hadoop
IBM Software India
 

Similar to Using Tableau with Hortonworks Data Platform (20)

Unifying Big Data Integration | Diyotta India
Unifying Big Data Integration | Diyotta IndiaUnifying Big Data Integration | Diyotta India
Unifying Big Data Integration | Diyotta India
 
Ben Marden - Making sense of Big Data
Ben Marden - Making sense of Big Data Ben Marden - Making sense of Big Data
Ben Marden - Making sense of Big Data
 
Enterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the UnionEnterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the Union
 
Hadoop data-lake-white-paper
Hadoop data-lake-white-paperHadoop data-lake-white-paper
Hadoop data-lake-white-paper
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack Europe
 
Bigger Data For Your Budget
Bigger Data For Your BudgetBigger Data For Your Budget
Bigger Data For Your Budget
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and Hortonworks
 
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopEnrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
 
Model Your Hadoop Hive Databases with Embarcadero ER/Studio
Model Your Hadoop Hive Databases with Embarcadero ER/StudioModel Your Hadoop Hive Databases with Embarcadero ER/Studio
Model Your Hadoop Hive Databases with Embarcadero ER/Studio
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
 
Impetus and Hortonworks Strategic Partnership- Press Release
Impetus and Hortonworks Strategic Partnership- Press ReleaseImpetus and Hortonworks Strategic Partnership- Press Release
Impetus and Hortonworks Strategic Partnership- Press Release
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
 
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldBridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven World
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
 
The Forrester Wave - Big Data Hadoop
The Forrester Wave - Big Data HadoopThe Forrester Wave - Big Data Hadoop
The Forrester Wave - Big Data Hadoop
 

More from Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
Hortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Hortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
Hortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Hortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
Hortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Hortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Hortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
Hortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
Hortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
Hortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
Hortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Hortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
Hortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Hortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
Hortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
Hortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Hortonworks
 

More from Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Recently uploaded

Opencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of MünsterOpencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of Münster
Matthias Neugebauer
 
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and OllamaTirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Zilliz
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
Adam Dunkels
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Networks
 
Amul milk launches in US: Key details of its new products ...
Amul milk launches in US: Key details of its new products ...Amul milk launches in US: Key details of its new products ...
Amul milk launches in US: Key details of its new products ...
chetankumar9855
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
alexjohnson7307
 
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
Priyanka Aash
 
July Patch Tuesday
July Patch TuesdayJuly Patch Tuesday
July Patch Tuesday
Ivanti
 
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
Edge AI and Vision Alliance
 
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
maigasapphire
 
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdfAcumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
BrainSell Technologies
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
RaminGhanbari2
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
Tatiana Al-Chueyr
 
Salesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot WorkshopSalesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot Workshop
CEPTES Software Inc
 
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
digitalxplive
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
Google Developer Group - Harare
 
Choose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presenceChoose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presence
rajancomputerfbd
 
Data Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining DataData Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining Data
Safe Software
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
SynapseIndia
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
ssuser1915fe1
 

Recently uploaded (20)

Opencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of MünsterOpencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of Münster
 
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and OllamaTirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
 
Amul milk launches in US: Key details of its new products ...
Amul milk launches in US: Key details of its new products ...Amul milk launches in US: Key details of its new products ...
Amul milk launches in US: Key details of its new products ...
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
 
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
 
July Patch Tuesday
July Patch TuesdayJuly Patch Tuesday
July Patch Tuesday
 
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
 
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
 
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdfAcumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
 
Salesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot WorkshopSalesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot Workshop
 
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
 
Choose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presenceChoose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presence
 
Data Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining DataData Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining Data
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
 

Using Tableau with Hortonworks Data Platform

  • 1. We do Hadoop.       Using Tableau Software with Hortonworks Data Platform September 2013                               © 2013 Hortonworks Inc. http://www.hortonworks.com
  • 2. We do Hadoop.   About Hortonworks Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution, Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators and technology vendors. 3460 West Bayshore Rd. Palo Alto, CA 94303 USA US: 1.855.846.7866 International: 1.408.916.4121 www.hortonworks.com Twitter: twitter.com/hortonworks Facebook: facebook.com/hortonworks LinkedIn: linkedin.com/company/hortonworks Modern  businesses  need  to  manage  vast  amounts  of  data,  and  in  many  cases  they   have  accumulated  this  data  for  years.  Many  enterprises  have  built  large-­‐scale   environments  for  transactional  data  with  analytic  databases,  but  are  now  inundated   with  new  types  of  data,  such  as  social  media  data,  server  logs,  clickstream  data,  web   logs,  machine/sensor  data,  and  geolocation  data.  These  new  data  sources  all  share   the  common  Big  Data  characteristics  of  volume  (size),  velocity  (speed),  and  variety   (type),  and  have  sometimes  been  thought  of  as  low  value,  or  even  as  “exhaust  data”:   too  expensive  to  store  and  analyze.     It  is  these  types  of  data  that  are  turning  the  conversation  from  “data  analytics”  to   “big  data  analytics.”  With  Hortonworks,  businesses  are  learning  to  see  these  types  of   data  as  inexpensive,  accessible  sources  of  insight  and  competitive  advantage.   The  Hortonworks  Data  Platform  allows  you  to  store,  process,  and  manage  data  at   scale.  It  is  designed  to  integrate  with  and  extend  existing  data  applications.  With   Hortonworks,  enterprises  can  retain  and  process  more  data,  join  new  and  existing   data  sets,  and  lower  the  cost  of  data  analysis.   Tableau  can  be  used  with  Hortonworks  to  explore  this  expanded  data  set.  Tableau   can  access  the  data  in  the  Hortonworks  Data  Platform,  visualize  that  data,  and   provide  valuable  insights  for  your  business.  Tableau  can  also  combine  the  data  in   the  Hortonworks  Data  Platform  with  data  in  traditional  analytics  databases  to   create  a  blended  view  of  multiple  data  sources.   The  combined  capabilities  of  Hortonworks  and  Tableau  make  Big  Data  less   expensive,  more  accessible,  and  easier  to  understand  and  use  for  business   advantage.     In  the  following  sections,  we  will  show  you:   • The  main  features  of  the  Hortonworks  Data  Platform  and  Tableau.     • Where  Tableau  fits  in  with  the  Hortonworks  Data  Platform  as  part  of  a   modern  data  architecture.   • How  you  can  use  Tableau  with  Hortonworks  for  data  exploration  and   visualization.              
  • 3. We do Hadoop.   About Hortonworks Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution, Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators and technology vendors. 3460 West Bayshore Rd. Palo Alto, CA 94303 USA US: 1.855.846.7866 International: 1.408.916.4121 www.hortonworks.com Twitter: twitter.com/hortonworks Facebook: facebook.com/hortonworks LinkedIn: linkedin.com/company/hortonworks The Hortonworks Data Platform   The  Hortonworks  Data  Platform  (HDP)  is  an  enterprise-­‐grade,  hardened  Apache   Hadoop  distribution  that  enables  you  to  store,  process,  and  manage  large  data  sets.   Apache  Hadoop  is  an  open-­‐source  software  framework  that  allows  for  the   distributed  processing  of  large  data  sets  across  clusters  of  computers  using  simple   programming  models.  It  is  designed  for  high-­‐availability  and  fault-­‐tolerance,  and   can  scale  from  a  single  server  up  to  thousands  of  machines.   The  Hortonworks  Data  Platform  combines  the  most  useful  and  stable  versions  of   Apache  Hadoop  and  its  related  projects  into  a  single  tested  and  certified  package.   Hortonworks  offers  the  latest  innovations  from  the  open  source  community,  along   with  the  testing  and  quality  you  expect  from  enterprise-­‐quality  software.   The  Hortonworks  Data  Platform  is  designed  to  integrate  with  and  extend  the   capabilities  of  your  existing  investments  in  data  applications,  tools,  and  processes.   With  Hortonworks,  you  can  refine,  analyze,  and  gain  business  insights  from  both   structured  and  unstructured  data  –  quickly,  easily,  and  economically.   Hortonworks Data Platform: Key Features and Benefits   With  the  Hortonworks  Data  Platform,  enterprises  can  retain  and  process  more  data,   join  new  and  existing  data  sets,  and  lower  the  cost  of  data  analysis.  Hortonworks   enables  enterprises  to  implement  the  following  data  management  principles:     • Retain as much data as possible. Traditional  data  warehouses  age,  and   over  time  will  eventually  store  only  summary  data.  Analyzing  detailed   records  is  often  critical  to  uncovering  useful  business  insights.     • Join new and existing data sets. Enterprises  can  build  large-­‐scale   environments  for  transactional  data  with  analytic  databases,  but  these   solutions  are  not  always  well  suited  to  processing  nontraditional  data  sets   such  as  text,  images,  machine  data,  and  online  data.  Hortonworks  enables   enterprises  to  incorporate  both  structured  and  unstructured  data  in  one   comprehensive  data  management  system.                
  • 4. We do Hadoop.   About Hortonworks Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution, Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators and technology vendors. 3460 West Bayshore Rd. Palo Alto, CA 94303 USA US: 1.855.846.7866 International: 1.408.916.4121 www.hortonworks.com Twitter: twitter.com/hortonworks Facebook: facebook.com/hortonworks LinkedIn: linkedin.com/company/hortonworks • Archive  data  at  low  cost.  It  is  not  always  clear  what  portion  of  stored  data   will  be  of  value  for  future  analysis.  Therefore,  it  can  be  difficult  to  justify   expensive  processes  to  capture,  cleanse,  and  store  that  data.  Hadoop  scales   easily,  so  you  can  store  years  of  data  without  much  incremental  cost,  and  find   deeper  patterns  that  your  competitors  may  miss.     • Access  all  data  efficiently.  Data  needs  to  be  readily  accessible.  Apache   Hadoop  clusters  can  provide  a  low-­‐cost  solution  for  storing  massive  data  sets   while  still  making  the  information  readily  available.  Hadoop  is  designed  to   efficiently  scan  all  of  the  data,  which  is  complimentary  to  databases  that  are   efficient  at  finding  subsets  of  data.       • Apply  basic  data  cleansing  and  data  cataloging.  Categorize  and  label  all   data  in  Hadoop  with  enough  descriptive  information  (metadata)  to  make   sense  of  it  later,  and  to  enable  integration  with  transactional  databases  and   analytic  tools.  This  greatly  reduces  the  time  and  effort  required  to  integrate   with  other  data  sets,  and  avoids  a  scenario  in  which  valuable  data  is   eventually  rendered  useless.       • Integrate  with  existing  platforms  and  applications.  Hortonworks   connects  seamlessly  with  many  leading  analytic,  data  integration,  and   database  management  tools.       Tableau   Tableau  is  a  data  analysis  tool  that  can  be  used  for  data  exploration  and   visualization.  Tableau  is  designed  to  support  people’s  natural  tendency  to  think   visually.  Rather  than  typing  data  into  forms  or  clicking  through  wizards,  Tableau   features  an  intuitive  drag-­‐and-­‐drop  interface.  You  can  connect  to  data  in  a  few   clicks,  then  visualize  and  create  interactive  dashboards  with  a  few  more.     Traditional  business  intelligence  (BI)  platforms  have  required  users  to  build   elaborate  “universes,”  “cubes,”  or  “temporary  tables”  before  any  real  work  can  be   done.  Tableau  eliminates  those  steps  completely.  There’s  no  requirement  to  pull   data  into  a  silo  –  you  work  directly  from  your  database.            
  • 5. We do Hadoop.   About Hortonworks Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution, Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators and technology vendors. 3460 West Bayshore Rd. Palo Alto, CA 94303 USA US: 1.855.846.7866 International: 1.408.916.4121 www.hortonworks.com Twitter: twitter.com/hortonworks Facebook: facebook.com/hortonworks LinkedIn: linkedin.com/company/hortonworks Tableau  includes  a  “Show  Me”  feature  –  a  visualization  best  practices  engine  –  that   enables  you  to  easily  view  your  data  using  different  visualizations,  such  as  graphs,   bar  and  pie  charts,  and  map-­‐based  data  representations.  Tableau  also  enables  you  to   share  your  visualizations  on  a  secure  server  with  colleagues,  customers,  and   partners.     With  Tableau  you  can  connect  directly  to  databases,  cubes,  data  warehouses,  files,   spreadsheets,  and  Hadoop.  Your  connection  is  live,  so  you  see  up-­‐to-­‐the-­‐minute   data.  It  takes  only  a  few  clicks,  and  no  programming  is  required.  In  minutes  you’ll  be   accessing  data,  consolidating  numbers,  and  visualizing  results  without  advance  set-­‐ up.  Tableau  is  true  ad-­‐hoc  business  analytics.   Reference Architecture Traditional Enterprise Data Architecture   Today,  nearly  every  enterprise  already  has  some  sort  of  database  management   system  already  in  place.  Generally,  these  environments  are  structured  as  follows:     • Data  comes  from  a  set  of  data  sources  –  most  typically  from  enterprise   applications  such  as  Enterprise  Resource  Planning  (ERP),  Customer   Relationship  Management  (CRM),  and  any  custom  applications  used  to   gather  data.     • That  data  is  extracted,  transformed,  and  loaded  into  a  data  system  such  as  a   Relational  Database  Management  System  (RDBMS),  an  Enterprise  Data   Warehouse  (EDW),  or  even  a  Massively  Parallel  Processing  (MPP)  system.     • A  set  of  analytical  applications  –  either  packaged  (e.g.  Tableau)  or  custom  –   then  access  the  data  in  those  systems  to  enable  users  to  garner  insights  from   the  data.    
  • 6. We do Hadoop.   About Hortonworks Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution, Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators and technology vendors. 3460 West Bayshore Rd. Palo Alto, CA 94303 USA US: 1.855.846.7866 International: 1.408.916.4121 www.hortonworks.com Twitter: twitter.com/hortonworks Facebook: facebook.com/hortonworks LinkedIn: linkedin.com/company/hortonworks   Figure 1: Traditional Database Architecture Modern Data Architecture In  addition  to  traditional  transactional  data  in  analytic  databases,  enterprises  now   also  need  to  gather,  process,  and  analyze  new  unstructured  data  sets  that  are   growing  exponentially.     This  new  information  can  include  text,  images,  machine-­‐generated  data,  and  online   data  from  social  media.  It  also  includes  data  such  as  log  files  that  was  once  thought   of  as  having  relatively  little  value;  too  expensive  to  store  and  analyze.  These  new   types  of  data  are  turning  the  focus  from  “data  analytics”  to  “big  data  analytics”   because  so  much  insight  can  be  gleaned  from  these  new  data  sources  for  business   advantage.        
  • 7. We do Hadoop.   About Hortonworks Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution, Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators and technology vendors. 3460 West Bayshore Rd. Palo Alto, CA 94303 USA US: 1.855.846.7866 International: 1.408.916.4121 www.hortonworks.com Twitter: twitter.com/hortonworks Facebook: facebook.com/hortonworks LinkedIn: linkedin.com/company/hortonworks The  Hortonworks  Data  Platform  is  increasingly  being  introduced  into  enterprise   environments  to  manage  the  massive  amounts  of  these  new  types  of  data  –  as  well   as  existing  data  –  in  an  efficient  and  cost-­‐effective  manner.     Figure 2: Modern Database Architecture The Hortonworks Data Platform does not replace traditional data systems used for building analytic applications – the RDBMS, EDW and MPP systems – but is instead designed to integrate with and extend these systems. By providing a framework to capture, store, and process vast quantities of both structured and unstructured data in a cost efficient and highly scalable manner, the Hortonworks platform is driving the creation of a new generation of enterprise database systems.            
  • 8. We do Hadoop.   About Hortonworks Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution, Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators and technology vendors. 3460 West Bayshore Rd. Palo Alto, CA 94303 USA US: 1.855.846.7866 International: 1.408.916.4121 www.hortonworks.com Twitter: twitter.com/hortonworks Facebook: facebook.com/hortonworks LinkedIn: linkedin.com/company/hortonworks Tableau and the Hortonworks Data Platform Tableau  can  be  used  with  Hortonworks  to  explore  your  expanded  data  set.  Tableau   can  directly  access  the  data  in  the  Hortonworks  Data  Platform,  as  well  as  the  data  in   traditional  analytic  databases,  and  can  combine  them  in  a  single  view  using  a   capability  known  as  “data  blending.”  Tableau  can  then  explore  and  visualize  the   blended  data,  providing  valuable  business  insights.     Figure 3: Tableau and Hortonworks      
  • 9. We do Hadoop.   About Hortonworks Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution, Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators and technology vendors. 3460 West Bayshore Rd. Palo Alto, CA 94303 USA US: 1.855.846.7866 International: 1.408.916.4121 www.hortonworks.com Twitter: twitter.com/hortonworks Facebook: facebook.com/hortonworks LinkedIn: linkedin.com/company/hortonworks Use Cases Enterprises  can  combine  Tableau  with  the  Hortonworks  Data  Platform  for  the   following  use  cases:     • Data  Exploration   • Data  Visualization   Data  Exploration   In  the  Data  Exploration  use  case,  organizations  are  capturing  and  storing  a  large   quantity  of  new  data  (sometimes  referred  to  as  a  data  lake)  in  Hadoop,  and  then   exploring  that  data  directly.       Data  Exploration  can  be  used  to  explore  information  that  was  previously  ignored   (social  media  data,  server  logs,  clickstream  data,  web  logs,  machine/sensor  data,   and  geolocation  data),  generate  reports  and  visualizations  from  that  data,  and  use   new  or  existing  analytic  applications  to  leverage  these  new  types  of  data.       Figure 4: Data Exploration with Tableau and HDP      
  • 10. We do Hadoop.   About Hortonworks Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution, Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators and technology vendors. 3460 West Bayshore Rd. Palo Alto, CA 94303 USA US: 1.855.846.7866 International: 1.408.916.4121 www.hortonworks.com Twitter: twitter.com/hortonworks Facebook: facebook.com/hortonworks LinkedIn: linkedin.com/company/hortonworks Data  Visualization   Traditionally,  gaining  insights  from  a  set  of  data  has  meant  writing  SQL  queries  to   extract  information  from  a  database  –  often  requiring  the  assistance  of  a   programmer  –  and  then  working  with  spreadsheets  to  derive  insights  from  tables  of   data.     Data  visualization  leverages  people’s  natural  tendency  to  think  visually.  It’s  much   easier  for  people  to  understand  data  when  they  see  it  visually  represented.  It’s  much   more  difficult  for  people  to  try  to  extract  meaning  by  looking  at  a  table  of  data.   To  illustrate  this,  let’s  use  Tableau  to  visualize  some  sample  clickstream  data  from   an  online  retail  store.  Let’s  take  a  look  at  website  visits  by  product  category  in  the   state  of  Florida.     With  Tableau  and  Hortonworks,  you  can  connect  to  the  data  directly  and  visualize   the  latest  data.  With  just  a  few  clicks  in  Tableau,  you  end  up  with  the  following   visualization  of  the  retail  store  data:         Figure 5: Sample Retail Store Data in Tableau          
  • 11. We do Hadoop.   About Hortonworks Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution, Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators and technology vendors. 3460 West Bayshore Rd. Palo Alto, CA 94303 USA US: 1.855.846.7866 International: 1.408.916.4121 www.hortonworks.com Twitter: twitter.com/hortonworks Facebook: facebook.com/hortonworks LinkedIn: linkedin.com/company/hortonworks Here  we  can  instantly  see  the  product  category  details  for  each  state  by  moving  the   pointer  over  the  pie  charts.  At  a  glance  we  can  see  that  clothing  is  the  largest   category  in  Florida,  followed  by  shoes  and  handbags.  With  a  few  more  clicks,  we   could  visualize  that  same  data  by  age  or  gender,  or  change  the  view  to  a  bar  chart  or   tree  map.     This  combination  of  ease-­‐of-­‐use  and  broader  access  means  that  a  business  or   financial  analyst  no  longer  needs  to  wait  for  a  database  specialist  in  order  to  access   data.  Tableau  also  enables  you  to  share  your  interactive  visualizations  on  a  secure   server  with  colleagues,  customers,  and  partners,  providing  them  with  the  tools  they   need  to  answer  their  own  questions.  It’s  true  democratization  of  data.     Getting  Started  with  Hortonworks  and  Tableau   Here  are  a  few  links  to  help  you  get  started  with  Hortonworks  and  Tableau:   • The  Hortonworks  Sandbox  –  This  free  download  contains  a  stand-­‐alone,   single-­‐node  Hadoop  environment,  along  with  a  set  of  hands-­‐on,  step-­‐by-­‐step   tutorials.     • Tableau  trial  version  –  This  page  contains  links  to  fully  functional  trial   versions  of  Tableau  Desktop,  Tableau  Server,  and  Tableau  Online.         • Hortonworks  Hive  ODBC  driver  –  The  Hortonworks  Add-­‐Ons  page  contains   links  to  the  Hortonworks  Hive  ODBC  driver.  On  Windows,  Tableau  requires   the  32-­‐bit  version  of  the  Hortonworks  ODBC  driver,  even  when  running  on   64-­‐bit  versions  of  Windows.     • Best  Practices  for  Hadoop  Data  Analysis  with  Tableau