Interactively search and 
visualize your data 
Romain Rigaux
Goals 
Hue: make Solr / Hadoop easier to use 
+ 
Build a Web app 
Quickly explore data 
… with Solr
Architecture 
“Just a view” on top of the standard Solr API 
REST
History: v1 User
History: v1 Admin
Architecture: Next! 
Lot of learning, UX Boost needed 
Simple, don’t know it is Solr
History: v2 User
History: v2 Admin
Architecture 
REST AJAX 
/select 
/admin/collections 
/get 
/luke... 
/add_widget 
/zoom_in 
/select_facet 
/select_range... 
www…. 
Templates 
+ 
JS Model
Architecture: UI for Facets 
Layout 
Collection 
Query 
All the 2D positioning (cell ids), visual, drag&drop 
Dashboard, fields, template, widgets (ids) 
Search terms, selected facets (q, fqs)
Adding a widget life cycle 
Load the initial page 
Edit mode and Drag&Drop 
/solr/zookeeper/clusterstate.json 
/solr/admin/luke… 
/get_collection
Adding a widget life cycle 
Select the field 
Guess ranges (number or dates) 
Rounding (number or dates) 
/solr/select?stats=true /new_facet
Adding a widget life cycle 
Query part 1 
facet.range={!ex=bytes}bytes&f.bytes.facet.range.start=0&f.bytes.facet.range.end=9000000& 
f.bytes.facet.range.gap=900000&f.bytes.facet.mincount=0&f.bytes.facet.limit=10 
Query Part 2 
q=Chrome&fq={!tag=bytes}bytes:[900000+TO+1800000] 
Augment Solr response 
{ ! 
'facet_counts':{ ! 
'facet_ranges':{ ! 
'bytes':{ ! 
'start':10000,! 
'counts':[ ! 
'900000',! 
3423,! 
'1800000',! 
339,! 
! ! ...! 
]! 
}! 
}! 
}! 
{! 
...,! 
'normalized_facets':[ ! 
{ ! 
'extraSeries':[ ! 
! 
],! 
'label':'bytes',! 
'field':'bytes',! 
'counts':[ ! 
{ ! 
'from’:'900000',! 
'to':'1800000',! 
'selected':True,! 
'value':3423,! 
'field’:'bytes',! 
'exclude':False! 
}! 
], ...! 
}! 
}! 
}!
JSON to Widget 
{ ! 
"field":"rate_code",! 
"counts":[ ! 
{ ! 
"count":97797,! 
"exclude":true,! 
"selected":false,! 
"value":"1",! 
"cat":"rate_code"! 
} ...! 
{ ! 
"field":"medallion",! 
"counts":[ ! 
{ ! 
"count":159,! 
"exclude":true,! 
"selected":false,! 
"value":"6CA28FC49A4C49A9A96",! 
"cat":"medallion"! 
} ….! 
{ ! 
"extraSeries":[ ! 
! 
],! 
"label":"trip_time_in_secs",! 
"field":"trip_time_in_secs",! 
"counts":[ ! 
{ ! 
"from":"0",! 
"to":"10",! 
"selected":false,! 
"value":527,! 
"field":"trip_time_in_secs",! 
"exclude":true! 
} ...! 
{ ! 
"field":"passenger_count",! 
"counts":[ ! 
{ ! 
"count":74766,! 
"exclude":true,! 
"selected":false,! 
"value":"1",! 
"cat":"passenger_count"! 
} ...!
Repeat…
Enterprise features 
- Access to Search App configurable, LDAP/SAML auths 
- Share by link 
- Solr Cloud (or non Cloud) 
- Proxy user 
/solr/jobs_demo/select?user.name=hue&doAs=romain&q= 
- Security 
Kerberos 
- Sentry 
Collection level, Solr calls like /admin, /query, Solr UI, ZooKeeper
Demo 
Index and Visualize Taxi data 
http://chriswhong.com/open-data/foil_nyc_taxi/ 
https://archive.org/details/nycTaxiTripData2013 [torrent better]
Missed it? 
http://demo.gethue.com/search
What’s next? 
- Map Pivot Facets 
- Autocomplete 
- Analytics range facets 
- Easier Indexing 
- … ?
Thank you! 
http://gethue.com/blog/search 
https://github.com/cloudera/hue

Interactively Search and Visualize Your Big Data

  • 2.
    Interactively search and visualize your data Romain Rigaux
  • 3.
    Goals Hue: makeSolr / Hadoop easier to use + Build a Web app Quickly explore data … with Solr
  • 4.
    Architecture “Just aview” on top of the standard Solr API REST
  • 5.
  • 6.
  • 7.
    Architecture: Next! Lotof learning, UX Boost needed Simple, don’t know it is Solr
  • 8.
  • 9.
  • 10.
    Architecture REST AJAX /select /admin/collections /get /luke... /add_widget /zoom_in /select_facet /select_range... www…. Templates + JS Model
  • 11.
    Architecture: UI forFacets Layout Collection Query All the 2D positioning (cell ids), visual, drag&drop Dashboard, fields, template, widgets (ids) Search terms, selected facets (q, fqs)
  • 12.
    Adding a widgetlife cycle Load the initial page Edit mode and Drag&Drop /solr/zookeeper/clusterstate.json /solr/admin/luke… /get_collection
  • 13.
    Adding a widgetlife cycle Select the field Guess ranges (number or dates) Rounding (number or dates) /solr/select?stats=true /new_facet
  • 14.
    Adding a widgetlife cycle Query part 1 facet.range={!ex=bytes}bytes&f.bytes.facet.range.start=0&f.bytes.facet.range.end=9000000& f.bytes.facet.range.gap=900000&f.bytes.facet.mincount=0&f.bytes.facet.limit=10 Query Part 2 q=Chrome&fq={!tag=bytes}bytes:[900000+TO+1800000] Augment Solr response { ! 'facet_counts':{ ! 'facet_ranges':{ ! 'bytes':{ ! 'start':10000,! 'counts':[ ! '900000',! 3423,! '1800000',! 339,! ! ! ...! ]! }! }! }! {! ...,! 'normalized_facets':[ ! { ! 'extraSeries':[ ! ! ],! 'label':'bytes',! 'field':'bytes',! 'counts':[ ! { ! 'from’:'900000',! 'to':'1800000',! 'selected':True,! 'value':3423,! 'field’:'bytes',! 'exclude':False! }! ], ...! }! }! }!
  • 15.
    JSON to Widget { ! "field":"rate_code",! "counts":[ ! { ! "count":97797,! "exclude":true,! "selected":false,! "value":"1",! "cat":"rate_code"! } ...! { ! "field":"medallion",! "counts":[ ! { ! "count":159,! "exclude":true,! "selected":false,! "value":"6CA28FC49A4C49A9A96",! "cat":"medallion"! } ….! { ! "extraSeries":[ ! ! ],! "label":"trip_time_in_secs",! "field":"trip_time_in_secs",! "counts":[ ! { ! "from":"0",! "to":"10",! "selected":false,! "value":527,! "field":"trip_time_in_secs",! "exclude":true! } ...! { ! "field":"passenger_count",! "counts":[ ! { ! "count":74766,! "exclude":true,! "selected":false,! "value":"1",! "cat":"passenger_count"! } ...!
  • 16.
  • 17.
    Enterprise features -Access to Search App configurable, LDAP/SAML auths - Share by link - Solr Cloud (or non Cloud) - Proxy user /solr/jobs_demo/select?user.name=hue&doAs=romain&q= - Security Kerberos - Sentry Collection level, Solr calls like /admin, /query, Solr UI, ZooKeeper
  • 18.
    Demo Index andVisualize Taxi data http://chriswhong.com/open-data/foil_nyc_taxi/ https://archive.org/details/nycTaxiTripData2013 [torrent better]
  • 19.
  • 20.
    What’s next? -Map Pivot Facets - Autocomplete - Analytics range facets - Easier Indexing - … ?
  • 21.
    Thank you! http://gethue.com/blog/search https://github.com/cloudera/hue