• Email
  • Like
  • Save
  • Private Content
  • Embed
 

Bayesian Counters

by on Jun 19, 2012

  • 1,512 views

Processing of large data requires new approaches to data mining: low, close to linear, complexity and stream processing. While in the traditional data mining the practitioner is usually presented with ...

Processing of large data requires new approaches to data mining: low, close to linear, complexity and stream processing. While in the traditional data mining the practitioner is usually presented with a static dataset, which might have just a timestamp attached to it, to infer a model for predicting future/takeout observations, in stream processing the problem is often posed as extracting as much information as possible on the current data to convert them to an actionable model within a limited time window. In this talk I present an approach based on HBase counters for mining over streams of data, which allows for massively distributed processing and data mining. I will consider overall design goals as well as HBase schema design dilemmas to speed up knowledge extraction process. I will also demo efficient implementations of Naive Bayes, Nearest Neighbor and Bayesian Learning on top of Bayesian Counters.

Accessibility

Categories

Upload Details

Uploaded via SlideShare as Adobe PDF

Usage Rights

© All Rights Reserved

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

Cancel

6 Embeds 72

http://marilson.pbworks.com 39
https://twitter.com 21
http://eventifier.co 8
http://us-w1.rockmelt.com 2
http://tweetedtimes.com 1
http://www.twylah.com 1

Statistics

Likes
0
Downloads
0
Comments
0
Embed Views
72
Views on SlideShare
1,440
Total Views
1,512
Post Comment
Edit your comment

Bayesian Counters Bayesian Counters Presentation Transcript