klassify
Text classification with Redis
Fatih Erikli
Software Developer
@Adphorus
fatiherikli@gmail.com
http://fatiherikli.com
Text Classification
The task is to assign a document to one or
more classes or categories.
• Spam filters
• Web page classifi...
• Neural Networks
• K-nearest neighbour algorithms
• Decision Trees
• Naive Bayes Classification
Techniques
Training
Just count the words under a label.
Politics
democrat 4
socialism 20
democrat 30
communism 20
politician 10
holac...
Classification
Stem, singularize and eliminate given text
Marijuana should be legalized
nationally in the United States
jus...
Classification
Calculate scores for each labels
Marijuana should be legalized
nationally in the United States
just as it is...
Redis Implementation
SADD labels Politics
SADD labels Drugs
SADD labels Game
SADD labels Programming
Initial labels
initial labels on a Set whi...
HINCRBY Programming Brainfuck 2
`awesome` and `is` should not be
trained. they are fuzzy words for
training logic.
Queryin...
HVALS Programming
# 3 3 2 3 5


HGET Programming Brainfuck
# 5
Queries for classification
A protoype:
http://github.com/fatiherikli/klassify


HTTP POST :8888/train


{
‘text’: ‘Brainfuck is awesome’,

‘label’: ‘Programming’
}
Training




HTTP POST :8888/classify


{
‘text’: ‘Brainfuck is awesome’
}
Classification


{
"label": "Programming",
"scores": {
"A...


{
"label": "Programming",
"scores": {
"Aliens": 1.4285714285714287e-05,
"Animals": 2.380952380952381e-06,
"Society": 6.4...
DEMO
Upcoming SlideShare
Loading in …5
×

Klassify: Text Classification with Redis

322 views

Published on

Python Istanbul - 2016

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
322
On SlideShare
0
From Embeds
0
Number of Embeds
27
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Klassify: Text Classification with Redis

  1. 1. klassify Text classification with Redis
  2. 2. Fatih Erikli Software Developer @Adphorus fatiherikli@gmail.com http://fatiherikli.com
  3. 3. Text Classification The task is to assign a document to one or more classes or categories. • Spam filters • Web page classification • News and and topic categorization • Sentiment Analysis Use cases
  4. 4. • Neural Networks • K-nearest neighbour algorithms • Decision Trees • Naive Bayes Classification Techniques
  5. 5. Training Just count the words under a label. Politics democrat 4 socialism 20 democrat 30 communism 20 politician 10 holacaust 20 Drugs smoke 30 weed 22 lsd 34 heroin 54 cannabis 23 marijuana 52
  6. 6. Classification Stem, singularize and eliminate given text Marijuana should be legalized nationally in the United States just as it is already in Colorado.
  7. 7. Classification Calculate scores for each labels Marijuana should be legalized nationally in the United States just as it is already in Colorado. drugs: 4 drugs: 2 politics: 3 Drugs = 4 + 2 = 6 / TotalWordCountOfDrugs Politics = 3 / TotalWordCountOfPolitics
  8. 8. Redis Implementation
  9. 9. SADD labels Politics SADD labels Drugs SADD labels Game SADD labels Programming Initial labels initial labels on a Set which is called `labels`
  10. 10. HINCRBY Programming Brainfuck 2 `awesome` and `is` should not be trained. they are fuzzy words for training logic. Querying for training “Brainfuck! Brainfuck is awesome” as Programming
  11. 11. HVALS Programming # 3 3 2 3 5 
 HGET Programming Brainfuck # 5 Queries for classification
  12. 12. A protoype: http://github.com/fatiherikli/klassify
  13. 13. 
 HTTP POST :8888/train 
 { ‘text’: ‘Brainfuck is awesome’,
 ‘label’: ‘Programming’ } Training
  14. 14. 
 
 HTTP POST :8888/classify 
 { ‘text’: ‘Brainfuck is awesome’ } Classification 
 { "label": "Programming", "scores": { "Aliens": 1.428, "Animals": 2.380, "Society": 6.4935, "Technology": 9.523 } } responserequest
  15. 15. 
 { "label": "Programming", "scores": { "Aliens": 1.4285714285714287e-05, "Animals": 2.380952380952381e-06, "Society": 6.493506493506494e-07, "Technology": 9.523809523809525e-07 } } Response
  16. 16. DEMO

×