9 Data Mining Challenges from
Data Scientists Like You
1. Poor quality data
•
•
•
•

D r t y dat a
i
M ssi ng val ues
i
I nadequat e dat a si ze
Poor r epr esent at i on i n dat a sam i ng
pl
2. Lack of understanding
Lack of understanding/lack of
diffusion of data mining techniques
in academic arenas
3. Lack of good literature
Lack of good literature on important
data mining topics and techniques
4. (Academic) access to
commercial-grade software.
(Academic institutions) have
trouble accessing commercial-grade
software at reasonable costs.
5. Data variety
Data variety - trying to
accommodate data that comes
from different sources and in a
variety of different forms
(images, geo
6. Data velocity
Data velocity - online
machine learning
requires models to be
constantly updated with
new, incoming data.
7. Dealing with huge datasets
Dealing with huge datasets, or 'Big
Data,' that require distributed
approaches.
8. Coming up with the right
question
"More data beats the better algorithm, but smarter questions beat more
data,“- Gregory Piatetsky, www.kdnuggets.com
9. Remaining objective and
allowing the data to lead you,
not the opposite.
Remaining objective and allowing the data to lead
you, not the opposite. Preconceived notions can be
dangerous, but luckily it is in our power to resist
them...
Interested in other articles on data mining topics
and techniques?
Check out the Salford Systems blog:
http://1.salford-systems.com/blog

9 Data Mining Challenges From Data Scientists Like You