These datasets are:
Relatively small
Clean
Machine learning heavy.
These are datasets that are commonly used in “Sandbox”
Picture Source:
Titanic: http://davidabramsbooks.blogspot.com/2012/04/soup-and-salad-titanic-books-reading.html
Iris Classification: https://www.mygardenlife.com/plant-library/2244/iris/species
MNIST Dataset: https://github.com/Orrimp/mnist_neural_net, http://yann.lecun.com/exdb/mnist/
Zillow: https://www.kaggle.com/zillow/zecon (banner)
What is wrong with this workflow?
Because it oversimplify the industry
Now we can see why people love the sandbox mindset.
Picture Source:
Beer: https://i.imgflip.com/zghl1.jpg
Cat: https://kittentoob.com/wp-content/uploads/2013/04/cat.jpg