1/The type of datasets Ground Truth typically helps create can be used to create extremely sophisticated models using a method called ‘supervised’ learning; this is common with computer vision, speech, and language.
2/It’s how we train Rekognition - our computer vision service is trained on tens of millions of labeled images, Polly’s lifelike voices come from hundreds of hours of scripted voice recordings, and so forth.
3/The sheer volume of the data, combined with deep learning neural networks, allows us to train models with human-like capabilities based on that data.
4/At the other end of this spectrum is ‘unsupervised’ learning, where algorithms don’t need large volumes of labeled data.
5/These approaches are commonly used for use cases such as anomaly detection; where the algorithm is only looking for statistical outliers in, say, a stream of data from an IoT temperature sensor. When it detects that the temperature is changing in a meaningful way, the model can send a signal and take action (open a window, for example).
6/These models are no less useful - in fact they are complementary to supervised methods - but they don’t attempt to mimic human level intelligence in the same way.
1/ In the bottom right, we have a no man’s land where for the obvious reasons of not wanting to invest a lot for little gain, there’s no meaningful research happening.
2/ But, there’s fertile ground in the upper left
1/ There are a lot of demands placed on organizations when dealing with documents. What they typically want to be able to do sounds straightforward…
2/ They want to be able to identify documents in any format;
3/ and then extract text from those documents, accurately.
4/ But there are a whole ton of challenges which make this difficult; such as the variety of forms and formats, and the quality.
5/ The way customers try to overcome this complexity today is by either by manual review (which is accurate, but time consuming and expensive), or
6/ with simple OCR and/or..
7/ template based data extraction (which is fast, but tends not to be accurate enough, so they end up sending the documents to manual review or verification anyway).
TRANSITION: we think there is a better way, and that instead of manual reviews, simplistic OCR, and templates, we can replace that heavy lifting with smart, cheap, powerful machine learning…
1/ DeepRacer is a physical device, about the size of a shoe box, which is packed full of everything you need to learn about reinforcement learning through autonomous driving.
2/ It has an HD video camera mounted high up, so it can get a good view of the road ahead;
3/ To make it work, you access a fully configured 3D physics simulator available in the cloud, with a track and a virtual car ready to start training.
4/ All you need to do is provide a simple - or complex - scoring function, using simple Python code, and with a single click, we’ll train the model in the simulator using reinforcement learning in SageMaker - you can watch in real time if you wish to see how the learning is going.
5/ Then just take your model, load it onto DeepRacer, and watch it go…
We think this is a really interesting and fun way to get started with reinforcement learning, and as we started to experiment with this internally, a funny thing happened…
The teams started racing against each other; continually tweaking and adjusting their reward functions for speed around a virtual track. Factions sprang up, it got pretty competitive, and developer’s knowledge and experience with RL grew almost exponentially…
In fact, we had so much fun, that we wanted to bring this to our customers, and so today, I’m also announcing…
Here’s how the league will work…
1/ Anyone can build an RL model in SageMaker (or develop on own and bring to SageMaker)
2/ At our 20 or so AWS Summits in 2019 we’ll hold a DeepRacer League Race, you can compete in as many of these as you like.
3/ Winner of each DRL Race and top 10 points getters qualify for the DRL Championship Cup held at re:invent 2019 here in Vegas.
4/ We’ll also have virtual events and tournaments throughout the year, likely about 20 where we will take the winners and top 10 points getters to the Championship Cup at re:invent.
5/ While there will be individual prizes for each race, big prize is Championship Cup at re:Invent
6/ This year, for 2018, because we don’t have as much lead time, we’re doing an accelerated version for our first Championship Cup.