Learn from ST Unitas, a Korean education technology company with over 60 brand names, including a recent merge with the Princeton Review, on how they leverage AWS machine learning (ML) services to enhance student learning.
5. The Princeton Review’s Earlier Work
History
• Tutoring Service since 1998
• Homework Help Mobile in 2015
• Live tutoring PC to mobile
Opportunity
• Generation Z expects a shorter feedback cycle
12. Why a Problem Search Engine is Valuable
Business side
• Lower the tutoring cost
• More questions and more data
Student side
• Find answers quickly
• More affordable tutoring
14. Define (1) Build In-house Dataset
Needs
• Evaluate performance
• Access similar images; however, no public
dataset matched our case
What we did: Took 1,000 photos
of a problem from our book
Example of The Princeton Review SAT
15. Define (2) Augmentation and Pairing
True Label: Original Image
(1,000 images)
Original Image
Input Data: Augmented Images
(5 per each original image)
Augmented Images
Finding
Original Image
16. Define (3) Set Out Baseline Model
How
• Baseline doesn’t need to be state-of-the-art
• Calculated similarity distance of Perceptual Hash (pHash)
Result: Top@5 Accuracy ≈ 30%
17. Solving (1) Search Similar Images
How
• Use distance of two images’ representation
• Our baseline, pHash, is also image representation
• We used ImageNet models to represent images to vector
Result: Top@5 Accuracy ≈ 50%
Example of image representation architecture
Vector representation
RGB Image
18. Solving (2) Search Similar Texts
Example of text-only image
Problem: Text-only math problems
19. Solving (2) Use Amazon Rekognition
• Amazon Rekognition was the fastest way to proof of concept
Result: Top@5 Accuracy ≈ 72%
Amazon Rekognition example
Extracted text
20. Solving (3) Search Similar Images with Texts
Vector representation
RGB Image
+
• Combine two similarity scores
• Use simple grid search algorithm to find optimal combine factor
Result: Top@5 Accuracy ≈ 81%
21. Uncovered Blind Spots to Keep Iterating
• Didn’t recognize mathematic symbols or different fonts
• Text extracted from graphs unhelpful
What we did: We built a new dataset which addressed those problems and hand-labeled ourselves
“8. The graph f.x) is given below.
Evaluate Sr(*) adx.3H107146E”
Extracted text
“47 and 48 The graphs of a
function f and its derivative f! are
shown. Which is f' bigger, (-1) or
(1)? f" 47. 48.” Extracted text
22. Improving our Engine
• Detect important layouts from the image
• Replace Text Extraction (Amazon Rekognition) with our own model in Amazon SageMaker
How we did it:
With Amazon SageMaker, we could
easily deploy and scale our model
SageMaker
Architecture
23. Achievement: Detecting Layouts
Ours Ground Truth Google Vision API
0.54
0.38
0.05
0.00
0.10
0.20
0.30
0.40
0.50
0.60
Our model Google Vision API AWS Rekognition API
Comparing Layout Analysis Performance
(F-Score)