How AI Helps Students Solve Math Problems

•

1 like•626 views

Learn from ST Unitas, a Korean education technology company with over 60 brand names, including a recent merge with the Princeton Review, on how they leverage AWS machine learning (ML) services to enhance student learning.

$July 10-11, 2019 The Conference Center,$

How AI Helps Students
Solve Math Problems
Hwechul Derrick Cho
Principal AI Research Engineer
ST Unitas & The Princeton Review

The Princeton Review’s Earlier Work
History
• Tutoring Service since 1998
• Homework Help Mobile in 2015
• Live tutoring PC to mobile
Opportunity
• Generation Z expects a shorter feedback cycle

How can we provide students
an answer as fast as possible?

Example Math Problems
Student’s Query Search Result Top1 Search Result Top2 Search Result Top3

Why a Problem Search Engine is Valuable
Business side
• Lower the tutoring cost
• More questions and more data
Student side
• Find answers quickly
• More affordable tutoring

Define (1) Build In-house Dataset
Needs
• Evaluate performance
• Access similar images; however, no public
dataset matched our case
What we did: Took 1,000 photos
of a problem from our book
Example of The Princeton Review SAT

Define (2) Augmentation and Pairing
True Label: Original Image
(1,000 images)
Original Image
Input Data: Augmented Images
(5 per each original image)
Augmented Images
Finding
Original Image

Define (3) Set Out Baseline Model
How
• Baseline doesn’t need to be state-of-the-art
• Calculated similarity distance of Perceptual Hash (pHash)
Result: Top@5 Accuracy ≈ 30%

Solving (1) Search Similar Images
How
• Use distance of two images’ representation
• Our baseline, pHash, is also image representation
• We used ImageNet models to represent images to vector
Result: Top@5 Accuracy ≈ 50%
Example of image representation architecture
Vector representation
RGB Image

Solving (2) Search Similar Texts
Example of text-only image
Problem: Text-only math problems

Solving (2) Use Amazon Rekognition
• Amazon Rekognition was the fastest way to proof of concept
Result: Top@5 Accuracy ≈ 72%
Amazon Rekognition example
Extracted text

Solving (3) Search Similar Images with Texts
Vector representation
RGB Image
+
• Combine two similarity scores
• Use simple grid search algorithm to find optimal combine factor
Result: Top@5 Accuracy ≈ 81%

Uncovered Blind Spots to Keep Iterating
• Didn’t recognize mathematic symbols or different fonts
• Text extracted from graphs unhelpful
What we did: We built a new dataset which addressed those problems and hand-labeled ourselves
“8. The graph f.x) is given below.
Evaluate Sr(*) adx.3H107146E”
Extracted text
“47 and 48 The graphs of a
function f and its derivative f! are
shown. Which is f' bigger, (-1) or
(1)? f" 47. 48.” Extracted text

Improving our Engine
• Detect important layouts from the image
• Replace Text Extraction (Amazon Rekognition) with our own model in Amazon SageMaker
How we did it:
With Amazon SageMaker, we could
easily deploy and scale our model
SageMaker
Architecture

Achievement: Detecting Layouts
Ours Ground Truth Google Vision API
0.54
0.38
0.05
0.00
0.10
0.20
0.30
0.40
0.50
0.60
Our model Google Vision API AWS Rekognition API
Comparing Layout Analysis Performance
(F-Score)

Achievement: Extracting Text
Extracted text

Future Plan (1): Using User-Labeled Data
Ask students if
search result
was helpful

Future Plan (2): Normalizing the Problem
Determine how to split variables from a problem and normalize

Future Plan (3): Auto-solving
Solve simple questions automatically

“If You Define the Problem Correctly,
You Almost Have the Solution”
Steve Jobs (1955 - 2011)

What's hot

Given at the PyData NYC 2013 conference (http://vimeo.com/79517341), and will be given at PyTennessee 2014. Scikit-learn is one of the most well-known machine learning Python modules in existence. But how does it work, and what, for that matter, is machine learning? For those with programming experience but who are new to machine learning, this talk gives a beginner-level overview of how machine learning can be useful, important machine learning concepts, and how to implement them with scikit-learn. We’ll use real world data to look at supervised and unsupervised machine learning algorithms and why scikit-learn is useful for performing these tasks.

A Beginner's Guide to Machine Learning with Scikit-Learn

Sarah Guido

Could a Data Science Program use Data Science Insights?

Zachary Thomas

Video available here http://www.youtube.com/watch?v=1jHxGCl8RXc Recommender systems research is often based on comparisons of predictive accuracy: the better the evaluation scores, the better the recommender. However, it is difficult to compare results from different recommender systems due to the many options in design and implementation of an evaluation strategy. Additionally, algorithmic implementations can diverge from the standard formulation due to manual tuning and modifications that work better in some situations. In this work we compare common recommendation algorithms as implemented in three popular recommendation frameworks. To provide a fair comparison, we have complete control of the evaluation dimensions being benchmarked: dataset, data splitting, evaluation strategies, and metrics. We also include results using the internal evaluation mechanisms of these frameworks. Our analysis points to large differences in recommendation accuracy across frameworks and strategies, i.e. the same baselines may perform orders of magnitude better or worse across frameworks. Our results show the necessity of clear guidelines when reporting evaluation of recommender systems to ensure reproducibility and comparison of results.

Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...

Alan Said

Ml3

poovarasu maniandan

Social Network Visualisation

Mithileysh Sathiyanarayanan

Agile software development has been gaining popularity and replacing the traditional methods of developing software. However, estimating the size and effort in Agile software development still remains a challenge. Measurement practices in agile methods are more important than traditional methods, because lack of appropriate an effective measurement practices will increase the risk of project. This paper discuss about traditional and agile effort estimation model, and analysis done on how the metrics are used in estimation process. The paper also suggeststo use object point and use case point to improve accuracy of effort in agile software development.

Analysis of Effort Estimation Model in Traditional and Agile (USING METRICS ...

ijcoa

Approximated and User Steerable tSNE for Progressive Visual Analytics

Nicola Pezzotti

My Three Ex’s: A Data Science Approach for Applied Machine Learning Daniel Tunkelang (LinkedIn) Presented at QCon San Francisco 2014 in the Applied Machine Learning and Data Science track https://qconsf.com/presentation/my-three-ex%E2%80%99s-data-science-approach-applied-machine-learning Abstract This talk is about applying machine learning to solve problems. It’s not a talk about machine learning — or at least not about the theory of machine learning. Theoretical machine learning requires a deep understanding of computer science and statistics. It’s one of the most studied areas of computer science, and advances in theoretical machine learning give us hope of solving the world’s “AI-hard” problems. Applied machine learning is more grounded but no less important. We are surrounded by opportunities to apply classifiers, learn rules, compute similarity, and assemble clusters. We don’t need to develop new algorithms for any of these problems — our textbooks and open-source libraries have done that hard work for us. But algorithms are not enough. Applying machine learning to solve problems requires a data science mindset that transcends the algorithmic details. In this talk, I’ll communicate the data science mindset by describing my three ex’s: express, explain, and experiment. These three activities are the pillars of a successful strategy for applying machine learning to solve problems. Whether you’re a machine learning novice or expert, I hope you’ll leave this talk with some practical wisdom you can apply to your next project.

My Three Ex’s: A Data Science Approach for Applied Machine Learning

Daniel Tunkelang

computer notes - Introduction to data structures

ecomputernotes

1440 track 2 boire_using our laptop

Rising Media, Inc.

What's hot (10)

A Beginner's Guide to Machine Learning with Scikit-Learn

Could a Data Science Program use Data Science Insights?

Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...

Ml3

Social Network Visualisation

Analysis of Effort Estimation Model in Traditional and Agile (USING METRICS ...

Approximated and User Steerable tSNE for Progressive Visual Analytics

My Three Ex’s: A Data Science Approach for Applied Machine Learning

computer notes - Introduction to data structures

1440 track 2 boire_using our laptop

Similar to How AI Helps Students Solve Math Problems

The automatic question answering (QA) task has long been considered a primary objective of artificial intelligence. Among the QA sub-systems, we focused on answer-ranking part. In particular, we investigated a novel neural network architecture with additional data clustering module to improve the performance in ranking answer candidates which are longer than a single sentence. This work can be used not only for the QA ranking task, but also to evaluate the relevance of next utterance with given dialogue generated from the dialogue model. In this talk, I'll present our research results (NAACL 2018), and also its potential use cases (i.e. fake news detection). Finally, I'll conclude by introducing some issues on previous research, and by introducing recent approach in academic.

Naver learning to rank question answer pairs using hrde-ltc

NAVER Engineering

DeepSearch_Project_Report

Urjit Patel

Le Machine Learning, l’IA, le DeepLearning, les Statistiques, le Data Mining… bref, tous ces mots sont les buzz words du moment mais que se cache-t-il derrière ? A travers des exemples concrets, on parcourra les différentes approches du Machine Learning, les grandes familles d’algorithmes (n’ayez crainte : sans rentrer dans le cœur de leurs implémentations), puis les outils et les frameworks à la disposition des Data Scientists… et pour finir, on essayera de prédire l’avenir ! Salon Data - Nantes - 19 Septembre 2017 https://salondata.fr/2017/07/12/0930-1030-ml/

Le Machine Learning de A à Z

Alexia Audevart

Net campus2015 antimomusone

DotNetCampus

Scopri come utilizzare Azure Machine Learning, un servizio cloud che consente alle aziende, università, centri di ricerca e sviluppatori di incorporare e sfrutturare nelle loro applicazioni funzionalità di apprendimento automatico e analisi predittiva su enormi set di dati. Tramite Azure ML Studio possiamo creare, testare, attuare e gestire soluzioni di analisi predittiva e apprendimento automatico nel cloud tramite un qualunque web browser. Durante la sessione si darà un saggio attraverso un esempio di analisi predittiva sul Flight Delay.

PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA

DotNetCampus

Research Opportunities in India & Keyword Search Over Dynamic Categorized Inf...

VNIT-ACM Student Chapter

B017350710

IOSR Journals

IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.

Efficient Refining Of Why-Not Questions on Top-K Queries

iosrjce

Wecp all-india-test-series-program-brochure

WeCP | We Create Problems

Wecp all-india-test-series-program-brochure

BIPIN KAUSHIK

ds 1.pptx

varu9

Répondre à la question automatique avec le web

Ahmed Hammami

Machine Learning 2 deep Learning: An Intro

Si Krishan

Dice.com Bay Area Search - Beyond Learning to Rank Talk

Simon Hughes

Maisa Penha - Art of Possible.pdf

SOLTUIONSpeople, THINKubators, THINKathons

data-science-pdf-16588.pdf

vkharish18

Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good. Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data. To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.

Module 1.3 data exploratory

Sara Hooker

Slide deck of our webinar about QuerySurge AI, a new paradigm that provides a radical shift in ETL testing by leveraging artificial intelligence through its no-code low-code solution. During this webinar, we covered the following topics, showcasing the features of QuerySurge AI: - How to utilize QuerySurge AI to fully automate the test development process - How to quickly convert data mapping documents with complex logic transformations from plain text into data validation tests in the data store’s native SQL with little to no human intervention - How QuerySurge AI automatically injects these tests into QuerySurge folders, ready for execution - How quickly these test can be run to completion The Goal - Gain valuable insights into how QuerySurge AI can benefit your organization, including: - A dramatic reduction in test development time through artificial intelligence - Reduced skillset needed for test creation -A massive increase in ROI For more information on QuerySurge AI, go to www.QuerySurge.com

QuerySurge AI webinar

RTTS

Machine learning workshop @DYP Pune

Ganesh Raskar

Intelligent Software Engineering: Synergy between AI and Software Engineering

Tao Xie

Similar to How AI Helps Students Solve Math Problems (20)

Naver learning to rank question answer pairs using hrde-ltc

DeepSearch_Project_Report

Le Machine Learning de A à Z

Net campus2015 antimomusone

PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA

Research Opportunities in India & Keyword Search Over Dynamic Categorized Inf...

B017350710

Efficient Refining Of Why-Not Questions on Top-K Queries

Wecp all-india-test-series-program-brochure

ds 1.pptx

Répondre à la question automatique avec le web

Machine Learning 2 deep Learning: An Intro

Dice.com Bay Area Search - Beyond Learning to Rank Talk

Maisa Penha - Art of Possible.pdf

data-science-pdf-16588.pdf

Module 1.3 data exploratory

QuerySurge AI webinar

Machine learning workshop @DYP Pune

Intelligent Software Engineering: Synergy between AI and Software Engineering

More from Amazon Web Services

Il Forecasting è un processo importante per tantissime aziende e viene utilizzato in vari ambiti per cercare di prevedere in modo accurato la crescita e distribuzione di un prodotto, l’utilizzo delle risorse necessarie nelle linee produttive, presentazioni finanziarie e tanto altro. Amazon utilizza delle tecniche avanzate di forecasting, in parte questi servizi sono stati messi a disposizione di tutti i clienti AWS. In questa sessione illustreremo come pre-processare i dati che contengono una componente temporale e successivamente utilizzare un algoritmo che a partire dal tipo di dato analizzato produce un forecasting accurato.

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...

Amazon Web Services

La varietà e la quantità di dati che si crea ogni giorno accelera sempre più velocemente e rappresenta una opportunità irripetibile per innovare e creare nuove startup. Tuttavia gestire grandi quantità di dati può apparire complesso: creare cluster Big Data su larga scala sembra essere un investimento accessibile solo ad aziende consolidate. Ma l’elasticità del Cloud e, in particolare, i servizi Serverless ci permettono di rompere questi limiti. Vediamo quindi come è possibile sviluppare applicazioni Big Data rapidamente, senza preoccuparci dell’infrastruttura, ma dedicando tutte le risorse allo sviluppo delle nostre le nostre idee per creare prodotti innovativi.

Big Data per le Startup: come creare applicazioni Big Data in modalità Server...

Amazon Web Services

Ora puoi utilizzare Amazon Elastic Kubernetes Service (EKS) per eseguire pod Kubernetes su AWS Fargate, il motore di elaborazione serverless creato per container su AWS. Questo rende più semplice che mai costruire ed eseguire le tue applicazioni Kubernetes nel cloud AWS.In questa sessione presenteremo le caratteristiche principali del servizio e come distribuire la tua applicazione in pochi passaggi

Esegui pod serverless con Amazon EKS e AWS Fargate

Amazon Web Services

Vent'anni fa Amazon ha attraversato una trasformazione radicale con l'obiettivo di aumentare il ritmo dell'innovazione. In questo periodo abbiamo imparato come cambiare il nostro approccio allo sviluppo delle applicazioni ci ha permesso di aumentare notevolmente l'agilità, la velocità di rilascio e, in definitiva, ci ha consentito di creare applicazioni più affidabili e scalabili. In questa sessione illustreremo come definiamo le applicazioni moderne e come la creazione di app moderne influisce non solo sull'architettura dell'applicazione, ma sulla struttura organizzativa, sulle pipeline di rilascio dello sviluppo e persino sul modello operativo. Descriveremo anche approcci comuni alla modernizzazione, compreso l'approccio utilizzato dalla stessa Amazon.com.

Costruire Applicazioni Moderne con AWS

Amazon Web Services

L’utilizzo dei container è in continua crescita. Se correttamente disegnate, le applicazioni basate su Container sono molto spesso stateless e flessibili. I servizi AWS ECS, EKS e Kubernetes su EC2 possono sfruttare le istanze Spot, portando ad un risparmio medio del 70% rispetto alle istanze On Demand. In questa sessione scopriremo insieme quali sono le caratteristiche delle istanze Spot e come possono essere utilizzate facilmente su AWS. Impareremo inoltre come Spreaker sfrutta le istanze spot per eseguire applicazioni di diverso tipo, in produzione, ad una frazione del costo on-demand!

Come spendere fino al 90% in meno con i container e le istanze spot

Amazon Web Services

In recent months, many customers have been asking us the question – how to monetise Open APIs, simplify Fintech integrations and accelerate adoption of various Open Banking business models. Therefore, AWS and FinConecta would like to invite you to Open Finance marketplace presentation on October 20th. Event Agenda : Open banking so far (short recap) • PSD2, OB UK, OB Australia, OB LATAM, OB Israel Intro to Open Finance marketplace • Scope • Features • Tech overview and Demo The role of the Cloud The Future of APIs • Complying with regulation • Monetizing data / APIs • Business models • Time to market One platform for all: a Strategic approach Q&A

Open banking as a service

Amazon Web Services

Per creare valore e costruire una propria offerta differenziante e riconoscibile, le startup di successo sanno come combinare tecnologie consolidate con componenti innovativi creati ad hoc. AWS fornisce servizi pronti all'utilizzo e, allo stesso tempo, permette di personalizzare e creare gli elementi differenzianti della propria offerta. Concentrandoci sulle tecnologie di Machine Learning, vedremo come selezionare i servizi di intelligenza artificiale offerti da AWS e, anche attraverso una demo, come costruire modelli di Machine Learning personalizzati utilizzando SageMaker Studio.

Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...

Amazon Web Services

Con l'approccio tradizionale al mondo IT per molti anni è stato difficile implementare tecniche di DevOps, che finora spesso hanno previsto attività manuali portando di tanto in tanto a dei downtime degli applicativi interrompendo l'operatività dell'utente. Con l'avvento del cloud, le tecniche di DevOps sono ormai a portata di tutti a basso costo per qualsiasi genere di workload, garantendo maggiore affidabilità del sistema e risultando in dei significativi miglioramenti della business continuity. AWS mette a disposizione AWS OpsWork come strumento di Configuration Management che mira ad automatizzare e semplificare la gestione e i deployment delle istanze EC2 per mezzo di workload Chef e Puppet. Scopri come sfruttare AWS OpsWork a garanzia e affidabilità del tuo applicativo installato su Instanze EC2.

OpsWorks Configuration Management: automatizza la gestione e i deployment del...

Amazon Web Services

Vuoi conoscere le opzioni per eseguire Microsoft Active Directory su AWS? Quando si spostano carichi di lavoro Microsoft in AWS, è importante considerare come distribuire Microsoft Active Directory per supportare la gestione, l'autenticazione e l'autorizzazione dei criteri di gruppo. In questa sessione, discuteremo le opzioni per la distribuzione di Microsoft Active Directory su AWS, incluso AWS Directory Service per Microsoft Active Directory e la distribuzione di Active Directory su Windows su Amazon Elastic Compute Cloud (Amazon EC2). Trattiamo argomenti quali l'integrazione del tuo ambiente Microsoft Active Directory locale nel cloud e l'utilizzo di applicazioni SaaS, come Office 365, con AWS Single Sign-On.

Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads

Amazon Web Services

Computer Vision con AWS

Amazon Web Services

Amazon Web Services e VMware organizzano un evento virtuale gratuito il prossimo mercoledì 14 Ottobre dalle 12:00 alle 13:00 dedicato a VMware Cloud ™ on AWS, il servizio on demand che consente di eseguire applicazioni in ambienti cloud basati su VMware vSphere® e di accedere ad una vasta gamma di servizi AWS, sfruttando a pieno le potenzialità del cloud AWS e tutelando gli investimenti VMware esistenti. Molte organizzazioni sfruttano i vantaggi del cloud migrando i propri carichi di lavoro Oracle e assicurandosi notevoli vantaggi in termini di agilità ed efficienza dei costi. La migrazione di questi carichi di lavoro, può creare complessità durante la modernizzazione e il refactoring delle applicazioni e a questo si possono aggiungere rischi di prestazione che possono essere introdotti quando si spostano le applicazioni dai data center locali.

Database Oracle e VMware Cloud on AWS i miti da sfatare

Amazon Web Services

Molte aziende oggi, costruiscono applicazioni con funzionalità di tipo ledger ad esempio per verificare lo storico di accrediti o addebiti nelle transazioni bancarie o ancora per tenere traccia del flusso supply chain dei propri prodotti. Alla base di queste soluzioni ci sono i database ledger che permettono di avere un log delle transazioni trasparente, immutabile e crittograficamente verificabile, ma sono strumenti complessi e onerosi da gestire. Amazon QLDB elimina la necessità di costruire sistemi personalizzati e complessi fornendo un database ledger serverless completamente gestito. In questa sessione scopriremo come realizzare un'applicazione serverless completa che utilizzi le funzionalità di QLDB.

Crea la tua prima serverless ledger-based app con QLDB e NodeJS

Amazon Web Services

Con l’ascesa delle architetture di microservizi e delle ricche applicazioni mobili e Web, le API sono più importanti che mai per offrire agli utenti finali una user experience eccezionale. In questa sessione impareremo come affrontare le moderne sfide di progettazione delle API con GraphQL, un linguaggio di query API open source utilizzato da Facebook, Amazon e altro e come utilizzare AWS AppSync, un servizio GraphQL serverless gestito su AWS. Approfondiremo diversi scenari, comprendendo come AppSync può aiutare a risolvere questi casi d’uso creando API moderne con funzionalità di aggiornamento dati in tempo reale e offline. Inoltre, impareremo come Sky Italia utilizza AWS AppSync per fornire aggiornamenti sportivi in tempo reale agli utenti del proprio portale web.

API moderne real-time per applicazioni mobili e web

Amazon Web Services

Molte organizzazioni sfruttano i vantaggi del cloud migrando i propri carichi di lavoro Oracle e assicurandosi notevoli vantaggi in termini di agilità ed efficienza dei costi. La migrazione di questi carichi di lavoro, può creare complessità durante la modernizzazione e il refactoring delle applicazioni e a questo si possono aggiungere rischi di prestazione che possono essere introdotti quando si spostano le applicazioni dai data center locali. In queste slide, gli esperti AWS e VMware presentano semplici e pratici accorgimenti per facilitare e semplificare la migrazione dei carichi di lavoro Oracle accelerando la trasformazione verso il cloud, approfondiranno l’architettura e dimostreranno come sfruttare a pieno le potenzialità di VMware Cloud ™ on AWS.

Database Oracle e VMware Cloud™ on AWS: i miti da sfatare

Amazon Web Services

Tools for building your MVP on AWS

Amazon Web Services

How to Build a Winning Pitch Deck

Amazon Web Services

Building a web application without servers

Amazon Web Services

Fundraising Essentials

Amazon Web Services

AWS_HK_StartupDay_Building Interactive websites while automating for efficien...

Amazon Web Services

Amazon Elastic Container Service (Amazon ECS) è un servizio di gestione dei container altamente scalabile, che semplifica la gestione dei contenitori Docker attraverso un layer di orchestrazione per il controllo del deployment e del relativo lifecycle. In questa sessione presenteremo le principali caratteristiche del servizio, le architetture di riferimento per i differenti carichi di lavoro e i semplici passi necessari per poter velocemente migrare uno o più dei tuo container.

Introduzione a Amazon Elastic Container Service

Amazon Web Services

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...

Big Data per le Startup: come creare applicazioni Big Data in modalità Server...

Esegui pod serverless con Amazon EKS e AWS Fargate

Costruire Applicazioni Moderne con AWS

Come spendere fino al 90% in meno con i container e le istanze spot

Open banking as a service

Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...

OpsWorks Configuration Management: automatizza la gestione e i deployment del...

Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads

Computer Vision con AWS

Database Oracle e VMware Cloud on AWS i miti da sfatare

Crea la tua prima serverless ledger-based app con QLDB e NodeJS

API moderne real-time per applicazioni mobili e web

Database Oracle e VMware Cloud™ on AWS: i miti da sfatare

Tools for building your MVP on AWS

How to Build a Winning Pitch Deck

Building a web application without servers

Fundraising Essentials

AWS_HK_StartupDay_Building Interactive websites while automating for efficien...

Introduzione a Amazon Elastic Container Service

How AI Helps Students Solve Math Problems

1. July 10-11, 2019 The Conference Center,

2. How AI Helps Students Solve Math Problems Hwechul Derrick Cho Principal AI Research Engineer ST Unitas & The Princeton Review

3. About the company

4. An Online Tutoring Service

5. The Princeton Review’s Earlier Work History • Tutoring Service since 1998 • Homework Help Mobile in 2015 • Live tutoring PC to mobile Opportunity • Generation Z expects a shorter feedback cycle

6. We Launched Conects Q&A

7. DEMO VIDEO

8. How can we provide students an answer as fast as possible?

9. Our Solution: A Problem Search Engine

10. Example Math Problems Student’s Query Search Result Top1 Search Result Top2 Search Result Top3

11. How the Problem Search Engine Works 5

12. Why a Problem Search Engine is Valuable Business side • Lower the tutoring cost • More questions and more data Student side • Find answers quickly • More affordable tutoring

13. Steps Taken & Learnings

14. Define (1) Build In-house Dataset Needs • Evaluate performance • Access similar images; however, no public dataset matched our case What we did: Took 1,000 photos of a problem from our book Example of The Princeton Review SAT

15. Define (2) Augmentation and Pairing True Label: Original Image (1,000 images) Original Image Input Data: Augmented Images (5 per each original image) Augmented Images Finding Original Image

16. Define (3) Set Out Baseline Model How • Baseline doesn’t need to be state-of-the-art • Calculated similarity distance of Perceptual Hash (pHash) Result: Top@5 Accuracy ≈ 30%

17. Solving (1) Search Similar Images How • Use distance of two images’ representation • Our baseline, pHash, is also image representation • We used ImageNet models to represent images to vector Result: Top@5 Accuracy ≈ 50% Example of image representation architecture Vector representation RGB Image

18. Solving (2) Search Similar Texts Example of text-only image Problem: Text-only math problems

19. Solving (2) Use Amazon Rekognition • Amazon Rekognition was the fastest way to proof of concept Result: Top@5 Accuracy ≈ 72% Amazon Rekognition example Extracted text

20. Solving (3) Search Similar Images with Texts Vector representation RGB Image + • Combine two similarity scores • Use simple grid search algorithm to find optimal combine factor Result: Top@5 Accuracy ≈ 81%

21. Uncovered Blind Spots to Keep Iterating • Didn’t recognize mathematic symbols or different fonts • Text extracted from graphs unhelpful What we did: We built a new dataset which addressed those problems and hand-labeled ourselves “8. The graph f.x) is given below. Evaluate Sr(*) adx.3H107146E” Extracted text “47 and 48 The graphs of a function f and its derivative f! are shown. Which is f' bigger, (-1) or (1)? f" 47. 48.” Extracted text

22. Improving our Engine • Detect important layouts from the image • Replace Text Extraction (Amazon Rekognition) with our own model in Amazon SageMaker How we did it: With Amazon SageMaker, we could easily deploy and scale our model SageMaker Architecture

23. Achievement: Detecting Layouts Ours Ground Truth Google Vision API 0.54 0.38 0.05 0.00 0.10 0.20 0.30 0.40 0.50 0.60 Our model Google Vision API AWS Rekognition API Comparing Layout Analysis Performance (F-Score)

24. Achievement: Extracting Text Extracted text

25. Future Plans

26. Future Plan (1): Using User-Labeled Data Ask students if search result was helpful

27. Future Plan (2): Normalizing the Problem Determine how to split variables from a problem and normalize

28. Future Plan (3): Auto-solving Solve simple questions automatically

29. “If You Define the Problem Correctly, You Almost Have the Solution” Steve Jobs (1955 - 2011)

30. Fireside Chat

31. Thank you!