10. Amazon reviews are loaded into PostgreSQL
Select reviews
with >5 votes
~ 1.2 million reviews
11. Pre-processing is performed in pandas
~100,000 reviews
Pre-Processing
Select reviews
with >5 votes
~ 1.2 million reviews
12. A helpfulness predictor is trained on the reviews
TFIDF tokenizer
SMOTE
Logistic Regression (F1 - 84%)~100,000 reviews
Pre-Processing
Select reviews
with >5 votes
~ 1.2 million reviews
13. The pre-trained models are applied to AliExpress
TFIDF tokenizer
SMOTE
Logistic Regression (F1 - 84%)~100,000 reviews
Pre-Processing
Select reviews
with >5 votes
~ 1.2 million reviews