Portofolio Muhammad Afrizal Septiansyah 2024

PROJECTS PORTOFOLIO
Sarcasm Detection
Clients : Dr. Afiyati, S.Si., MT.
Email : afiyati.reno@mercubuana.ac.id
Description :
In this project, Im and Dr. Afiyati, S.Si., MT. collaborated with Datains (PT Global Data Inspirasi) to
conduct research to detect sarcasm in datasets from Twitter. Create a baseline with Logistic
Regression and Naive Bayes models. Then the advanced model uses several architectural such as:
Indobertweet, Indobertweet + CNN, Indobertweet + LSTM, Indobertweet + GRU and Indobertweet +
Cosine Similarity. From the results of experiments on various architectures, the results showed that
Indobertwitter + LSTM got the highest results when compared with other architectures, including
Indobertwitter alone, with a difference of accuracy > ± 0.5%.
Improve Bag of Words with Slang Word Dictionary
Clients : Dr. Afiyati, S.Si., MT.
Email : afiyati.reno@mercubuana.ac.id
In this project, Im and Dr. Afiyati, S.Si., MT. worked together again to conduct research. Where in this
case it will try to improve the performance of a bag of words combined with an slang words dictionary
(generated from the soundex algorithm). From the experimental results it was found that a bag of
words combined with slang words and training into several algorithms such as logistic regression and
naive Bayes on average succeeded in increasing accuracy > ± 2%.
Open Source Project : POTML and POTDL Libraries
Description :
POT was the team name when Im and Dr. Afiyati, S.Si., MT. conducted joint research. This initial idea
was because we needed a flow in creating machine and deep learning models with more efficient
code and several techniques that we could develop together. The POTML library implements several
helper functions such as: statistical analysis, pipeline preprocessing, pipeline data scaling, encoding
and transformation, features selection, hyperparameter tuning, and evaluation models. POTDL
(Pytorch Base) implements several classes and helper functions such as: linear and convolutional
block layers, configuration, optimizer, training arguments for training and saving models, model
evaluation. It's as easy to use as tf.Keras but still maintains Pytorch's flexibility of architecture and
training.
Open Source Project : AutoBoosting (Automatic Machine Learning)
Description :
AutoBoosting is a complement to POTML and POTDL. Where with AutoML we can create a baseline
model quickly with just two lines of code. Which after that can automate all processes from feature
engineering, Boosting algorithm (XGBoost and LGBM) and hyperparameter tuning (random, bayes or
optuna). After the training process is complete, it will output training, validation and testing score and
provide recommendations for the best algorithm. AutoML can handle several tasks such as:
classification, regression, multioutput classification and regression, text classification, mix data types:
numeric, categorical and text for training the data.

Pretraining the BERT Model with Modification of the Attention Formula
Clients : Dr. Meilany Nonsi Tentua, S.Si, MT.
Email : meilany@ upy.ac.id
Description :
In this project, Im and Dr. Meilany Nonsi Tentua, S.Si, MT. do research together starting from
collecting data from Indonesian, English Wikipedia and Twitter for pretraining BERT model. Then the
job vacancy dataset from indeed.com (web scraping) was used for fine tuning the BERT model with a
task named entity recognition (NER). Pretraining of the BERT model was carried out on two
architectures, namely: original BERT and BERT with modified attention formulas. From the results of
fine tuning the two models, with modifying the attention formula in BERT, it succeeded in increasing
the F1 score > ± 1% compared with BERT model without modification.
Pretraining BERT Model Multilanguage (Indonesian and English)
Clients : Dr. Suprapto, M.I.Kom. (Associate Professor)
Email : sprapto@ugm.ac.id.
Description :
In this project, Im and Dr. Meilany Nonsi Tentua, S.Si, MT. do research together starting from
collecting data from Indonesian, English Wikipedia and Twitter for pretraining BERT model. Then the
review dataset of various applications from Playstore is used for fine tuning the BERT model with
sentiment analysis tasks. The review dataset from Playstore is a mixture of Indonesian and English
and this BERT model was created with the aim of being able to recognize the mixture between the two
languages . In the fine tuning evaluation results, good results were obtained in sentiment analysis with
an accuracy of 88%. In future research, we plan to create a pretrained GPT 2 model that can
recognize more than one language.
Using Narrative Context to Predict the Trajectory of Sentence Sentiment
Clients : Maya Rini Handayani, M.Kom
Description :
In this project, Im and Maya Rini Handayani, M.Kom. conducted research to predict narrative text
sentiment with a prediction range of -1 – 1. The dataset used came from the Harry Potter novel
chapters 1 – 9. Using the English BERT (Bert-base-uncased) architecture. In the experimental results,
a poor correlation value was obtained ± 0.03. In this research, there is still a lack of data, so if you
want to improve performance, the amount of data must be increased further.
GAS Detection Based On Image Using CNN
Clients : Umi Salamah, S.Si., M.Si.
Email : umi.salamah@fisika.uad.ac.id
Description :
In this project, Im and Umi Salamah, S.Si., M.Si. conducted research to detect whether an image
contains: SO2, CO2 or a mixture of both. From several experiments using pre-trained CNN models
such as: MobileNetV2 and DenseNet169. The results obtained were that the DenseNet169 model
which was trained from scratch obtained a testing accuracy of 80%.

Summarization Documents Using BERT-Kmeans
Clients : Agus Mulyanto, S.Si., M.Kom., ASEAN Eng.
Description :
In this project Im created a text summarization model using a pretrained Indonesian BERT model
combined with Kmeans clustering. BERT is used for the process of extracting paragraphs (collections
of words) into vectors. After that, a clustering process is do using Kmeans, where the paragraph
closest to the center of the cluster will be used as a representative of the document to make it more
concise.
Panorama Detection
Github : github.com/AfrizalSeptiansyah/Panorama-Detection
In this project, Im used the pretrained model “mobilenetv2” to detect objects in the panoramic image
into 5 categories: desert, mountains, sea, sunset and tree (multilabel case).
Customers Churn Prediction
Github : github.com/AfrizalSeptiansyah/Customers_Churn_Prediction
In this project, Im used the xgboost algorithm to predict whether customers will churn : yes or no
(imbalanced). Successfully found important features that managed to increase prediction accuracy.
Costumers Segmentation
Github : github.com/AfrizalSeptiansyah/Costumers-Segmentation
In this project, Im used the k-means algorithm to cluster customers into several segments, based on
similar characteristics.
Forecasting Consumer Price Index
Github : github.com/AfrizalSeptiansyah/Consumer-Price-Index
In this project, Im used fbprophet to forecast the consumer price index (cyclical problem) until the end
of 2021.

Portofolio Muhammad Afrizal Septiansyah 2024

Recommended

Recommended

More Related Content

Similar to Portofolio Muhammad Afrizal Septiansyah 2024

Similar to Portofolio Muhammad Afrizal Septiansyah 2024 (20)

Recently uploaded

Recently uploaded (20)

Portofolio Muhammad Afrizal Septiansyah 2024