1. Christine Straub
M AC H I N E L E A R N I N G E N G I N E E R
323-356-6300 Los Ángeles christinestraub@gmail.com | https://github.com/christinestraub
S K I L L S
Python
Data Engineering
Natural Language Processing
Deep Learning
Computer Vision
OCR | Document Parsing
Bot Development
Statistical Analysis
Data Analysis
Amazon Web Services
Google Cloud Platform
Google Cloud
Cloud Computing
Cloud Architecture
Data Warehousing
Big Query
Big Data
MEAN/MERN/MEVN
Django / Flask
MySQL
Elasticsearch
Algorithms
Algo Trading
Database Design
Agile Methodology
Lean Management
L A N G U AG E S
English
Tagalog
P R O F I L E
Experienced Software Engineer adept in bringing forth expertise in design, development,
testing, and maintenance of software systems. Proficient in various platforms, languages, and
embedded systems. Experienced in the latest cutting-edge development tools and procedures.
Able to effectively self-manage during independent projects, as well as collaborate as part of
a productive team. Various experiences in Natural Language Processing, Computer Vision,
Deep Learning, Bot Development Projects.
E D U CAT I O N
Computer Science at University of California, Berkeley, Berkeley
August 2014 — May 2017
IEEE Instructor for Micromouse EE class
Machine Learning @ Berkeley
UC Berkeley - Automation Sciences Lab
Cognitive Science at University of California, Berkeley, Berkeley
August 2014 — May 2017
Machine Learning at Stanford University
January 2018 — April 2018
Deep Learning Specialization at Stanford University
April 2018 — September 2018
Certified Scrum Product Owner at Scrum Alliance, San Francisco
August 2018 — October 2018
E M P LOY M E N T H I S TO R Y
Machine Learning Engineer | Data Scientist at Upwork, Remote
March 2017 — Present
Developed product and managed over 60+ projects through Upwork.
Deployed Call Segmentation and Sentiment Analysis models using BERT.
Worked on several NLP and AI Chatbot development projects which analyzes and processes
the text with NLP, NLU, and Deep Learning techniques.
Developed a python mechanism based on Sklearn that will calculate predictions for numeric
data stored in the Elasticsearch index, and also process historical value and return future
estimation for Energy Log Server company.
Developed a strategy engine made with LSTM and other machine learning and statistical
algorithms used as trading bots.
Utilize a combination of word embedding vectors with deep learning (i.e BERT or GloVe)
to support the training of models in determining the semantic similarity of sentences and
paragraphs from the data sets.
Implemented Prediction models as input parameter read number of predictions to
be returned (one sample per timeframe): RNN(LSTM), ANN(MultiLayerPerceptrons),
ARIMA(AutoRegressiveIntegratedMovingAverage).
Part of the team that built a scalable serverless Data Warehouse on top of Google BigQuery
that provided in-depth data exploration and visualization of customer data using integrated
BI tools. The system presented holistic data governance support where the infrastructure
was capable of handling terabytes of data.
Design the ETL architecture and deploy on Talend and Google Cloud Platform (GCP).
Implemented data warehouse in a layered approach i.e. Ingestion, Staging, Transformation,
and Aggregation layer.
Provided support in enhancing and improving Data Models (DIMs and FACTs) based on
user requirements.
2. Implemented ML pipeline in TensorFlow with feature engineering is done using
tf.transform() and deployment on fully managed AI Platform.
Developed ETL pipeline using Apache Beam, Cloud Data Flow and orchestrated it using
Cloud Composer (Apache Airflow).
Worked on Bio-metrics recognition utilizing Machine learning approaches, computer vision,
and signal(time series) analysis.
Worked on Geo-image (Satellite, Drone, Aerial image) processing, analysis, Remote
Sensing with Computer Vision and AI technologies using techniques like - Geometric
calibration, Radiometric calibration / Sensor sensitivity calibration, Multi-channel arithmetic,
multi-channel arithmetic even with different resolutions (IR channel + RGB channel), 3D
reconstruction, Hyperspectral / SAR data analysis, Object Detection, Segmentation with a
neural network, Crop monitoring/management/prediction.
Worked on building large-scale data warehouse solutions(on-premises and cloud). I have
also had exposure to building distributed data pipelines using Spark SQL and Apache
Beam.
Hands-on experience with data integration tools like Talend, Informatica, Apache Nifi, Azure
Data Factory, Cloud Data Fusion.
Google Data Engineer at Collegis Education, Remote
May 2021 — Present
Created near-real time event based ETL pipelines using Google Cloud Functions to fetch
data from external API’s for example Phoneburner and LMS Canvas.
CICD enabled using Github and Github Actions.
Created ETL pipelines in fivetran to integrate data from multiple sources e.g. SQL Server,
Google Sheets, and Salesforce. Enabled automatic scaling and handle schema evolutions.
Used dbt models, seeds, exposure to transform and document data in BigQuery
Data lineage created using dbt.
Report ready data marts made available in thoughtspot for building dashboards
Extracted audio calls from phone burner, converted to text using Cloud Speech to Text, and
performed sentiment analysis using Cloud Natural Language API.
Senior Machine Learning Engineer at Inxeption, Remote
March 2021 — December 2021
Created visualizations using D3.JS library to ML models, descriptive data (e.g. being able
to show orders on the map, visitors on the map, make the map interactive, change the map
and rest of the data changes, etc.)
Built data pipeline to ingest data to feed utility and internal websites by Python scripts, AWS
Kinesis, Apache Spark.
Developed utility and internal websites for the company. The tools are used to estimate the
shipping price, the goal is to estimate the price for FTL, LTL and Domestic shipment to be
used for both internal (marketing and sales team) and external use (providing service to
users).
Built data pipeline for data analysis by Python scripts, AWS Elasticsearch/Kibana and S3.
Fixed 80+ bugs to ensure smooth delivery and functioning of the applications.
Conducted testing & maintained the code quality.
Collaborated with a team of 15 engineers to define, design, and ship new features & fix
multiple bugs.
Natural Language Processing Engineer at PlusOne, Remote
January 2019 — June 2021
Deployed Call Segmentation and Sentiment Analysis models using BERT.
Built an NLP worker engine that extracts key phrases from thousands of voice calls daily
using unsupervised machine learning techniques.
Implemented unsupervised keyphrase extraction algorithms like Topicrank, Textrank, Yake,
Autophrase. Extracted keyphrases from textual calls which are then passed via ensembles,
and various pre and post-processing techniques to aggregate top key phrases.
Improved cloud memory store used for performance improvement of NLP workers.
Utilized Google Cloud AutoML and AutoML tables for building keyphrase and call
segmentation models on human-annotated data.
Worked on model training, evaluation, serving, and re-training automated.
Automated sentiment analysis was performed using Google Natural Language API and
Vader. Sentiments generated via these algorithms were fed to BERT to train a custom
sentiment analysis model.
Researched, prototyped (from research papers), built features, and
optimized(hyper-parameter tuning) the state-of-the-art machine learning and deep learning
techniques like SVM, Logistic Regression, Random Forest regression, LSTM, CNN
etc, using Scikit-Learn, Keras, TensorFlow on CPU/GPU environments for student
text-classification.
Built ETL pipeline to ingest data from CloudSQL to BigQuery.
3. Built dashboards for analyzing keyphrase data on Google Data Studio by connecting via
BigQuery.
Pre-processing and data preparation for ML models is done via BigQuery using complex
SQL transformations and joins.
Utilized Stack / Techniques / Libraries: Python, NLP, DialogFlow, Spacy, sklearn, BERT.
Google Cloud Platform; BigQuery, Cloud SQL, App Engine, Cloud Storage, Compute
Engine, Cloud Functions, Cloud Memory Store, Cloud Scheduler, Cloud Run, Kubernetes
Engine, Artifact Registry, Source Repositories, PubSub, AI Platform, AI Notebook, Cloud
Natural Language, AutoML Tables, Cloud AutoML.
Software Development Engineer III, Risk Management Solutions - Moody’s
at Newark, Silicon Valley
February 2018 — April 2021
Worked productively with the Product Team to understand requirements and business
specifications around Portfolio Management, Analytic, and Risk Management.
Effectively coded software changes and alterations based on specific design specifications.
Modified HTML, JavaScript, and CSS web pages to optimize the page's performance for
faster loading and browsing.
Designed and developed automation framework for functional and regressing testing using
Javascript, Coffeescript, Java, Selenium, Rest-Assured, Maven, Test NG, Junit, Postman.
Develop and load test data into test environments.
Designed Location Intelligence Products and other Insurance products Extensive experience
in preparing Test Strategies, Test plan, Test scenarios, Test cases, and Test scripts based on
User requirements and System Requirements.
Extensively Worked on the Creation of Data-Driven, Modular driven and Page Object
Module Frameworks.
I N T E R N S H I P S
Undergraduate Researcher, UC Berkeley, Berkeley
January 2016 — June 2017
Participated in implementing automation of surgical sub tasks in 'Robot-Assisted Minimally
Invasive Surgery.
Designed ROS objects for visualizing trajectories before executing on robot and
implemented a utility for the visualization.
Software Engineering Intern, SAP - Concur, Bellevue
July 2015 — September 2015
Design and Implement software that detects and shows anomalous spending of credit cards
by performing a stream analysis of the data flowing through the data pipeline.
Implemented a scalable low-latency architecture leveraging a broad range of frameworks,
such as Kafka, Zookeeper, Spark, HBase, HDFS, Flume, Spark Streaming, MapReduce and
Impala.
S C H O L A R S H I P S
University of California - Berkeley, Berkeley
August 2015 — May 2017
Recipient of Achievement Award Program TAAP @ CAL
Received University of California - Berkeley Undergraduate Scholarship
President's Distinguished Honor Award
Alpha Gamma Sigma 4.0 HonorRoll Award
Maseke Inc Scholarship Recipient