I am an accomplished data science and consulting professional with 5+ years of experience with data analysis and visualization practices. As a recent graduate from Georgetown's Data Science and Analytics masters program, I am currently looking for a full-time position as a data scientist or data engineer. I love to learn and be challenged. I am fascinated with advanced analytics techniques and am passionate about helping organizations find strategic, actionable meaning in their data.
In my free time, I like to travel, run, play video games, and pamper my cat Zack. Ask me about my recent backpacking trip in Europe!
This research uses all Reddit activity data, including posts and comments on posts, from January 2021 to August 2022 to extract valuable business insights for Glossier, a cosmetics brand primarily in the e-commerce space. Ten granular business questions, broken down by consumer, demand, product, and competition, are explored and answered using NLP and ML techniques using Spark in Azure Databricks. Sentiment analysis as well as topic modeling with LDA is conducted to inform future product strategy. External data (Google search and Covid data) is incorporated as well to determine if sentiment can be predicted. This project is important as it serves as an example of analysis for companies looking to gain a greater market share that does not include easily accessible customer tracking data purchased from third parties. Most importantly, Glossier will have insights to determine the most optimal launch and marketing strategies in the midst of their high growth and physical presence expansion.
Gender discrimination is a prominent human rights issue that manifests itself via job segregation, employment inequity, pay gaps, and lack of freedom in career choice for women around the world. This research offers a more general toolset to examine gender inequities in education, entrance into the labor force, and within the labor force by location and over time. Developed by women in tech, this analysis specifically focuses on gender disparities in STEM and assesses how test scores, entrance into professional fields, labor participation, parental leave, and salaries differ for men and women.
Online hate speech has unfortunately become increasingly prevalent specifically on social media platforms. Twitter can be used as a tool to spread misinformation and hate with the protection of anonymity behind a screen. This analysis explores research conducted to build a model that will be able to detect tweets from Twitter that contain these “hate speech” sentiments.
This research is an in-depth assessment of if it is possible to predict the winner of a horse race. This research is in direct response to the Big Data Derby 2022 Kaggle Competition that intends to help owners, trainers, and veterinarians improve equine welfare and assess competition strategy. An ANN model that processed numerical, categorical, and text data was built to predict the target variables using the Keras functional API. It's output is then compared to traditional machine learning methods such as random forest, logistic regression, and gradient boost.