Hello! I'm Jasmine (Anh) Pham
I'm a versatile Data Scientist with 4 years of experience in big data analytics, visualization, ML/AI, and data engineering.
Data science is a perfect blend of what I'm good at and what I'm passionate about. While I have a knack for quantitative work, what ultimately drives me is the desire to make positive and meaningful impacts.
My specialty lies in deriving actionable insights from unstructured data to drive strategic decisions and tackle real-world problems. This can manifest in various forms, from an automated dashboard that detects anomalies in a recruiting pipeline, to an in-depth analysis that exposes racial disparities, or a machine learning model that predicts green energy production to optimize resource planning.
I've led several end-to-end data science projects that made tangible impacts — like reducing hospital admissions by 24% and saving $30M in healthcare costs for patients, or designing and analyzing marketing metrics that led to successful market expansion into three new cities. Right now, I’m building a chatbot that empower professors with real-time, data-driven insights to better improve student learning outcomes. These tangible outcomes not only highlight my technical capabilities and adaptability but also is a testament to my commitment to drive data-driven growth and make positive impacts.
If you’re interested, feel free to scroll down to see some demos of my work or visit the Professional Experience tab to learn more about my past projects and impacts.
PROJECTS
Analysis of Racial Disparities in Housing
When I was at the Delaware Data Innovation Lab, I worked on a project that analyzed the magnitude and impact of eviction during the pandemic in the state of Delaware. Specifically, I gathered eviction filing data and converted every single address in Delaware to their corresponding census tracts to understand how distance and accessibility to courts might have influenced one's vulnerability to be evicted.
Tools: R, API
Domain: Data Mining, Data Analysis, Visualization
Breast Cancer Detection
This project aims to predict breast cancer from digitized image readings of patients' fine-needle aspirates through the use of different machine learning models in R. The best performing model correctly classifies patients with and without breast cancer more than 95% of the times. It also has a lift metric of 2.2 and an AUC of 99.1%, indicating a great ability to distinguish between a benign lump and a malignant tumor.
Tools: R
Domain: ML/AI (Logistic Regression, Random Forest, Gradient Boosting Machines), Data Analysis, Visualization
Energy Use Dashboard
A comprehensive and interactive view of Boston University's energy consumption and energy use intensity. This dashboard helps university administrators track energy usage patterns, identify inefficiencies, and measure the impact of sustainability initiatives. Its insights empower decision-makers to reduce operational costs, improve resource management, and meet the university's environmental goals more effectively.
Tools: Power BI
Domain: Visualization
LLM-based Topic Modeling
As an Applied AI Scientist Intern at Boston University, I developed an automated and modularized Python pipeline to extract policy insights from 300+ unstructured documents, utilizing NLP techniques including tokenization, lemmatization, and topic modeling (BERTopic). This project aims to provide policy makers, researchers, and institutions comprehensive insights of the consensus and differences among different organizations in managing and policing the usage of Generative AI.
Tools: Python, LangChain, API, Chainlit
Domain: LLM, NLP, Data Engineering, Analysis, Visualization
AI-Powered Energy Prediction App
An AI-powered app that forecasts solar energy production based on up-to-date weather conditions. Accurate green energy forecasting can help energy grid operators efficiently manage the integration of renewable energy sources, reduce reliance on fossil fuels, and minimize energy waste.
Tools: Python (scikit-learn, PyTorch), API
Domain: ML/AI (AdaBoost, Linear Regression), Data Engineering, Visualization
As much as I love working professionally with Power BI, I know that Tableau is also a powerful business intelligence tool that I never had a chance to dive deeper into. So I decided to take this quarantine time to teach myself more about Tableau. This Sales Dashboard using data from Red30 Tech is my early attempt to master Tableau.
Tools: Tableau
Domain: Visualization
AI Assistant for Professors
As an Applied AI Scientist Intern at Boston University, I'm building a RAG chatbot to empower professors with real-time, data-driven insights to better address student concerns and improve course quality in a timely manner. The goal is to make the chatbot scalable so that it can be easily tailored and implemented in any course.
Tools: Python, LangChain, Hugging Face, API, Chainlit, LiteralAI, TruLens
Domain: LLM, RAG, NLP
Weather Forecast & Automated Trading
Recurrent neural networks (LSTM) trained on weather data to predicts daily temperatures. These predictions are then fed into Kalshi’s API to automatically execute event-based trading strategies.
Tools: Python (scikit-learn, TensorFlow, matplotlib), API
Domain: ML/AI (Recurrent Neural Network), Time Series
We were given a week to create a plan for Boeing to expand production and ensure on-time delivery for the Chinook orders from both the United Kingdom and the United States Army. We derived a strategy that applies LEAN techniques to the management of supply chain, human resources, and factory design. With our solutions, Boeing could save $310M in manufacturing, supplies, and labor costs while increasing product quality by 36% and ensuring no layoff.
GALLERY
I know how much people like free samples, so scroll down for a taste of what I do besides work.












