Hello! If you’ve clicked on this, we probably share a similar curiosity about the world — and I’m glad you’re here. I’m a sophomore at Wellesley College studying Computer Science and Mathematics, with a strong interest in coding, modeling, and developing quantitative trading strategies. I’m always excited to explore new ideas and collaborate on interesting projects, so feel free to reach out if you’d like to connect or work together.
*Click here to see project details.
My team and I researched how hog futures could reduce price volatility in China’s pig market, which has long faced cyclical fluctuations. Using the herding effect and cobweb model, we showed how irrational decisions and supply lags drive instability. We argued that hog futures provide essential functions—price discovery, hedging, vertical integration, and support for large-scale production—that can stabilize the market. Through theoretical analysis and a case study of Muyuan Food Co., we found that major producers began using futures to manage risk and inform production. However, high capital barriers and technical complexity limited small-scale farmer participation. Although China’s hog futures market remained nascent, we concluded it holds strong potential to enhance market stability, reduce volatility, and improve long-term risk management across the supply chain.
*Click here to see project details.
My friends and I built EcoBear in a 2-day hacakthon, a web-widget to bridge the gap between consumers’ sustainable intentions and their actual shopping behavior. Most shoppers wanted to buy green but lacked clear, trustworthy information and were overwhelmed by greenwashing and convenience. Our tool integrated directly into shopping websites, automatically scanning products and displaying instant sustainability ratings using color-coded labels—green for sustainable, red for unsustainable. Users could click to learn more about certifications and evaluation criteria.
To drive engagement, we gamified the experience with rewards and a friendly polar bear mascot. Eco-conscious users appreciated the badges and rewards, while casual shoppers responded well to clear alternatives and incentives. From our testing, we learned to make sustainability cues more prominent and reduce guilt-driven nudging.
*Click here to see project details.
As a participant in the WiDS Datathon 2025, I was to analyze fMRI brain imaging data to uncover how biological sex influenced the presentation of ADHD. Me and my teammates analyzed neuroimaging (fMRI) data from over 1,000 pediatric subjects to predict biological sex and ADHD diagnosis. We engineered advanced statistical features and applied principal component analysis (PCA) for dimensionality reduction. Using Python and R, we built and evaluated predictive models including convolutional neural networks (CNN), random forests, and logistic regression and contributed to advancing research into sex-specific neurobiological patterns associated with ADHD.
*Click here to see project details.
I led a team of four to create a full hackathon platform from scratch — including the main site, past event archive, merch designs, and all visual assets. It simplified everything from registration to judging and ran smoothly with 100+ active users during the event.
*Click here to see project details.
Our team proposed a machine learning model that more accurately predicts the reading difficulty of educational texts than traditional formulas like Flesch-Kincaid or Lexile. These formulas rely on shallow features and often miss deeper semantic and syntactic complexity. Using datasets such as the CommonLit Readability Prize and CLEAR Corpus, we planned to combine baseline models (Ridge, Lasso) with advanced approaches like TF-IDF with LightGBM and BERT embeddings. Our model aimed to incorporate cohesion and syntactic features, evaluated using RMSE, to improve both accuracy and transparency in readability assessment for grades 3–12.
*Click here to see project details.
Me and my colleague at MITCSAIL and developed a scalable filtered ANN system that integrates graph-based indices and IVF structures to efficiently handle up to 100 million vectors and over 200,000 labels. Our system achieves 5× faster query speeds at 90% recall compared to ParlayIVF2 on SIFT100M and YFCC10M benchmarks, outperforming Filtered DiskANN, NHQ, and UNG. We optimized the C++ backend using SIMD filtering, cache-aware prefetching, flattened memory layouts, and bitset label masks, achieving near-linear scalability across 32 threads with dynamic scheduling and query reordering. I am currently leading independent research on memory-efficient vector quantization, integrating advanced algorithms like SymphonyQG and HNSW-Flash to further improve graph-based ANN performance.