Sheron Yang
About
Contact
Light
Dark
Projects
I'm Sheron Yang.
I reply to messages with memes and emojis.
I make coffee.
I pull all-nighters gaming.
I also sometimes maybe code.

I find beauty in making the complicated feel effortless. My path blends research, engineering, and a spark for creative problem-solving — and I am always seeking new frontiers to learn, collaborate, and create technologies that leave a thoughtful mark.

Linkedin
Github
Mail


See the sourcecode of this site!


Research on ANN Filtered Vector Search
ML / DS - Affiliated with MITCSAIL - Ongoing

Text Readability Prediction via Advanced NLP
ML / DS - Affiliated with BreakThroughTech AI - Ongoing

Wellesley College Hackathon Official Site
Frontend - Ongoing

ADHD Diagnosis for Women
ML / DS - Kaggle "Women in Data Science" Hackathon Top 10% Students - 2025

EcoBear, a Web Widget
Frontend - Wellesley College Designathon Winning - 2024

China's Hog Futures Research
Paper - ICEMGD 7th - 2023

Subsidy Impact on China's EV Uptake
Paper - S.-T. Yau Science Award Nominated - 2022

The Impact of Subsidy on EV Adoption: Evidence from China

*Click here to see project details.

My team and I investigated the impact of subsidy policies on EV adoption in China using city–month data from 15 cities between 2016 and 2019. We applied panel regression and instrumental variables—specifically the local-to-national subsidy ratio—to address endogeneity. Results show that a 10,000 Yuan increase in subsidies boosts EV sales by 6–11%, controlling for GDP, income, and charging infrastructure. We chose this period for its stable, consumer-focused policies. While subsidies significantly increased adoption, we also found that local industrial dynamics and consumers’ expectations of future policies influenced results. Due to limited data granularity and transparency, we recommend future research explore interactions among multiple policy tools and variations in consumer responses.


How Will the Hog Futures Smooth Price Fluctuations in China’s Pig Market

*Click here to see project details.

My team and I researched how hog futures could reduce price volatility in China’s pig market, which has long faced cyclical fluctuations. Using the herding effect and cobweb model, we showed how irrational decisions and supply lags drive instability. We argued that hog futures provide essential functions—price discovery, hedging, vertical integration, and support for large-scale production—that can stabilize the market. Through theoretical analysis and a case study of Muyuan Food Co., we found that major producers began using futures to manage risk and inform production. However, high capital barriers and technical complexity limited small-scale farmer participation. Although China’s hog futures market remained nascent, we concluded it holds strong potential to enhance market stability, reduce volatility, and improve long-term risk management across the supply chain.


EcoBear, a Web Widget for Sustainability and Green Development

*Click here to see project details.

My friends and I built EcoBear in a 2-day hacakthon, a web-widget to bridge the gap between consumers’ sustainable intentions and their actual shopping behavior. Most shoppers wanted to buy green but lacked clear, trustworthy information and were overwhelmed by greenwashing and convenience. Our tool integrated directly into shopping websites, automatically scanning products and displaying instant sustainability ratings using color-coded labels—green for sustainable, red for unsustainable. Users could click to learn more about certifications and evaluation criteria.
To drive engagement, we gamified the experience with rewards and a friendly polar bear mascot. Eco-conscious users appreciated the badges and rewards, while casual shoppers responded well to clear alternatives and incentives. From our testing, we learned to make sustainability cues more prominent and reduce guilt-driven nudging.


Unraveling the Mysteries of the Female Brain: Sex Patterns in ADHD

*Click here to see project details.

As a participant in the WiDS Datathon 2025, I was to analyze fMRI brain imaging data to uncover how biological sex influenced the presentation of ADHD. Teamed with friends, we analyzed neuroimaging (fMRI) data from over 1,000 pediatric subjects to predict biological sex and ADHD diagnosis. We engineered advanced statistical features and applied principal component analysis (PCA) for dimensionality reduction. Using Python and R, we built and evaluated predictive models including convolutional neural networks (CNN), random forests, and logistic regression and contributed to advancing research into sex-specific neurobiological patterns associated with ADHD.


WHACK 2025: Wellesley College Annual Hackathon's Official Website

*Click here to see project details.

Running a hackathon, leading a samll team and building this site ;) Coming up soon!


Readability Assessment for Educational Texts Using Advanced NLP

*Click here to see project details.

We proposed a project to develop a machine learning model that more accurately predicts the reading difficulty of educational texts than traditional formulas like Flesch-Kincaid or Lexile. These formulas rely on shallow features and often miss deeper semantic and syntactic complexity. Using datasets such as the CommonLit Readability Prize and CLEAR Corpus, we planned to combine baseline models (Ridge, Lasso) with advanced approaches like TF-IDF with LightGBM and BERT embeddings. Our model aimed to incorporate cohesion and syntactic features, evaluated using RMSE, to improve both accuracy and transparency in readability assessment for grades 3–12.


Research on ANN Filtered Vector Search

*Project details coming up soon.

I Collaborated with colleague at MITCSAIL and developed a scalable filtered ANN system that integrates graph-based indices and IVF structures to efficiently handle up to 100 million vectors and over 200,000 labels. Our system achieves 5× faster query speeds at 90% recall compared to ParlayIVF2 on SIFT100M and YFCC10M benchmarks, outperforming Filtered DiskANN, NHQ, and UNG. We optimized the C++ backend using SIMD filtering, cache-aware prefetching, flattened memory layouts, and bitset label masks, achieving near-linear scalability across 32 threads with dynamic scheduling and query reordering. I am currently leading independent research on memory-efficient vector quantization, integrating advanced algorithms like SymphonyQG and HNSW-Flash to further improve graph-based ANN performance.