Hello!

Hello!

My name is Marlene Lin, currently a neuroimaging data analyst at the Rabinovici Lab at UCSF Memory & Aging Center, supporting research on neurodegenerative diseases. I work with Python, R, and MATLAB for image processing, data wrangling, statistical modeling, and machine learning—focusing on understanding disease heterogeneity and its implications for clinical care.

 
🤖 Github
🖇️ LinkedIn
 

Recent

Tau-PET subtypes in Early-onset Alzheimer's Disease
notion image
AI-Guided Surgical Blood Transfusion Need Prediction
notion image
Read More
Data Sciences Help Desk Dashboard
notion image
Read More
Breast Lesion Classification with MRS
notion image
Read More

About Me

I have broad experience analyzing data across domains and thrive in diverse analytical environments. As an operations intern at Kaisa Group, I evaluated complex population and health data to support a key study on telemedicine resource distribution. At UCLA and UCSF, I contributed to projects that derived diagnostic and clinical decision-making insights from imaging or EHR data using rigorous statistical analysis and machine learning models.
Beyond technical proficiency, I further built my problem-solving and collaboration skills while working in these cross-functional teams and as a teaching assistant with the UCSF Library’s data science team. I strive to foster effective working environments through development and documentation of pipelines and tools that optimize workflows. I believe my proactive, detail-oriented approach would allow me to contribute meaningfully to the team.

Read More

Projects

Research

Tau-PET subtypes in Early-onset Alzheimer’s Disease
CAPSTONE project with Rabinovici Lab, UCSF Memory and Aging Center
Identify Alzheimer’s Disease subtypes based on the topographic distribution of tau by applying robust data-driven clustering methods on baseline tau-PET of sporadic early-onset patients from the Longitudinal Early-Onset Alzheimer’s Disease Study
 
AI-Guided Surgical Blood Transfusion Need Prediction
June 2024 with Dr. Priya Ramaswamy, UCSF Department of Anesthesia
Enhanced a gradient boosting model to predict surgical blood transfusion needs and benchmarked model against the Maximum Surgical Blood Order Schedule and clinician order performance. Supported silent prospective validation and model deployment.
Paper | Git
 
Prevalence of Diabetes Screening, Pre-diabetes Testing, and Nutrition Counseling
March 2024 with Dr. Yoshimi Fukuoka, UCSF School of Nursing
Characterize the prevalence of encountered for diabetes screening, testing for pre-diabetes, and nutrition counseling among different demographic groups using electronic health record data queried from the UCSF Info Commons
Restricted SQL query and data | Analysis Plan (Feb)
 
Breast Lesion Classification with DWI and 5D MRSI data
June 2023 with Thomas Lab, UCLA Magnetic and Resonance Research Labs
Ensemble Learning Breast Lesion Classification using Magnetic Resonance Spectroscopic Imaging and Diffusion-weighted Imaging
Paper | Poster & Presentation (not updated) | Git
 
MATLAB Magnetic Resonance Spectroscopy Processing Application
Nov 2022 with Thomas Lab, UCLA Magnetic and Resonance Research Labs
Developed MATLAB applications for in-depth processing and annotations of multidimensional Magnetic Resonance Spectra data.
Doc | Git (not updated for COSY)
 
Bone Age Assessment with Added Features
Dec 2021
Re-implemented a deep-learning model for bone age regression with an attention-guided localization network and label distribution learning-based regression network. Incorporated ethnicity information from the Digital Hand Atlas data to improve the accuracy of bone age prediction.
📄 Report | Git
 
Literature Review: Application of Gene Drive
March 2020
Future gene drive applications: concerns, regulations, and communication
📄
Report

Data Analysis

Library Data Sciences Help Desk Dashboard
May 2025 Git
 
Adverse outcome prediction of neurosurgical cases using the ACS-NSQIP Dataset
Aug 2023 📄 Report
 
Associations between non-physical adult mistreatment and adolescent eating disorders
Dec 2022 📄 Report
 
全国一二线城市医疗市场分析报告 (5p)
Aug 2021 📄 Report
 
杭州数字医疗峰会总结
July 2021 📄 Report
 
跨境医疗政策开放正面影响
June 2021 📄 Report

Python Blogposts

Pseudo-alignment Implementation
April 2023
Finding the vector of equivalence class counts given FASTA format RNA Data.
 
NIH Funding Dashboard
March 2023
An interactive web-based data dashboard that monitors and showcases trends in research projects supported by the National Institutes of Health (NIH)
 
Message Bank
March 2023
Creating a flask-based online message bank deployed as a Python web app using the Google Cloud service and MySQL database.
 
Cast-based Show Recommendations
Feb 2023
Scraping through the TMDB movie database to come up with show recommendations based on a similar cast.
 
Basic Classification 1: Cat v.s. Dog pics
Jan 2023
Improving cat v.s. dog classification with data augmentation and preprocessing using TensorFlow.
 
Basic Classification 2: False v.s. Real news
Jan 2023
Building a machine learning & N-Gram-based fake news classification model.
 
Basic Classification 3: Penguin Classification
Dec 2022
Penguin Classification using multiple sklearn models and visualization using Seaborn.
 

American Statistical Association DataFest

ABA Pro Bono Service Dataset
May 2023, Finalist
Strategize Attorney Training Based on Cycle Time & Conversation Emotion
📄 Pitch
 
PlayForward Game Stats Dataset
May 2022
Characterize Student Players Based on Avatar Choice & Gaming Experience
📄 Pitch

Education

University of California, San Francisco (2023 - 2025)

Master of Science in Health Data Science
Relevant coursework:
  • R for Health Data Science
  • Biostatistics
  • Machine Learning
 
  • Epidemiological Methods
  • Electronic Health Record Research
  • Responsible Conduct of Research
Bachelor of Science in Computational and System Biology
Minor: Statistics
Thesis: Ensemble Learning for Breast Cancer Lesion Classification: A Pilot Validation Using Correlated Spectroscopic Imaging and Diffusion-Weighted Imaging. Metabolites. 2023; 13(7):835. https://doi.org/10.3390/metabo13070835
Relevant Coursework:
  • Biotechnology and Society
  • Digital Image Processing
  • System and Signal
  • Statistical Programming
  • Mathematical Statistics
  • Linear Models
  • Data Analysis & Regressions
  • Experimental Design

University of California, Los Angeles (2019 - 2023)

Bachelor of Science in Computational and System Biology
Minor: Statistics
Thesis: Ensemble Learning for Breast Cancer Lesion Classification: A Pilot Validation Using Correlated Spectroscopic Imaging and Diffusion-Weighted Imaging. Metabolites. 2023; 13(7):835. https://doi.org/10.3390/metabo13070835
Relevant Coursework:
  • Biotechnology and Society
  • Digital Image Processing
  • System and Signal
  • Statistical Programming
  • Mathematical Statistics
  • Linear Models
  • Data Analysis & Regressions
  • Experimental Design

Previous Experience

San Francisco

Library Teaching Assistant

(October 2023 - May 2025)
  • Co-teach and assist with R and Python programming workshops on topics including data visualization, statistical analysis, and machine learning.
  • Offer 1:1 programming and data analysis help to UCSF community members during weekly data science help desk.
  • Assist with other projects including updating online course materials, creating subject guides and help articles, and presenting library resources to UCSF members.

Machine Learning Intern

(June 2024 - October 2024)
  • Enhanced a gradient boosting model for blood transfusion predictions through feature engineering, model optimization, and threshold tuning, achieving a marked improvement in precision.
  • Benchmarked model performance against the Maximum Surgical Blood Order Schedule and clinician request patterns using both retrospective data and silent prospective validation, providing actionable insights for model deployment to support pre-operative clinical decision-making.
  • Managed data visualization and report preparation for the Learning Health System meetings, collaborating with data scientists, clinicians, and health IT stakeholders to align on project goals.

Los Angeles

Research Assistant

(March 2022 - July 2023)
  • Developed MATLAB applications for in-depth processing and annotations of multidimensional Magnetic Resonance Spectra data.
  • Tuned parameters and assessed the performance of various MR spectroscopy reconstruction methods, reducing MR spectroscopy scan time by at least two-fold.
  • Characterized brain MR Spectroscopy data of obstructive sleep apnea and pediatric AIDS patients.
  • Construct ensemble learning models to classify breast tumors based on 5D MR Spectroscopy quantitation, results published in Ajin et. al, Metabolites. 2023; 13(7):835.

Learning Assistant

(Sep 2022 - March 2023)
  • Encouraged growth-oriented mindsets among students and created an equitable STEM learning environment by fostering student collaboration as a peer educator.
  • Effectively explained course concepts and expanded peers’ understanding of the relevant materials in discussions, course workshops, and review sessions.
  • Formulated ways to improve course structure and helped maintain clarity within the course by consistently communicating with instructors in weekly meetings.

Shenzhen

Telemedicine Operation Analyst Intern

Kaisa Health Group, Shenzhen, China
(June 2021 - Sep 2021)
  • Directed detailed regional analysis of the healthcare sector in major cities of China to provide insights on the department’s Direct-to-Patient service deployment.
  • Collected and analyzed data on medical resources distribution, international and domestic health product sales, and health expenditures of different communities.
  • Conducted on-site investigations of various pharmacy chains and delivered relevant reports to support the department’s business acquisition plan.

Trivia

Thank you for scrolling all the way through :)
Header: Sayram Hu, Xinjiang, 2008-08
Fun fact: my friend got me a “already thinking about my next coffee” cup from Typo but I reserved it exclusively for tea drinking! (recipe: brown sugar syrup line the cup, hojicha, a spalsh of sikhye, and a sprinkle of salt. Let it sit for 10 minutes and enjoy ☕🧘)