About me

I am a Machine Learning Research Scientist at Meta in Menlo Park, CA, currently working on feed ranking. Notwithstanding my passion for ML and generative AI, I was trained as a population geneticist. Before joining Meta, my research effort was at the intersection of genomics and machine learning. During my PhD, I developed novel deep learning methods for deriving evolutionary insights from large-scale population genomic data. I also worked on biomedical ML projects as a Research Intern at Health Futures, part of Microsoft Research. All the way back in undergrad, my first foray into computational research was through building high-throughput bioinformatic pipelines for NGS data analysis.

When I am not sitting in front of a screen, I enjoy (but not necessarily excel at) lifting, snowboarding, climbing and board games. I also really want to get into surfing now that I've moved to California. When I have a bit more free time, I spend most of it traveling, where I get to experience different cultures, learn (just the basics of) a new language, hear and share fun stories.

What i do

  • deep learning icon

    Machine Learning

    Turning scientific questions or business needs into well-defined machine learning problems and developing innovative and tailored solutions with state-of-the-art tools and models.

  • genomics icon

    Genomics

    Developing scalable and robust bioinformatic pipelines to derive insights from biobank-scale genomic datasets.

Resume

Experience

  1. Machine Learning Research Scientist

    2024 — Present

    Meta Platforms, Inc.

    Menlo Park, CA

  2. Graduate Researcher

    2019 — 2024

    Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory

    Cold Spring Harbor, NY

  3. Research Intern

    2022

    Health Futures, Microsoft Research

    Redmond, WA

  4. Undergraduate Research Assistant

    2016 — 2018

    Center for Genomics and Systems Biology, New York University

    New York, NY & Abu Dhabi, UAE

Education

  1. Cold Spring Harbor Laboratory (CSHL) School of Biological Sciences

    August 2018 — February 2024

    Ph.D. Quantitative Biology

  2. New York University (NYU) Abu Dhabi

    August 2014 — May 2018

    B.S. Biology, minor in Computer Science, summa cum laude

Projects

  1. Retrieval-augmented generation (RAG) for CSHL scientific archive

    Towards the end of my tenure at CSHL, I kickstarted an experimental project to design and implement an end-to-end system powered by LLMs that can serve the most relevant content from the vast multi-modal scientific archive managed by the CSHL library. This project is still ongoing under the leadership of Mila Pollock after my departure from CSHL.

  2. Transfer learning framework for simulated genomic data

    For my primary PhD work, I created domain-adaptive models to combat simulation mis-specification, which is a fundamental challenge in population genetic inference tasks. My framework has been adopted by at least two other research groups to improve their ML models for tackling key questions in population genetics.

  3. Deep learning models for biobank-scale genomic data

    At the start of my PhD, I co-led my first substantial machine learning project with a postdoctoral researcher, Hussein Hejase . We developed novel deep learning models and engineered features suited for complex evolutionary data structures to detect signals of natural selection from population genomic data. Our models achieved up to 50% reduction in error compared to previous models.

  4. Probabilistic epidemiological modeling

    During the COVID-19 pandemic, I led an epidemiological data science project with a small team of researchers across two groups at CSHL. Our team built probabilistic models and designed statistical tests for the project. Our findings support the hypothesis that circadian fluctuation of innate immunity influences the transmission of respiratory diseases.

Cool Places