Da Kuang

Da Kuang

Computer Science Doctoral Candidate

University of Pennsylvania

About Me

Hi! I am a final-year Ph.D. Candidate in the Department of Computer and Information Science at the University of Pennsylvania, advised by Professor Junhyong Kim. I am funded by The Human BioMolecular Atlas Program (HuBMAP) to develop computational frameworks for spatial biology.

My research centers on representation learning, particularly in designing robust and biologically meaningful embeddings to enable automated scientific discovery. I work at the intersection of machine learning, single-cell genomics, and computational biology, with ongoing efforts in:

  • Learning Hierarchical representation: CellTreeQM
  • LLM-based automatic scientific discovery: Aegle (ongoing)
  • Robust representation: BooleanSpurious, RFNorm
  • Design Protein Embedding Space by Empirical Kernel: PESS
  • Construct integrative embedding space for cellular developmental preturbation responses: Argos (ongoing)

I previously interned as a Senior Data Scientist at IBM’s Chief Analytics Office, where I worked on model deployment and decision optimization.

I am actively seeking full-time opportunities as a Research Scientist in AI4Science across tech, biotech, or pharmaceutical industries. Feel free to reach out!

Interests
  • Representation & Metric Learning
  • Embedding Design
  • LLMs for Scientific Discovery
  • Single-cell & Spatial Omics
Education
  • MA in Statistics

    University of Pennsylvania, Wharton

  • MSE in Computer and Information Science

    University of Pennsylvania, Engineering

  • MSE in Nanotechnology

    University of Pennsylvania, Engineering

  • BSc in Electronic Science

    Jilin University

📚 Selected Publications

Reconstructing Cell Lineage Trees from Phenotypic Features with Metric Learning

Da Kuang , Guanwen Qiu , Junhyong Kim
ICML , Jul 2025
PDF

Learning Proteome Domain Folding Using LSTMs in an Empirical Kernel Space

Da Kuang , Dina Issakova , Junhyong Kim
Journal of Molecular Biology , Aug 2022
PDF
🏛️ Academic Services
  • Reviewer: ICML 2025, ICLR 2025, AISTAT 2025, NeurIPS 2024
  • Workshop Reviewer: ICLR-25 FM-Wild, MLGenX; NeurIPS-24 FM4Science, AIDrugX; NeurIPS-23 AI4Science
  • Head TA for Operating Systems (CIT 595) with MCIT Online TA Awards, University of Pennsylvania
  • Mentored undergraduate researchers on projects in biomedical imaging, multi-omics analysis, and protein function prediction
  • Course designer for CIT 595, STAT 471, and STAT 961