Mount Sinai Health System logo

Computational Scientist, Computational Biology and Machine Learning - Hematology & Medical Oncology

Mount Sinai Health System
1 day ago
Full-time
On-site
New York, New York, United States
AI and Machine Learning
Description

We are looking for a Computational Scientist in Computational Biology and Machine Learning to join our growing translational research program at the Tisch Cancer Institute. Our team studies myeloproliferative neoplasms (MPNs), acute myeloid leukemia (AML), and related myeloid malignancies, combining single-cell multi-omics, clinical data, and artificial intelligence–based approaches to understand disease mechanisms, identify biomarkers, support drug development, and improve patient care.

The scientist will work closely with longitudinal patient datasets, integrating genomics, immune and cytokine profiling, treatment responses, and clinical trial outcomes.

The scientist will report directly to Dr. Md Babu Mia, lead of the Computational Biology and Machine Learning program within the MPN team.



Responsibilities

Computational Biology & Single-Cell Analytics

  • Lead analysis of single-cell genomics datasets and build reproducible pipelines for data integration, clustering, differential expression, and clonal architecture reconstruction
  • Apply rigorous statistical methods that appropriately account for sample-level replication, longitudinal structure, and multi-modal data
     

Machine Learning & Predictive Modeling

  • Build machine learning models linking genomic drivers to clinical phenotypes, cytokine profiles, and treatment outcomes using ensemble methods and deep learning
  • Develop interpretable risk stratification models for disease progression and treatment response, with a focus on clinical relevance
     

AI & Large Language Model Development

  • Develop retrieval-augmented generation (RAG) systems and AI-assisted workflows that enable natural-language querying of clinical and genomic datasets
  • Build LLM-powered pipelines for extracting structured information from clinical notes and pathology reports, with an emphasis on transparency and clinical usability
     

Data Integration & Infrastructure
 

  • Build unified data models connecting treatments, laboratory results, cytokines, multi-omic biomarkers, and clinical trial endpoints
  • Maintain HIPAA-compliant databases and ETL pipelines, and develop dashboards and APIs to support cross-institutional collaboration
     

Scientific Communication

  • Prepare publication-ready figures and analyses, and contribute to manuscripts, grant applications, and research proposals
  • Present findings at conferences and collaborate closely with lab scientists and clinicians
     


Qualifications
  • Masters degree or equivalent in a domain science; Ph.D in Computational Biology, Bioinformatics, Computer Science, Data Science, or related scientific domain preferred.
  • 3 years, preferably in a scientific/academic computing environment or equivalent experience.
  • Experience in batch HPC cluster environment with a parallel file system
  • Experience installing and supporting bio and chemistry codes (NAMD, AMBER, Matlab, Gromacs, DESMOND) and laboratory equipment such as sequencers, etc.
  • Experience with MPI, Open MP and numerical libraries
  • Experience with scientific workflows
  • Experience with instrumenting and optimizing application codes
  • Experience in an academic or research community environment
  • Programming experience in any applicable language

Preferred:

  • Strong experience with next-generation sequencing data analysis, and proficiency in Python and R
  • Demonstrated track record building machine learning models for biomedical applications; familiarity with LLM frameworks or RAG systems
  • Experience with cloud or HPC environments, containerization (Docker), and database design
  • Background in building dashboards, LLM fine-tuning, and working across laboratory, clinical, and computational teams