Skip to content

Negin Sobhani

Negin Sobhani, Ph.D.

Machine Learning for Weather, Air Quality & Climate Modeling 🌎 | Geospatial AI/ML & Data Science | Bridging HPC, AI/ML, Weather & Climate Modeling at NSF-NCAR

About me

I'm an HPC consultant and scientific software developer working at the intersection of AI/ML, HPC, and Earth system science 🌎. My work focuses on making GPU clusters useful for weather and climate researchers: building the distributed training systems, data pipelines, and optimization workflows that turn HPC hardware into scientific results.

Currently serving as the Technical Lead of the NSF-NCAR Community AI Ecosystem initiative, I coordinate eight labs and 50+ stakeholders to build unified geospatial AI/ML infrastructure for Earth system science.

My background spans numerical weather prediction, distributed GPU training, performance optimization on HPC/Cloud architectures, and building scalable data workflows for large geospatial datasets. I'm passionate about open science and fostering community-driven computational geoscience.

I'm an active open-source contributor and technical leader in the Pangeo ecosystem, serving as a core contributor to Xarray, CuPy-Xarray, and Pythia. I enjoy teaching and have delivered tutorials at SciPy, ESDS, and NCAR on topics ranging from scalable geospatial data analysis using Dask to distributed AI/ML workflows.

I hold a Ph.D. in Chemical Engineering from the University of Iowa, where my research focused on atmospheric chemistry modeling, performance analysis, and optimization of weather and air quality models.

Portfolio

Scaling AI/ML Workflows on HPC

Scaling and optimizing AI/ML training workflows for geoscientific applications on HPC systems.

SciPy 2025 Presentation

GPU-Native Data Loading with Zarr

End-to-end benchmark for GPU data pipeline using Xarray, kvikIO, and CuPy for accelerated AI/ML geoscientific workflows.

Xarray Blog Post
GitHub Benchmark Repo

Dask on NCAR HPC Workshop

Comprehensive workshop on scalable data analysis with Dask and Xarray on NCAR HPC systems.

Workshop Website
Workshop Video - Part 1
Workshop Video - Part 2

Community AI Ecosystem for ESS

Leading NSF-NCAR's initiative to build unified AI/ML infrastructure for Earth system science, coordinating 8 labs and 50+ stakeholders.

NCAR AI Website

CREDIT

Community Research Earth Digital Intelligence Twin — an AI-powered framework for weather and climate prediction.

GitHub
Paper

CuPy-Xarray

GPU-accelerated array computing with CuPy-Xarray for geoscientific data analysis.

Tutorial Repo
GitHub

Distributed Training for ESS on NCAR HPC

Comprehensive guidelines for multi-node multi-gpu distributed deep learning training for Earth system science applications on NCAR HPC.

GuideBook
GitHub

Interactive Dashboards for Climate Data Visualizations

Harnessing Kubernetes to build scalable, interactive visualization platforms for climate data.

AGU 2023 Slides
NEON Dashboard
LENS2 Dashboard
GitHub

Early Predictions of Extreme Heat Events using AI/ML

Machine learning approaches for seasonal and sub-seasonal forecasting of extreme heat events in the Eastern United States.

AMS Talk

Deep Learning for Cloud Microphysics

Deep learning emulation of bin microphysics autoconversion processes as an alternative to empirical parameterizations in climate models.

AGU Talk

Interests

HPC AI/ML Distributed Training Scientific Machine Learning Weather & Climate Modeling Scalable Data Pipelines Open-Source Scientific Software Performance Optimization