Skip to content

Negin Sobhani

Negin Sobhani, Ph.D.

Machine Learning for Weather, Air Quality & Climate Modeling 🌎 | Geospatial AI/ML & Data Science | Bridging HPC, AI/ML, Weather & Climate Modeling at NSF-NCAR

About me

I'm an HPC consultant and scientific software developer working at the intersection of AI/ML, HPC, and Earth system science 🌎. My work focuses on making GPU clusters useful for weather and climate researchers: building the distributed training systems, data pipelines, and optimization workflows that turn HPC hardware into scientific results.

Currently serving as the Technical Lead of the NSF-NCAR Community AI Ecosystem initiative, I coordinate eight labs and 50+ stakeholders to build unified geospatial AI/ML infrastructure for Earth system science.

My background spans numerical weather prediction, distributed GPU training, performance optimization on HPC/Cloud architectures, and building scalable data workflows for large geospatial datasets. I'm passionate about open science and fostering community-driven computational geoscience.

I'm an active open-source contributor and technical leader in the Pangeo ecosystem, serving as a core contributor to Xarray, CuPy-Xarray, and Pythia. I enjoy teaching and have delivered tutorials at SciPy, ESDS, and NCAR on topics ranging from scalable geospatial data analysis using Dask to distributed AI/ML workflows.

I hold a Ph.D. in Chemical Engineering from the University of Iowa, where my research focused on atmospheric chemistry modeling, performance analysis, and optimization of weather and air quality models.

Portfolio

Scaling AI/ML Workflows on HPC

Scaling and optimizing AI/ML training workflows for geoscientific applications on HPC systems.

SciPy 2025 Talk

GPU-Native Data Loading with Zarr

End-to-end GPU data pipeline using Xarray, kvikIO, and CuPy for accelerated AI/ML geoscientific workflows.

Xarray Blog Post
GitHub

Dask on NCAR HPC Workshop

Comprehensive workshop on scalable data analysis with Dask and Xarray on NCAR HPC systems.

Workshop Website
Workshop Video - Part 1
Workshop Video - Part 2

CREDIT

Community Research Earth Digital Intelligence Twin — an AI-powered framework for weather and climate prediction.

GitHub
Paper

CuPy-Xarray

Tutorial on GPU-accelerated array computing with CuPy-Xarray for geoscientific data analysis.

Tutorial Website
GitHub

Distributed Training for ESS on NCAR HPC

Comprehensive guidelines for multi-node multi-gpu distributed deep learning training for Earth system science applications on NCAR HPC.

GuideBook
GitHub

Kubernetes for Climate Data Visualizations

Harnessing Kubernetes to build scalable, interactive visualization platforms for climate data.

AGU 2023 Slides
NEON Dashboard
LENS2 Dashboard
GitHub

Deep Learning for Cloud Microphysics

Deep learning emulation of bin microphysics autoconversion processes as an alternative to empirical parameterizations in climate models.

AGU 2018 Poster

Interests

HPC AI/ML Scientific Machine Learning Distributed Training Weather & Climate Modeling Scalable Data Pipelines Open-Source Scientific Software Performance Optimization