Western US water management is underpinned by forecasts of spring-summer river flow volumes made using operational hydrologic models. The USDA Natural Resources Conservation Service (NRCS) National Water and Climate Center operates the largest such forecast system regionally, carrying on a nearly century-old tradition. The NWCC recently developed a next-generation prototype for generating such operational water supply forecasts (WSFs), the multi-model machine-learning metasystem (M4), which integrates a variety of AI and other data-science technologies carefully chosen or developed to satisfy specific user needs. Required inputs are data around snow and precipitation from the NRCS Snow Survey and Water Supply Forecast program SNOTEL environmental monitoring network but are flexible. In hindcasting test-cases spanning diverse environments across the western US and Alaska, out-of-sample accuracy improved markedly over current benchmarks. Various technical design elements, including multi-model ensemble modeling, autonomous machine learning (AutoML), hyperparameter pre-calibration, and theory-guided data science, collectively permitted automated training and operation. Live operational testing at a subset of sites additionally demonstrated logistical feasibility of workflows, as well as geophysical explainability of results in terms of known hydroclimatic processes, belying the black-box reputation of machine learning and enabling relatable forecast storylines for NRCS customers.