Ray v2.55.1 Release Notes

Release Date: 2026-04-22 // about 2 months ago
    • 🛠 Fixes SSH connectivity issue in the ray-llm image (#62625 / #62718).
    • ⬆️ Upgrade apt packages in slim base (#62666 / #62717).

Previous changes from v2.55.0

  • Ray Data

    🎉 New Features

    • ➕ Add DataSourceV2 API with scanner/reader framework, file listing, and file partitioning (#61220, #61615, #61997)
    • 👌 Support GPU shuffle with rapidsmpf 26.2 (#61371, #62062)
    • ➕ Add Kafka datasink, migrate to confluent-kafka, support datetime offsets (#60307, #61284, #60909)
    • ➕ Add Turbopuffer datasink (#58910)
    • ➕ Add 2-phase commit checkpointing with trie recovery and load method (#61821, #60951)
    • Queue-based autoscaling policy integrated with task consumers (#59548, #60851)
    • Enable autoscaling for GPU stages (#61130)
    • 👍 Expressions: add random(), uuid(), cast, and map namespace support (#59656, #60695, #59879)
    • ➕ Add support for Arrow native fixed-shape tensor type (#56284)
    • 👌 Support writing tensors to tfrecords (#60859)
    • ➕ Add pathlib.Path support to read_* functions (#61126)
    • ➕ Add cudf as a batch_format (#61329)
    • 👍 Allow ActorPoolStrategy for read_datasource() via compute parameter (#59633)
    • Introduce ExecutionCache for streamlined caching (#60996)
    • 👌 Support strict=False mode for StreamingRepartition (#60295)
    • Port changes from lance-ray into Ray Data (#60497)
    • Enable PyArrow compute-to-expression conversion for predicate pushdown (#61617)
    • ➕ Add vLLM metrics export and Data LLM Grafana dashboard (#60385)
    • ⏱ Include logical memory in resource manager scheduling decisions (#60774)
    • ➕ Add monotonically increasing ID support (#59290)

    💫 Enhancements

    • Performance: cache _map_task args, heap-based actor ranking, actor pool map improvements (#61996, #62114, #61591)
    • ⚡️ Optimize concat tables and PyArrow schema hashing (#61315, #62108)
    • ⬇️ Reduce default DownstreamCapacityBackpressurePolicy threshold to 50% (#61890)
    • 👌 Improve reproducibility for random APIs (#59662)
    • Clamp batch size to fall within C++ 32-bit int range (#62242)
    • Account for external consumer object store usage in resource manager budget (#62117)
    • Make get_parquet_dataset configurable in number of fragments to scan (#61670)
    • Consolidate schema inference and make all preprocessors implement SerializablePreprocessorBase (#61213, #61341)
    • 0️⃣ Disable hanging issue detection by default (#62405)
    • 👉 Make execution callback dataflow explicit to prevent state leakage (#61405)
    • 🌲 Log DataContext in JSON format at execution start for traceability (#61150, #61428)
    • 🔧 Autoscaler: configurable traceback, Prometheus gauges, relaxed constraints (#62210, #62209, #61917, #61385)
    • ➕ Add metrics for task scheduling time, output backpressure, and logical memory (#61192, #61007, #61436)
    • Prevent operators from dominating entire shared object store budget (#61605)
    • 📌 Eliminate generators to avoid intermediate state pinning (#60598)
    • 🏁 Default log encoding to UTF-8 on Windows (#61143)
    • Remove legacy BlockList, locality_with_output, old callback API, PyArrow 9.0 checks (#60575, #61044, #62055, #61483)
    • ⬆️ Upgrade to pyiceberg 0.11.0; cap pandas to <3 (#61062, #60406)
    • 🔨 Refactor logical operators to frozen dataclasses (#61059, #61308, #61348, #61349, #61351, #61364, #61481)
    • ⏱ Prevent aggregator head node scheduling (#61288)
    • ➕ Add error for local:// paths with a zero-resource head node (#60709)

    🛠 🔨 Fixes

    • 🛠 Fix RCE in Arrow extension type deserialization from Parquet (#62056)
    • 🛠 Fix StreamingSplitDataIterator.schema() (#62057)
    • 🛠 Fix ParquetDatasource handling of FileSystemFactory.inspect (#62065)
    • 🛠 Fix read_parquet file-extension filtering for versioned object-store URIs (#61376)
    • Fix wide_schema_pipeline_tensors cloudpickle deserialization (#62149)
    • 🛠 Fix OpBufferQueue race condition (#60828)
    • 🛠 Fix scheduling metrics computation (#62031)
    • 🛠 Fix OneHotEncoder max_categories to use global top-k instead of per-partition (#60790)
    • 🛠 Fix ReservationOpResourceAllocator resource borrowing for ActorPoolMapOperator (#60882)
    • 🛠 Fix DatabricksUCDatasource schema() shadowing by schema string attribute (#61282)
    • 🛠 Fix AliasExpr structural equality to respect rename flag (#60711)
    • Fix _align_struct_fields failure with unaligned scalar fields (#58364)
    • ⏱ Fix min_scheduling_resources fallback to incremental_resource_usage (#60997)
    • 🛠 Fix output backpressure unblocking sequence for terminal ops (#60798)
    • 🛠 Fix multi-input operator object store memory attribution (#61208)
    • 🛠 Fix reference cycle by moving to module scope (#61934)
    • 🛠 Fix autoscaler logging: reduce verbose output and move traceback to debug (#61989, #62126)
    • Fix double counting ref_bundle + input_files (#61774)
    • Replace on_exit hook with __ray_shutdown__ to fix UDF cleanup race (#61700)
    • Prevent Limit from getting pushed past map_groups (#60881)
    • Propagate schema in empty _shuffle_block to fix ColumnNotFound in chained left joins (#61507)
    • 🛠 Fix unclear metadata warning and incorrect operator name logging (#61380)
    • Clamp rolling utilization averages to zero (#61543)
    • 🛠 Fix floating point errors in TimeWindowAverageCalculator (#61580)
    • ✂ Remove default task-level timeout and clamp end_offset in Kafka datasource (#61476)
    • ✅ Avoid redundant reads in train_test_split (#60274)
    • Return None when no outputs have been produced (#62029)
    • Replace bare raise with TypeError in string concatenation (#60795)

    📚 📖 Documentation

    • ➕ Add job-level checkpointing documentation (#60921)
    • ⚡️ Update exclude_resources docs for Train autoscaling changes (#61990)
    • Add locality_with_output migration instructions (#61151)
    • Document max_tasks_in_flight_per_actor vs max_concurrent_batches (#60477)
    • ➕ Add missing MOD operation docs; improve ray.data.Datasource docs (#60803, #59654)
    • ➕ Add polars usage instructions (#60029)

    Ray Serve

    🎉 New Features:

    • ➕ Added end-to-end gRPC client and bidirectional streaming support, including public APIs, proxy handling, proto updates, and developer docs, so Serve apps can handle streaming workloads natively instead of building custom transport layers. (#60767, #60768, #60769, #60770, #60771)
    • 👍 Introduced HAProxy-based serving with fallback proxy support and load-balancer tunables, giving operators a higher-throughput ingress path and more control over traffic behavior in production. (#60586, #61180, #61271, #61468, #61988)
    • ➕ Added queue-based autoscaling for async inference and Taskiq-backed workloads, so scaling decisions can account for both HTTP in-flight load and queued tasks. (#59548, #60851, #60977, #61008)
    • ⚡️ Rolled out gang scheduling support across validation, core scheduling, fault tolerance, downscaling, autoscaling, rolling updates, and migration, enabling coordinated multi-replica placement for tightly coupled workloads. (#60944, #61205, #61206, #61207, #61215, #61467, #61216, #61659)
    • 🚀 Introduced deployment-scoped actors with config/schema, lifecycle management, public API, and controller health checks, making it easier to run durable per-deployment sidecar-like logic inside Serve. (#61639, #61648, #61664, #61833, #62161)

    💫 Enhancements:

    • ➕ Added first-class tracing support for Serve, including inter-deployment gRPC propagation and richer streaming-path attributes, improving end-to-end observability across distributed request flows. (#61230, #61089, #61451)
    • 🔊 Expanded operational metrics with replica utilization, richer error labeling, and client IP logging in access logs, helping teams diagnose bottlenecks and user-impacting issues faster. (#60758, #61092, #60967)
    • 👌 Improved autoscaling extensibility with class-based policies and policy_kwargs, so advanced users can package reusable autoscaling logic without custom forks. (#60964)
    • 🚀 Reduced controller overhead with broad algorithmic improvements (indexing, cache reuse, and avoiding repeated per-tick work), which improves scalability as deployment and replica counts grow. (#60810, #60829, #60830, #60838, #60842, #60843, #60844, #60832, #60806)
    • 👌 Improved throughput-oriented operation controls by adding environment-based tuning and explicit throughput optimization logging, making performance behavior easier to configure and audit. (#60757, #62146)
    • ⬆️ Upgraded Serve internals to Pydantic v2 and refined time-series aggregation behavior for more predictable metric accuracy under high load. (#61061, #61403)

    🛠 🔨 Fixes:

    • 🛠 Fixed a direct-ingress shutdown bug where replicas could hang indefinitely while draining stuck requests, ensuring bounded shutdown behavior in failure scenarios. (#60754)
    • 🛠 Fixed HAProxy reliability issues, including config race conditions, draining guards, and platform compatibility edge cases, improving stability in production rollouts. (#61120, #60955)
    • 🛠 Fixed autoscaling correctness issues that could cause runaway scaling or delayed reactions, including feedback-loop regressions, streaming scale-down behavior, and wall-clock delay handling. (#61731, #61920, #62331, #61844, #60613)
    • 🛠 Fixed high-percentile latency regression in request routing and queue-length accounting, reducing tail-latency spikes under load. (#61755)
    • 🛠 Fixed replica-state and health-state edge cases during migration and ingress transitions, preventing false errors and unhealthy/healthy misreporting. (#60365, #61818, #62213)
    • 🛠 Fixed chained upstream actor-failure handling so request failures are attributed correctly and no longer hang when upstream deployments die mid-chain. (#61758, #62147)
    • 🛠 Fixed HTTP status classification for client disconnects after successful responses, improving accuracy of error-rate monitoring and alerting. (#61396)

    📚 📖 Documentation:

    • ➕ Added AsyncInferenceAutoscalingPolicy documentation and clarified Serve performance guidance for HAProxy and inter-deployment gRPC use cases. (#61086, #61386)
    • 🚀 Updated scheduling and configuration docs, including replica scheduling guidance and a catalog of Serve environment variables, so operators can tune deployments with less guesswork. (#60922, #60807)
    • 📄 Clarified multiplexing and async behavior docs (including model pre-warming con...