SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Stream Processing Open-Source Projects
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
Project mention: Building a Jedi-Style Hand Gesture Interface with TensorFlow.js: Control Your Browser Without Touching Anything | dev.to | 2026-02-09
In this tutorial, I'll show you how to build a production-ready hand gesture control system using TensorFlow.js and MediaPipe Hands that transforms any webcam into a precision input device.
-
Project mention: We Cut Log Costs by 35% Using Vector 0.30 and Loki 3.0: Lessons from a 3-Month Tuning | dev.to | 2026-05-04
We evaluated three alternatives: ClickHouse for log storage, Fluent Bit for log collection, and the Vector (https://github.com/vectordotdev/vector) + Loki (https://github.com/grafana/loki) stack. ClickHouse had great query performance but required manual index management, which would add operational overhead. Fluent Bit was lightweight but lacked the transform capabilities we needed to mask PII and drop low-value logs. Vector and Loki stood out: Vector is a Rust-based agent with 1/10th the memory footprint of Filebeat, and Loki is designed for cost-efficient log storage with a query model that aligns with how our team actually debugs (using labels, not full-text search).
-
-
redpanda
Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
Project mention: Top Open-Source Data Engineering Tools- Unravelling the Best in 2026 | dev.to | 2025-12-10Redpanda
-
awesome-system-design
A curated list of awesome System Design (A.K.A. Distributed Systems) resources.
Project mention: You might not be a junior anymore, but are you thinking like a senior dev? These 10 mental models reveal the difference. | dev.to | 2025-06-26Awesome System Design
-
I'm using Watermill for the event bus with Redis Streams as the backend. Redis Streams has this concept of consumer groups; consumers in the same group split messages between them, while different groups each receive all messages.
-
risingwave
Event streaming platform for agentic AI. Continuously ingest, transform, and serve event streams in real time, at scale.
In crypto markets, these price differences, or spreads, appear and vanish in milliseconds. If your data pipeline takes five seconds to process a batch of prices, the opportunity is already gone. This post demonstrates how to use RisingWave—an open-source real-time event streaming platform—to detect arbitrage opportunities with sub-second latency using standard SQL.
-
-
Project mention: Benchmark: Vector 0.40 vs. Fluent Bit 3.0 Log Processing Throughput for 100k Logs/Second | dev.to | 2026-04-28
18.7k
-
-
Hazelcast
Hazelcast is a unified real-time data platform combining stream processing with a fast data store, allowing customers to act instantly on data-in-motion for real-time insights.
-
materialize
The live data layer for apps and AI agents. Create up-to-the-second views into your business, just using SQL (by MaterializeInc)
Project mention: ANN v3: 200ms p99 query latency over 100B vectors | news.ycombinator.com | 2026-01-25I agree our sample may not be representative but we try to stay focused on the current and next crop of tpuf customers. So far "CI prohibits network access during tests" just hasn't come up as a pain point for any of them, but as I mentioned in another comment [0], we're definitely keeping an open mind about introducing an offline dev experience.
At my last company an engineer spent a year implementing Bazel [0][1] only to have it ripped out after they left [2] due to the maintenance burden. You might say it was a little bit of a hassle. :)
[0]: https://news.ycombinator.com/item?id=46758156
[1]: https://github.com/MaterializeInc/materialize/pull/24243
[2]: https://github.com/MaterializeInc/materialize/pull/31006
[3]: https://github.com/MaterializeInc/materialize/pull/33895
-
Project mention: Top Open-Source Data Engineering Tools- Unravelling the Best in 2026 | dev.to | 2025-12-10
Apache Hudi
-
-
faststream
FastStream is an asynchronous Python framework for building event-driven applications. It brings together message broker integration, dependency injection, validation, testing utilities, and AsyncAPI documentation generation in a single toolkit
Project mention: FastStream 0.7: MQTT support – in-memory tests, AsyncAPI generation and more | news.ycombinator.com | 2026-06-01 -
fluvio
🦀 event stream processing for developers to collect and transform data in motion to power responsive data intensive applications.
-
danfojs
Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.
-
-
Memgraph
High-performance open-source in-memory graph database for GraphRAG, AI memory, agentic AI, and real-time graph analytics. Cypher-compatible, built in C++.
Project mention: CI/CD Auto-Remediation: The Complete Guide for SRE and Platform Teams (2026) | dev.to | 2026-05-11Auto-remediating into a worse state. The classic failure is auto-scaling a service to handle elevated error rates that are themselves caused by a downstream dependency. The service scales, hammers the dependency harder, and the dependency collapses. Fix: never auto-remediate without dependency-graph awareness. Aurora uses Memgraph for this; HolmesGPT uses its toolset structure; pure-L1 stacks should require manual escalation when the failure crosses service boundaries.
-
peerdb
Fast, Simple and a cost effective tool to replicate data from Postgres to Data Warehouses, Queues and Storage
Project mention: Postgres and ClickHouse forming the default data stack for AI | news.ycombinator.com | 2025-12-27You should try PeerDB, it was acquired by ClickHouse for exactly this use-case - Fast, simple Postgres replication to ClickHouse. https://github.com/PeerDB-io/peerdb
In ClickHouse Cloud, you have ClickPipes which is a simpler/managed manifesation of PeerDB https://clickhouse.com/cloud/clickpipes/postgres-cdc-connect...
-
-
Project mention: Rewriting Numaflow (for AI), an open-source stream processing platform, in Rust | news.ycombinator.com | 2025-08-18
Stream Processing discussion
Stream Processing related posts
-
Bytewax: Stream processing library built using Python and Rust
-
Benchmark: Vector 0.40 vs. Fluent Bit 3.0 Log Processing Throughput for 100k Logs/Second
-
Pushing and Pulling: Three Reactivity Algorithms
-
Building a Real-Time Crypto Arbitrage Monitoring System
-
Composeable stream processing: reactive dataflow graphs in Python
-
Build a Self-Hosted Apache Iceberg Lakehouse in Minutes with RisingWave
-
Streaming
-
A note from our sponsor - SaaSHub
www.saashub.com | 22 Jun 2026
Index
What are some of the best open-source Stream Processing projects? This list will help you:
| # | Project | Stars |
|---|---|---|
| 1 | pathway | 62,902 |
| 2 | mediapipe | 35,690 |
| 3 | vector | 22,062 |
| 4 | awesome-bigdata | 14,452 |
| 5 | redpanda | 12,227 |
| 6 | awesome-system-design | 12,213 |
| 7 | watermill | 9,762 |
| 8 | risingwave | 9,088 |
| 9 | connect | 8,684 |
| 10 | fluent-bit | 7,936 |
| 11 | Faust | 6,823 |
| 12 | Hazelcast | 6,572 |
| 13 | materialize | 6,314 |
| 14 | hudi | 6,175 |
| 15 | river | 5,846 |
| 16 | faststream | 5,239 |
| 17 | fluvio | 5,234 |
| 18 | danfojs | 5,049 |
| 19 | arroyo | 4,938 |
| 20 | Memgraph | 4,174 |
| 21 | peerdb | 3,154 |
| 22 | awesome-streaming | 2,987 |
| 23 | numaflow | 2,714 |