SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Python Data Validation Projects
-
cleanlab
Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
Project mention: Framework de Tests Automatisés API avec Pytest: Tutoriel Pratique | dev.to | 2026-05-22
-
-
deepchecks
Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and models from research to production.
-
-
-
-
Project mention: Show HN: Data contracts engine for the modern data stack | news.ycombinator.com | 2026-01-28
-
Project mention: Programmers and software developers lost the plot on naming their tools | news.ycombinator.com | 2025-12-11
When I told a co-worker about https://pypi.org/project/voluptuous/ he immediately searched for the name alone, then told us not to do the same.
-
-
Project mention: Show HN: Dingo 1.9.0 released: With enhanced hallucination detection | news.ycombinator.com | 2025-07-31
-
-
GitHub: github.com/posit-dev/pointblank (star the repo, file issues, contribute!)
-
opendataeditor
The Open Data Editor (ODE) is a no-code application to explore and validate tabular data in a simple way. Forever free and open source project powered by the Frictionless Framework.
-
-
-
python-codicefiscale
:it: :credit_card: italian fiscal codes encoding, decoding and validation - codifica, decodifica e validazione del Codice Fiscale italiano.
-
snowflake-provisioning
Snowflake Database, Schema, and Warehouse provisioning with Access Roles & Generating and Provisioning of Functional Roles & Snowflake Source Export, Snowflake cloning, and data tieout tool
-
-
OpenDQV
Open-source, contract-driven data quality validation. Shift-left enforcement at the point of write — before data enters your pipeline.
Project mention: OpenDQV – open-source data quality validation at the point of write | news.ycombinator.com | 2026-03-20 -
-
validatelite
ValidateLite: A lightweight CLI for database schema validation and data quality checks. Ideal for CI/CD, ETL, and data pipelines.
Project mention: DevLog #1 - ValidateLite: Building a Zero-Config Data Validation Tool | dev.to | 2025-08-09This data validation tool is built on a simple principle: "Cross-cloud ready, code-first, operational in 30 seconds." And it is open source: ValidateLite on GitHub.
-
SmartExcelGuardian
SmartExcelGuardian is a professional Python desktop application for Excel data cleanup, validation, and auditing. It automatically detects missing values, duplicates, type issues, and invalid formulas, applies heuristic scoring, conditional formatting, and auto-calculated Excel formulas, and export
Project mention: SmartExcelGuardian: Open-source Excel data cleaning with heuristics and formulas | news.ycombinator.com | 2026-01-19
Python Data Validation discussion
Python Data Validation related posts
-
Framework de Tests Automatisés API avec Pytest: Tutoriel Pratique
-
Detect, Defend, Prevail: Payments Fraud Detection using ML & Deepchecks
-
Deepchecks: Open-source ML testing and validation library
-
Deepchecks' New Open Source is on Product Hunt, and Needs Your Help
-
Do you think we need an open-source web scraping monitoring tool?
-
[D] Is accurately estimating image quality even possible?
-
Python: Data validation
-
A note from our sponsor - SaaSHub
www.saashub.com | 23 Jun 2026
Index
What are some of the best open-source Data Validation projects in Python? This list will help you:
| # | Project | Stars |
|---|---|---|
| 1 | cleanlab | 11,515 |
| 2 | jsonschema | 4,952 |
| 3 | pandera | 4,381 |
| 4 | deepchecks | 4,025 |
| 5 | Cerberus | 3,285 |
| 6 | schema | 2,945 |
| 7 | Schematics | 2,589 |
| 8 | soda-core | 2,374 |
| 9 | voluptuous | 1,845 |
| 10 | cleanvision | 1,188 |
| 11 | dingo | 718 |
| 12 | colander | 463 |
| 13 | pointblank | 445 |
| 14 | opendataeditor | 306 |
| 15 | valideer | 261 |
| 16 | Validoopsie | 88 |
| 17 | python-codicefiscale | 87 |
| 18 | snowflake-provisioning | 50 |
| 19 | laravel-validation | 15 |
| 20 | OpenDQV | 10 |
| 21 | data_check | 5 |
| 22 | validatelite | 3 |
| 23 | SmartExcelGuardian | 3 |