10

8

6

4

2


10.0

7.5

9.8

9.6

9.8

9.7

9.5

9.7

9.2

9.4

8.7
0.0

51 Text Processing packages and projects

  • MarkItDown

    10.0 7.5 Python
    Python tool for converting files and office documents to Markdown.
  • mem0

    9.8 9.6 Python
    Universal memory layer for AI Agents
  • SaaSHub helps you find the best software and product alternatives
    Promo www.saashub.com
    SaaSHub Logo
  • Docling

    9.8 9.7 Python
    Get your documents ready for gen AI
  • pydantic

    9.5 9.7 Python
    Data validation using Python type hints
  • RenderCV

    9.2 9.4 Python
    Resume builder for academics and engineers
  • fuzzywuzzy

    8.7 0.0 L4 Python
    DISCONTINUED. Fuzzy String Matching in Python
  • Lark

    8.0 6.0 Python
    Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
  • 汉字拼音转换工具(Python 版)

    7.9 5.7 Python
    汉字转拼音(pypinyin)
  • sqlparse

    7.6 8.0 L4 Python
    A non-validating SQL parser module for Python
  • Pygments

    7.3 -
    A generic syntax highlighter.
  • phonenumbers

    7.2 8.4 L4 Python
    Python port of Google's libphonenumber
  • ftfy

    7.1 8.5 L4 Python
    Fixes mojibake and other glitches in Unicode text, after the fact.
  • TextDistance

    7.0 4.1 Python
    📐 Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
  • msgspec

    6.9 8.8 Python
    DISCONTINUED. A fast serialization and validation library, with builtin support for JSON, MessagePack, YAML, and TOML [Moved to: https://github.com/msgspec/msgspec]
  • PLY

    6.9 0.0 L2 Python
    DISCONTINUED. Python Lex-Yacc
  • chardet

    6.5 9.8 L4 Python
    Python character encoding detector
  • pyparsing

    6.3 7.3 Python
    DISCONTINUED. Python library for creating PEG parsers
  • jellyfish

    6.0 4.7 Jupyter Notebook
    🪼 a python library for doing approximate and phonetic matching of strings.
  • shortuuid

    5.9 3.7 L5 Python
    A generator library for concise, unambiguous and URL-safe UUIDs.
  • typeguard

    5.5 6.3 Python
    Run-time type checker for Python
  • Data Profiler

    5.4 5.8 Python
    What's in your data? Extract schema, statistics and entities from datasets
  • python-slugify

    5.4 6.1 L4 Python
    Returns unicode slugs
  • python-user-agents

    5.4 0.0 L4 Python
    A Python library that provides an easy way to identify devices like mobile phones, tablets and their capabilities by parsing (browser) user agent strings.
  • pyfiglet

    5.3 5.5 L3 Python
    An implementation of figlet written in Python
  • Mirascope

    5.1 9.9 Python
    The LLM Anti-Framework
  • Levenshtein

    5.0 0.0 L1 C
    The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity
  • Construct

    4.7 2.7 Python
    Construct: Declarative data structures for python that allow symmetric parsing and building
  • xpinyin

    4.5 5.9 L4 Python
    Translate Chinese hanzi to pinyin (拼音) by Python, 汉字转拼音
  • python-nameparser

    4.2 3.3 L2 Python
    A simple Python module for parsing human names into their individual components
  • Charset Normalizer

    4.1 9.2 Python
    Truly universal encoding detector in pure Python.
  • ijson

    4.0 0.3 Python
    DISCONTINUED. Iterative JSON parser with Pythonic interface
  • awesome-slugify

    3.5 0.0 L5 Python
    Python flexible slugify function
  • AnyAscii

    3.2 3.7 Kotlin
    Unicode to ASCII transliteration - C Elixir Go Java JS Julia PHP Python Ruby Rust Shell .NET
  • unicode-slugify

    3.1 0.0 L4 Python
    A slugifier that works in unicode
  • pangu.py

    2.9 1.9 L5 Python
    Paranoid text spacing in Python
  • json-streamer

    2.7 5.9 Python
    A fast streaming JSON parser for Python that generates SAX-like events using yajl
  • uniout

    2.4 1.8 L5 Python
    Never see escaped bytes in output.
  • simplematch

    2.3 5.1 Python
    Minimal, super readable string pattern matching for python.
  • json2xml

    2.2 8.3 Python
    JSON-to-XML converter for Python, accelerated with a native Rust extension.
  • nider

    2.2 0.0 Python
    Python package to add text to images, textures and different backgrounds
  • Efficient keyword mining with regular expressions

    2.1 4.9 Python
    Efficient string matching with regular expressions
  • HaikunatorPY

    2.1 0.0 L5 Python
    Generate Heroku-like random names to use in your python applications
  • Python Left-Right Parser

    2.1 5.9 L4 Python
    Python Parser
  • Atoma

    2.0 0.0 Python
    Atom, RSS and JSON feed parser for Python 3
  • LLMWorkbook

    0.7 8.1 Python
    Effortlessly harness the power of LLMs on Excel and DataFrames—seamless, smart, and efficient!
  • acorn

    0.6 9.1 Python
    LLM framework for long running agents
  • GoBeautifulSoup

    0.3 3.3 Python
    GoBeautifulSoup is a high-performance HTML/XML parsing library that provides a 100% compatible API with BeautifulSoup4, but powered by Go for dramatically improved performance. It's designed as a drop-in replacement for BeautifulSoup4 with significant speed improvements.
  • Prompt Optimizer

    0.3 8.1 Python
    Automated prompt optimization using mentor-agent architecture. Generate and refine prompts from labeled data.
  • iban-tools

    0.3 7.9 Python
    Comprehensive IBAN & BIC toolkit for Python — validate, parse, format, generate, and extract IBANs from text/PDF.
  • unidecode

    -
    ASCII transliterations of Unicode text.
  • difflib

    -
    (Python standard library) Helpers for computing deltas.

Add another 'Text Processing' Package