🐼 tsb – `readHtml()`

readHtml(html, opts?) mirrors pandas.read_html(). It scans an HTML string for <table> elements and returns one DataFrame per table found.

Live Demo

Paste or edit HTML below, then click Parse.

header row index: converters (numeric): nrows:

Code Example

import { readHtml } from "tsb";

const html = `<table>
  <thead><tr><th>Name</th><th>Age</th></tr></thead>
  <tbody>
    <tr><td>Alice</td><td>30</td></tr>
    <tr><td>Bob</td><td>25</td></tr>
  </tbody>
</table>`;

const [df] = readHtml(html);
console.log(df.columns);   // ["Name", "Age"]
console.log(df.shape);     // [2, 2]
console.log(df.toRecords());
// [{ Name: "Alice", Age: 30 }, { Name: "Bob", Age: 25 }]

Supported Options

header — which row to use as column names (default 0). Use null for no header.
indexCol — column name or index to use as the row index.
match — array of table indices to return (e.g. [0, 2]).
naValues — extra strings to treat as NaN (default includes "", "NA", "NaN", "None").
converters — try to convert cells to numbers (default true).
thousands — thousands-separator character, e.g. ",".
decimal — decimal separator, default ".".
skipRows — 0-based row indices to skip in the body.
nrows — maximum rows to return.
skipBlankLines — skip rows where all cells are whitespace (default true).

🐼 tsb – readHtml()

Live Demo

Code Example

Supported Options

🐼 tsb – `readHtml()`