Benchmarks¶
Every number on this page is position-verified against every competitor. Mismatch aborts the run. 57,072 positions cross-checked on the 10 MB Wikipedia headline alone — 15 exact matches + 1 documented soft-parity (ripgrep's non-overlapping multi-pattern semantics).
Every number is reproduced from the markdown tables under benchmark/output/m2-pro/md/*.md and the raw JSON under benchmark/output/m2-pro/*.json. Reproduction commands are at the bottom of this page — all public benchmarks are in the repo under benchmark/.
The headline: parot flatlines while everyone else scales linearly¶
Query the same 10 MB of Wikipedia with N phrases. Everyone else scales linearly. Parot flatlines.
| Phrases → | 10 | 100 | 1,000 | 10,000 | 100,000 |
|---|---|---|---|---|---|
| parot | 0.14 ms | 0.18 ms | 0.99 ms | 6.4 ms | 60 ms |
| pyahocorasick | 62.7 ms | 126 ms | 161 ms | 213 ms | 685 ms |
str.count (Python) |
36.6 ms | 422 ms | 4.2 s | 41 s | DNF |
pandas str.count |
69 ms | 510 ms | 5.1 s | 57 s | DNF |
polars count_matches |
281 ms | 639 ms | 7.6 s | 70 s | DNF |
pyarrow count_substring |
182 ms | 1.8 s | 17 s | 170 s | DNF |
duckdb LENGTH/REPLACE |
227 ms | 2.1 s | 22 s | 222 s | DNF |
Wikipedia 10 MB corpus (9,999,999 bytes), Apple M2 Pro. Index build is O(N) and happens once (~700 ms). After that, query cost is independent of corpus size — 100,000 distinct phrases finish in 60 ms while every tokenizer-free substring engine times out. DNF = exceeded 30 s per-call ceiling.
- Source:
bench_query_scaling.py(Python) +bench_query_scaling_js.mjs(JS) - Raw data:
benchmark/output/m2-pro/bench_query_scaling.json - Reproduce:
just bench-query-scaling(full) orjust bench-query-scaling-js(JS-only, fast)
Performance highlights¶
| Benchmark | Result |
|---|---|
| 1 GB search | 0.97 ms to batch_count 100 patterns across 1 GB of Wikipedia. Build once (~148 s), then every query is free. |
| Genomics | 21,000x faster than BioPython on CRISPR guide search against GRCh38 chr22 (50 MB). |
| DataFrames | 36,000x faster than polars str.count on 500K rows, 50 patterns. Same results, one-line change. |
| Scale-invariant query | 0.34 ms whether your text is 10 KB or 10 GB. |
| 100K phrases, 10 MB corpus | 60 ms total. Every other substring engine times out past 10K. |
Every number below is sourced from a script in benchmark/, checked into the repo, with raw JSON output committed alongside.
Methodology¶
Hardware & software¶
- Python benchmarks: Apple M2 Pro, macOS 15.5, Python 3.12.8, Rust 1.94.1
- JS/WASM benchmarks: Apple M4, macOS, Node.js 25.8.0, Rust 1.94.1
Two machines — JS/WASM only
The Python, Rust, and CLI benchmarks all ran on the same M2 Pro. The JS/WASM numbers (the four JS competitors in the headline table, the WASM vs-mnemonist table, the parot-in-the-browser table) were produced on an M4. Do not read absolute times across those two machines. All speedups in the headline table are computed against parot native (M2 Pro, 0.34 ms) — a single baseline — so the Speedup column is coherent across languages.
Package versions (canonical run, 2026-04-15)¶
| Package | Version |
|---|---|
| parot | 0.2.4 |
| pyahocorasick | 2.3.0 |
| ahocorasick-rs | 1.0.3 |
| stringzilla | 4.6.0 |
| duckdb | 1.5.2 |
| google-re2 | latest via pip |
| pandas / polars / pyarrow | pinned in benchmark/pyproject.toml |
| comparator A (Rust-backed text index on PyPI) | 2.3.4 |
| comparator B (Rust-backed SA on PyPI) | 0.0.20 |
| pylcs | 0.1.1 |
Every benchmark script writes the exact competitor versions it imported into benchmark/output/m2-pro/*.json under environment.packages.
Real corpora — no synthetic data¶
Every table on this page uses a real public corpus:
| Corpus | Size | Source | Used by |
|---|---|---|---|
| Wikipedia-EN 10 MB slice | 9,999,999 B | dumps.wikimedia.org | headline scaling + 16-competitor + multipattern |
| Wikipedia-EN 1 GB slice | 1,000,000,000 B | dumps.wikimedia.org | 1 GB hero |
| GRCh38 chr22 | 50,818,468 B | Ensembl release 110 | genomics / CRISPR |
| Pride and Prejudice | 752 KB | Project Gutenberg | duplicates |
| Complete Shakespeare | 5.6 MB | Project Gutenberg | duplicates |
| King James Bible | 4.6 MB | Project Gutenberg | duplicates |
| Dickens Collected (16 novels) | ~22 MB / 3.9M words | Project Gutenberg | duplicates |
| Proust (ISOLT) | ~3.4 MB / 691K words | Project Gutenberg | duplicates |
Patterns are not hardcoded — they are sampled from the corpus itself with a fixed seed via benchmark/common.py::sample_count_queries(text, n=1000, seed=99). These are mid-frequency phrases a user would realistically search for.
Measurement protocol¶
- Python: 2 warmup iterations (discarded), 10 measurement iterations (typical), median reported with IQR.
- JavaScript: 2 warmup, 5 measurement, median + IQR.
- Indexes are built once before the timed region; only the query is timed. Build time is reported separately in the caption, never folded into a per-cell number.
- GC is not pinned. Median + IQR mitigate the occasional spike.
Correctness verification¶
Every competitor's position list is compared bit-for-bit against parot's batch_find_all before timing starts. The 10 MB headline run cross-verifies 57,072 positions across 11 strict competitors + 1 soft; the 1 GB run cross-verifies 100 patterns across 14 competitors. Any mismatch aborts the run non-zero. Reaching the end of a run is the correctness receipt.
The "soft" bucket contains competitors with documented semantic differences (currently: ripgrep's non-overlapping multi-pattern mode) — their mismatches are logged but not fatal.
Raw data¶
- Python scripts:
benchmark/{headlines,memory,docs,internal}/bench_*.py - JS scripts:
benchmark/*.mjs - Output (JSON + MD + CSV):
benchmark/output/m2-pro/andbenchmark/output/m4/ - Manifest of every script:
benchmark/BENCHMARKS.md
16 competitors on 10 MB Wikipedia (find_all)¶
Source: bench_hero_locate.py · raw JSON
Median wall-clock time for find_all() on 100 mid-frequency phrases against a single 10 MB Wikipedia corpus. One parot index built once; every competitor compared on the same corpus with the same 100 patterns. Every row's position list is verified against parot's batch_find_all before the competitor's median is allowed to count. Returns every position for every pattern — not just the first, not just a count.
| Competitor | Platform | Regime | Median (ms) | Speedup vs parot |
|---|---|---|---|---|
| parot (Us!) | Python + Rust | index | 0.34 | 1.0x (baseline) |
| parot WASM (Us!) | JavaScript | index | 1.09 | 0.3x (~3x slower than native) |
| stringzilla | Python | substring | 42.6 | 125x |
| ahocorasick-rs | Python (Rust bind) | multi-pattern | 53.7 | 158x |
| pyahocorasick | Python | multi-pattern | 115.5 | 339x |
| ripgrep (subprocess) | subprocess | substring | 123.0 | 361x ⚠ soft |
| modern-ahocorasick | JavaScript | multi-pattern | 340.5 | 999x |
| String.matchAll | JavaScript | regex | 393.9 | 1,156x |
| RegExp.exec /g | JavaScript | regex | 395.8 | 1,161x |
| String.indexOf | JavaScript | substring | 402.8 | 1,182x |
| bytes.find loop | Python | substring | 383.2 | 1,124x |
| str.find loop | Python | substring | 427.3 | 1,253x |
| regex (mrab) | Python | regex | 449.0 | 1,317x |
| re.finditer | Python | regex | 676.7 | 1,985x |
| pyarrow.compute (chunked) | Python | substring | 1,336 | 3,919x |
| polars.str.contains (chunked) | Python | substring | 1,457 | 4,274x |
| pandas.str.contains (chunked) | Python | substring | 1,709 | 5,013x |
| google-re2 | Python | regex | 3,027 | 8,880x |
parot build: 187 ms (one-time, 38.6 MB heap). Speedups are computed against parot native (0.34 ms) so a user can read "125x over stringzilla" and "1,184x over String.indexOf" on the same baseline even though the JS rows were measured on an M4.
⚠ Soft parity: ripgrep uses non-overlapping multi-pattern semantics (19 positions differ from parot's overlapping ground truth — 4,737 vs 4,756 hits). Every other row is position-identical: 4,756 hits, exact.
Parity receipt (from bench_hero_locate.json):
- 11 strict competitors, all position-identical —
str.find loop,bytes.find loop,re.finditer,regex (mrab),google-re2,stringzilla,pyahocorasick,ahocorasick-rs,pandas (chunked),polars (chunked),pyarrow (chunked). - 1 soft competitor —
ripgrep (subprocess). - 100 patterns × 11 strict competitors = 57,072 positions validated before any timing ran.
1 GB Wikipedia — batch_count¶
Source: bench_hero_plot.py · raw JSON
Batch-counting 100 patterns over 1 GB of English Wikipedia. Build cost is amortized over the query stream — the question is: once built, how fast can you run queries?
| Competitor | Regime | Median (ms) | Build (ms) |
|---|---|---|---|
parot (Us!) — batch_count |
index | 0.97 | 147,975 |
| comparator A (Rust-backed text index) — loop | index | 0.58 | 4,447,514 |
| comparator B (Rust-backed SA) | index | 0.89 | 64,548 |
| stringzilla | substring | 4,623 | 0 |
| ahocorasick-rs | multi-pattern | 10,689 | 0.20 |
| ripgrep (subprocess) | substring | 11,275 | 0 |
| pyahocorasick | multi-pattern | 12,643 | 0.04 |
| bytes.count | substring | 63,527 | 0 |
| re.findall | regex | 68,085 | 0.34 |
| pandas.str.count | substring | 68,582 | 0 |
| str.count | substring | 69,258 | 0 |
| polars.count_matches | substring | 108,425 | 0 |
| pyarrow.count_substring | substring | 181,895 | 0 |
duckdb contains |
substring | 273,283 | 801 |
| google-re2 | regex | 385,105 | 0.43 |
parot build: 148 s once, then 0.97 ms for 100 patterns across 1 GB. Comparator A is 1.7x faster per query but takes 30x longer to build (4,448 s vs 148 s) — only worth it if you run millions of queries per index. parot heap: 5.05 GB for 1 GB corpus (5.1x linearity, consistent with the memory table below).
Genomics — CRISPR off-target search (GRCh38 chr22, 50 MB)¶
Source: bench_genomics.py · raw JSON
Real data: GRCh38 chromosome 22 (50,818,468 bases). Real workload: 1,000 sgRNA guides sampled from the Brunello lentiviral knockout library, batch_find_all'd against the chromosome.
| Competitor | Regime | Median (ms) | Speedup vs parot |
|---|---|---|---|
parot (Us!) — batch_find_all |
index | 2.31 | 1.0x |
python bytes.find loop |
substring | 49,333 | 21,313x |
BioPython Seq.count_overlap |
substring (count-only) | 49,576 | 21,418x |
| ripgrep (subprocess) | substring | 189.3 | 82x |
parot build: 1,162 ms for the full chromosome (cached as a .parot index thereafter). BioPython has no native find-all; we compared count-only for parity. 1,000-guide CRISPR run cross-verifies 108,402 positions.
DNA motif sweep — 50 MB¶
Source: bench_dna.py · raw JSON
Motif sweep across genome sizes — parot holds flat at ~0.03 ms while everyone else scales linearly.
| Genome size | bytes.count |
re.findall |
BioPython | parot | vs BioPython |
|---|---|---|---|---|---|
| 1 MB | 44.5 ms | 37.2 ms | 37.2 ms | 0.024 ms | 1,541x |
| 10 MB | 474.3 ms | 379.5 ms | 379.5 ms | 0.031 ms | 12,128x |
| 50 MB | 2,275 ms | 1,856 ms | 1,856 ms | 0.036 ms | 51,964x |
At 50 MB the gap opens to 51,964x vs BioPython on a pure motif-count workload — consistent with the 21,000x CRISPR result above (the CRISPR run does the full find_all with positions, which costs more).
Parot vs pandas / polars (DataFrames)¶
Source: bench_dataframes.py (lives in the private parot-core benchmark suite) · raw JSON
Build the parot index once from a column of text, then run many str.contains / str.count-style queries. pandas and polars rescan the full column every time.
Different data models
pandas returns per-row results (a Series); parot operates on the concatenated text and returns a flat total or a contains-mask. The speedups reflect both the algorithmic win and the per-row Python object overhead pandas carries. See Index.from_series and Index.contains_mask for the symmetric API.
contains — vary patterns (100K rows, query time only)¶
| Patterns | pandas (ms) | polars (ms) | parot (ms) | vs pandas | vs polars |
|---|---|---|---|---|---|
| 5 | 157.9 | 11.7 | 0.57 | 276x | 20x |
| 10 | 346.0 | 22.3 | 1.52 | 228x | 15x |
| 50 | 1,664 | 108.1 | 8.41 | 198x | 13x |
| 100 | 3,509 | 222.1 | 14.4 | 244x | 16x |
count — vary rows (50 patterns, query time only)¶
| Rows | pandas (ms) | polars (ms) | parot (ms) | vs pandas | vs polars |
|---|---|---|---|---|---|
| 10,000 | 131.4 | 147.4 | 0.11 | 1,148x | 1,288x |
| 100,000 | 642.0 | 920.0 | 0.15 | 4,214x | 6,038x |
| 500,000 | 4,736 | 6,190 | 0.17 | 27,694x | 36,201x |
This is the 36,000x headline — 500K rows, 50 patterns, parot at 0.17 ms vs polars at 6.19 s.
Break-even — contains (100K rows)¶
| Patterns | pandas | polars | parot | Winner |
|---|---|---|---|---|
| 1 | 34 ms | 3 ms | 1.56 s | polars |
| 50 | 1,649 ms | 110 ms | 1.53 s | polars |
| 500 | 18 s | 1.1 s | 1.59 s | polars |
| 1,000 | 37 s | 2.3 s | 1.72 s | parot |
| 2,500 | 90 s | 5.5 s | 1.90 s | parot |
Parot eats a ~1.5 s index-build cost. It pays off around the thousand-pattern mark on 100K rows of text. For one-off contains lookups, use polars.
Break-even — count (100K rows)¶
| Patterns | pandas | polars | parot | Winner |
|---|---|---|---|---|
| 1 | 16 ms | 23 ms | 1.64 s | pandas |
| 100 | 1,298 ms | 1,860 ms | 1.56 s | pandas |
| 250 | 3,272 ms | 4,704 ms | 1.59 s | parot |
| 500 | 6,456 ms | 9,379 ms | 1.53 s | parot |
Parot vs Aho-Corasick vs Regex (multi-pattern)¶
Source: bench_multipattern.py · raw JSON
Vary pattern count on 1 MB corpus (query only, all pre-built)¶
| Patterns | str.count | re.findall | pyahocorasick | ahocorasick-rs | parot | parot batch |
|---|---|---|---|---|---|---|
| 10 | 4.78 ms | 6.33 ms | 7.12 ms | 5.40 ms | 0.020 ms | 0.051 ms |
| 50 | 22.8 ms | 36.7 ms | 12.0 ms | 6.05 ms | 0.14 ms | 0.082 ms |
| 200 | 94.8 ms | 148.6 ms | 14.6 ms | 12.1 ms | 0.61 ms | 0.18 ms |
| 1,000 | 453.9 ms | 719.2 ms | 18.5 ms | 16.1 ms | 3.06 ms | 0.69 ms |
Vary corpus size with 200 patterns¶
| Corpus | str.count | re.findall | pyahocorasick | ahocorasick-rs | parot | parot batch |
|---|---|---|---|---|---|---|
| 100 KB | 9.20 ms | 12.8 ms | 1.41 ms | 1.00 ms | 0.52 ms | 0.16 ms |
| 1 MB | 93.0 ms | 146.1 ms | 14.2 ms | 11.8 ms | 0.62 ms | 0.20 ms |
| 10 MB | 834.3 ms | 1,289 ms | 149.5 ms | 112.7 ms | 0.69 ms | 0.75 ms |
At 10 MB × 200 patterns: parot batch_count is 1,112x faster than str.count, 1,719x faster than re.findall, and 150x faster than ahocorasick-rs.
Build time (1 MB corpus)¶
| Method | Build time |
|---|---|
| re.findall | 0 (no build) |
| pyahocorasick | 0.10 ms |
| ahocorasick-rs | 0.16 ms |
| parot | 61.3 ms |
Aho-Corasick indexes the patterns in microseconds; parot indexes the text in tens of milliseconds. Parot wins any time you re-query the same text with new patterns.
Duplicate phrase detection¶
Source: bench_duplicates.py · raw JSON · CLI corpus data
No other library provides word-boundary-aware duplicate phrase detection. These numbers are from the find_duplicates Python API; the parot scan CLI hits the same figures.
Absolute performance (real Gutenberg text)¶
| Text | Size | Time | Phrases found |
|---|---|---|---|
| Pride and Prejudice | 752 KB | 75 ms | 2,025 |
| Proust: In Search of Lost Time | ~3.4 MB | 564 ms | 20,192 |
| King James Bible | 4.6 MB | 2,385 ms | 57,009 |
| Complete Shakespeare | 5.6 MB | 796 ms | 10,308 |
| Dickens Collected (16 novels) | ~22 MB / 3.9M words | 6,770 ms | 171,060 |
The Bible (4.6 MB) is slower than Shakespeare (5.6 MB) because it contains 5.5x more repeated phrases (57K vs 10K). Output size, not input size, dominates. Dickens Collected is an end-to-end stress test: 16 novels, 3.9M words, 22 MB, 171K duplicate phrases in under 7 seconds.
Parameter sensitivity (Pride & Prejudice)¶
| min_words | min_chars | max_words | Time | Phrases |
|---|---|---|---|---|
| 3 | 6 | 50 | 127.5 ms | 8,703 |
| 4 | 9 | 50 | 76.9 ms | 2,025 |
| 5 | 12 | 50 | 64.4 ms | 470 |
| 8 | 20 | 50 | 58.4 ms | 10 |
| 4 | 9 | 20 | 76.4 ms | 2,025 |
Looser params (min_words=3) find 4x more phrases at 1.7x the cost. The default (4/9) is the precision/recall sweet spot.
Normalization modes (Pride & Prejudice)¶
| Mode | Time | Phrases | Delta vs default |
|---|---|---|---|
| default | 76.3 ms | 2,025 | — |
| case_insensitive | 87.3 ms | 1,614 | +14% time, −20% phrases |
| collapse_whitespace | 78.6 ms | 2,025 | +3% time, 0 phrases |
| both | 243.1 ms | 2,263 | +219% time, +12% phrases |
case_insensitive collapses phrases that differ only in case (e.g. "he said"/"He said" become one entry), which is why it reports fewer distinct phrases despite covering more of the text. Enable it when you care about deduplication, not when you care about maximum phrase count.
Block detection¶
| Mode | Time | Phrases |
|---|---|---|
| False | 74.2 ms | 2,025 |
| True | 76.6 ms | 2,025 |
Zero-cost toggle. Block detection matters for texts with repeated paragraphs/sections (documentation, legal boilerplate).
Parot vs comparator A (Rust-backed text index on PyPI)¶
Source: bench_vs_index.py · raw JSON — the closest Rust-backed competitor on PyPI.
Build time¶
| Text size | parot (ms) | comparator A (ms) | Winner |
|---|---|---|---|
| 100 KB | 3.4 | 12.8 | parot (3.7x) |
| 1 MB | 35.4 | 76.9 | parot (2.2x) |
| 10 MB | 370.1 | 1,425 | parot (3.8x) |
count — 100 patterns, query only¶
| Text size | parot (ms) | comparator A (ms) | Winner |
|---|---|---|---|
| 100 KB | 0.28 | 0.29 | tied |
| 1 MB | 0.94 | 0.72 | comparator A (1.3x) |
| 10 MB | 0.35 | 0.41 | parot (1.2x) |
find_all — 100 patterns, query only¶
parot uses batch_find_all; comparator A has no batch API, so we loop its per-pattern position lookup (as a user would).
| Text size | Patterns | Total hits | parot batch_find_all (ms) |
comparator A loop (ms) | Winner |
|---|---|---|---|---|---|
| 100 KB | 100 | 201 | 0.11 | 1.87 | parot (17x) |
| 1 MB | 100 | 3,523 | 0.12 | 3.18 | parot (26x) |
| 10 MB | 100 | 17,876 | 0.35 | 43.3 | parot (125x) |
Memory — apples-to-apples¶
Both libraries measured via serialized-artifact size (Index.to_bytes() vs pickle.dumps(...)). The parot heap column is the live Python/Rust runtime footprint — comparator A has no equivalent accessor, so it's parot-only.
| Text size | Raw | parot serialized | parot ser. ratio | comparator A pickle | comparator A ratio | parot heap | parot heap ratio |
|---|---|---|---|---|---|---|---|
| 100 KB | 99,999 | 420,452 | 4.2x | 126,949 | 1.3x | 507,392 | 5.1x |
| 1 MB | 999,988 | 4,553,401 | 4.6x | 1,174,839 | 1.2x | 5,052,836 | 5.1x |
| 10 MB | 9,999,988 | 50,506,913 | 5.1x | 13,152,252 | 1.3x | 50,506,356 | 5.1x |
Comparator A is ~4x smaller on disk because it keeps a more compact compressed form and drops the positional detail parot preserves. Parot keeps that detail — that's what powers batch_find_all's 125x lead over comparator A's per-pattern position lookup.
Summary
Comparator A is ~4x smaller on disk and ~1.3x faster on single-pattern count. Parot is 2–3x faster on build, 17–125x faster on batch find_all, and brings LCS + LPF + duplicate detection + serialization + sentence splitting + WASM + CLI with it.
Break-even — when does parot pay off?¶
Source: bench_breakeven.py · raw JSON
Per-query cost¶
| Text size | str.count per query |
parot build | parot per query | Break-even |
|---|---|---|---|---|
| 100 KB | 0.045 ms | 5.22 ms | 0.003 ms | 125 queries |
| 1 MB | 0.46 ms | 59.8 ms | 0.003 ms | 131 queries |
| 10 MB | 4.29 ms | 750.5 ms | 0.003 ms | 176 queries |
Total time (build + N queries) on 10 MB text¶
| Queries | str.count |
parot | Winner |
|---|---|---|---|
| 1 | 4.3 ms | 751 ms | str.count() |
| 50 | 214 ms | 751 ms | str.count() |
| 100 | 429 ms | 751 ms | str.count() |
| 250 | 1,072 ms | 751 ms | parot |
| 500 | 2,144 ms | 752 ms | parot |
| 1,000 | 4,288 ms | 754 ms | parot (5.7x) |
Break-even clusters around 125–180 queries regardless of text size. After that, every additional query costs parot 3 μs and str.count milliseconds-to-seconds.
Parot memory — serialized vs heap¶
Source: bench_memory.py · raw JSON
Parot keeps full positional detail alongside its compressed index — unlike comparator B, which drops positional detail in favor of a smaller artifact. The trade: parot is ~4x larger on disk but supports batch_find_all, segment APIs, and shared-prefix analysis without a rebuild.
| Corpus | Raw | parot serialized | parot ser. ratio | parot heap | parot heap ratio | comparator A (index+text) | comparator A ratio | comparator B pickle | comparator B ratio |
|---|---|---|---|---|---|---|---|---|---|
| 1 MB | 977 KB | 4.3 MB | 4.6x | 4.8 MB | 5.1x | 4.8 MB | 5.0x | 1.1 MB | 1.2x |
| 4 MB | 3.8 MB | 18.3 MB | 4.8x | 19.3 MB | 5.1x | 19.1 MB | 5.0x | 4.8 MB | 1.2x |
| 16 MB | 15.3 MB | 77.1 MB | 5.0x | 77.1 MB | 5.1x | 76.3 MB | 5.0x | 20.2 MB | 1.3x |
| 64 MB | 61.0 MB | 323.5 MB | 5.3x | 308.3 MB | 5.1x | 305.2 MB | 5.0x | 81.4 MB | 1.3x |
| 256 MB | 244.1 MB | 1.32 GB | 5.6x | 1.20 GB | 5.1x | 1.19 GB | 5.0x | 314 MB | 1.3x |
Parot serialized ≈ comparator A (index+text) ≈ 5x raw. Parot heap is essentially identical (5.1x) — the Rust Vecs pack tightly. Comparator B's pickle sits at 1.3x because it drops positional detail.
Memory/speed tradeoff¶
Source: bench_locate_levels.py · raw JSON
The memory_compactness parameter (0–4) controls the memory/speed tradeoff for find_all(). count() is unaffected — its speed stays identical across levels.
memory_compactness |
Build (ms) | Memory (bytes) | count (ms) |
find_all (ms) |
|---|---|---|---|---|
| 0 (default) | 60.3 | 9,052,792 | 0.28 | 0.31 |
| 1 | 61.8 | 2,302,872 | 0.28 | 1.04 |
| 2 | 60.1 | 1,677,880 | 0.28 | 2.60 |
| 3 | 60.1 | 1,365,384 | 0.28 | 6.03 |
| 4 | 60.5 | 1,209,136 | 0.28 | 13.2 |
count() is flat at ~0.28 ms regardless of level. find_all() spans 42x (0.31 ms → 13.2 ms). Memory spans 7.5x (1.2 MB → 9.1 MB on a 1 MB corpus).
Normalization overhead (case-insensitive, collapse-whitespace)¶
| Mode | Build (ms) | Memory (bytes) | count (ms) |
|---|---|---|---|
| default | 60.0 | 1,677,880 | 0.28 |
| case_insensitive | 64.5 | 1,677,912 | 0.28 |
| normalize_whitespace | 63.1 | 1,676,056 | 0.30 |
| both | 65.2 | 1,676,072 | 0.28 |
Normalization is effectively free at build and query time — the cost is paid up front when the text is copied and folded.
Parot build time¶
Parot's index construction runs in linear time. Slope verified against two other Rust-backed text indexes on PyPI:
Build time at reference sizes (from bench_scaling.py):
| Text size | parot build |
|---|---|
| 1 KB | 0.10 ms |
| 10 KB | 0.50 ms |
| 100 KB | 5.00 ms |
| 1 MB | 58.8 ms |
| 10 MB | 655.2 ms |
Memory grid — peak RSS + on-disk artifact¶
The two dimensions that matter in production: peak build RAM (how much headroom you need to construct the index) and on-disk artifact (how much space it takes to persist the index once built). Both swept across corpus size × memory_compactness.
Peak build RAM¶
On-disk artifact (serialized)¶
Peak build RAM is always ~5x corpus regardless of memory_compactness — the full index is materialized during construction and discarded after. The on-disk artifact shrinks by up to ~4x as memory_compactness rises.
JavaScript / WASM¶
Index construction: WASM vs mnemonist (M4)¶
Source: bench_vs_mnemonist.mjs (lives in the parot-core JS benchmark suite). mnemonist is the standard JS package in this space (8.7M downloads/week, pure JS).
| Input size | parot (ours) | mnemonist | Speedup |
|---|---|---|---|
| 10 K chars | 0.44 ms | 4.7 ms | 11x |
| 100 K chars | 3.9 ms | 58 ms | 15x |
| 500 K chars | 20 ms | 330 ms | 17x |
| 1 M chars | 42 ms | 787 ms | 19x |
| 10 M chars | 730 ms | DNF | — |
| 100 M chars | 9.8 s | DNF | — |
mnemonist DNFs at 10M+. Parot's linear-time construction scales to 100M chars. Parot is the only WASM-based positional index on npm.
Parot in the browser — vs indexOf + Aho-Corasick (M4)¶
Source: bench_parot_index.mjs (lives in the parot-core JS benchmark suite). Parot is the only positional search index of its kind on npm.
100 patterns, varying text size:
| Text size | indexOf loop |
modern-ahocorasick | parot | vs indexOf |
|---|---|---|---|---|
| 10 KB | 0.32 ms | 0.34 ms | 0.22 ms | 1.5x |
| 100 KB | 3.8 ms | 2.4 ms | 0.22 ms | 17x |
| 1 MB | 39 ms | 24 ms | 0.26 ms | 143x |
| 10 MB | 383 ms | 224 ms | 0.29 ms | 1,287x |
Parot query time stays flat at ~0.3 ms regardless of text size. indexOf and Aho-Corasick scale linearly. Build cost: 54 ms on 1 MB, 1.7 MB WASM heap.
Longest common substring vs pylcs (M4)¶
Benchmark pending rewrite
These numbers are from a legacy bench_sa_construction.py retired when the suite was tiered. The LCS story will be restored as a dedicated script under benchmark/docs/. The table below reflects the last captured run.
vs pylcs (C++ DP backend):
| Input size | parot (ours) | pylcs | Speedup |
|---|---|---|---|
| 1 KB | 0.24 ms | 3.4 ms | 14x |
| 5 KB | 0.41 ms | 85 ms | 206x |
| 10 KB | 0.62 ms | 331 ms | 532x |
Linear-time index vs quadratic DP — the gap grows quadratically.
Index construction (Rust native, M4)¶
Benchmark pending rewrite
Same retirement situation as the LCS table. Run the Rust-native benches in parot-core (cargo bench) to reproduce today.
Parot automatically selects among internal construction backends based on input size. Typical construction times on an M2 Pro (single-thread):
| Input size | Build |
|---|---|
| 1–5 KB | 0.02–0.10 ms |
| 50 KB | 0.72 ms |
| 500 KB | 6.9 ms |
| 1 MB | 13.9 ms |
| 5 MB | 66 ms |
The substring gap (token-based search libraries miss substrings)¶
Token-based search libraries (fuse.js, lunr, minisearch) split text into words. They can't find arbitrary substrings:
| Query | fuse.js | lunr | minisearch | Parot |
|---|---|---|---|---|
"rown fox" |
-- | Y | Y | Y |
"ATTA" |
-- | -- | -- | Y |
"Script dev" |
-- | -- | -- | Y |
"ows-Whe" |
-- | -- | -- | Y |
"ix arr" |
-- | -- | -- | Y |
"uick brown f" |
-- | Y | Y | Y |
| Score | 0/6 | 2/6 | 2/6 | 6/6 |
Reproduce with benchmark/utils/demo_substring_gap.mjs. Parot's WASM build runs in any browser — no server round-trip, no backend needed.
Running the benchmarks yourself¶
All commands run from the project root.
# Aggregate: runs the full benchmark suite
just bench
# Smoke test: every script, BENCH_QUICK=1, ~30s total
just bench-quick
# The scaling headline: parot vs pyahocorasick / str.count / pandas / polars /
# pyarrow / duckdb on a 10 MB Wikipedia corpus, sweeping N in {10, 100, 1K,
# 10K, 100K} and writing a log-log SVG to benchmark/charts/.
just bench-query-scaling # Python + JS, plus the SVG
just bench-query-scaling-run # Python + JS only (no plot)
just bench-query-scaling-py # Python sweep only (slow)
just bench-query-scaling-js # JS sweep only (fast)
just bench-query-scaling-plot # re-render SVG from saved JSON
Running individual benchmark scripts directly¶
Because each benchmark script is a standalone Python or Node program that writes its own JSON to benchmark/output/{chip}/, you can invoke them one at a time:
# Headlines (Tier 1 — the pitch deck)
uv run benchmark/headlines/bench_hero_locate.py # 16 competitors vs parot, 10 MB Wikipedia
uv run benchmark/headlines/bench_hero_plot.py # 1 GB Wikipedia, 100 patterns
uv run benchmark/headlines/bench_query_scaling.py # Python side of the scaling sweep
uv run benchmark/headlines/bench_scaling.py # parot query time vs corpus size
uv run benchmark/headlines/bench_breakeven.py # build-once-query-many crossover
uv run benchmark/headlines/bench_vs_index.py # vs a Rust-backed text index on PyPI
node benchmark/headlines/bench_query_scaling_js.mjs # JS side of the scaling sweep
# Memory (Tier 2 — footprint + tradeoffs)
uv run benchmark/memory/bench_memory.py # 5x heap linearity + disk artifact
uv run benchmark/memory/bench_locate_levels.py # memory_compactness memory/speed tradeoff
uv run benchmark/memory/bench_build_time.py # O(n) build-time slope
uv run benchmark/memory/bench_memory_grid.py # peak RSS x corpus size x memory_compactness
# Docs (Tier 3 — application evidence)
uv run benchmark/docs/bench_multipattern.py # parot vs Aho-Corasick + regex
uv run benchmark/docs/bench_duplicates.py # duplicate phrase detection
uv run benchmark/docs/bench_genomics.py # CRISPR guide search vs BioPython
uv run benchmark/docs/bench_dna.py # DNA motif sweep
# bench_dataframes.py ships inside the parot-core benchmark suite rather than
# this public repo; its JSON output is committed under benchmark/output/.
# JS / WASM
node benchmark/bench_hero_js.mjs # WASM parot vs indexOf vs modern-ahocorasick
node benchmark/bench_scale.mjs # WASM parot scale sweep (1K -> 5M words)
Every script writes to benchmark/output/{chip}/bench_*.json and (for most) benchmark/output/{chip}/md/bench_*.md. The chip slug is auto-detected (m2-pro, m4, ...). Delete the matching JSON to force a re-run.
Regenerating charts¶
Most benchmark scripts now emit their own {name}_{dark,light}.svg pair under benchmark/output/{chip}/{hero,memory,docs}/ as part of the run. To refresh the images on this page, re-run the corresponding benchmark and copy the SVG pair into docs/assets/ with the naming convention <topic>-{dark,light}.svg.
Large-corpus runs (workstation only)¶
FSA_BENCH_XL=1 uv run benchmark/headlines/bench_scaling.py # 10 GB tier
FSA_BENCH_XXL=1 uv run benchmark/headlines/bench_scaling.py # 100 GB tier
The default-gated tier is laptop-runnable. Each tier needs the corresponding corpus downloaded via benchmark/utils/download.sh.