Labs

Matching lab

This space separates experimental work from the main directory. We will test the zero-cost pipeline here for cross-matching public missing-person reports with public located-person lists.

Visual review of possible matches

This page shows possible matches for human review. The percentage is heuristic: an exact ID match carries much more weight than name similarity.

How the score is calculated

The current formula does not use machine learning. It is a review heuristic: it combines strong and weak signals to suggest human priority.

1. ID comes first

If the ID matches exactly, the case moves close to the maximum. If the IDs conflict, the score drops even when the name looks similar.

2. Then the name

Without an ID, the main weight comes from normalized name similarity: exact, very close, strong, or moderate.

3. Age and location adjust

Age and location only adjust the result. An exact or nearby age adds a little; a large difference subtracts. Location adds a small boost and never replaces identity.

How to read the percentages

93–99%: high review priority 80–92%: possible match 0–79%: weak or ambiguous signal

The percentage does not confirm identity on its own. Review the evidence shown on the card and, when available, verify the ID, age, hospital, and original source.

…Loading labs summary

Loading candidate matches

Initial version

We start without a paid backend: GitHub Actions for source syncs, static JSON files for publishing results, and manual review before confirming any match.

Expected outputs

Normalized missing-people dataset.
Normalized located-people dataset.
Candidate-match dataset with score and explanation.

Summary JSON Missing JSON Located JSON Candidates JSON

Back to directory See technical resources

Zero-dollar infrastructure

Scheduled GitHub Actions to fetch Kobo and Localizados.
Repository scripts for parsing, normalization, and scoring.
Generated datasets under docs/ to serve them through GitHub Pages.
Manual review of suggested matches before anything is published.

Pipeline we will build

Ingest: fetch Kobo and Localizados on each run.
Parsing: extract name, ID, age, contact, and location from free text.
Normalization: clean accents, casing, whitespace, and name variants.
Matching: generate candidates by ID, name, age, and location.
Review: publish suggested matches separately for human verification.

Confirmed public sources

KoboToolbox · TerremotoVE

Public JSON feed with reports of missing people, missing families, rescued people, and related events.

Open resource

Localizados Venezuela API

Public read-only API with people located in hospitals and other facilities.

Open resource

Later this can move to a subdomain like labs.directorioterremotovenezuela.org, but starting with /labs/ is the simplest and cheapest option.