Sign up, collect key
Create a free account. Your API key is generated instantly — no credit card, no sales call.
Audit analytics assembled from the data you’re auditing is a jigsaw puzzle solved without the picture. Recovering ground truth from observed enterprise data is combinatorially infeasible.
VynFi generates the reference forward— from a fully specified model where every node’s provenance is known.
One curl command. Reference data in your terminal before you finish reading this sentence.
$ No credit card. No sales call. Your first reference knowledge graph generates in under three minutes.
Create a free account. Your API key is generated instantly — no credit card, no sales call.
Call the API with your sector, tables, and row count. Receive a fully provenanced reference knowledge graph.
Use reference data for testing, ML training, and compliance workflows — with known audit trail for every node.
First-class SDKs for Python, TypeScript, Rust, and .NET. Or just use curl.
curl https://api.vynfi.com/v1/generate/quick \ -H "Authorization: Bearer vf_live_7mN4kP2x..." \ -H "Content-Type: application/json" \ -d '{ "preset": "retail_small", "tables": ["journal_entries"], "rows": { "journal_entries": 1000 }, "format": "json" }'Audit firms, fintech engineers, academic researchers, compliance teams, and ESG reporters — all running reference data against the same model.
Journal entries with known anomalies for audit analytics testing. Calibrated to real-world distributions.
Build and test financial applications with production-quality synthetic data. Zero real customer exposure.
Large-scale labelled datasets for fraud detection, process mining, and financial ML research.
Test SOX, Basel III, and IFRS workflows with COSO control mappings and evaluation reports.
CSRD/TCFD reporting pipelines with Scope 1/2 emissions, workforce diversity, and pay equity analysis.
Four integrated audit methodologies, group-audit simulation per ISA 600, and complete end-to-end audit data generation.
4 integrated blueprints with 728–757 steps each — KPMG Clara, PwC Aura, Deloitte Omnia, EY GAM.
Component-auditor simulation with Significant, Non-Significant, Not-in-Scope classification and consolidated reporting.
From journal entries to board minutes — IT reports, management packs, regulatory filings, and more.
Disco, Celonis IBC, XES 2.0, and OCEL 2.0 format support for process-mining research and tooling.
14 fully-implemented typologies, multi-party criminal networks, cross-layer fraud propagation from payments to bank transactions, and 10 evaluators to prove it.
Structuring, smurfing, mule chains, synthetic identity, trade-based ML, crypto integration, sanctions evasion, romance scam, casino & real-estate integration — with ground-truth labels.
Barabási-Albert preferential-attachment topology — one coordinator + 5–25 smurfs, mule chains with recruiter / middleman / cash-out roles, shell-company pyramids.
Rolling-window counts (1h/24h/7d/30d), unique counterparties, amount z-scores, and realistic power-law device fingerprint distributions — pre-computed.
A fraudulent vendor payment surfaces in document flow, journal entries, AND on both sides of a mirrored bank-transaction pair — ≥95% fraud-label propagation.
Data streams direct to Azure Blob with short-lived SAS downloads — or bring your own storage and keep zero bytes on VynFi. For live pipelines, NDJSON streaming at up to 10,000/sec.
Lifecycle retention (7d Free → 365d Scale). Per-file SAS URLs — direct blob access, no API proxy, no 2 GB cap, no OOM kills.
Supply a container SAS URL and the worker uploads directly to your data lake. Zero bytes transit our storage. Pair with Private Link for airgapped flows.
GET /v1/jobs/{id}/stream/ndjson emits self-describing envelopes with token-bucket rate-limiting. Point Kafka, Spark, ClickHouse at it and ingest live.
Two new enterprise-grade output formats. Reference data that drops directly into S/4HANA IMPORT or OECD tax-software validators — no manual CSV pre-processing, no hand-written XML.
BKPF / BSEG / ACDOCA plus five master-data tables (LFA1 / KNA1 / MARA / CSKS / CEPC). HANA dialect for S/4HANA IMPORT, classic for legacy ECC. Configurable client / ledger / source-system tags. FK integrity across the full P2P cycle.
Structurally-valid SAF-T XML for Portugal (1.04_01), Poland (JPK_KR 1.0), Romania (D406 3.0), Norway (Fin 1.10), Luxembourg (FAIA 2.01). Audit-PoC ready for tax-software validation and compliance simulation.
From raw journal entries to audited financial statements, VynFi generates data that passes your reconciliation, audit, and regulatory tests — all derived from a single declarative model.
Complete balance sheet, income statement, cash flow, and equity rollforward — generated from actual journal entry data, not templates.
Multi-stage WIP → Finished Goods → COGS pipeline with standard cost variance accounting and IAS 37 warranty provisions.
Debt interest accrual, cash-flow and fair-value hedge mark-to-market, cash-pool sweeps, and covenant compliance evaluation.
Tax provision computed from actual pre-tax income. VAT posting from source documents. Deferred tax with temporary-difference tracking.
Instance documents mapped to US GAAP and IFRS taxonomies. Test your regulatory filing pipeline with reference data.
FG rollforward, WIP rollforward, trial-balance proof, cash-flow reconciliation, equity rollforward, segment-to-consolidated, IC elimination.
GDPR-ready, EU AI Act aligned, AES-256 at rest and in transit, zero real-client-data ingestion, all 4 Big 4 firm methodologies covered.
Validation is not a feature — it is the output. Every reference carries provable bounds on its own distributional fidelity.
Mean absolute deviation for first-digit compliance. Rated 'excellent conformity' by Nigrini's criteria.
GNN fraud detectors trained on reference data within 3% F1 of real-data baselines.
Gaussian, Clayton, Gumbel, Frank, and Student-t copulas model complex inter-variable dependencies.
Across 5 categories — timing, amount, relationship, pattern, and structural — with ground-truth labels.
The DataSynth engine was calibrated against 155 ISO 21378:2019–compliant general-ledger datasets, encompassing 364 million journal entries and 2.4 billion line items across industries and geographies.
Analyzed for distribution calibration and statistical benchmarking across 10 industry sectors.
In the calibration corpus used to derive realistic financial patterns and temporal dynamics.
Processed to build inter-table correlation models and cross-entity relationship graphs.
Localized tax, banking, naming, holidays, and accounting standards — so regional realism is a config flag, not a backlog item.
Start free. Scale when you need it. Credits reset every billing cycle — no rollover games.
“Recovering ground truth from observed enterprise data is combinatorially infeasible. DataSynth circumvents this by generating data forward — producing reference datasets where the complete audit trail is known by construction.”
Ivertowski · 2026 · arXiv:cs.CE
Powered by the DataSynth engine — a purpose-built Rust engine with 16 crates and counting.
10,000 credits, every month, free. No credit card required. Your first reference knowledge graph generates in under three minutes.
You scrolled all the way down. We respect that.
We use strictly necessary cookies so the site works — authentication, session state, CSRF protection. We don't use advertising or marketing cookies. If we later add analytics to understand product usage, it will only run with your consent here. See our Cookie Policy.