research-prior-art-streams-with-gaps
Prior Art: Streams-with-Gaps Thesis
Literature review validating novelty of the Wanderland thesis
Summary
The combination of:
- Structural isomorphism of documents and programs
- Streams-with-holes as cross-domain invariant (DBs, compilers, networks, transformers)
appears to be novel in both articulation and implementation via Wanderland.
Adjacent efforts exist but none cleanly state and operationalize the same "documents ≅ programs; streams‑with‑gaps ≅ shared problem" thesis.
What's Close on the Document Side
Executable Knowledge Graphs (ExeKG / xKG)
Encode papers, techniques, and code into a graph that can be executed.
Key difference: They treat documents as inputs to a graph extractor and code generator, not as the primary programming substrate.
- Executable Knowledge Graphs for Replicating AI Research
- ExeKGLib - Bosch Research
- ExeKG: Executable Knowledge Graph System
Literate Programming for LLMs
Recent "literate programming for LLMs" and Interoperable Literate Programming work moves toward executable documentation.
Key difference: Still assumes a code layer distinct from the prose, rather than "prose is the source language, graph is the compiler/runtime" stance.
- Programming's New Frontier: LLM-First Languages
- Renaissance of Literate Programming in the Era of LLMs
- Literate Programming with LLMs? - Rosetta Code Study
What's Close on the Transformer/DB Side
Database Analogies for Attention
Multiple posts and papers use database analogies for Q/K/V or treat attention as soft retrieval over a KV store.
Key difference: Frame it as analogy or implementation detail, not as a unifying "streams-with-gaps" invariant across DBs, compilers, networks, and transformers.
LLMs as Query Operators
DB researchers treating LLMs as query operators inside DBMSs, and using transformers to model query plans.
Key difference: That's "LLM + database," not "they solve the same abstract problem and share optimization math."
The Gap We Fill
| Existing Work | Their Claim | Our Claim |
|---|---|---|
| ExeKG | Documents → extract → graph → execute | Documents ARE the execution substrate |
| Literate LLM | Code + prose interleaved | Prose IS the source language |
| DB/Attention analogies | Attention is like a DB lookup | Same mathematical structure, same optimization space |
| LLM + SQL | Use LLM inside queries | Queries and attention are the same problem |
The Streams-with-Holes Invariant
The Streams-with-Holes Invariant
The Thesis (Clean Statement)
Databases, compilers, and transformers solve the same problem: streams with holes that need filling.
| Domain | Stream | Holes | Filler | Output |
|---|---|---|---|---|
| Query plan | operators | parameter holes | executor | results |
| Binary | instructions | relocation holes | linker | executable |
| Packet | bytes | address holes | NAT | routable traffic |
| Document | tokens | context holes | model | meaning |
Why The Math Transfers
Fifty years of optimization research applies directly:
- Cache hierarchies → working memory
- Predicate pushdown → early filtering
- Cost-based planning → attention routing
- Dead code elimination → context pruning
The mHC Connection
The mHC paper adds multiple streams with conservation constraints. That's basic traffic engineering:
| mHC Concept | Traffic Engineering Equivalent |
|---|---|
| Don't amplify signal | No packet storms |
| Don't drop signal | No lost packets |
| Distribute across channels | Load balancing |
The "multi-head latent" part is multiple lanes. The "conserved" part is routing discipline.
Not Analogy—Isomorphism
This isn't analogy. It's the same structural invariants showing up in different domains.
The math transfers because the problem transfers.
We've been optimizing streams-with-gaps since the first compiler.
The substrate changed. The problem didn't.
Cross-Domain Literature Review
Each domain's literature describes the same structure in its own language: a stream, a gap or condition, a match step, and a fill/merge step.
DNA Replication / Gap Repair
- Replication = polymerase moving along a template strand, dealing with gaps (Okazaki fragments, lesions) that must be filled/repaired
- Gap repair papers explicitly use "gap filling" terminology—"find a matching donor and splice it into the hole"
Sources: DNA Replication Mechanisms - NCBI, Gap-Filling Translesion Synthesis, Template Switching Analysis
TCP Stream Reassembly / NAT
- TCP reassembly defines sequence gaps as holes in byte stream; implementations buffer out-of-order segments until later packet fills the gap
- NAT maintains mapping table keyed by flow identifiers, rewrites packets by lookup—dynamic "symbol resolution" over packet streams
Sources: TCP Reassembly - ICIR, NAT-PMP RFC 6886
Database Joins / Query Execution
- Join algorithms: two streams of tuples, find matches via join conditions, combine into result rows
- Query optimization = ordering constraints (predicate pushdown, join order) to minimize search space—"resolve then fetch"
Sources: CMU 15-445 Join Algorithms
Immune System / Pattern Recognition
- PRRs as molecular link between innate and adaptive immunity
- Receptors detect conserved patterns (PAMPs), matches trigger downstream programs
- Pattern recognition → capability = CFR's recognition → capability
Sources: Control of adaptive immunity by PRRs
Economic Price Discovery
- Markets integrate information into prices: mispricing (gap) → trades/arbitrage → equilibrium
- Explicitly framed as errors corrected over time—markets filling informational gaps
Sources: Dynamics of Price Discovery
Attention / RAG over Graphs
- Attention: each token broadcasts value, receives custom blend based on query-key compatibility—content-addressable memory
- Graph RAG: parse query → retrieve relevant subgraphs → splice into context window
Sources: Query-Key-Value Attention, RAG with Knowledge Graphs
The Universal Pattern
| Domain | Stream | Gap | Match | Fill/Splice |
|---|---|---|---|---|
| DNA | template strand | lesions/fragments | complementary base | polymerase fill |
| TCP | byte sequence | lost packets | sequence number | reassembly buffer |
| NAT | packet flow | address holes | mapping table | rewrite |
| Database | tuple stream | join condition | hash/sort-merge | combine rows |
| Immune | antigen stream | unknown pattern | PRR match | response cascade |
| Markets | price stream | mispricing | arbitrage | equilibrium |
| Attention | token sequence | context gap | Q/K similarity | value blend |
The literature's own language keeps circling the same structure: stream → gap → match → splice → continue.
Provenance
- Source: Perplexity research query, 2026-01-04
- Context: Validating novelty for Wanderland paper
- Status: 🟡 Partially verified (citations checked, claims validated against sources)
North
slots:
- context:
- Prior art research supporting the paper's novelty claims
slug: wanderland-paper
- context:
- Linking invariant to supporting research
slug: streams-with-gaps-invariantEast
slots:
- context:
- Prior art research alongside original thesis articulation
slug: streams-all-the-way-down
- context:
- Prior art research alongside theoretical foundations
slug: theoretical-foundations-streams-with-gaps
- context: []
slug: streams-with-gaps-invariantSouth
slots:
- context:
- Prior art research flows down to foundational claim
slug: bedrock
West
slots:
- context:
- Formalization alongside prior art research
slug: simulation-without-a-basement