Architecture

Empyr is a decentralized data coordination network that connects browser data producers with AI agents, curators, and verifiers through a transparent, tokenized system. Its architecture combines on-chain governance, off-chain data processing, and cryptographic verification to create a trusted foundation for agent-grade web data.

Core Concepts

Empyr separates responsibilities across four logical layers

1. Capture Layer Data originates from browser contexts, plugins, or SDKs integrated with Empyr. Producers sign payloads locally before submission, ensuring data authenticity without exposing personal identifiers. The capture layer handles

Local signing and hashing of event payloads
Edge filtering, redaction, and anonymization
Secure channel transmission to Empyr’s data nodes

2. Curation Layer Once captured, data passes through curator nodes. Curators maintain quality and consistency using defined schemas. They perform

Deduplication and normalization
Validation against collection schemas
Content fingerprinting and clustering
Field tagging for semantic search

Curators stake Empyr tokens to participate. Misbehavior or poor-quality output results in slashing, aligning incentives toward honest operation.

3. Verification Layer Independent verifiers audit data quality and challenge suspicious submissions. Using zero-knowledge proofs and random sampling, they confirm that data originates from compliant sources without seeing private details. Functions include

Randomized proof-of-origin checks
Schema compliance audits
Anomaly detection and reputation scoring
Dispute resolution and stake slashing

4. Access Layer This is the interface for buyers and developers. Indexed, curated data is stored in a content-addressable store and made accessible via APIs, snapshots, and streams. The access layer handles

Token-to-credit conversion for usage
Role-based access control
Query metering and usage logging
Streaming, snapshot, and pay-per-query endpoints

Network Roles

Empyr’s ecosystem depends on specialized participants with aligned incentives

Role

Function

Incentive

Stake Requirement

Producers

Submit signed, anonymized event data

Earn tokens from reward pool

Optional

Curators

Clean, format, and enrich data

Share of curation rewards

Required

Verifiers

Audit and challenge data

Receive audit fees and slashing rewards

Required

Buyers

Access data using credits

Access to curated datasets

None

Governance Members

Vote on protocol parameters

Influence network direction

Token holder

Data Lifecycle

Step 1 Ingestion Browser agents or partners submit structured data fragments through the producer SDK. Each submission is hashed, signed, and stored in a staging pool with provenance metadata.

Step 2 Curation Curator nodes clean and normalize submissions. They apply schema validation, tag semantic relationships, and create deduped objects for the canonical data store.

Step 3 Verification Verifier nodes randomly audit curated batches. Discrepancies trigger dispute resolution through a lightweight arbitration process. Successful audits reinforce the curator’s reputation and release staking rewards.

Step 4 Indexing and Storage Validated data is written to a content-addressable store (CAS) and referenced on chain through hash pointers. Each reference links to the collection ID, schema version, license type, and provenance record.

Step 5 Access and Monetization Buyers convert tokens into credits to access data through Empyr’s APIs or through periodic snapshots. The burn of credits permanently removes the equivalent token amount, tightening supply and supporting value.

Technical Stack

Empyr’s architecture combines open and modular technologies

On-chain Layer: smart contracts for token, staking, and governance logic
Data Layer: IPFS-based content-addressable storage with proofs for integrity
Indexing Layer: GraphQL and vector indices for semantic queries and embeddings
Verification Layer: Zero-knowledge proof system for privacy-preserving audits
API Layer: REST and WebSocket endpoints with metered access and credit authentication

This modularity allows empyr to operate both in public and enterprise environments while keeping the trust model consistent.

Token Flow in Architecture

Buyers acquire tokens to purchase credits.
Credits are used to access data, then burned.
A portion of each burn funds the reward pool.
Producers, curators, and verifiers claim their share from the pool based on contribution and reputation.
Treasury collects a protocol fee to fund audits, grants, and new schema development.

This creates continuous circulation that links usage demand with token scarcity and contributor incentives.

Security and Reliability

All data objects are hashed and signed before network entry.
Transport channels use end-to-end encryption.
Smart contracts undergo external audits.
Node performance and uptime affect staking rewards.
Failover and replication prevent single-point data loss.

PreviousIntro NextData Collection

Last updated 4 days ago