Architecture
Empyr is a decentralized data coordination network that connects browser data producers with AI agents, curators, and verifiers through a transparent, tokenized system. Its architecture combines on-chain governance, off-chain data processing, and cryptographic verification to create a trusted foundation for agent-grade web data.
Core Concepts
Empyr separates responsibilities across four logical layers
1. Capture Layer Data originates from browser contexts, plugins, or SDKs integrated with Empyr. Producers sign payloads locally before submission, ensuring data authenticity without exposing personal identifiers. The capture layer handles
Local signing and hashing of event payloads
Edge filtering, redaction, and anonymization
Secure channel transmission to Empyr’s data nodes
2. Curation Layer Once captured, data passes through curator nodes. Curators maintain quality and consistency using defined schemas. They perform
Deduplication and normalization
Validation against collection schemas
Content fingerprinting and clustering
Field tagging for semantic search
Curators stake Empyr tokens to participate. Misbehavior or poor-quality output results in slashing, aligning incentives toward honest operation.
3. Verification Layer Independent verifiers audit data quality and challenge suspicious submissions. Using zero-knowledge proofs and random sampling, they confirm that data originates from compliant sources without seeing private details. Functions include
Randomized proof-of-origin checks
Schema compliance audits
Anomaly detection and reputation scoring
Dispute resolution and stake slashing
4. Access Layer This is the interface for buyers and developers. Indexed, curated data is stored in a content-addressable store and made accessible via APIs, snapshots, and streams. The access layer handles
Token-to-credit conversion for usage
Role-based access control
Query metering and usage logging
Streaming, snapshot, and pay-per-query endpoints
Network Roles
Empyr’s ecosystem depends on specialized participants with aligned incentives
Producers
Submit signed, anonymized event data
Earn tokens from reward pool
Optional
Curators
Clean, format, and enrich data
Share of curation rewards
Required
Verifiers
Audit and challenge data
Receive audit fees and slashing rewards
Required
Buyers
Access data using credits
Access to curated datasets
None
Governance Members
Vote on protocol parameters
Influence network direction
Token holder
Data Lifecycle
Step 1 Ingestion Browser agents or partners submit structured data fragments through the producer SDK. Each submission is hashed, signed, and stored in a staging pool with provenance metadata.
Step 2 Curation Curator nodes clean and normalize submissions. They apply schema validation, tag semantic relationships, and create deduped objects for the canonical data store.
Step 3 Verification Verifier nodes randomly audit curated batches. Discrepancies trigger dispute resolution through a lightweight arbitration process. Successful audits reinforce the curator’s reputation and release staking rewards.
Step 4 Indexing and Storage Validated data is written to a content-addressable store (CAS) and referenced on chain through hash pointers. Each reference links to the collection ID, schema version, license type, and provenance record.
Step 5 Access and Monetization Buyers convert tokens into credits to access data through Empyr’s APIs or through periodic snapshots. The burn of credits permanently removes the equivalent token amount, tightening supply and supporting value.
Technical Stack
Empyr’s architecture combines open and modular technologies
On-chain Layer: smart contracts for token, staking, and governance logic
Data Layer: IPFS-based content-addressable storage with proofs for integrity
Indexing Layer: GraphQL and vector indices for semantic queries and embeddings
Verification Layer: Zero-knowledge proof system for privacy-preserving audits
API Layer: REST and WebSocket endpoints with metered access and credit authentication
This modularity allows empyr to operate both in public and enterprise environments while keeping the trust model consistent.
Token Flow in Architecture
Buyers acquire tokens to purchase credits.
Credits are used to access data, then burned.
A portion of each burn funds the reward pool.
Producers, curators, and verifiers claim their share from the pool based on contribution and reputation.
Treasury collects a protocol fee to fund audits, grants, and new schema development.
This creates continuous circulation that links usage demand with token scarcity and contributor incentives.
Security and Reliability
All data objects are hashed and signed before network entry.
Transport channels use end-to-end encryption.
Smart contracts undergo external audits.
Node performance and uptime affect staking rewards.
Failover and replication prevent single-point data loss.
Last updated
