Dataset Provider Verification Whitepaper
Establishing trust and transparency in the data marketplace through robust identity verification
Abstract
This whitepaper outlines the Dataset Provider Verification process for the OutDated Platform, ensuring that data providers—whether individuals, organizations, or research institutes—are authenticated and assigned accurate metadata tags.
By leveraging Decentralized Identifiers (DIDs), email and domain verification, and public credential checks, we establish trust and transparency in the data marketplace.
1. Introduction
Trustworthy data provisioning demands rigorous identity verification. This process prevents fraudulent or low-quality contributions and enables consumers to filter datasets by provider attributes (e.g., research affiliation, domain expertise).
Validates
Provider identity via DIDs and verifiable credentials.
Verifies
Control of organizational emails and public URLs.
Assigns
Standardized metadata tags based on role, sector, or specialization.
Logs
Every verification step for auditability.
2. Objectives
Authentication
Confirm provider ownership of claimed identity or organization.
Verification
Programmatically validate contact emails and public web presence.
Metadata Enrichment
Tag providers with roles (e.g., Researcher) and domains (e.g., Medical).
Transparency
Record verification results in immutable logs.
3. Verification Workflow
3.1 Decentralized Identifier (DID) Check
DID Registration
Providers register a DID (e.g., DID:sol:…) linking their on-chain address to a DID document.
Credential Presentation
Providers present verifiable credentials (VCs) signed by recognized issuers (universities, employers).
Cryptographic Proof
Validate VC signatures and check revocation status via DID resolver.
3.2 Email & Domain Verification
Email Challenge
Send a one-time token to the provider's email (must be corporate or organizational domain).
Domain Record Check
Verify DMARC/DKIM/SPF records to prevent spoofing.
Public URL Proof
Require provider to host a JSON snippet or TXT record at a well-known URL containing their DID and token.
3.3 Metadata Tag Assignment
Preliminary Tags
- Researcher
- Medical Employee
- University Affiliated
- Government Data Source
- Private Sector
Specialty Tags
- Visually Impaired Research Institute
- Climate Science Lab
- Financial Analytics Firm
Criteria for Tags
- Researcher:holds a VC from an academic institution.
- Medical Employee:email domain in a verified list of healthcare orgs and VC from medical board.
- University Affiliated:VC or URL proof from .edu domain.
- Visually Impaired Research Institute:VC issued by relevant accreditation body and website demonstrates focus on assistive tech.
3.4 Audit & Logging
All verification events are recorded:
provider_id: string did: string email_verified: boolean domain_verified: boolean vc_types: [string] tags_assigned: [string] timestamp: datetime logs: - stage: DID | Email | Domain | Tagging result: pass | fail details: string
Logs are appended to a tamper-evident ledger, accessible via the Admin API.
4. Security & Privacy Considerations
Data Minimization
Only store hashed identifiers and verification flags.
Encryption
Encrypt logs at rest using KMS keys.
Consent
Providers explicitly consent to verification and publishing of selected metadata.
5. Conclusion
By combining DIDs, verifiable credentials, and programmatic email/domain proof, the Dataset Provider Verification process builds a robust trust foundation. Metadata tags enable consumers to filter and choose data sources with confidence, improving overall marketplace quality and reliability.
Version 1.0