Dataset Provider Verification Whitepaper

Establishing trust and transparency in the data marketplace through robust identity verification

Abstract

This whitepaper outlines the Dataset Provider Verification process for the OutDated Platform, ensuring that data providers—whether individuals, organizations, or research institutes—are authenticated and assigned accurate metadata tags.

By leveraging Decentralized Identifiers (DIDs), email and domain verification, and public credential checks, we establish trust and transparency in the data marketplace.

1. Introduction

Trustworthy data provisioning demands rigorous identity verification. This process prevents fraudulent or low-quality contributions and enables consumers to filter datasets by provider attributes (e.g., research affiliation, domain expertise).

Validates

Provider identity via DIDs and verifiable credentials.

Verifies

Control of organizational emails and public URLs.

Assigns

Standardized metadata tags based on role, sector, or specialization.

Logs

Every verification step for auditability.

2. Objectives

Authentication

Confirm provider ownership of claimed identity or organization.

Verification

Programmatically validate contact emails and public web presence.

Metadata Enrichment

Tag providers with roles (e.g., Researcher) and domains (e.g., Medical).

Transparency

Record verification results in immutable logs.

3. Verification Workflow

3.1 Decentralized Identifier (DID) Check

1

DID Registration

Providers register a DID (e.g., DID:sol:…) linking their on-chain address to a DID document.

2

Credential Presentation

Providers present verifiable credentials (VCs) signed by recognized issuers (universities, employers).

3

Cryptographic Proof

Validate VC signatures and check revocation status via DID resolver.

3.2 Email & Domain Verification

1

Email Challenge

Send a one-time token to the provider's email (must be corporate or organizational domain).

2

Domain Record Check

Verify DMARC/DKIM/SPF records to prevent spoofing.

3

Public URL Proof

Require provider to host a JSON snippet or TXT record at a well-known URL containing their DID and token.

3.3 Metadata Tag Assignment

Preliminary Tags

  • Researcher
  • Medical Employee
  • University Affiliated
  • Government Data Source
  • Private Sector

Specialty Tags

  • Visually Impaired Research Institute
  • Climate Science Lab
  • Financial Analytics Firm

Criteria for Tags

  • Researcher:holds a VC from an academic institution.
  • Medical Employee:email domain in a verified list of healthcare orgs and VC from medical board.
  • University Affiliated:VC or URL proof from .edu domain.
  • Visually Impaired Research Institute:VC issued by relevant accreditation body and website demonstrates focus on assistive tech.

3.4 Audit & Logging

All verification events are recorded:

provider_id: string
did: string
email_verified: boolean
domain_verified: boolean
vc_types: [string]
tags_assigned: [string]
timestamp: datetime
logs:
  - stage: DID | Email | Domain | Tagging
    result: pass | fail
    details: string

Logs are appended to a tamper-evident ledger, accessible via the Admin API.

4. Security & Privacy Considerations

Data Minimization

Only store hashed identifiers and verification flags.

Encryption

Encrypt logs at rest using KMS keys.

Consent

Providers explicitly consent to verification and publishing of selected metadata.

5. Conclusion

By combining DIDs, verifiable credentials, and programmatic email/domain proof, the Dataset Provider Verification process builds a robust trust foundation. Metadata tags enable consumers to filter and choose data sources with confidence, improving overall marketplace quality and reliability.

Version 1.0

Ready to implement verification?

v1.0 · Last updated May 11, 2025.