The VP-WPI Test Collection is a novel dataset that implements the Virtual Patent (VP) concept. A Virtual Patent is a synthesized document that represents a single patent, created by merging the most up-to-date information from its various publication stages (e.g., kind codes A1, A2, B1, B2).
Specifically, VP-WPI is as a specialized vertical of the WPI+ resource, which offers a unified, non-redundant view of patents by aggregating all relevant documents from the WPI test collection at the kind-code level to create unified VP documents.
This collection serves as an abstraction layer over WPI, designed to:
Simplify analysis by reducing document redundancy.
Enhance data consistency by providing a single source of truth.
Preserve traceability with links back to all original source documents.
Further Information
For full technical details, including collection statistics, data specifications, and the creation process, please refer to:
WPI+ Resource - Documentation & Source Code: WPI+ GitHub Repository
Resources:
VP-WPI Test Collection on TU-Wien (this page): VP-WPI Collection.
WPI Test Collection on Zenodo: WPI Test Collection.
Comprehensive Thesis (in Greek): Papadopoulos, C., MSc Thesis, International Hellenic University. https://repository.ihu.gr/handle/11544/47881.