ICDAR 2011 Signature Verification Competition (SigComp2011)
Datasets -> Datasets List -> Current Page
|
Contents
Contact Author
Dr. phil. nat. Marcus Liwicki DFKI - German Research Center for Artificial Intelligence Trippstadter Str. 122 D-67663 Kaiserslautern, Germany E-mail: liwicki@dfki.uni-kl.de Phone +49 (0) 631 20575 1200 Fax +49 (0) 631 20575 1020
Current Version
1.0
Keywords
Online handwriting, offline handwriting, signature, verification
Description
The collection contains simultaneously acquired online and offline samples.
The collection contains offline and online signature samples. The offline dataset comprises PNG images, scanned at 400 dpi, RGB color. The online dataset comprises ascii files with the format: X, Y, Z (per line).
Dutch dataset
- Training set
- For both online and offline modes, signatures of 10 reference writers and skilled forgeries of these signatures.
- Total online: 449 signatures, total offline: 362 signatures.
- Additionally, the public data of the 2009 competition may be used for training.
- Test set
- For both online and offline modes, signatures of 54 reference writers and skilled forgeries of these signatures.
- Total online: 1907 signatures, total offline: 1932 signatures.
Chinese dataset
- Training set
- For both online and offline modes, signatures of 10 reference writers and skilled forgeries of these signatures.
- Total online: 659 signatures, total offline: 575 signatures.
- Test set
- For both online and offline modes, signatures of 10 reference writers and skilled forgeries of these signatures.
- Total online: 680 signatures, total offline: 602 signatures.
Technical Details
Data Acquisition Details
A preprinted paper was used with 12 numbered boxes (width 59mm, height 23mm). The preprinted paper was placed underneath the blank writing paper. Four extra blank pages were added underneath the first two pages to ascertain a soft writing surface.
- Sampling rate 200 Hz, resolution 2000 lines/cm, precision of 0.25 mm.
- Collection device: WACOM Intuos3 A3 Wide USB Pen Tablet.
- Collection software: MovAlyzer.
Folder Structure and File Naming
The signatures of the training set are arranged according to the following folder structure:
- OfflineSignatures
- Chinese
- TrainingSet
- Offline Genuine
- Offline Forgeries
- TrainingSet
- Dutch
- TrainingSet
- Offline Genuine
- Offline Forgeries
- TrainingSet
- Chinese
- OnlineSignatures
- … (similar structure)
Genuine signatures are named according to the following convention (the same for all data sets): III_NN.*, where III is the ID of the reference writer and NN is an index of the signature, i.e., it is the NNth authentic signature contributed by writer III.
Simulated signatures (forgeries) are named according to the following conventions: FFFFIII_NN.*, where FFFF is the ID of the forger, III is the ID of the reference writer and NN is an index, i.e., it is the NNth simulation attempt of writer FFFF to simulate the signature of writer III.
It is advised to optimize your systems by using 12 authentic signatures per writer for training and the other authentic signatures for validation. You could also perform a cross validation. Note that in the first version of the data set there are some online authentic signatures missing (only 12 reference signatures). In a second version of this data set we will provide you with this missing data.
The folder structure and the filename conventions are not the same for testing the systems.
The signatures of the test set are arranged according to the following folder structure:
- SigComp11-Offlinetestset
- Chinese
- Questioned(487) (Containing all the questioned signatures, both genuine and forged)
- Ref(115) (Containing the reference signatures, only genuine)
- Dutch
- Questioned(1287) (Containing all the questioned signatures, both genuine and forged)
- Ref(646) (Containing the reference signatures, only genuine)
- Chinese
- SigComp11-Onlinetestset
- … (similar structure)
Note that for both the training and test sets the online and offline folders do not necessary contain exactly the same signatures, because during acquisition not all samples could be acquired in both modes. Furthermore, note that the online signatures may contain artifacts from the pen-movements (e.g., strokes that do not belong to the actual signature anymore). Systems could recover from those artifacts by applying preprocessing heuristics).
Genuine signatures are named according to the following convention (the same for all data sets): NN _III.*, where NN is an index of the signature and III is the ID of the reference writer, i.e., it is the NNth authentic signature contributed by writer III.
Simulated signatures (forgeries) are named according to the following conventions: NN _FFFFIII.*, where NN is an index, FFFF is the ID of the forger, and III is the ID of the reference writer, i.e., it is the NNth simulation attempt of forger FFFF to simulate the signature of writer III.
References
- Marcus Liwicki, Michael Blumenstein, Elisa van den Heuvel, Charles E.H. Berger, Reinoud D. Stoel, Bryan Found, Xiaohong Chen, Muhammad Imran Malik. "SigComp11: Signature Verification Competition for On- and Offline Skilled Forgeries", Proc. 11th Int. Conference on Document Analysis and Recognition, 2011
Submitted Files
Version 1.0
Files
Note: the password for opening the zip files is "I hereby accept the SigComp 2011 disclaimer." (without the double quotes).
- Disclaimer (30 KB)
- Training set (239 MB)
- Test set (553 MB)
This page is editable only by TC11 Officers .