Difference between revisions of "ICDAR 2011 Signature Verification Competition (SigComp2011)"

From TC11
Jump to: navigation, search
(Dutch dataset)
(Folder Structure and File Naming)
 
(2 intermediate revisions by the same user not shown)
Line 41: Line 41:
 
** Total online: 449 signatures, total offline: 362 signatures.
 
** Total online: 449 signatures, total offline: 362 signatures.
 
** Additionally, the public data of the 2009 competition may be used for training.
 
** Additionally, the public data of the 2009 competition may be used for training.
* Test set:
+
* Test set
 
** For both online and offline modes, signatures of 54 reference writers and skilled forgeries of these signatures.
 
** For both online and offline modes, signatures of 54 reference writers and skilled forgeries of these signatures.
 
** Total online: 1907 signatures, total offline: 1932 signatures.
 
** Total online: 1907 signatures, total offline: 1932 signatures.
  
 
==Chinese dataset==
 
==Chinese dataset==
* Total set: 960 signatures.
+
* Training set
* Training set: data of 10 reference writers and some skilled forgeries of these signatures.
+
** For both online and offline modes, signatures of 10 reference writers and skilled forgeries of these signatures.
* Test set: #.
+
** Total online: 659 signatures, total offline: 575 signatures.
 +
* Test set
 +
** For both online and offline modes, signatures of 10 reference writers and skilled forgeries of these signatures.
 +
** Total online: 680 signatures, total offline: 602 signatures.
  
 
=Technical Details=
 
=Technical Details=
Line 58: Line 61:
  
 
==Folder Structure and File Naming==
 
==Folder Structure and File Naming==
The signatures of the training set are arranged according to the following folder structure:
+
The signatures of the training set are arranged according to the following folder structure:  
* OfflineSignatures
+
* OfflineSignatures  
** Chinese
+
** Chinese  
*** TrainingSet
+
*** TrainingSet  
**** Offline Genuine (Containing the reference signatures)
+
**** Offline Genuine  
**** Offline Forgeries (Containing the simulations)
+
**** Offline Forgeries  
** Dutch
+
** Dutch  
*** TrainingSet
+
*** TrainingSet  
**** Offline Genuine
+
**** Offline Genuine  
**** Offline Forgeries
+
**** Offline Forgeries  
* OnlineSignatures
+
* OnlineSignatures  
 
** … (similar structure)
 
** … (similar structure)
  
Note that the online and offline folders do not necessary contain exactly the same signatures, because during acquisition not all samples could be acquired in both modes. Furthermore, note that the online signatures may contain artifacts from the pen-movements (e.g., strokes that do not belong to the actual signature anymore). Systems could recover from those artifacts by applying preprocessing heuristics).
+
Genuine signatures are named according to the following convention (the same for all data sets): III_NN.*, where III is the ID of the reference writer and NN is an index of the signature, i.e., it is the NNth authentic signature contributed by writer III.
 +
 
 +
Simulated signatures (forgeries) are named according to the following conventions: FFFFIII_NN.*, where FFFF is the ID of the forger, III is the ID of the reference writer and NN is an index, i.e., it is the NNth simulation attempt of writer FFFF to simulate the signature of writer III.
  
Genuine signatures are named according to the following convention (the same for all data sets):
+
It is advised to optimize your systems by using 12 authentic signatures per writer for training and the other authentic signatures for validation. You could also perform a cross validation. Note that in the first version of the data set there are some online authentic signatures missing (only 12 reference signatures). In a second version of this data set we will provide you with this missing data.
III_NN.*, where III is the ID of the reference writer and NN is an index of the signature, i.e., it is the NNth authentic signature contributed by writer III.
 
  
Simulated signatures (forgeries) are named according to the following conventions:
+
The folder structure and the filename conventions are not the same for testing the systems.
FFFFIII_NN.*, where FFFF is the ID of the forger, III is the ID of the reference writer and NN is an index, i.e., it is the NNth simulation attempt of writer FFFF to simulate the signature of writer III.
 
  
It is advised to optimize your systems by using 12 authentic signatures per writer for training and the other authentic signatures for validation. You could also perform a cross validation. Note that in the first version of the data set there are some online authentic signatures missing (only 12 reference signatures). In a second version of this data set we will provide you with this missing data.
+
The signatures of the test set are arranged according to the following folder structure:
 +
* SigComp11-Offlinetestset
 +
** Chinese
 +
*** Questioned(487) (Containing  all the questioned signatures, both genuine and forged)
 +
*** Ref(115) (Containing the reference signatures, only genuine)
 +
** Dutch
 +
*** Questioned(1287) (Containing  all the questioned signatures, both genuine and forged)
 +
*** Ref(646) (Containing the reference signatures, only genuine)
 +
* SigComp11-Onlinetestset
 +
** … (similar structure)
 +
 
 +
Note that for both the training and test sets the online and offline folders do not necessary contain exactly the same signatures, because during acquisition not all samples could be acquired in both modes. Furthermore, note that the online signatures may contain artifacts from the pen-movements (e.g., strokes that do not belong to the actual signature anymore). Systems could recover from those artifacts by applying preprocessing heuristics).
  
The filename conventions are not the same for testing the systems, i.e., random file names were used.
+
Genuine signatures are named according to the following convention (the same for all data sets): NN _III.*, where NN is an index of the signature and III is the ID of the reference writer, i.e., it is the NNth authentic signature contributed by writer III.
  
 +
Simulated signatures (forgeries) are named according to the following conventions: NN _FFFFIII.*, where NN is an index, FFFF is the ID of the forger, and III is the ID of the reference writer, i.e., it is the NNth simulation attempt of forger FFFF to simulate the signature of writer III.
 
<!--
 
<!--
 
=Related Tasks=
 
=Related Tasks=

Latest revision as of 12:13, 9 February 2012

Datasets -> Datasets List -> Current Page

Created: 2012-01-23
Last updated: 2012-002-09

Contact Author

Dr. phil. nat. Marcus Liwicki
DFKI - German Research Center for Artificial Intelligence
Trippstadter Str. 122
D-67663 Kaiserslautern, Germany
E-mail: liwicki@dfki.uni-kl.de
Phone +49 (0) 631 20575 1200
Fax +49 (0) 631 20575 1020

Current Version

1.0

Keywords

Online handwriting, offline handwriting, signature, verification

Description

Six Sample signatures from the SigComp2011 dataset. Top row: signatures from the Chinese subset, bottom row: signatures from the dutch subset.

The collection contains simultaneously acquired online and offline samples.

The collection contains offline and online signature samples. The offline dataset comprises PNG images, scanned at 400 dpi, RGB color. The online dataset comprises ascii files with the format: X, Y, Z (per line).

Dutch dataset

  • Training set
    • For both online and offline modes, signatures of 10 reference writers and skilled forgeries of these signatures.
    • Total online: 449 signatures, total offline: 362 signatures.
    • Additionally, the public data of the 2009 competition may be used for training.
  • Test set
    • For both online and offline modes, signatures of 54 reference writers and skilled forgeries of these signatures.
    • Total online: 1907 signatures, total offline: 1932 signatures.

Chinese dataset

  • Training set
    • For both online and offline modes, signatures of 10 reference writers and skilled forgeries of these signatures.
    • Total online: 659 signatures, total offline: 575 signatures.
  • Test set
    • For both online and offline modes, signatures of 10 reference writers and skilled forgeries of these signatures.
    • Total online: 680 signatures, total offline: 602 signatures.

Technical Details

Data Acquisition Details

A preprinted paper was used with 12 numbered boxes (width 59mm, height 23mm). The preprinted paper was placed underneath the blank writing paper. Four extra blank pages were added underneath the first two pages to ascertain a soft writing surface.

  • Sampling rate 200 Hz, resolution 2000 lines/cm, precision of 0.25 mm.
  • Collection device: WACOM Intuos3 A3 Wide USB Pen Tablet.
  • Collection software: MovAlyzer.

Folder Structure and File Naming

The signatures of the training set are arranged according to the following folder structure:

  • OfflineSignatures
    • Chinese
      • TrainingSet
        • Offline Genuine
        • Offline Forgeries
    • Dutch
      • TrainingSet
        • Offline Genuine
        • Offline Forgeries
  • OnlineSignatures
    • … (similar structure)

Genuine signatures are named according to the following convention (the same for all data sets): III_NN.*, where III is the ID of the reference writer and NN is an index of the signature, i.e., it is the NNth authentic signature contributed by writer III.

Simulated signatures (forgeries) are named according to the following conventions: FFFFIII_NN.*, where FFFF is the ID of the forger, III is the ID of the reference writer and NN is an index, i.e., it is the NNth simulation attempt of writer FFFF to simulate the signature of writer III.

It is advised to optimize your systems by using 12 authentic signatures per writer for training and the other authentic signatures for validation. You could also perform a cross validation. Note that in the first version of the data set there are some online authentic signatures missing (only 12 reference signatures). In a second version of this data set we will provide you with this missing data.

The folder structure and the filename conventions are not the same for testing the systems.

The signatures of the test set are arranged according to the following folder structure:

  • SigComp11-Offlinetestset
    • Chinese
      • Questioned(487) (Containing all the questioned signatures, both genuine and forged)
      • Ref(115) (Containing the reference signatures, only genuine)
    • Dutch
      • Questioned(1287) (Containing all the questioned signatures, both genuine and forged)
      • Ref(646) (Containing the reference signatures, only genuine)
  • SigComp11-Onlinetestset
    • … (similar structure)

Note that for both the training and test sets the online and offline folders do not necessary contain exactly the same signatures, because during acquisition not all samples could be acquired in both modes. Furthermore, note that the online signatures may contain artifacts from the pen-movements (e.g., strokes that do not belong to the actual signature anymore). Systems could recover from those artifacts by applying preprocessing heuristics).

Genuine signatures are named according to the following convention (the same for all data sets): NN _III.*, where NN is an index of the signature and III is the ID of the reference writer, i.e., it is the NNth authentic signature contributed by writer III.

Simulated signatures (forgeries) are named according to the following conventions: NN _FFFFIII.*, where NN is an index, FFFF is the ID of the forger, and III is the ID of the reference writer, i.e., it is the NNth simulation attempt of forger FFFF to simulate the signature of writer III.

References

  1. Marcus Liwicki, Michael Blumenstein, Elisa van den Heuvel, Charles E.H. Berger, Reinoud D. Stoel, Bryan Found, Xiaohong Chen, Muhammad Imran Malik. "SigComp11: Signature Verification Competition for On- and Offline Skilled Forgeries", Proc. 11th Int. Conference on Document Analysis and Recognition, 2011

Submitted Files

Version 1.0

Files

Note: the password for opening the zip files is "I hereby accept the SigComp 2011 disclaimer." (without the double quotes).


This page is editable only by TC11 Officers .