Difference between revisions of "Binarization of PHIBD 2012 dataset"

From TC11
Jump to: navigation, search
(Created page with "Datasets -> Datasets List -> Current Page {| style="width: 100%" |- | align="right" | {| |- | '''Created: '''2013-05-30 |- | {{Last updated}} |} |} =Description= Bi…")
 
(Version 1.0)
 
Line 86: Line 86:
 
* [http://www.iapr-tc11.org/dataset/PHIBD2012/Training.txt  Train Set Meta Data] (0 Mb)
 
* [http://www.iapr-tc11.org/dataset/PHIBD2012/Training.txt  Train Set Meta Data] (0 Mb)
 
* [http://www.iapr-tc11.org/dataset/PHIBD2012/Test.txt  Test Set Meta Data] (0 Mb)
 
* [http://www.iapr-tc11.org/dataset/PHIBD2012/Test.txt  Test Set Meta Data] (0 Mb)
* [http://www.iapr-tc11.org/dataset/PHIBD2012/Otsu_PHIBD_2012.zip   baseline method] (0.39 Mb)
+
* [http://www.iapr-tc11.org/dataset/PHIBD2012/Otsu_PHIBD_2012.zip Otsu_PHIBD_2012 baseline method] (0.39 Mb)
* [http://www.iapr-tc11.org/dataset/PHIBD2012/PC_PHIBD_2012.zip   baseline method] (0.33 Mb)
+
* [http://www.iapr-tc11.org/dataset/PHIBD2012/PC_PHIBD_2012.zip PC_PHIBD_2012  baseline method] (0.33 Mb)
* [http://www.iapr-tc11.org/dataset/PHIBD2012/SGLBGL_PHIBD_2012.zip   baseline method] (0.36 Mb)
+
* [http://www.iapr-tc11.org/dataset/PHIBD2012/SGLBGL_PHIBD_2012.zip SGLBGL_PHIBD_2012  baseline method] (0.36 Mb)
* [http://www.iapr-tc11.org/dataset/PHIBD2012/SGLBGL_PC_PHIBD_2012.zip  baseline method] (0.36 Mb)
+
* [http://www.iapr-tc11.org/dataset/PHIBD2012/SGLBGL_PC_PHIBD_2012.zip  SGLBGL_PC_PHIBD_2012 baseline method] (0.36 Mb)
* [http://www.iapr-tc11.org/dataset/PHIBD2012/SGLBGL_PC_TrainTest_PHIBD_2012.zip  baseline method] (0.36 Mb)
+
* [http://www.iapr-tc11.org/dataset/PHIBD2012/SGLBGL_PC_TrainTest_PHIBD_2012.zip  SGLBGL_PC_TrainTest_PHIBD_2012 baseline method] (0.36 Mb)
* [http://www.iapr-tc11.org/dataset/PHIBD2012/SGLBGL_metacode.m   Sample Program] (0 Mb)
+
* [http://www.iapr-tc11.org/dataset/PHIBD2012/SGLBGL_metacode.m SGLBGL_metacode.m Sample Program] (0 Mb)
  
 
----
 
----
 
This page is editable only by [[IAPR-TC11:Reading_Systems#TC11_Officers|TC11 Officers ]].
 
This page is editable only by [[IAPR-TC11:Reading_Systems#TC11_Officers|TC11 Officers ]].

Latest revision as of 15:50, 3 July 2013

Datasets -> Datasets List -> Current Page

Created: 2013-05-30
Last updated: 2013-007-03

Description

Binarization of handwritten Document Images.

There are actually two tasks, depending on the nature of the binarization method used.

  1. For regular binarization methods, the task is to binarize all 15 document images.
  2. For learning-based binarization methods, the task is to use images number 1 to 5 for training, and then binarize images number 6 to 16.

A few baseline methods have been provided: PC (phase congruency) binarization method [Ziaei2012], and SGL/BGL binarization method [Farrahi2009, Farrahi2010]. The SGL/BGL method uses a rough binarization as its initialization.



Evaluation Protocol

  1. For regular methods, the average F-measure of the binarized images against the provided ground truth is used as the performance of the binarization method in question.
  2. For learning-based methods, the average F-measure of the binarized images number 6 to 15 against the provided ground truth is used as the performance of the binarization method in question.
Method Whole set Training set Test set
Otsu (regular) 82.09 90.76 77.75
PC (regular) 90.91 92.33 90.20
SGL/BGL (upper bound; rough bin: GT) 94.55 95.37 94.14
SGL/BGL (upper bound; rough bin: PC) 91.79 93.29 91.04
SGL/BGL (learning) (rough bin: PC) N/A N/A 89.94

A metacode of a learning-based binarization method based on stroke gray level (SGL) and background gray level (BGL) is provided. The executable of the method will be provided in near future.

The proposed learning-based binarization method uses the SGL and the BGL to determine a locally-adaptive threshold value based on a parameter (alpha). The optimal selection of this parameter is the learning part of this method.

Related Dataset

Related Ground Truth Data

References

  • [Ziaei2013] Hossein Ziaei Nafchi, Reza Farrahi Moghaddam, and Mohamed Cheriet. Persian historical document dataset with introduction to PhaseGT: A ground truthing application, to be submitted to ICDAR’13.
  • [Ziaei2012] Hossein Ziaei Nafchi, Reza Farrahi Moghaddam and Mohamed Cheriet, Historical Document Binarization Based on Phase Information of Images, in ACCV’12 Workshop on e-Heritage, Daejeon, South Korea, Nov 5-10, 2012.
  • [Farrahi2009] Reza Farrahi Moghaddam, and Mohamed Cheriet, RSLDI: Restoration of single-sided low-quality document images, Pattern Recognition, Volume 42, Issue 12, p.3355–3364 (2009) DOI: 10.1016/j.patcog.2008.10.021
  • [Farrahi2010] Reza Farrahi Moghaddam, and Mohamed Cheriet, A multi-scale framework for adaptive binarization of degraded document images, Pattern Recognition, Volume 43, Issue 6, Number 6, p.2186–2198 (2010) DOI: 10.1016/j.patcog.2009.12.024
  • [Cheriet2012] Mohamed Cheriet, Reza Farrahi Moghaddam, and Rachid Hedjam, A learning framework for the optimization and automation of document binarization methods, Computer Vision and Image Understanding, Volume Accepted, p.– (2012) DOI: 10.1016/j.cviu.2012.11.003

Submitted Files

Version 1.0


This page is editable only by TC11 Officers .