Binarization of PHIBD 2012 dataset
Datasets -> Datasets List -> Current Page
|
Contents
Description
Binarization of handwritten Document Images.
There are actually two tasks, depending on the nature of the binarization method used.
- For regular binarization methods, the task is to binarize all 15 document images.
- For learning-based binarization methods, the task is to use images number 1 to 5 for training, and then binarize images number 6 to 16.
A few baseline methods have been provided: PC (phase congruency) binarization method [Ziaei2012], and SGL/BGL binarization method [Farrahi2009, Farrahi2010]. The SGL/BGL method uses a rough binarization as its initialization.
Evaluation Protocol
- For regular methods, the average F-measure of the binarized images against the provided ground truth is used as the performance of the binarization method in question.
- For learning-based methods, the average F-measure of the binarized images number 6 to 15 against the provided ground truth is used as the performance of the binarization method in question.
Method | Whole set | Training set | Test set |
---|---|---|---|
Otsu (regular) | 82.09 | 90.76 | 77.75 |
PC (regular) | 90.91 | 92.33 | 90.20 |
SGL/BGL (upper bound; rough bin: GT) | 94.55 | 95.37 | 94.14 |
SGL/BGL (upper bound; rough bin: PC) | 91.79 | 93.29 | 91.04 |
SGL/BGL (learning) (rough bin: PC) | N/A | N/A | 89.94 |
A metacode of a learning-based binarization method based on stroke gray level (SGL) and background gray level (BGL) is provided. The executable of the method will be provided in near future.
The proposed learning-based binarization method uses the SGL and the BGL to determine a locally-adaptive threshold value based on a parameter (alpha). The optimal selection of this parameter is the learning part of this method.
Related Dataset
Related Ground Truth Data
References
- [Ziaei2013] Hossein Ziaei Nafchi, Reza Farrahi Moghaddam, and Mohamed Cheriet. Persian historical document dataset with introduction to PhaseGT: A ground truthing application, to be submitted to ICDAR’13.
- [Ziaei2012] Hossein Ziaei Nafchi, Reza Farrahi Moghaddam and Mohamed Cheriet, Historical Document Binarization Based on Phase Information of Images, in ACCV’12 Workshop on e-Heritage, Daejeon, South Korea, Nov 5-10, 2012.
- [Farrahi2009] Reza Farrahi Moghaddam, and Mohamed Cheriet, RSLDI: Restoration of single-sided low-quality document images, Pattern Recognition, Volume 42, Issue 12, p.3355–3364 (2009) DOI: 10.1016/j.patcog.2008.10.021
- [Farrahi2010] Reza Farrahi Moghaddam, and Mohamed Cheriet, A multi-scale framework for adaptive binarization of degraded document images, Pattern Recognition, Volume 43, Issue 6, Number 6, p.2186–2198 (2010) DOI: 10.1016/j.patcog.2009.12.024
- [Cheriet2012] Mohamed Cheriet, Reza Farrahi Moghaddam, and Rachid Hedjam, A learning framework for the optimization and automation of document binarization methods, Computer Vision and Image Understanding, Volume Accepted, p.– (2012) DOI: 10.1016/j.cviu.2012.11.003
Submitted Files
Version 1.0
- Train Set Meta Data (0 Mb)
- Test Set Meta Data (0 Mb)
- baseline method (0.39 Mb)
- baseline method (0.33 Mb)
- baseline method (0.36 Mb)
- baseline method (0.36 Mb)
- baseline method (0.36 Mb)
- Sample Program (0 Mb)
This page is editable only by TC11 Officers .