Binarization of PHIBD 2012 dataset
Binarization of handwritten Document Images.
There are actually two tasks, depending on the nature of the binarization method used.
- For regular binarization methods, the task is to binarize all 15 document images.
- For learning-based binarization methods, the task is to use images number 1 to 5 for training, and then binarize images number 6 to 16.
A few baseline methods have been provided: PC (phase congruency) binarization method [Ziaei2012], and SGL/BGL binarization method [Farrahi2009, Farrahi2010]. The SGL/BGL method uses a rough binarization as its initialization.
- For regular methods, the average F-measure of the binarized images against the provided ground truth is used as the performance of the binarization method in question.
- For learning-based methods, the average F-measure of the binarized images number 6 to 15 against the provided ground truth is used as the performance of the binarization method in question.
|Method||Whole set||Training set||Test set|
|SGL/BGL (upper bound; rough bin: GT)||94.55||95.37||94.14|
|SGL/BGL (upper bound; rough bin: PC)||91.79||93.29||91.04|
|SGL/BGL (learning) (rough bin: PC)||N/A||N/A||89.94|
A metacode of a learning-based binarization method based on stroke gray level (SGL) and background gray level (BGL) is provided. The executable of the method will be provided in near future.
The proposed learning-based binarization method uses the SGL and the BGL to determine a locally-adaptive threshold value based on a parameter (alpha). The optimal selection of this parameter is the learning part of this method.
Related Ground Truth Data
- [Ziaei2013] Hossein Ziaei Nafchi, Reza Farrahi Moghaddam, and Mohamed Cheriet. Persian historical document dataset with introduction to PhaseGT: A ground truthing application, to be submitted to ICDAR’13.
- [Ziaei2012] Hossein Ziaei Nafchi, Reza Farrahi Moghaddam and Mohamed Cheriet, Historical Document Binarization Based on Phase Information of Images, in ACCV’12 Workshop on e-Heritage, Daejeon, South Korea, Nov 5-10, 2012.
- [Farrahi2009] Reza Farrahi Moghaddam, and Mohamed Cheriet, RSLDI: Restoration of single-sided low-quality document images, Pattern Recognition, Volume 42, Issue 12, p.3355–3364 (2009) DOI: 10.1016/j.patcog.2008.10.021
- [Farrahi2010] Reza Farrahi Moghaddam, and Mohamed Cheriet, A multi-scale framework for adaptive binarization of degraded document images, Pattern Recognition, Volume 43, Issue 6, Number 6, p.2186–2198 (2010) DOI: 10.1016/j.patcog.2009.12.024
- [Cheriet2012] Mohamed Cheriet, Reza Farrahi Moghaddam, and Rachid Hedjam, A learning framework for the optimization and automation of document binarization methods, Computer Vision and Image Understanding, Volume Accepted, p.– (2012) DOI: 10.1016/j.cviu.2012.11.003
- Train Set Meta Data (0 Mb)
- Test Set Meta Data (0 Mb)
- Otsu_PHIBD_2012 baseline method (0.39 Mb)
- PC_PHIBD_2012 baseline method (0.33 Mb)
- SGLBGL_PHIBD_2012 baseline method (0.36 Mb)
- SGLBGL_PC_PHIBD_2012 baseline method (0.36 Mb)
- SGLBGL_PC_TrainTest_PHIBD_2012 baseline method (0.36 Mb)
- SGLBGL_metacode.m Sample Program (0 Mb)
This page is editable only by TC11 Officers .