ICDAR 2005 Robust Reading Competitions
Datasets -> Datasets List -> Current Page
|
Contents
Contact Author
Prof Simon Lucas School of Computer Science and Electronic Engineering University of Essex Email: sml@essex.ac.uk
Current Version
1.0
Keywords
Scene text, character recognition, word recognition, text localization, robust reading
Description
The datasets below were created for the ICDAR 2005 Robust Reading competitions organised by Prof Simon Lucas. You can find more details about these competitions at the ICDAR 2005 competition page.
Four independent competitions were organised: Character Recognition, Word Recognition, Text Locating and Reading. These tasks were organised in a closed mode, meaning that the participants had to submit an operational version of their system for independent testing. Out of the three tasks above, training datasets are available only for the character recognition competition. The datasets used for final performance evaluation are not available for any of the competitions.
Character Recognition
The character recognition datasets are in the simple MNist format, at the same size as the original MNist dataset (28x28). Each pixel is represented as a grey-level in the range 0 (black) to 255 (white). A random selection of 10 digits from each class is shown in the image. All the segmentation and labelling was performed while observing the full size colour images (i.e. including the surrounding context).
Three datasets are provided covering digits, lower case characters and upper case characters. For each of the datasets there is an images.bin file that contains the images in the MNIST format and a labels.bin file that contains the class labels in the MNist format.
In the case of the digits, in addition to the MNist format the data is also available as a directory tree of GIF images. This enables easy viewing without the need for a special purpose application. For each image in the directory tree, the file c*.gif is the rectangular grey-level image of each character, normalised so that the maximum dimension is 56 pixels, and the file n*.gif is the same image centred on a 28 x 28 square with the margins filled with Gaussian noise (with mean and standard deviation derived from the statistics of that image).
References
- S.M. Lucas, "ICDAR 2005 Text Locating Competition Results", Proc. of the 8th Int. Conf. on Document Analysis and Recognition (ICDAR 2005), pp. 80-84, Vol. 1, 2005
Submitted Files
Version 1.0
Character Recognition
- Digits - Images (Directory Tree) (5 MB)
- Digits - Images (MNist format) (750 KB)
- Digits - Labels (MNist format) (1 KB)
- Lower Case Characters - Images (MNist format) (4 MB)
- Lower Case Characters - Labels (MNist format) (6 KB)
- Upper Case Characters - Images (MNist format) (4 MB)
- Upper Case Characters - Labels (MNist format) (6 KB)
This page is editable only by TC11 Officers .