Difference between revisions of "ICDAR 2003 Robust Reading Competitions"
(Created page with 'Datasets -> Datasets List -> Current Page {| style="width: 100%" |- | align="right" | {| |- | '''Created: '''2011-06-28 |- | {{Last updated}} |} |} =Contact Author=…') |
(→Description) |
||
Line 27: | Line 27: | ||
=Description= | =Description= | ||
− | The datasets below were created for the ICDAR 2003 Robust Reading competitions organised by Prof Simon Lucas and his team. You can find more details about these competitions at the [http://algoval.essex.ac.uk/icdar/Competitions.html ICDAR 2003] competition page | + | The datasets below were created for the ICDAR 2003 Robust Reading competitions organised by Prof Simon Lucas and his team. You can find more details about these competitions at the [http://algoval.essex.ac.uk/icdar/Competitions.html ICDAR 2003] competition page. |
Four independent competitions were organised: Robust Reading, Robust Word Recognition, Robust Character Recognition and Text Locating. These tasks were organised in a closed mode, meaning that the participants had to submit an operational version of their system for independent testing. The datasets used for the final performance evaluation are not available for any of the competitions. | Four independent competitions were organised: Robust Reading, Robust Word Recognition, Robust Character Recognition and Text Locating. These tasks were organised in a closed mode, meaning that the participants had to submit an operational version of their system for independent testing. The datasets used for the final performance evaluation are not available for any of the competitions. | ||
Line 61: | Line 61: | ||
Each dataset is provided as a zip file, and contains a set of JPEG images of single characters and an XML tag file containing the ground truth character classes. | Each dataset is provided as a zip file, and contains a set of JPEG images of single characters and an XML tag file containing the ground truth character classes. | ||
− | |||
=References= | =References= |
Latest revision as of 13:06, 16 October 2012
Datasets -> Datasets List -> Current Page
|
Contact Author
Prof Simon Lucas School of Computer Science and Electronic Engineering University of Essex Email: sml@essex.ac.uk
Current Version
1.0
Keywords
Scene text, character recognition, word recognition, text localization, robust reading
Description
The datasets below were created for the ICDAR 2003 Robust Reading competitions organised by Prof Simon Lucas and his team. You can find more details about these competitions at the ICDAR 2003 competition page.
Four independent competitions were organised: Robust Reading, Robust Word Recognition, Robust Character Recognition and Text Locating. These tasks were organised in a closed mode, meaning that the participants had to submit an operational version of their system for independent testing. The datasets used for the final performance evaluation are not available for any of the competitions.
The datasets provided are organized into Sample and Trial datasets.
Sample datasets are provided to give you a quick impression of the data, and also to allow function testing of your software. That is, you can run tests on the sample data to check that your software works with the data, but the results won't mean much.
Trial datasets serve two purposes. Use them to get results for your ICDAR 2003 papers. For this purpose, they are partitioned into two subsets: TrialTrain and TrialTest. Use TrialTrain to train or tune your algorithms, then quote results on TrialTest.
Robust Reading and Text Locating
The aim of the Robust Reading Competition is to find the best system able to read complete words in camera captured scenes. This entails both locating the text in the image (in terms of bounding boxes of individual words) and recognising the containing text.
The aim of the Text Locating Competition is to just locate the text regions in scenes.
The datasets for both Robust Reading and Text Locating Competitions are provided in a bundle. Each dataset is provided as a zip file, and contains a set of JPEG scene images, and three XML tag files: locations.xml, words.xml and segmentation.xml.
- locations.xml is for the Text Locating problem, and contains the path to each image and the set of rectangles for each image.
- words.xml is for the Robust Reading competition - this tags each image with the bouding rectangles of each word in the image together with the text in each rectangle.
- segmentation.xml - like words.xml, except that each word is also given its segmentation points - just in case this information is useful to your algorithm (e.g. may be used to speed up EM).
Robust Word Recognition
The aim of this competition is to find the best system able to read single words that have been extracted from natural scenes.
Each dataset is provided as a zip file, and contains a set of JPEG images of single words and an XML tag file containing the ground truth transcriptions.
Robust Character Recognition
The aim of this competition is to find the best system able to classify single characters that have been extracted from natural scenes.
Each dataset is provided as a zip file, and contains a set of JPEG images of single characters and an XML tag file containing the ground truth character classes.
References
- S.M. Lucas et al, "ICDAR 2003 Robust Reading Competitions: Entries, Results and Future Directions", Int. Journal on Document Analysis and Recognition, Vol. 7, Num. 2-3, pp. 105-122, 2005
- C. Wolf, J.-M. Jolion, "Object count/area graphs for the evaluation of object detection and segmentation algorithms", Int. Journal on Document Analysis and Recognition, Vol. 8, Num. 4, pp. 280-296, 2006
Submitted Files
Version 1.0
Robust Reading and Text Locating
- Sample Set (20 Images) (7.3 MB)
- TrialTrain Set (258 Images) (43.3 MB)
- TrialTest Set (251 Images) (69.6 MB)
Robust Word Recognition
- Sample Set (171 Words) (2.7 MB)
- TrialTrain Set (1157 Words) (17.5 MB)
- TrialTest Set (1111 Words) (19.6 MB)
Robust Character Recognition
- Sample Set (854 Characters) (3.0 MB)
- TrialTrain Set (6185 Characters) (22.3 MB)
- TrialTest Set (5430 Characters) (23.9 MB)
This page is editable only by TC11 Officers .