Difference between revisions of "The Street View Text Dataset"
(Created page with "Datasets -> Datasets List -> Current Page {| style="width: 100%" |- | align="right" | {| |- | '''Created: '''2012-10-06 |- | {{Last updated}} |} |} =Contact Author=…") |
|||
Line 22: | Line 22: | ||
La Jolla, CA 92093-0404 | La Jolla, CA 92093-0404 | ||
Email: k[http://mailhide.recaptcha.net/d?k=01omEyRZid0nVm1TN9t98J1A==&c=cUMGBbQUpaP6Zu0AxhNJ8zylFKIVmLYKdQ3GCxUgtxY= ...]@cs.ucsd.edu | Email: k[http://mailhide.recaptcha.net/d?k=01omEyRZid0nVm1TN9t98J1A==&c=cUMGBbQUpaP6Zu0AxhNJ8zylFKIVmLYKdQ3GCxUgtxY= ...]@cs.ucsd.edu | ||
− | |||
− | |||
=Current Version= | =Current Version= | ||
− | [[Image: | + | [[Image:StreetViewText_Sample.jpg|400px|thumb|right| Example images from the Street View Text dataset.]] |
− | |||
− | |||
− | |||
− | 1.0 (also available from the [http:// | + | 1.0 (also available from the [http://vision.ucsd.edu/~kai/svt/ Author's Web site]) |
=Keywords= | =Keywords= | ||
− | OCR, | + | OCR, Real Scene, Urban Scene, Scene Text, Word Spotting, Scene Text Recognition, Scene Text Detection, Scene Text Localization |
=Description= | =Description= | ||
− | The | + | The Street View Text (SVT) dataset was harvested from Google Street View. Image text in this data exhibits high variability and often has low resolution. In dealing with outdoor street level imagery, we note two characteristics. (1) Image text often comes from business signage and (2) business names are easily available through geographic business searches. These factors make the SVT set uniquely suited for word spotting in the wild: given a street view image, the goal is to identify words from nearby businesses. More details about the data set can be found in our paper, Word Spotting in the Wild [[#References|[1]]]. For our up-to-date benchmarks on this data, see our paper, End-to-end Scene Text Recognition [[#References|[2]]]. |
− | + | This dataset only has word-level annotations (no character bounding boxes) and should be used for | |
+ | * cropped lexicon-driven word recognition and | ||
+ | * full image lexicon-driven word detection and recognition. | ||
− | + | If you need character training data then you should look into the Chars74K and ICDAR datasets. | |
− | |||
− | + | <!-- | |
− | + | =Metadata and Ground Truth Data= | |
+ | TODO | ||
=Related Tasks= | =Related Tasks= | ||
Line 54: | Line 51: | ||
=References= | =References= | ||
− | # | + | # To DO. [http://www.iapr-tc11.org/dataset/NEOCR/cbdar_paper.pdf (PDF)] |
− | + | ||
− | + | =Download= | |
− | = | ||
− | |||
− | |||
==Version 1.0== | ==Version 1.0== | ||
+ | TODO | ||
* [http://www.iapr-tc11.org/dataset/NEOCR/neocr_dataset.tar.gz The complete NEOCR dataset with annotations] (1.3 GB) | * [http://www.iapr-tc11.org/dataset/NEOCR/neocr_dataset.tar.gz The complete NEOCR dataset with annotations] (1.3 GB) | ||
− | |||
− | |||
− | |||
--> | --> |
Revision as of 12:03, 16 October 2012
Datasets -> Datasets List -> Current Page
|
Contact Author
Kai Wang EBU3B, Room 4148 Department of Comp. Sci. and Engr. University of California, San Diego 9500 Gilman Drive, Mail Code 0404 La Jolla, CA 92093-0404 Email: k...@cs.ucsd.edu
Current Version
1.0 (also available from the Author's Web site)
Keywords
OCR, Real Scene, Urban Scene, Scene Text, Word Spotting, Scene Text Recognition, Scene Text Detection, Scene Text Localization
Description
The Street View Text (SVT) dataset was harvested from Google Street View. Image text in this data exhibits high variability and often has low resolution. In dealing with outdoor street level imagery, we note two characteristics. (1) Image text often comes from business signage and (2) business names are easily available through geographic business searches. These factors make the SVT set uniquely suited for word spotting in the wild: given a street view image, the goal is to identify words from nearby businesses. More details about the data set can be found in our paper, Word Spotting in the Wild [1]. For our up-to-date benchmarks on this data, see our paper, End-to-end Scene Text Recognition [2].
This dataset only has word-level annotations (no character bounding boxes) and should be used for
- cropped lexicon-driven word recognition and
- full image lexicon-driven word detection and recognition.
If you need character training data then you should look into the Chars74K and ICDAR datasets.
This page is editable only by TC11 Officers .