Difference between revisions of "Character discovery in the sub-word shapes"
(→Description) |
|||
(3 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | [[Datasets]] -> Current Page | + | [[Datasets]] -> [[Datasets List]] -> Current Page |
{| style="width: 100%" | {| style="width: 100%" | ||
Line 12: | Line 12: | ||
|} | |} | ||
+ | |} | ||
+ | =Proposed By= | ||
+ | {| border="0" | ||
+ | |- | ||
+ | | <pre>Prof Mohamed Cheriet | ||
+ | Synchromedia Laboratory | ||
+ | ETS, Montréal, (QC) Canada | ||
+ | H3C 1K3 | ||
+ | E-mail: mohamed.cheriet@etsmtl.ca | ||
+ | Tel: +1(514)396-8972 | ||
+ | Fax: +1(514)396-8595 | ||
+ | </pre> | ||
+ | | | ||
+ | | [[Image:Synchromedia logo.png|200px|thumb|[http://www.synchromedia.ca/web/ets/ '''Synchromedia Laboratory''']]] | ||
+ | | [[Image:Ets logo.jpg|230px|thumb|[http://www.etsmtl.ca/ '''École de technologie supérieure''']]] | ||
|} | |} | ||
Line 33: | Line 48: | ||
The average BER is calculated over the 10 runs. | The average BER is calculated over the 10 runs. | ||
− | =Related Dataset= | + | =Related Dataset and Ground Truth Data= |
− | * [[IBN SINA: A database for research on processing and understanding of Arabic manuscripts images]] | + | * [[IBN SINA: A database for research on processing and understanding of Arabic manuscripts images]] (originally proposed for v1.0 of the dataset) |
− | |||
− | |||
− | |||
=References= | =References= |
Latest revision as of 16:29, 2 October 2011
Datasets -> Datasets List -> Current Page
|
Proposed By
Prof Mohamed Cheriet Synchromedia Laboratory ETS, Montréal, (QC) Canada H3C 1K3 E-mail: mohamed.cheriet@etsmtl.ca Tel: +1(514)396-8972 Fax: +1(514)396-8595 |
Description
Labels for 15 characters are provided in the ground truth. For each character, a classifier is required to predict the presence of that character in each shape. Output of the classifier is binary. The evaluation for each character is carried out separately. The Balanced Error Rate (BER) is used as the performance measure (please see below for the details).
As a reference, the results can be compared to the published results available in Table 2 in [1]. Evaluation Protocol A cross-validation technique is proposed for the evaluation of this task. The average BER for each character is computed by repeating the training process for 10 times. In each run, the database is split into a training set and test set randomly. The training set consists of 80 percent of the database. The proposed method is trained using the training data, and its performance is computed over the test data in terms of BER.
The BER is defined as:
BER = 0.5*(FP/(TN+FP) + FN/(FN+TP))
Where,
FP = False Positive TP = True Positive FN = False Negative TN = True Negative
The average BER is calculated over the 10 runs.
Related Dataset and Ground Truth Data
- IBN SINA: A database for research on processing and understanding of Arabic manuscripts images (originally proposed for v1.0 of the dataset)
References
- Reza Farrahi Moghaddam, Mohamed Cheriet, Mathias M. Adankon, Kostyantyn Filonenko, and Robert Wisnovsky, “IBN SINA: A database for research on processing and understanding of Arabic manuscripts images”, Proceedings of DAS’10, June 9-11, 2010, Boston, MA, USA
This page is editable only by TC11 Officers .