ICDAR2011
  HOME
  VENUE
  COMMITTEES
  CALL FOR PAPERS
  IMPORTANT DATES
  TECHNICAL PROGRAM
  INVITED SPEAKERS
  AWARDS
  WORKSHOPS
  TUTORIALS
  COMPETITIONS
  DOCTORAL CONSORTIUM
  SPONSORS
  REGISTRATION
  ACCOMMODATION
  SOCIAL PROGRAM
  TRAVEL INFORMATION
  LINKS
  CONTACT
TUTORIALS
The following tutorials will be organized at the ICDAR 2011 venue on September 18, prior to the main conference.
 

T0. Automated Forensic Handwriting Analysis
     Room #3 of Building 8

T1. Build Your Own Handwriting Recognizer
     Room #1 of Friendship Palace

T2. Performance Evaluation in Document Image Analysis
     Room #3 of Friendship Palace

T3. Ancient Document Analysis and Recognition Systems – Data Acquisition, Conception, Development, and Evaluation
     Room #3 of Friendship Palace

T4. Key Topics in Administrative Document Analysis
     Room #4 of Friendship Palace

T5. Discriminative Markovian Models for Sequence Recognition
     Room #6 of Friendship Palace

T6. Ebooks: Challenges and opportunities for Document Analysis Research
     Room #6 of Friendship Palace

 


Registration Fee

  • 100USD (650CNY) for one-day (or two half-day), except Tutorial T0 (included in Workshop AFHA)
  • 50USD (325CNY) for half-day

One-day (or two half-day) registration includes tutorial attendance, one lunch ticket and two coffee breaks
Half-day registration includes tutorial attendance and one coffee break

For registration at tutorials, please visit the Registration.
 


T0. Automated Forensic Handwriting Analysis

Registration fee: 140USD (910CNY) for regular, 80USD (520CNY) for students

Presenters: Marcus Liwicki, Michael Blumenstein, Bryan Found, Charles Berger (or Reinoud Stoel)
Location: Room #3 of Building 8

Abstract:
The AFHA 2011 is a novel approach and brings together researchers in the field of automated handwriting analysis and signature verification and experts from the forensic handwriting examination community. It is organized as a two-day combined workshop and tutorial. On the first day, an introductory tutorial on forensic handwriting examination will be given. This includes a description of the forensics point of view and examples of real casework as well as a summary of important approaches in the area of automated handwriting examination. On the second day, a workshop about recent research activities will be held. First, Participants with accepted report papers will get the opportunity to talk about their research. Subsequently, in a panel discussion session, all participants will be able to state their points of view and discuss together about selected topics of the community.

 
 

Full day

Presenters: Gernot A. Fink, Szilard Vajda
Location: Room #1 of Friendship Palace

Abstract:
Today, the automatic recognition of machine-printed text is considered an almost solved problem. Therefore, research now focuses on machine reading of handwriting - a considerably more challenging task.  With the introduction of the statistical paradigm to the field of automatic handwriting recognition it became possible to build extremely successful recognizers based on the concept of Markovian models offering parameters estimation based on samples and segmentation and classification in an integrated manner. However, in order to be able to build a successful handwriting recognizer using Markovian models one not only has to be familiar with the theoretical concepts behind hidden Markov models and the often accompanying Markov chain models but also with many practical and application specific aspects, which are hardly covered by pattern recognition textbooks.

The tutorial presenting concepts, methods and algorithms will be organized in two major parts, namely a (shorter) lecture part introducing the necessary conceptual background including the architecture of a state-of-the-art statistical handwriting recognition system and a hands-on lab-working part where a handwriting recognition system for a small real-world task will be built using the open-source development environment ESMERALDA.

 

T2. Performance Evaluation in Document Image Analysis

Half-day (morning)

Presenters: Apostolos Antonacopoulos, Basilis Gatos, Stefan Pletschacher
Location: Room #3 of Friendship Palace

Abstract:
Performance evaluation, based on objective measures and representative datasets, is crucial to making real progress. This tutorial will cover the key issues in performance evaluation in the most widely researched but, at the same time, more difficult to assess areas of Document Image Analysis. Whilst the focus will be mostly on Layout Analysis and OCR, the evaluation of other areas such as binarisation and geometric correction will be mentioned. All aspects of performance evaluation will be examined, from collecting a representative sample to ground truthing to defining evaluation metrics and scenarios to interpreting the results. Participants will learn about the state of the art and gain valuable insights in the design, implementation and running of performance evaluation systems. Most importantly they will be given copies of software tools and will be guided through example ground-truthing and evaluation workflows.

 

T3. Ancient Document Analysis and Recognition Systems – Data Acquisition, Conception, Development, and Evaluation

Half-day (afternoon)

Presenters: Volker Märgner and Haikal El Abed
Location: Room #3 of Friendship Palace

Abstract:
The aim of the tutorial “Ancient Document Analysis and Recognition Systems – Data Acquisition, Conception, Development, and Evaluation” is to lay the foundations and to encourage further discussions on the development of ancient document analysis and recognition systems, especially for the recognition of printed and handwritten historic text. Researchers and practitioners working in the field of pattern recognition will be introduced to handwritten text recognition systems in general, different state-of-the-art approaches, steps of a system evaluation process and techniques to improve recognition quality. An important objective of this tutorial is to provide a basis to design a ground-truthing framework for Meta-data, structure, and content information for ancient documents. The presented methods include detailed analysis of a recognition system, relation between structured data and performance of systems, reject/combination strategies, and post-processing approaches. After this tutorial participants should be able to design their own recognition system or improve an existing one.

Full day

Presenters: Vincent Poulain d'Andecy,Jean Marc Ogier, Jose A. Rodriguez, Marçal Rusiñol, Dimosthenis Karatzas, Josep Llados
Location: Room #4 of Friendship Palace

Abstract:
Businesses and organizations along with their clients create a massive amount of documents – faxes, letters, forms, invoices, etc. - that more often than not has to be dealt with in a close to real-time manner. These are vital communications with clients, providers and other stakeholders that flow into, through, and out of the organization. Processing the paper-based correspondence is a work-intensive task. Letters are opened, read, sorted, routed and delivered. Depending on the contents, the contained documents are then forwarded to the appropriate recipient for the required action. The needs of the market have been the leading force behind a huge amount of research and development across the document life-time from digitization to image analysis and from indexing and classification to knowledge management, re-purposing and routing. The collective application of the above processes for the management of document flows at large scales is known as the Digital Mail Room.

Document Analysis research provides solutions for automating the screening process and determining the document type (whether invoice, contract, letter, etc.), and for extracting the relevant information from each document with minimal human intervention. This information is stored into appropriate databases for future querying and feeding outbound communications.

This tutorial will review the key document analysis techniques involved in a document workflow management. The agenda is organized in self-contained lectures addressed by invited specialists. It is Sponsored by FP7 ADAO Project.

Half-day (morning)

Presenters: Thierry Artières, Alain Biem
Location: Room #6 of Friendship Palace

Abstract:
Sequence modeling is key component in pattern recognition and data mining, Hidden Markov Models are very popular models for such a task.  Maximum Likelihood learning of HMMs is very popular as it is both simple and efficient, and it scales well with large corpus. However, it does not focus on minimizing the classification (or the segmentation) error rate.

Two approaches have been explored to improve the discriminative power of HMM based systems. The first approach replaces the MLE criterion by a discriminative criterion such as Maximum A Posteriori criterion, Conditional Maximum Likelihood, or a criterion that is even more closely related to the error rate, such as Minimum Classification Error and Large Margin criterion. The second approach exploits new Markovian structures to achieve intrinsic discrimination, either through probabilistic models (Maximum Entropy Markov Networks, Conditional Random Fields), and non probabilistic models as proposed in the structured output prediction framework (Hidden Markov Support Vector Machines, Maximum Margin Markov Networks).

This tutorial aims to provide an overview of existing methods in the two families mentioned above. It will provide the technical basis for understanding the strengths and weaknesses of these methods and to identify potential implementation difficulties.

 

Half-day (afternoon)

Presenter: Simone Marinai
Location: Room #6 of Friendship Palace

Abstract:
In the last years the interest in e-book readers is growing, following the growth of sales in electronic books. Two main document formats are accepted by most devices: PDF and ePub. The PDF format is widely used to share documents allowing a cross-platform readability. However, it is not ideal for a comfortable reading on small screens. On the opposite, the ePub format is re-flowable and is well suited for e-book readers.

In this tutorial we analyze the challenges and opportunities for the Document Analysis research with respect to these devices and document formats. In particular, we first describe the main features of dedicated e-book readers and of the various file formats supported by most devices. We will subsequently analyze in more details the standard ePub format with hands-on demonstration of most popular open source software for conversion and editing of ebooks.

In the second part we point out the problems that are faced by most tools to convert complex documents such as scientific and technical papers. We first analyze one system that we developed for the conversion of PDF books to ePub. In this system we invert the text formatting made during the pagination. To this purpose, layout analysis techniques are performed at the book level in order to identify the book's table of contents and the main functional areas of the book such as chapters, paragraphs, and notes.

In the last part we will address ongoing research related to the conversion of scientific and technical documents that are more difficult to handle. In particular, the presence of mathematical equations, tables, and illustrations in multi-column layouts require the integration of document analysis techniques with information extraction algorithms. Among others, techniques related to layout analysis, graphical symbol recognition, mathematical expression analysis, table understanding are relevant in this application area. We will also discuss open problems related to the use of relatively small screen devices to properly display complex objects such as tables and chemical drawings.

 

Updated:2011-09-08   Visited: 3040