# IAPR TC-11 (Reading Systems) Newsletter ## July, 2018 **Online, phone-friendly version:** [July 2018 Newsletter]( **Richard Zanibbi, TC-11 Communications Officer** ( ) Report: The 2nd IAPR TC10/TC11 Summer School (La Rochelle, France) ================================================================== The 2nd IAPR TC-10/TC-11 Summer School on Document Analysis (SSDA) was held in La Rochelle, from July 2nd to 6th 2018. The aim of this summer school was to give to new students in the field of DAR (Document Analysis and Recognition) an overview of all the traditional approaches to process and analyse documents but also to focus on new trends such as deep learning, the use of interactive devices, the fraud issues in documents, and so on. Different talks were given by international researchers from France, Switzerland, Spain, Japan and the United States. In addition to traditional oral talks, three practical sessions were carried out in order to apply some techniques addressed during the talks. The interactive sessions (such as posters and self-introduction session) were a good environment for exchanges between participants and senior researchers. Finally, a visit of the L3i laboratory was given, including demonstrations of research activities in the field of DAR. Two awards were given. The poster award was given to Florian Westphal (Blekinge Institute of Technology, Sweden) for his poster titled "Efficient Binarization using Heterogeneous Computing and Interactive Learning". The excellence award was given to Tarin Clanuwat (Center for Open Data in the Humanities, Tokyo, Japan) for her outstanding participation. This event gathered PhD students and junior researchers from 11 countries: - France: 7 - Vietnam: 4 - Sweden: 2 - Finland: 2 - Germany: 2 - Indonesia: 2 - Pakistan: 1 - Austria: 1 - Tunisia: 1 - Italy: 1 - Japan: 1 Thanks to the support of the IAPR, some travel grants and fee waivers were provided for participants with limited resources. Moreover, 17 master students (16 from China and 1 from Brazil) attended this summer school in order to discover the field of Document Analysis. **Jean-Christophe Burie, SSDA General Chair** ( ) Dates and Deadlines =================== Deadlines --------- - **Sept. 28:** Paper submission deadline for [IWRR](http://www.cvc.uab.es/iwrr2018/?c=home). - **Dec. 7/15th:** Abstract/paper submission deadline for [ICDAR 2019](http://www.icdar2019.org). ([Call for Papers](www.icdar2019.org/resources/PreliminaryCFP-icdar2019.pdf)) Upcoming Conferences and Events ------------------------------- **2018** - [ICFHR 2018](https://icfhr2018.org). Niagara Falls, USA (August 5-8, 2018) - [ICPR 2018](http://www.icpr2018.org). Beijing, China (August 20-24, 2018) - [DocEng 2018](https://doceng.org/doceng2018). Halifax, Canada (August 28-31, 2018) - [WoRMS 2018](https://sites.google.com/view/worms2018). Paris, France (Sept. 20, 2018) - [IWRR 2018](http://www.cvc.uab.es/iwrr2018/?c=home). Perth, Australia (Dec. 2, 2018) **2019 and Later** - [ICDAR 2019](http://www.icdar2019.org), Sydney, Australia (September 22-25, 2019) - [ICFHR 2020](http://www.icfhr2020.org). Dortmund, Germany (September 8-10, 2020) Calls for Papers ================ CFP: International Workshop on Robust Reading (IWRR) ---------------------------------------------------- **3rd International Workshop on Robust Reading (IWRR)** ACCV 2018, Perth, Australia - December 2, 2018 [IWRR Web Page](http://www.cvc.uab.es/iwrr2018/?c=home) **Important Dates:** September 28, 2018 Submission Deadline December 2, 2018 Workshop at ACCV The 3rd IWRR workshop will be held in Perth, Australia, in conjunction with ACCV2018. The workshop aims at bringing together computer vision researchers and practitioners with an interest in reading systems that operate on images acquired in unconstrained conditions, such as scene images and video sequences, born-digital images, wearable camera and lifelog feeds, social media images, etc. The particular focus of the workshop is on the automatic extraction and interpretation of textual content in images, and applications that use textual information obtained automatically by such methods. IWRR2018 invites the submission of original, previously unpublished work and welcomes re-submissions of improved versions of papers that have been rejected in the ACCV2018 conference reviewing process. Workshop proceedings with accepted papers will be published along with the main conference proceedings by Springer in the *Lecture Notes in Computer Science (LNCS)* series. **The topics of interest include, among others:** - Word spotting and end-to-end reading systems - Scene text based image retrieval - Joint modelling of textual and visual information - Text localisation, segmentation, and recognition in scene and born-digital images - Reading and tracking scene and/or overlaid text in video sequences - Robust reading applications (e.g. translation, reading text for the blind etc) - Performance evaluation and metrics Visit the IWRR website for more information: **Dimosthenis Karatzas, Workshop Co-Organizer** ( ) IJDAR ===== IJDAR: New Issue (Vol. 21, Issue 1 - repost) -------------------------------------------- **Table of Contents** Click on the links to go directly to the Springer Link page for each article. - [Text and non-text separation in offline document images: a survey.](http://alerts.springer.com/re?l=D0In6ahl0I6hfgfd5Iq) Showmik Bhowmik, Ram Sarkar, Mita Nasipuri & David Doermann - [Recognition-based character segmentation for multi-level writing style.](http://alerts.springer.com/re?l=D0In6ahl0I6hfgfd5It) Papangkorn Inkeaw, Jakramate Bootkrajang, Phasit Charoenkwan, Sanparith Marukatat, Shinn-Ying Ho & Jeerayut Chaijaruwanich - [Efficient document image binarization using heterogeneous computing and parameter tuning.](http://alerts.springer.com/re?l=D0In6ahl0I6hfgfd5Iw) Florian Westphal, Håkan Grahn & Niklas Lavesson - [Making scanned Arabic documents machine accessible using an ensemble of SVM classifiers.](http://alerts.springer.com/re?l=D0In6ahl0I6hfgfd5Iz) Randa Elanwar, Wenda Qin & Margrit Betke - [A novel Arabic OCR post-processing using rule-based and word context techniques.](http://alerts.springer.com/re?l=D0In6ahl0I6hfgfd5I12) Iyad Abu Doush, Faisal Alkhateeb & Anwaar Hamdi Gharaibeh - [Text box proposals for handwritten word spotting from documents.](http://alerts.springer.com/re?l=D0In6ahl0I6hfgfd5I15) Suman Ghosh & Ernest Valveny - [Fusion of LLE and stochastic LEM for Persian handwritten digits recognition.](http://alerts.springer.com/re?l=D0In6ahl0I6hfgfd5I18) Rassoul Hajizadeh, A. Aghagolzadeh & M. Ezoji - [Binarization of degraded document images based on contrast enhancement.](http://alerts.springer.com/re?l=D0In6ahl0I6hfgfd5I1b) Di Lu, Xin Huang & LiXue Sui - [Handling noise in textual image resolution enhancement using online and offline learned dictionaries.](http://alerts.springer.com/re?l=D0In6ahl0I6hfgfd5I1e) Rim Walha, Fadoua Drira, Frank Lebourgeois, Christophe Garcia & Adel M. Alimi IJDAR Discount for IAPR Members (repost) ---------------------------------------- IAPR is pleased to announce a partnership agreement with Springer, the publisher of IJDAR, the International Journal on Document Analysis and Recognition. This new agreement will allow IAPR members to receive a subscription to the electronic version of IJDAR at a discount of nearly 50%. For additional details, see the links below: - - [http://www.iapr.org/publications/intjrnlsub.php](http://www.iapr.org/publications/intjournal.php) **Koichi Kise, Daniel Lopresti and Simone Marinai, IJDAR Editors-in-Chief** ( , , ) Datasets (repost) ================= TC-11 maintains a colletion of datasets that can be found online in the [TC-11 Datasets Repository](http://www.iapr-tc11.org/mediawiki/index.php/Datasets). If you have new datasets (e.g., from competitions) that you wish to share with the research community, please contact the TC-11 Dataset Curator (contact information is below). **Andreas Fischer (TC-11 Dataset Curator)** () Careers ======= Post-Doctoral Scholarship in Machine learning (Luleå) ----------------------------------------------------- **Ref 1861-2018** **Online Description:** **Imporant Dates:** August 19th, 2018: Final day to apply The newly established Machine Learning Research Laboratory chaired by Prof. Marcus Liwicki has several open positions in the area of Machine Learning available. We can offer well-equipped laboratory facilities for performing research, and a good academic network both within Sweden and abroad. The Post-Doc Scholarship is 100% for the duration of at least one year (extendable to two years). Afterwards, hiring at LTU is also possible. The awarded candidate will get a direct stipend of 28 000 SEK per month, which roughly corresponds to 2800 EUR and is 30% above the average net income in the area (more info: ). **Subject Description** Machine learning focuses on computational methods by which computer systems uses data to improve their own performance, understanding, and to make accurate predictions and has a close connection to applications. **Post-Doctoral Projects** Machine Learning, especially Deep Learning for applications in the area of Pattern Recognition (e.g., natural language processing, document processing, robotics, eHealth, unsupervised learning, and time series analysis). The Post-Docs should furthermore strongly participate in acquisition of future projects. **Qualifications** To qualify for a position as a postdoctoral research fellow on scholarship, you must have a PhD, or doctoral degree or a foreign degree equivalent to a doctorate or doctoral degree. For this scholarship, a Ph.D. in computer science or the above-mentioned areas is expected and ideally some experience in international research project collaboration. We are looking for enthusiastic candidates capable of conducting state-of-the-art research. We expect candidates with good knowledge of English both in speech and writing and have the capacity to work independently as well as in teams. Participation in international research projects is meriting. As LTU is very strong in application-oriented research, and industrial experience is very welcome. **Information** For further information, please contact: Professor (Chair), Marcus Liwicki, and Professor Jonas Ekman (Head of Department) **Reference number:** 1861-2018 **Marcus Liwicki, Chair, Machine Learning Research Laboratory (Luleå)** ( ) IRISA/INSA Rennes (France): Research Engineer/PostDoc Position (3 Years) ------------------------------------------------------------------------ **Analysis systems for serial sources in collections of historical image documents** **Pdf version:** **Important Dates** September 1, 2018 - August 31, 2021 Contract period **IRISA - Intuidoc** IRISA is a joint research center for Informatics, including Robotics and Image and Signal Processing. 800 people, 40 teams, explore the world of digital sciences to find applications in healthcare, ecology-environment, cyber-security, transportation, multimedia, and industry. INSA Rennes is one of the 8 trustees of IRISA. The Intuidoc team () conducts research on the topic of document image recognition. Since many years, the team proposes a system, called DMOS-PI method, for document structure analysis of documents. This DMOS-PI method is used for document recognition, or field extraction in archive documents, handwritten contents damaged documents (musical scores, archives, newspapers, letters, electronic schema, etc.). **EURHISFIRM project** EURHISFIRM European project aims at developing a research infrastructure to connect, collect, collate, align, and share reliable long-run company-level data for Europe to enable researchers, policymakers and other stakeholders to analyze, develop, and evaluate effective strategies to promote investment and economic growth. To achieve this goal, EURHISFIRM develops innovative tools to spark a "Big data" revolution in the historical social sciences and to open access to cultural heritage. EURHISFIRM is a project funded by the European Commission within the Infrastructure Development Program of Horizon 2020. The first phase of the Infrastructure Development Program lasts for three years. It aims at developing an in-depth design study of the Research Infrastructure. After this phase, Development and Consolidation Phases follow if further applications will be successful. EURHISFIRM brings together eleven research institutions in economics, history, information technologies and data science from seven European countries. **Position to be filled** - Position: Post-doctoral fellow / Research Engineer - Time commitment: Full-time - Duration of the contract: up to 36 months, starting as soon a possible - Supervisors: Bertrand Coüasnon and Aurélie Lemaitre - Indicative salary: Up to €36 000 gross annual salary (according to experience), with social security benefits - Location: IRISA -- Rennes, France **Missions** The post-doctoral fellow / research engineer will be working on two tasks of EURHISFIRM workflow: the architecture of an adaptable system for document recognition, and the implementation of a generic structure layout extraction module. The scientific challenge will be to extract information from various printed serial sources. Due to the large variety of those documents, a flexible and easy-to-adapt document recognition system is designed. For that purpose, the system will be based on a modeling of knowledge not only at the page level but also at the collection level in interaction with experts of the historical sources. Thus, redundancies between pages will be used to make the system more reliable and reduce manual corrections while obtaining a high recognition quality. The system will we based on the DMOS-PI method which gives a framework for the analysis of collections of documents. It enables to share information from the collection between the pages, thanks to an iterative mechanism of analysis. This mechanism also makes it possible to integrate an asynchronous interaction between automatic analysis and human operators in order to limit the time of interaction by avoiding mutual waiting. This modeling of the global analysis must be able to adapt to very different kinds of documents: from very structured documents, like stock exchange lists with redundancy and strong consistency between sequences of data, up to less structured documents, like yearbooks even if, also for them, the sequence from one year to another is important for improving the recognition quality. The implementation of a generic structure extraction module will be based on the DMOS-PI method. It uses a grammatical language, EPF (Enhanced Position Formalism), to describe a general page layout, with perceptive vision mechanisms, and an iterative analysis. The system will also combine structural method with Deep Learning. For new collections, an adapted description of the document layout will be developed. This has to be done on a large range of structure levels: from very structured pages like table structures from stock exchange lists, up to a paragraph-oriented structures from yearbooks. **Applicant Requirements** - PhD, Master degree or Engineering degree in computer science - Experience in document recognition, statistical analysis or deep learning. - Fluent English - Skills in grammars and languages and/or logical programming are nice-to-have. For further information, please contact Bertrand Coüasnon () and Aurélie Lemaitre (). Applicants should send a curriculum-vitae with a list of publications and the names and email addresses of up to three references. **Bertrand Coüasnon, Director, Media and Interactions Department (IRISA)** ( ) PostDoc and Researcher Positions at Uppsala University (repost) --------------------------------------------------------------- **Jobs Advertisements:** The group at the Centre for Image Analysis active in the field on Handwritten Text Recognition at Uppsala Univeristy is recruiting two new team members (PostDoc or Researcher positions). We encourage strong candidates to apply and join us at Uppsala. The appointments are for a maximum of two years - details may be found through the links above. Questions should be directed to Dr. Ingela Nyström (email provided below). **Ingela Nyström, Professor, Uppsala University** ( ) Student Industrial Internship Opportunities (IAPR - repost) ----------------------------------------------------------- [IAPR's Industrial Liaison Committee](http://www.iapr.org/committees/committees.php?id=5&subid=53) is pleased to announce the opening of its Company Internship Brokerage List. The web page lists internship opportunities for students at different levels of education and specialism. We expect many additional internship opportunities to be listed here as the community becomes more aware of the site. IAPR Company Internship Brokerage List: **Bob Fisher, Chair, IAPR Industrial Liason Committee** ( ) Contributions and Subscriptions ================================== **Call for Contributions:** To contribute news items, please send a short email to the editor, [Richard Zanibbi](mailto:rxzvcs@cs.rit.edu). Contributions might include conference and workshop announcements or reports, career opportunities, book reviews, or anything else of interest to the TC-11 community. **Subscription:** This newsletter is sent to subscribers of the IAPR TC11 mailing list. To join the TC-11 mailing list, please click on [this link](https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=iapr-tc11&A=1). To manage your subscription, please visit the [mailing list homepage](https://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=IAPR-TC11). ------------------------------------------------------------------------ IAPR TC-11 HOMEPAGE: [http://www.iapr-tc11.org](http://www.iapr-tc11.org) The IAPR is the International Association for Pattern Recognition. IAPR's Technical Committee No. 11 (TC-11) includes researchers and practitioners working with Optical Character Recognition (OCR), and more generally the analysis and recognition of information in documents.