Difference between revisions of "Softwares"

From TC11
Jump to: navigation, search
Line 1: Line 1:
 
== On-line handwriting ==
 
== On-line handwriting ==
 
* [http://unipen.nici.kun.nl/uptools3/ '''uptools:'''] Tools for reading and processing files in the UNIPEN file format.
 
* [http://unipen.nici.kun.nl/uptools3/ '''uptools:'''] Tools for reading and processing files in the UNIPEN file format.
* [http://www.alphaworks.ibm.com/tech/comparehwr Comparison Tools for Handwriting Recognizers] using the UNIPEN format (Gene Ratzlaff, IBM) == Off-line handwriting ==
+
* [http://www.alphaworks.ibm.com/tech/comparehwr Comparison Tools for Handwriting Recognizers] using the UNIPEN format (Gene Ratzlaff, IBM)
* [http://esewww.essex.ac.uk/research/vasa/hueweb/huereal.html  '''HUE:'''] a software toolkit which supports the rapid development and re-use of handwriting and document analysis systems (Univ. of Essex, UK). == OCR ==
+
 
 +
 
 +
== Off-line handwriting ==
 +
* [http://esewww.essex.ac.uk/research/vasa/hueweb/huereal.html  '''HUE:'''] a software toolkit which supports the rapid development and re-use of handwriting and document analysis systems (Univ. of Essex, UK).
 +
 
 +
 
 +
== OCR ==
 
* [http://documents.cfar.umd.edu/ocr/  Public domain OCR software] (Univ. of Maryland, USA)
 
* [http://documents.cfar.umd.edu/ocr/  Public domain OCR software] (Univ. of Maryland, USA)
 
* [http://documents.cfar.umd.edu/resources/source/  Source code at the DIMUND server] (Univ. of Maryland, USA)
 
* [http://documents.cfar.umd.edu/resources/source/  Source code at the DIMUND server] (Univ. of Maryland, USA)
* [http://www.fmi.uni-passau.de/~buckley/OCR.html  Optical Character Recognition sources]== Pixels vs Vectors ==
+
* [http://www.fmi.uni-passau.de/~buckley/OCR.html  Optical Character Recognition sources]
* [http://sourceforge.net/projects/autotrace AutoTrace] bitmap to vector conversion == Pattern classification ==
+
 
 +
 
 +
== Pixels vs Vectors ==
 +
* [http://sourceforge.net/projects/autotrace AutoTrace] bitmap to vector conversion
 +
 
 +
 
 +
== Pattern classification ==
 
* [http://www-ai.cs.uni-dortmund.de/SOFTWARE/SVM_LIGHT/svm_light.eng.html  Support-Vector Machine: SVM<sup>light</sup>] Well-designed light-weight package for experimentation with the support-vector classifier. Several kernel functions are supported. ASCII data files.
 
* [http://www-ai.cs.uni-dortmund.de/SOFTWARE/SVM_LIGHT/svm_light.eng.html  Support-Vector Machine: SVM<sup>light</sup>] Well-designed light-weight package for experimentation with the support-vector classifier. Several kernel functions are supported. ASCII data files.
 
* [http://www.idiap.ch/learning/SVMTorch.html  SVM Torch-II] is a new implementation of Vapnik's Support Vector Machine that works both for classification and regression problems, and that has been specifically tailored for large-scale problems (such as more than 20000 examples, even for input dimensions higher than 100).
 
* [http://www.idiap.ch/learning/SVMTorch.html  SVM Torch-II] is a new implementation of Vapnik's Support Vector Machine that works both for classification and regression problems, and that has been specifically tailored for large-scale problems (such as more than 20000 examples, even for input dimensions higher than 100).
 
* [http://www.itl.atr.co.jp/comp.speech/Section6/Recognition/myers.hmm.html  Discrete-HMM kernel in C++] Originally developed for speech recognition, this generic package (ASCII data files!) allows for quick experimentation using discrete hidden-Markov modeling. A single HMM model is handled by the main program, thus multiple-class recognition will be realizable using (Unix) scripts.
 
* [http://www.itl.atr.co.jp/comp.speech/Section6/Recognition/myers.hmm.html  Discrete-HMM kernel in C++] Originally developed for speech recognition, this generic package (ASCII data files!) allows for quick experimentation using discrete hidden-Markov modeling. A single HMM model is handled by the main program, thus multiple-class recognition will be realizable using (Unix) scripts.
 
* [http://ic.arc.nasa.gov/ic/projects/bayes-group/autoclass/  '''AutoClass:'''] An unsupervised Bayesian classification program (NASA). Some data modeling (e.g., specifying all feature scale types) and structuring of the (ASCII) files is required.
 
* [http://ic.arc.nasa.gov/ic/projects/bayes-group/autoclass/  '''AutoClass:'''] An unsupervised Bayesian classification program (NASA). Some data modeling (e.g., specifying all feature scale types) and structuring of the (ASCII) files is required.
* [http://astro.u-strasbg.fr/~fmurtagh/mda-sw/pca.c  '''PCA:''']Principal Components Analysis, compact single main program written in C. Reads ASCII input files. == Information Retrieval ==
+
* [http://astro.u-strasbg.fr/~fmurtagh/mda-sw/pca.c  '''PCA:''']Principal Components Analysis, compact single main program written in C. Reads ASCII input files.
* [ftp://ftp.cs.cornell.edu/pub/smart/ '''SMART 11.0:'''] A package implementing the keyword vector-space approach for IR as introduced by Salton (1961). Source code is for SunOS, but has been ported to Linux by several [http://www.linuxgazette.com/issue13/smart.html groups]. There is extensive [http://broncho.ct.monash.edu.au/~maria/Smart/hands-on-tekst.html  documentation] on www. == Tools for (linguistic) post processing ==
+
 
 +
 
 +
== Information Retrieval ==
 +
* [ftp://ftp.cs.cornell.edu/pub/smart/ '''SMART 11.0:'''] A package implementing the keyword vector-space approach for IR as introduced by Salton (1961). Source code is for SunOS, but has been ported to Linux by several [http://www.linuxgazette.com/issue13/smart.html groups]. There is extensive [http://broncho.ct.monash.edu.au/~maria/Smart/hands-on-tekst.html  documentation] on www.
 +
 
 +
 
 +
== Tools for (linguistic) post processing ==
 
* [http://unipen.nici.kun.nl/scrawls/dictionaries/udi/  '''Word lists'''] of a few Western languages.
 
* [http://unipen.nici.kun.nl/scrawls/dictionaries/udi/  '''Word lists'''] of a few Western languages.
 
* [http://bobo.link.cs.cmu.edu/index.html/ftp-site/link-grammar/system-4.1/  '''Link Grammar 4.1:'''] A parser for English, written in C, by [http://bobo.link.cs.cmu.edu/index.html/  Temperley, Sleator and Lafferty ] at Carnegie Mellon.
 
* [http://bobo.link.cs.cmu.edu/index.html/ftp-site/link-grammar/system-4.1/  '''Link Grammar 4.1:'''] A parser for English, written in C, by [http://bobo.link.cs.cmu.edu/index.html/  Temperley, Sleator and Lafferty ] at Carnegie Mellon.
 
* [http://WWW-KSL-SVC.stanford.edu:5915/&service=frame-editor  '''Ontolingua:'''] Semantic modeling tool on WWW by Stanford University. <br /> There is a European [http://ontolingua.nici.kun.nl mirror site.] Ontologies can be exported in a number of formats, including Kif, Clips, Loom and Prolog. This is a generic tool, but can be used for content-related or document-related modeling in the context of machine reading.
 
* [http://WWW-KSL-SVC.stanford.edu:5915/&service=frame-editor  '''Ontolingua:'''] Semantic modeling tool on WWW by Stanford University. <br /> There is a European [http://ontolingua.nici.kun.nl mirror site.] Ontologies can be exported in a number of formats, including Kif, Clips, Loom and Prolog. This is a generic tool, but can be used for content-related or document-related modeling in the context of machine reading.
 +
 
----
 
----
 +
 
== Benchmarking Tools ==
 
== Benchmarking Tools ==
 
* [algoval.html Algoval] Internet-based algorithm evaluation. Several benchmarks in the area of TC-11 are already present (digit recognition, dictionary search, region-of-interest (ROI) detection). Algorithms in Java can be uploaded and compared (Simon Lucas, Univ. of Essex).
 
* [algoval.html Algoval] Internet-based algorithm evaluation. Several benchmarks in the area of TC-11 are already present (digit recognition, dictionary search, region-of-interest (ROI) detection). Algorithms in Java can be uploaded and compared (Simon Lucas, Univ. of Essex).
 
* [http://documents.cfar.umd.edu/resources/source/ppanther/ PinkPanther] document-segmentation benchmarking.
 
* [http://documents.cfar.umd.edu/resources/source/ppanther/ PinkPanther] document-segmentation benchmarking.
 +
 
----
 
----
= General === Learning and Optimization ==
+
 
 +
= General =
 +
 
 +
== Learning and Optimization ==
 
* [http://www.emsl.pnl.gov:2080/proj/neuron/neural/systems/shareware.html Neural Networks]
 
* [http://www.emsl.pnl.gov:2080/proj/neuron/neural/systems/shareware.html Neural Networks]
 
* [http://www.aic.nrl.navy.mil/galist/src/ Genetic Algorithms]
 
* [http://www.aic.nrl.navy.mil/galist/src/ Genetic Algorithms]

Revision as of 11:40, 28 August 2009

On-line handwriting


Off-line handwriting

  • HUE: a software toolkit which supports the rapid development and re-use of handwriting and document analysis systems (Univ. of Essex, UK).


== OCR ==


Pixels vs Vectors


Pattern classification

  • Support-Vector Machine: SVMlight Well-designed light-weight package for experimentation with the support-vector classifier. Several kernel functions are supported. ASCII data files.
  • SVM Torch-II is a new implementation of Vapnik's Support Vector Machine that works both for classification and regression problems, and that has been specifically tailored for large-scale problems (such as more than 20000 examples, even for input dimensions higher than 100).
  • Discrete-HMM kernel in C++ Originally developed for speech recognition, this generic package (ASCII data files!) allows for quick experimentation using discrete hidden-Markov modeling. A single HMM model is handled by the main program, thus multiple-class recognition will be realizable using (Unix) scripts.
  • AutoClass: An unsupervised Bayesian classification program (NASA). Some data modeling (e.g., specifying all feature scale types) and structuring of the (ASCII) files is required.
  • PCA:Principal Components Analysis, compact single main program written in C. Reads ASCII input files.


Information Retrieval

  • SMART 11.0: A package implementing the keyword vector-space approach for IR as introduced by Salton (1961). Source code is for SunOS, but has been ported to Linux by several groups. There is extensive documentation on www.


Tools for (linguistic) post processing

  • Word lists of a few Western languages.
  • Link Grammar 4.1: A parser for English, written in C, by Temperley, Sleator and Lafferty at Carnegie Mellon.
  • Ontolingua: Semantic modeling tool on WWW by Stanford University.
    There is a European mirror site. Ontologies can be exported in a number of formats, including Kif, Clips, Loom and Prolog. This is a generic tool, but can be used for content-related or document-related modeling in the context of machine reading.

Benchmarking Tools

  • [algoval.html Algoval] Internet-based algorithm evaluation. Several benchmarks in the area of TC-11 are already present (digit recognition, dictionary search, region-of-interest (ROI) detection). Algorithms in Java can be uploaded and compared (Simon Lucas, Univ. of Essex).
  • PinkPanther document-segmentation benchmarking.

General

Learning and Optimization


The software packages mentioned on this page are - mostly and preferably - available in source-code format (C,C++,Tcl/Tk,Java) and require standard ASCII input files. Please do not hesitate to give me a hint about free source code in the area of text processing on Internet.
schomaker@ai.rug.nl