Difference between revisions of "Softwares"
From TC11
Line 1: | Line 1: | ||
− | + | == On-line handwriting == | |
+ | * [http://unipen.nici.kun.nl/uptools3/ '''uptools:'''] Tools for reading and processing files in the UNIPEN file format. | ||
+ | * [http://www.alphaworks.ibm.com/tech/comparehwr Comparison Tools for Handwriting Recognizers] using the UNIPEN format (Gene Ratzlaff, IBM) == Off-line handwriting == | ||
+ | * [http://esewww.essex.ac.uk/research/vasa/hueweb/huereal.html '''HUE:'''] a software toolkit which supports the rapid development and re-use of handwriting and document analysis systems (Univ. of Essex, UK). == OCR == | ||
+ | * [http://documents.cfar.umd.edu/ocr/ Public domain OCR software] (Univ. of Maryland, USA) | ||
+ | * [http://documents.cfar.umd.edu/resources/source/ Source code at the DIMUND server] (Univ. of Maryland, USA) | ||
+ | * [http://www.fmi.uni-passau.de/~buckley/OCR.html Optical Character Recognition sources]== Pixels vs Vectors == | ||
+ | * [http://sourceforge.net/projects/autotrace AutoTrace] bitmap to vector conversion == Pattern classification == | ||
+ | * [http://www-ai.cs.uni-dortmund.de/SOFTWARE/SVM_LIGHT/svm_light.eng.html Support-Vector Machine: SVM<sup>light</sup>] Well-designed light-weight package for experimentation with the support-vector classifier. Several kernel functions are supported. ASCII data files. | ||
+ | * [http://www.idiap.ch/learning/SVMTorch.html SVM Torch-II] is a new implementation of Vapnik's Support Vector Machine that works both for classification and regression problems, and that has been specifically tailored for large-scale problems (such as more than 20000 examples, even for input dimensions higher than 100). | ||
+ | * [http://www.itl.atr.co.jp/comp.speech/Section6/Recognition/myers.hmm.html Discrete-HMM kernel in C++] Originally developed for speech recognition, this generic package (ASCII data files!) allows for quick experimentation using discrete hidden-Markov modeling. A single HMM model is handled by the main program, thus multiple-class recognition will be realizable using (Unix) scripts. | ||
+ | * [http://ic.arc.nasa.gov/ic/projects/bayes-group/autoclass/ '''AutoClass:'''] An unsupervised Bayesian classification program (NASA). Some data modeling (e.g., specifying all feature scale types) and structuring of the (ASCII) files is required. | ||
+ | * [http://astro.u-strasbg.fr/~fmurtagh/mda-sw/pca.c '''PCA:''']Principal Components Analysis, compact single main program written in C. Reads ASCII input files. == Information Retrieval == | ||
+ | * [ftp://ftp.cs.cornell.edu/pub/smart/ '''SMART 11.0:'''] A package implementing the keyword vector-space approach for IR as introduced by Salton (1961). Source code is for SunOS, but has been ported to Linux by several [http://www.linuxgazette.com/issue13/smart.html groups]. There is extensive [http://broncho.ct.monash.edu.au/~maria/Smart/hands-on-tekst.html documentation] on www. == Tools for (linguistic) post processing == | ||
+ | * [http://unipen.nici.kun.nl/scrawls/dictionaries/udi/ '''Word lists'''] of a few Western languages. | ||
+ | * [http://bobo.link.cs.cmu.edu/index.html/ftp-site/link-grammar/system-4.1/ '''Link Grammar 4.1:'''] A parser for English, written in C, by [http://bobo.link.cs.cmu.edu/index.html/ Temperley, Sleator and Lafferty ] at Carnegie Mellon. | ||
+ | * [http://WWW-KSL-SVC.stanford.edu:5915/&service=frame-editor '''Ontolingua:'''] Semantic modeling tool on WWW by Stanford University. <br /> There is a European [http://ontolingua.nici.kun.nl mirror site.] Ontologies can be exported in a number of formats, including Kif, Clips, Loom and Prolog. This is a generic tool, but can be used for content-related or document-related modeling in the context of machine reading. | ||
+ | ---- | ||
+ | == Benchmarking Tools == | ||
+ | * [algoval.html Algoval] Internet-based algorithm evaluation. Several benchmarks in the area of TC-11 are already present (digit recognition, dictionary search, region-of-interest (ROI) detection). Algorithms in Java can be uploaded and compared (Simon Lucas, Univ. of Essex). | ||
+ | * [http://documents.cfar.umd.edu/resources/source/ppanther/ PinkPanther] document-segmentation benchmarking. | ||
+ | ---- | ||
+ | = General === Learning and Optimization == | ||
+ | * [http://www.emsl.pnl.gov:2080/proj/neuron/neural/systems/shareware.html Neural Networks] | ||
+ | * [http://www.aic.nrl.navy.mil/galist/src/ Genetic Algorithms] | ||
+ | * [http://www.aic.nrl.navy.mil/~aha/research/machine-learning.html Machine Learning] | ||
+ | |||
+ | ---- | ||
+ | |||
+ | <font size="-1"> The software packages mentioned on this page are - mostly and preferably - available in source-code format (C,C++,Tcl/Tk,Java) and require standard ASCII input files. Please do not hesitate to give me a hint about free source code in the area of text processing on Internet. <br />[mailto:schomaker@ai.rug.nl schomaker@ai.rug.nl ] </font> |
Revision as of 11:38, 28 August 2009
On-line handwriting
- uptools: Tools for reading and processing files in the UNIPEN file format.
- Comparison Tools for Handwriting Recognizers using the UNIPEN format (Gene Ratzlaff, IBM) == Off-line handwriting ==
- HUE: a software toolkit which supports the rapid development and re-use of handwriting and document analysis systems (Univ. of Essex, UK). == OCR ==
- Public domain OCR software (Univ. of Maryland, USA)
- Source code at the DIMUND server (Univ. of Maryland, USA)
- Optical Character Recognition sources== Pixels vs Vectors ==
- AutoTrace bitmap to vector conversion == Pattern classification ==
- Support-Vector Machine: SVMlight Well-designed light-weight package for experimentation with the support-vector classifier. Several kernel functions are supported. ASCII data files.
- SVM Torch-II is a new implementation of Vapnik's Support Vector Machine that works both for classification and regression problems, and that has been specifically tailored for large-scale problems (such as more than 20000 examples, even for input dimensions higher than 100).
- Discrete-HMM kernel in C++ Originally developed for speech recognition, this generic package (ASCII data files!) allows for quick experimentation using discrete hidden-Markov modeling. A single HMM model is handled by the main program, thus multiple-class recognition will be realizable using (Unix) scripts.
- AutoClass: An unsupervised Bayesian classification program (NASA). Some data modeling (e.g., specifying all feature scale types) and structuring of the (ASCII) files is required.
- PCA:Principal Components Analysis, compact single main program written in C. Reads ASCII input files. == Information Retrieval ==
- SMART 11.0: A package implementing the keyword vector-space approach for IR as introduced by Salton (1961). Source code is for SunOS, but has been ported to Linux by several groups. There is extensive documentation on www. == Tools for (linguistic) post processing ==
- Word lists of a few Western languages.
- Link Grammar 4.1: A parser for English, written in C, by Temperley, Sleator and Lafferty at Carnegie Mellon.
- Ontolingua: Semantic modeling tool on WWW by Stanford University.
There is a European mirror site. Ontologies can be exported in a number of formats, including Kif, Clips, Loom and Prolog. This is a generic tool, but can be used for content-related or document-related modeling in the context of machine reading.
Benchmarking Tools
- [algoval.html Algoval] Internet-based algorithm evaluation. Several benchmarks in the area of TC-11 are already present (digit recognition, dictionary search, region-of-interest (ROI) detection). Algorithms in Java can be uploaded and compared (Simon Lucas, Univ. of Essex).
- PinkPanther document-segmentation benchmarking.
General === Learning and Optimization =
The software packages mentioned on this page are - mostly and preferably - available in source-code format (C,C++,Tcl/Tk,Java) and require standard ASCII input files. Please do not hesitate to give me a hint about free source code in the area of text processing on Internet.
schomaker@ai.rug.nl