Ontea: Pattern based Semantic Annotation Platform
Ontea process email or text documents and find relevant objects grouping them also to segments like paragraphs, sentences or objects with properties like address decomposed into street, zip and city.
The Platform contains also graphical user interface, which shows identified objects in the text of email message or text file.
The Platform also analyses HTML, PDF and Word email attachments.
Ontea uses two main techniques for information extraction:
- patterns based on regular expressions
- gazetteers (now working with GATE or OntoText gazetteers)
found objects are then transformed into the object trees with properties.
CSV repository contain of latest working code with all features not included in releases.
More information:
Publications:
-
M. Laclavik, S. Dlugolinsky, M. Seleng, M. Kvassay, E. Gatial, Z. Balogh, L. Hluchy
Email Analysis and Information Extraction for Enterprise Benefit
In Computing and informatics, 2011, vol. 30, no. 1, p. 57-87. ISSN 1335-9150, Special Issue on Business Collaboration Support for micro, small, and medium-sized Enterprises
- Michal Laclavik, Martin Seleng, Marek Ciglan, Ladislav Hluchy
Ontea: Platform for Pattern based Automated Semantic Annotation
In Computing and Informatics, Vol. 28, 2009, 555-579, ISSN 1335-9150
New release 1.0 was issued. Now including graphical user interface (picture below). New release does not contain good documentation yet. This latest release is based on work developed within FP7 Commius project.
Install and start latest Ontea version 1.0:
download version 1.0
Unzip and start executable jar file
Read release README file for more details
Install and start Ontea version 0.8:
This distribution include implementation of regular expression patterns and integration with Sesame API
download version 0.8
Required: java installation with ant
Run ontea example by running following commands:
$ ant start
or by running following class:
ontea.example.text.Annotation
Contact Information:
Authors: Stefan Dlugolinsky, Michal Laclavik, Martin Seleng
Intelligent and Knowledge oriented Technology Group
Institute of Informatics, Slovak academy of Sciences, Bratislava, Slovakia