Ontea: Pattern based Semantic Annotation Platform

SourceForge.net Logo

Ontea process email or text documents and find relevant objects grouping them also to segments like paragraphs, sentences or objects with properties like address decomposed into street, zip and city. The Platform contains also graphical user interface, which shows identified objects in the text of email message or text file. The Platform also analyses HTML, PDF and Word email attachments. Ontea uses two main techniques for information extraction:

found objects are then transformed into the object trees with properties.
CSV repository contain of latest working code with all features not included in releases.

More information:

Publications:

New release 1.0 was issued. Now including graphical user interface (picture below). New release does not contain good documentation yet. This latest release is based on work developed within FP7 Commius project.

Install and start latest Ontea version 1.0:

download version 1.0
Unzip and start executable jar file
Read release README file for more details

Install and start Ontea version 0.8:

This distribution include implementation of regular expression patterns and integration with Sesame API

download version 0.8
Required: java installation with ant
Run ontea example by running following commands:
 $ ant start
or by running following class:
  ontea.example.text.Annotation

Contact Information:

Authors: Stefan Dlugolinsky, Michal Laclavik, Martin Seleng
Intelligent and Knowledge oriented Technology Group
Institute of Informatics, Slovak academy of Sciences, Bratislava, Slovakia