Understand, index, use documents - the Tagging Service
The Tagging Service supports companies to generate metadata automatically and standardized, keywords and tags from document texts and contents to use them for company processes. For instance, companies that are already using an Enterprise Search solution can use high quality filters or can better evaluate document contents. In addition, uniformly tagged documents and editorial content can be found much better.
The IntraFind Tagging Service consists of different, standardized, but also configurable tagging types:
- Free statistic tagging
- Controlled, statistic and rule based tagging
- Entity extraction
- Text classification
Tagging Service automatically identifies semantic descriptors (keywords, entities, subjects) from documents and additionally delivers a contextual view of the document. The extracted semantic descriptors can then be integrated in applications or in the company wide search or can even trigger further process chains. Tagging Service serves the content related extraction of information or the tagging of unstructured text.
In connection with a company wide search or with the Enterprise CMS, these semantic descriptors form a knowledge base of the entire document inventory and allow for, among other things, a semantic search in company data or the automatic tagging of documents. Tagging Service can be operated as software-as-a-service (SaaS), this means it runs as an independent service either on the local server of the customer or also on a remote server and provides all important functions through web services.
The output can, for example, be integrated in any usage context at the customer’s location as a Word or Explorer plug-in, in order to generate tagging suggestions for a created text or in order to act as a back-end component for the analysis, tagging and enrichment of mass data (e.g. files on file servers) with metadata.
Depending on the desired return format, different web services are available for the extraction of semantic descriptors (keywords, subjects, entities) from an unstructured text.
With HTML and XML, two return formats are currently supported. The HTML return lists the results of the information extraction in a table and is intended for demonstration and test purposes. Of course, a customer specific format - even with a different HTML format - can be defined and integrated readily. The XML format offers the possibility to use the extraction results further through a program, for example, through Microsoft SharePoint plug-ins or via Ajax in a website.
Tagging Service is linearly scalable and is therefore also suitable for large data quantities.
Metadata is by definition data that describes texts or a collection of data and documents. Intelligently structured, they reflect the content of a document, like the author, date of creation, location of creation, person responsible, but also the text modules, which are particularly relevant so that a contextual overview of an individual document is provided without a laborious viewing. Furthermore, such metadata can be combined across documents and thereby ensure important insight into the company data.
In companies, the structure of metadata is either not available as a process level or often a manual process; journalists index their new article, employees contents, which they upload into a SharePoint infrastructure. Thus metadata also ensures that, for example, employees provide information in a comprehensively retrievable manner. But does this work through a manual process?
IntraFind sees a big risk here, because particularly manual metadata creation is often susceptible to errors, because it is done by humans and therefore subjective. Each of us has a different view on what is really important in a text and thereby indexes according to the best of one’s knowledge. The colleague from the same team in contrast evaluates the contents completely differently and therefore indexes this differently. In this case, the person searching for information suffers because he/she does not receive the desired document and has to spend a lot more time and patience while searching and finding contents.
IntraFind offers a comprehensive solution with Tagging Service. Fully automatic metadata enrichment for Microsoft SharePoint instances, file systems, DMS or archiving systems are only a few possible integration options.
Other application examples:
The metadata created with Tagging Service can be used for the following purposes:
- Enrichment of information
- Improvement of the search
- Classification tasks
- Control of workflows
- Identification compliance-relevant data or
- Identification of documents, which are to be transferred to a DMS or archive for audit reasons.