ENTERPRISE SEARCH & CONTENT ANALYTICS BLOG
IntraFind experts provide there insights into the world of enterprise search, metadata extraction and usage as well as content analytics. Please feel free to share our blog with your colleagues or business partners.
Even nowadays technical documentation is mostly not valued as good or useful source of information. Too many incomprehensible or unmanageable documents have shaped the past and left behind frustrated or desperate users. Not to mention experts and technicians who prefer to rather trust their experience than long texts and inappropriate illustrations.
Read more … Content Delivery of Technical Information
Google recently announced to its Gold Partners that from 2017 on they no longer want to sell their Google Search Appliance, a hardware-software bundle that has been on the market since 2002. Google wants to solve this problem worldwide with a cloud-based application for companies. What is it all about the Google Search Cloud? Our idea for this is as follows.
Read more … Hardware was yesterday?
That’s right, we have always seen it this way.
What makes a good search engine? You type a sequence of letters and the document is searched for all occurrences of this combination. This way you can quickly and easily find certain text passages within a document. For a user who wants to find out more about a particular topic or the use of a particular word, such a basic search functionality would certainly not be enough.
Read more … Difficult Relations –
How Search Engines Can Be Expanded by Means of Word Families
IntraFind has long been known for high quality information retrieval. For our new product generation iFinder5 elastic we completely overhauled our core search technology consisting of our Lucene / Elasticsearch Analyzers and our Query Parser. In part 1 of this blog article I talk about advantages for the standard user and how we are able to reduce configuration efforts.
Read more … New High Quality Search and Linguistics for iFinder5 elastic
Identifying the language of a given text is a crucial preprocessing step for almost all text analysis methods. It is considered as a solved problem since more than 20 years. Available solutions build on the simple observation that for all languages typical letter sequences (letter n-grams) exist, that occur significantly more frequent in this language than in other languages.
Read more … Language Identification and Language Chunking
"Stemming" as well as "Lemmatization" are commonly used buzzwords in the field of Information Retrieval (IR), particularly in the development of powerful search engines. [...]
So what exactly is the difference between these two methods? What are the advantages and disadvantages and which one should be preferred? [...]
Read more … The difference between stemming and lemmatization
Some say software developers draw their motivation from minimizing or maximizing numbers in any given problem. That's a smug innuendo. From my experience, developers are always on the lookout for beautiful solutions, of which numbers are but a symptom. The usage of approximative data structures for language processing is one such example of a beautiful idea with nice numbers. [...]
Read more … Approximative data structures for natural language processing