Die Forschungsabteilung bei Neofonie

Seit unserer Gründung investieren wir konsequent in Forschungsprojekte, um unsere Technologien und Lösungen stets weiterzuentwickeln und zu optimieren. Unseren Kunden garantieren wir damit einen Wettbewerbsvorteil, den sie durch leistungsstarke und innovative Lösungen — etwa bei der Suchtechnologie, dynamischen Communities, Cloud Computing, Big Data sowie Sprachtechnologie und Text Mining — in ihren Produkten wiederfinden.

Hochqualifizierte Informatiker, Ingenieure, Mathematiker, Physiker und Computerlinguisten schaffen die Grundlagen für neue Technologien.

Digitale Dokumente stehen im Zentrum unserer grundsätzlich anwendungsorientierten Forschungsarbeit. Wir erforschen intelligente Verfahren, mit deren Hilfe die wachsenden Datenbestände — ob im Internet oder im Unternehmen — nutzbar gemacht werden können.

Neofonie gehört im Bereich Semantik und Suche zu den profiliertesten Unternehmen in Deutschland. Wir kombinieren Verfahren des maschinellen Lernens mit semantischen Verfahren, d.h. mit linguistischen Verfahren für ein inhaltliches Verständnis von natürlicher Sprache. Das Ziel: eine Technologie, die Dokumente vollständig versteht.



Natalja Friesen, Jörg Kindermann, Doris Maassen and Stefan Rüping:

Data Mining in Data-Intensive and Cognitively-Complex Settings: Lessons Learned from the Dicode Project

This book reports on cutting-edge research carried out within the context of the EU-funded Dicode project, which aims at facilitating and augmenting collaboration and decision making in data-intensive and cognitively complex settings

(Veröffentlicht in Mastering Data-Intensive Collaboration and Decision Making, Springer 2014, 2014)

Buy Online

Jürgen Bross, Heiko Ehrig:

Automatic Construction of Domain and Aspect Specific Sentiment Lexicons for Customer Review Mining

Automatically analyzing the opinions expressed in customer reviews is of high relevance in many application scenarios , e.g., market research, trend analysis, or reputation management. A great share of current sentiment analysis approaches makes use of special purpose lexicons that provide information about the polarity (e.g., positive or negative) of individual words and phrases. One major challenge is that the actual sentiment polarity of a specific expression is often context dependent (e.g., "long+ battery life" vs. "long- flash recycle time"). However, the vast majority of existing approaches focuses on creating general purpose lexicons. Especially in the context of mining customer review data, the use of such lexicons is rather suboptimal as they fail to adequately reflect the domain specific lexical usage. We propose a novel method that allows to automatically adapt and extend existing lexicons to a specific product domain. We follow a corpus-based approach and exploit the fact that many customer reviews exhibit some form of semi-structure. The method is fully automatic and thus scales well across different product domains. Our experiments show that the extracted lexicons are highly accurate and significantly improve the performance in a sentiment classification scenario.

(Veröffentlicht in Proceedings of CIKM, 2013)

ACM Digital Library

Jürgen Bross, Heiko Ehrig:

Terminology Extraction Approaches for Product Aspect Detection in Customer Reviews

In this paper, we address the problem of identifying relevant product aspects in a collection of online customer reviews. Being able to detect such aspects represents an important subtask of aspect-based review mining systems, which aim at automatically generating structured summaries of customer opinions. We cast the task as a terminology extraction problem and examine the utility of varying term acquisition heuristics, filtering techniques, variant aggregation methods, and relevance measures. We evaluate the different approaches on two distinct datasets (hotel and camera reviews). For the best configuration, we find significant improvements over a state-of-the-art baseline method.

(Veröffentlicht in Proceedings of CoNLL, 2013)

Download (PDF)

Matthias Wendt, Martin Gerlach, and Holger Düwiger:

Linguistic Modeling of Linked Open Data for Question Answering

While more and more semantic data is published on the Web, the question of how typical Web users can access this body of knowledge becomes of crucial importance. Therefore there is a growing amount of research on interaction paradigms that allow end users to profit from the expressive power of Semantic Web standards while at the same time hiding the complexity behind an intuitive and easy-to-use interface.

(Veröffentlicht in Workshop co-located with the 9th Extended Semantic Web Conference, Heraklion, Greece, 2012)

Download (PDF)

Johannes Kirschnick, Torsten Kilias, Holmer Hemsen, Alexander Löser, Peter Adolphs, Heiko Ehrig, and Holger Düwiger:

A Marketplace for Web Scale Analytics and Text Annotation Services

In this paper, we present MIA, a cloud-based platform and a data marketplace with the goal to enable massive parallel processing of data from the German language Web. End users can describe their analytical task in a structured query language called MIAQL. We describe the functionality of the platform to gather relevant text data, extract information, join, aggregate, group and return results as database tables. In addition to this powerful functionality, MIA offers many cost savings through sharing annotations, text data, built-in analytical functions and third party text mining functions.

(Veröffentlicht in Proceedings of Coling 2014, Dublin, Ireland, 2014)


Steffen Kemmerer, Benjamin Großmann, Christina Müller, Peter Adolphs, Heiko Ehrig:

The Neofonie NERD System at the ERD Challenge 2014

This paper describes Neofonie NERD, our named entity recognition and disambiguation system submitted to the ERD Challenge 2014. The system consists of precomputed lexica and statistics from Wikipedia and Freebase, an efficient spotting component, and a context based disambiguation step. It was originally developed for the German language and has now been adapted for English for the first time. We achieved 70.0% F1-score in the final evaluation, which is 5.7 percent points above the average of all participating teams.

(Veröffentlicht in ACM SIGIR 2014 Workshop, Gold Coast, Australia, 2014)

ACM SIGIR 2014 Workshops
→ Alle Publikationen