Website | Source | Getting Started

OpeNER’s main goal is to provide a set of ready to use tools to perform some natural language processing tasks, free and easy to adapt for SMEs to integrate them in their workflow. More precisely, OpeNER aims to be able to detect and disambiguate entity mentions and perform sentiment analysis and opinion detection on the texts, to be able for example, to extract the sentiment and the opinion of customers about certain resource (e.g. hotels and accommodations) in Web reviews.


Customer reviews and ratings on the Internet are increasing importance in the evaluation of products and services by potential customers. In certain sectors, it is even becoming a fundamental variable in the purchase decision. A recent Forrester study showed more than 30% of Internet users have evaluated products online, and that 70% of those studied end user generated reviews.

This trend will continue with the growth of Social Media and access to Information andCommunication Technologies (ICT). Consumers tend to trust the opinion of other consumers, especially those with prior experience of a product or service, rather than company marketing. The role of user comments is of particular importance when there is little differentiation between the product offers.

Sentiment Analysis and Opinion Mining are established, although nascent, fields of research, development and innovation. The goal is always broadly the same; to know “Who” is speaking about “What”, “When” and in “What sense”.

These factors have led to a burgeoning industry with a plethora of companies offering Sentiment Analysis services in Social Media. While most offer a generic service, in typically just one language, several companies have specialised offering services specific to tourism due to its bounded domain, demonstrable value, and the high level of adoption of Internet technologies by both suppliers and consumers.

It is also an application domain with limited scope and variation, and a high dependency on multilingual sentiment analysis and detection and classification of a wide range of common Named Entities.

Named Entity Recognition and Classification (NERC) are important in determining roles. Once multilingualism and cultural skew are introduced, the complexity of the challenge increases manifold. OpeNER will create base technologies for Cross-lingual NERC and Sentiment Analysis that will enable industry users to both implement and contribute to a basic set of core technologies that all require and allow them to focus their efforts on providing tailored and innovative solutions at the rules and analysis levels.

The OpeNER project will provide a rich Named Entity Data Source in a simple, structured and standardised format. The Named Entity Detection will be capable of marking Named Entities in the same format irrespective of the text under analysis or the language of the text. The project will also provide linking modules that are capable of matching locally detected Named Entities with generic data.


OpeNER aims to provide enterprise and society with base technologies for Cross-lingual Named Entity Recognition and Classification and Sentiment Analysis through the reuse of existing resources and the open development of complementary technologies. The key objectives of the project are:

Repurposing of existing language resources and generation of a reference generic multilingual sentiment lexicon with cultural normalisation and scales. An extension lexicon for the tourism sector in different languages (Spanish, Dutch, German, Italian, English and French). Named Entity Recognition and Classification in the same set of target languages as the Sentiment Lexicon which is extensible to other languages by leveraging multilingual resources such as Wikipedia and Linked Data. Development and open availability of validated reference Sentiment and Opinion Mining techniques and tools based on the results of the project. Validation of the project results, principally in the tourism sector, with leading SMEs in the sector and with the support of several stakeholders as part of the End User Advisory Board. Research and trailing of models that will ensure that the project results are self-sustainable and economically viable in the long term. Achievement of the projects objectives by repurposing and leveraging existing state of the art and established language resources.

Tags: library   nlp   ruby   python   jvm  

Last modified 03 May 2022