Monday, May 16, 2011

Sentiment Analysis Processing For .NET Solutions

Sentiment Analysis is the process of determining 'tone' within the context of a given text. In today's world of CRM systems, blogging, and social networking has sparked an interest in being able to harness the overall sentiment of that content (positive, negative, neutral, etc). The algorithmic computations and processing required to perform sentiment analysis is based on the field of Natural Language Processing, which it and SA can make someone's career all on thier own. Needless to say it is probably not something a .NET developer wants to attempt from scratch.

The trend tends to be publicly available RESTful APIs that are both free or with a cost that allow a developer to send data to, have SA processing preformed against the submitted data, and have the resulting sentiment information returned.

There are typically (2) main approaches to processing the data to determine sentiment: trained systems using custom data sets (not ADO.NET datasets, actual text data), and out-of-the-box non-trained systems using a default data set. The latter is the simplest approach because you can be up and processing data in a matter of minutes. The downside is much less accurate results (my tests averaged about 60-70% average accuracy across the board using non-trained systems). A trained system using custom data sets with terminology and keywords specific to your need or industry would be the best approach and yield the most accurate results. The downside to this is it is much more time consuming, involved, and will most likely be associated only with paid solutions.

When 1st researching sentiment analysis in relation to .NET applications, I tell you truthfully I knew nothing about it (or what sentiment analysis was even termed) and was looking for an out-of-the-box widget from CodePlex or something in the form of a .dll, etc. Wrong! But I did learn about the different APIs available and composed a list. Below are the names and links of several Sentiment Analysis APIs that could be used for .NET applications, and really for other programming languages too because most are RESTful APIs. (I apologize for the URL data below being an image, but formatting it all or doing it manually was going to be a nightmare)

Table 1.0: Sentiment Analysis (CLICK ON PICTURE FOR LARGER DETAIL)

As I mentioned before this isn't such a straight forward need that I can tell you, "Pick this 1 best sentiment analysis API." You really need to try them out independently as I did and become familiar with each tool and its capabilities. A lot of these tools depending on load can get very expensive. If you are just randomly processing a few blog articles or some basic customer feedback you can probably use any of the solutions for free or close to it. However the minute you want to process the data stored in a CRM system or from a social networking site like Facebook or Twitter, you could be looking at a multi-thousand dollar on going investment. For this reason you should research, become familiar with, and test each API listed to see how it performs.

Lastly I am no expert on this subject and the purpose of this article was really just to create a more organized starting point which I did not have. The following (2) links below will give a decent high level description of both Sentiment Analysis and Natural Language Processing. I also welcome any comments on additional APIs others may have used and the details about them.

Sentiment analysis

Natural language processing