At SAPPHIRE NOW and ASUG 2013, Mantis Technology Group CEO Doug Turner and Senior Consultant Jim Egan gave a presentation explaining how they provide their customers with sophisticated, customizable “voice of the customer” solutions, based on SAP’s powerful text analysis technology.
Their product, Mantis Pulse Analytics, uses a combination of SAP BusinessObjects technologies to gather information from social feeds, analyze sentiment, and combine it with other corporate metrics to make the data actionable.
Voice of the Customer Analysis
There are lots of different sentiment analysis tools on the market, but many of them provide only shallow classification based on the presence of words like “bad” or “good” in a tweet.
Others, like SAP Social Media Analytics by NetBase, are very powerful and easy to use, but are not primarily designed to flexibly combine information from both internal and external data sources. Mantis uses SAP’s text analysis functionality to combine sentiment data with other information from the web, or from internal systems, to help make the data more actionable.
Originally acquired from a company called Inxight, SAP’s out-of-the box “entity extraction” technology breaks documents down into sentences, then phrases, then concepts, and then sentiment analysis is applied to these concepts.
Here’s an example analysis on customer survey data from a cruise ship company:
Different social sources present a spectrum of different opportunities. Emoticon and acronym-filled teenage tweets may produce little useful sentiment information for a brand, while Facebook posts or entries on a customer feedback page are more likely to provide useful information, and a full customer survey (“please tell us what else we can do…)” will likely result in highly qualified, very clean information. Individual tweets can be processed very quickly, while entire documents naturally take much longer.
Using Sentiment Data to Improve Business
The number one questions Mantis is asked is “what do I do with sentiment data?” — and the answer, of course, depends on the organization. It could be improving brand loyalty, customer service perception, or increase sales – the goal is to improve the metrics organizations currently use and take better advantage of information that is currently being ignored.
Customers are now starting to expect the same level of response from a tweet or a post to a Facebook page as a call to your customer service center. If the tweet is ignored, it’s like not picking up the call, with negative consequences to your brand image. The good news is that the most important signals are likely to be the clearest. If, as a consumer, I decide to purposely leverage social media to get the attention of a brand, then I’m likely to use direct, easily understood language.
Once the relevant information has been processed, it can be presented to customer service representatives in priority order:
The Mantis Pulse Analytics system lets users reply directly on social media, or integrate it into existing email-based service ticket systems:
Over time, analysis has become more sophisticated and customized to individual needs. Data is collected from more channels including custom sources such as web surveys and information about particularly influential social media users. Existing CRM customer records can be expanded to include social media profile information.
Time-based correlation analysis allows the comparison of sentiment with things like sales promotions. This helps make the information more actionable. The example below shows sentiment combined with Google Analytics web metrics to show the correlation and success of a marketing campaign by different social channels.
Other examples of Mantis Pulse reporting:
The Challenges of Sentiment Analysis
Examples of problems and issues encountered by Mantis (if you’re interested in the subject, you should read articles by Seth Grimes of Alta Plana, and see presentations from the Sentiment Analysis Symposium he organizes each year)
- If somebody says “I hated the install of the new version, but loved the new features,” the sentiments don’t cancel each other out, but have to be separated and ranked separately for the top positive and negative lists.
- A long list of problems followed by a single “resolved” may badly underestimate the real sentiment
- A consumer might wax poetically and enthusiastically about chocolate in general for five sentences, but then say “but I hate M&Ms” in two different ways. In this case, it would probably be more useful to calculate the ratio of positive to negative sentiment as 1 to 1 rather than 5 to 2.
- Many words can be interpreted as either a verb or a noun, and quality will be negatively effected by poor sentence structure, punctuation, or run-on sentences that combine several different unrelated concepts.
- Not all clear brand sentiment is very actionable. For example, one of Mantis’ customers had predictable spikes of drunken “I love Bud beer!” tweets on Friday and Saturday nights.
- The band “Big Bad Voodoo Daddy” generated negative sentiment, as did the tag line of a clothing campaign “dress irresponsibly”, so those are “tokenized” to take them out of the equation. In addition, specific fields and language may be required for certain industries – a cruise ship is interested in recording the name of a ship, the cabin number, etc.
- Phrases had to be added that weren’t recognized such as “abend” (from “abnormal end” or crash)
- Some expletives have to be classed as positive sentiment, in phrases such as “just bought a guitar, it’s f****g awesome”
- Call centers exist in order to deal with problems, so rather than just classifying “problem” as negative, it’s important to distinguish between phrases such as “you solved my problem” and “I have a new problem.”
Mantis found that tuning the system for different customer meant adding specific custom dictionaries to the system. Rather than modifying the underlying text analysis “CGUL” files – which are the same for all the Mantis Pulse Analytics customers, and can generate unwanted side effects, since rules are based on other rules — Mantis used SAP Data Services to make appropriate upfront changes to the data.
In general, it’s important to realize that sentiment analysis can never be an exact science, given the complexity of human language – and much language is fundamentally ambiguous, with different people classifying sentiment in different ways. Manual evaluation of data samples by Mantis and their customers indicates an accuracy rate of around 80% compared to human interpretation of sentiment.
In particular, humans have the benefit of a greater amount of “context knowledge” that can be discerned just from the text itself – which means there will always be false positives and negatives. This means that the results are much, much better than ignoring the data entirely, and can be used in aggregate to detect important trends, but have to be used with caution when deciding, for example, whether a customer service representative should keep his or her job.
The Underlying Technology
Mantis Pulse Analytics has existed for several years, in both on-premise and Amazon-cloud-based versions. Custom tools are used to collect the data, using standard social APIs and code to crawl specialized forms and blogs. Once the data is collected, it was put into a MySQL holding area, then the sentiment was extracted using SAP BusinessObjects Data Services Text Data Processing to create a data mart, with reporting using SAP BusinessObjects.
Mantis Pulse Analytics was rumored one of the largest BusinessObjects text analytics installations ever, with a Text Analytics Farm of 20-25 concurrent versions of the text analytics engine.
Customers wanted ever-more powerful analysis, ever faster, and Mantis turned to SAP HANA One running in the Amazon cloud, which now provides the same text analysis engines, but built directly into the platform, and processed in-memory. The result is much faster, with fewer moving parts.
The data is collected directly in a SAP HANA table, and a full text index created, without any need for rollups or aggregate tables. Reporting is unchanged – except that sentiment analysis against million of documents now running in seconds.
Currently, all customer data is processed using the same configuration – in the future, the company may move to using multiple tables, each with their own customizable table index.
The SAP HANA-based solution currently provides a little less visibility into the text processing process than the previous solution, but this is more than made up for by the dramatic advantages when it comes to the reporting and analysis – “It’s coming to the state where the processing is at the speed of thought” says Doug Turner. “Our customers are trying to get to the point where they can respond to a tweet before a customer leaves a store”
There’s also a version of the solution designed specifically for sports organizations called Mantis Pulse Athletics, designed to “monitor, measure, and visualize social media and to reduce the risk of regulatory and organizational compliance issues”