Natural-Language Processing: After the Initial Buzz | Portfolio for the Future

A new white paper from Deutsche Bank Markets Research cautions traders that although they can enhance the value they get from traditional quantitative signals by overlaying information from the web and news sources, the use of such sources is far from a “magic bullet.” Its efficacy will be somewhat less than its enthusiasts circa 2009 hoped.

At that time, “news sentiment and natural language processing was one of the hottest topics in quant” they write. The new tools have lost some of their mojo since, and the mavens of Deutsche Bank Quantitative strategy think they know why.

This paper is the third part in an ongoing research series.

The First and Second Papers

The first paper in the series (2010) offered non-linear learning models that could be employed to turn news flow into an alpha signal. The second (2012) expanded the analysis to included web data , showing in particular (in the words of the introduction to the new one) that “co-mentions of two companies on the web can be a useful way to uncover relationships between companies that often transcend the usual sector or industry lines.”

Co-mentions can be employed to create a sort of … well … web-like diagram of the connections among listed companies as mentioned on social media, in blogs, and so forth.

In the figure below the thickness of the lines between the companies illustrates the frequency of co-mentions. The figure can be a bit tricky to read. Notice for example that one of the thickest of lines in that graph connects Microsoft (MSFT) to Google (GOOG), but this line has to be understood as passing beneath the circle representing Citicorp (C).

The new paper suggests other ways to use web and news data. It also addresses the issue of why the initial buzz surrounding this subject has died down.

Third Paper: Signal Processing

The new paper stresses that there is only a moderate overlap between news sentiment and web sentiment as to listen companies.

The authors rank information coefficients for web sentiment and news sentiment respectively. The average web sentiment IC for the period between March 2005 and March 2012 was about 0.85%, in contrast to the 0.45% IC for news. This means that news is the inferior predictor of next-day stock performance.

When they convert these ICs into risk-adjusted IC ranks and compare them to the wider universe of quant factors, they find that the one-month quantification of web sentiment does pretty well, though not spectacularly so. It doesn’t get as high a risk-adjusted IC as, say, free cash flow yield, or asset turnover (sales to total assets), but it does better than CAPM idiosyncratic volatility or skewness. Furthermore, even the news sentiment IC does better as a predictor than operating profit margin or expected dividend yield.

Nonetheless, you can’t expect to harvest much alpha if you simply buy on good sentiment as measured through news or on the web, and sell on negative sentiment. As the authors put it in quant-speak, such predictive value as these measures have “does not translate … cleanly into return space.” Indeed, this is likely why the use of sentiment-based considerations has fallen out of favor since the buzzy days of 2009: the way to make proper use of these considerations turns out to be a complicated matter.

Filtering the Significance of the Lottery Effect

They suggest three specific non-linear ways of making use of such information. I’ll just mention one of them here. One quant strategy involves shorting stocks that recently experienced a sizeable one-day jump. Short them, that is, on the behavioral premise that the one-day jump was largely than warranted, or has itself produced a crowding-in effect that will soon produce a reversal. Short ahead of and profit from that reversal.

The authors call this the “lottery” strategy. If the number 4444 won your state’s lottery recently, then there may well be a lot of crowding to bet on the fours, and you might, in effect, to bet against that crowd.

The problem with this strategy is of course that once in a while the one-day jump is warranted. The strategy works if it really is a matter of a random number coming up, not if you’re shorting against some actual improvement in fundamentals, or an upcoming beneficial merger or the like.

So … use the news flow and the web to filter out one-day jumps that are based upon positive news. It will help, too, if you remember that “positive” in the relevant sense has to mean “better than expected.”