Evaluating Google Trends Data to the Task of Predicting U.S. Stock Returns
Daniel Oliveira  1@  , Felipe Salvatore  2@  , Pedro Valls Pereira  3@  , Andre Fujita  1@  
1 : Institute of Mathematics and Statistics - University of São Paulo  (IME-USP)
2 : Institute of Mathematics and Statistics - University of São Paulo  (IME-USP)
3 : Sao Paulo School of Economics - FGV  -  Website
Rua Itapeva 474 room 1006 01332-000 Sao Paulo, Sao Paulo -  Brazil

Predicting financial assets returns is one of the main problems of the empirical finance literature. In particular, one of its main challenges is to evaluate the usefulness of the so-called alternative data. A popular alternative dataset that has gained popularity in recent years is the Google Trends data. In this article, we want to evaluate the usefulness of this dataset to the task of predicting U.S. stock indices returns (more specifically, the S$\&$P 500, NASDAQ, and Dow Jones indices). We perform an extensive search to tackle this problem: we employ eight feature selection methods and seven forecasting models. We use both standard machine learning methods and the recently proposed feature selection methods based on causal inference. When comparing aggregated AUC results across all indices, we found evidence that none of the feature selection models or forecasting models could significantly improve the random guess. Moreover, when we transform the model's prediction into a trading strategy, we find similar results in terms of the Sharpe ratio, maximum drawdown and other risk metrics.


Online user: 4 Privacy
Loading...