Perhaps more than any other kind of time series data, financial markets have been scrutinized by countless mathematicians, economists, investors and speculators over hundreds of years. Even in modern times, despite all scientific advances, the effort of predicting future movements of the stock market sometimes still bears resemblance to the ancient alchemistic aspirations of turning base metals into gold. That is not to say that there is no genuine scientific effort in studying financial markets, but distinguishing serious research from charlatanism (or even fraud) remains remarkably difficult.
We neither aspire to develop a crystal ball for investors nor do we expect to contribute to the economic and econometric literature. However, we find the wealth of data in the financial markets to be fertile ground for experimenting with knowledge discovery algorithms and for generating knowledge representations in the form of Bayesian networks. This area can perhaps serve as a very practical proof of the powerful properties of Bayesian networks, as we can quickly compare machine-learned findings with our own understanding of market dynamics. For instance, the prevailing opinions among investors regarding the relationships between major stocks should be reflected in any structure that is to be discovered by our algorithms.
More specifically, we will utilize the unsupervised and supervised learning algorithms of the BayesiaLab software package to automatically generate Bayesian networks from daily stock returns over a six-year period. We will examine 459 stocks from the S&P 500 index, for which observations are available over the entire timeframe. We selected the S&P 500 as the basis for our study, as the companies listed on this index are presumably among the best-known corporations worldwide, so even a casual observer should be able to critically review the machine-learned findings. In other words, we are trying to machine-learn the obvious, as any mistakes in this process would automatically become self-evident. Quite often experts’ reaction to such machine-learned findings is, “well, we already knew that.” That is the very point we want to make, as machine-learning can — within seconds — catch up with human expertise accumulated over years, and then rapidly expand beyond what is already known.