In this era of cloud computing, big data, server farms, and the smartphone in your pocket that’s vastly more powerful than a roomful of computers of previous generations, it can be easy to lose sight of the very definition of a supercomputer. The key is “capability,” or processing speed, rather than capacity, or memory.
For financial forecasters, the particular computing capability of interest is the probabilistic analysis of multiple, interrelated, highspeed, complex data streams. The extreme speed of global financial systems, their hyperconnectivity, large complexity, and the massive data volumes produced are often seen as problems. Moreover, the system components themselves increasingly make autonomous decisions. For example, supercomputers are now performing the majority of financial transactions.
High-frequency (HF) trading firms represent approximately 2% of the nearly 20,000 trading firms operating in the U.S. markets, but since 2009 have accounted for over 70% of the volume in U.S. equity markets and are approaching a similar level of volume in futures markets. This enhanced velocity has shortened the timeline of finance from days to hours to nanoseconds. The accelerated velocity means not only faster trade executions
but also faster investment turnovers.
At the end of World War II, the average holding period for a stock was four years. By 2000, it was eight months; by 2008, two months; and by 2011, twenty-two seconds. The “flash crash” of May 6, 2010 made it eminently clear to the financial community (i.e., regulators, traders, exchanges, funds, and researchers) that the capacity to understand what had actually occurred, and why, was not then in place. In the aftermath of that event, the push was begun to try applying supercomputers to the problem of modeling the financial system, in order to provide advance notification of potentially disastrous anomalous events. Places such as the Center for Innovative Financial Technology (CIFT) at the Lawrence Berkeley National Laboratory (LBNL) and the National Energy Research Scientific Computing (NERSC) center assumed leading roles in this exploration.
Fortunately for many forecasters, you no longer need to affiliate with a government funded megalaboratory in order to access high-performance computing power. Although the only way to get high performance for an application is to program it for multiple processing cores, the cost of a processor with many cores has gone down drastically. With the advent of multicore architecture, inexpensive computers are now routinely capable of parallel processing. In the past, this was mostly available only to advanced scientific applications. Today, it can be applied to other disciplines such as econometrics and financial computing.
It is worth taking a moment here to look at the size of the market data problem. Mary Schapiro, chair of the SEC from 2009 through 2012, estimated the flow rate of the data stream to be about twenty terabytes per month. This is certainly an underestimation, especially when one considers securities that are outside the jurisdiction of the SEC, or bids and offers that are posted and removed from the markets (sometimes in milliseconds). Nevertheless, supercomputers involved in scientific modeling such as weather forecasting, nuclear explosions, or astronomy process this much data every second! And, after all, only certain, highly specialized forecasting applications are going to require real-time input of the entire global financial market. Many forecasting applications do well enough with only a small fraction of this data.
Note: This blog was excerpted from my article published in the Fall 2013 issue of FORESIGHT: THE INTERNATIONAL JOURNAL OF APPLIED FORECASTING. To see the entire article, as well as an interview with the author, click here: Future of Financial Forecasting