The Average Investor's Blog

A software developer view on the markets

Archive for July, 2011

Index Levels to Watch for Friday and for July

Posted by The Average Investor on Jul 29, 2011

This Friday is the last trading day not only for the week, but also for the month of July. With the significant pullback lately, S&P 500 has come even closer to its 10-month EMA. The level for this month end to watch is 1275.61. A close below this value would mean that the index is below this MA. It looks like the S&P 500 is likely to keep above this long term MA for another month.

Using the 20-week EMA, the levels and the situation are different:

Asset Symbol Thursday’s Close EMA
US REIT VNQ $61.03 $59.89
S&P 500 ^GSPC $1300.67 $1311.20
Nasdaq 100 ^NDX $2371.77 $2320.08
Emerging Markets EEM $46.91 $47.25

It looks like the S&P 500 is heading to close below the 20-week MA, let’s see.


Posted in Market Timing, Trades | Leave a Comment »

The Weekly Update

Posted by The Average Investor on Jul 17, 2011

An ugly week for the stock markets, the S&P 500 losing 2.05%, but another splendid week for the ARMA strategy winning 1.56% over the same period!

Date GSPC Gain Position Position Gain
2011-07-11 -1.81% Short 1.81%
2011-07-12 -0.44% Long -0.44%
2011-07-13 0.31% Long 0.31%
2011-07-14 -0.67% Long -0.67%
2011-07-15 0.56% Long 0.56%

The ARMA strategy was short only on Monday, but what a difference it made, -1.8% on the S&P 500! After this last successful week, the ARMA strategy has registered a growth of 5.98% in 2011, back ahead of the S&P 500 performance of 4.65%.

On the 20-week EMA front, all indexes lost their previous gains and went into negative territory, but only the Emerging Markets ETF (EEM), closed below it’s 20-week EMA on Friday.

Asset Symbol Position Date In Gain
US REIT VNQ Long 2011-07-01 0.39%
Nasdaq 100 ^NDX Long 2011-07-01 -0.20%
S&P 500 ^GSPC Long 2011-07-01 -1.76%

A long position taken on the last signals from EEM would have resulted in a lost of about 3.11% – the fifth unsuccessful signal from this instrument.

Posted in Market Timing, Trades | Leave a Comment »

Yet another reason to avoid loops in R

Posted by The Average Investor on Jul 12, 2011

In some previous posts I have mentioned my struggles with the performance of the computations needed to implement the ARMA strategies in practice. Finally I have found a worthy solution, and as usual, there is a programming pattern to learn from it – avoid loops in R. 🙂

My first approach was to optimize the algorithms. Willing to trade some quality of the models to gain in performance, I tried a few alternatives, but I didn’t like neither of them. Then I concentrated on improving the overall R performance. After applying a few easy to do things I had to look for something more substantial.

For a little bit I toyed with the idea to use GPU, but although they can provide massive performance improvements, quite often they require a lot of specialized code and this alone can postpone using the system for months.

Then I took a step back, and reconsidered the issues. I am running two expensive tasks each day, on an 8-core (Intel i7 2600K processor, 4 core with hyper-threading) machine. Since each task is a single R process, I realized that I am not using the CPU maximum capacity. So I considered splitting each task in pieces, manually, but (luckily) before doing so, I decided to google for R parallelism.

The solution I finally came to was to use the multicore R package. The only changes I needed to make to my code, was to remove the loops! As an illustration, let’s take the dumbest example, let’s suppose we are computing sqrt with the following code:

for( ii in 1:100 )
   print( sqrt( ii ) )

The transformed, mutlicore-friendly code looks like:

ll = c()
for( ii in 1:100 )
   ll[ii] = ii

print( lapply( ll, sqrt ) )

Why is the last code multicore-friendly? Because one can transparently switch to mclapply from the multicore package:

library( multicore)

ll = c()
for( ii in 1:100 )
   ll[ii] = ii

print( mclapply( ll, sqrt ), mc.cores=multicore:::detectCores( ) )

The last version will “lapply” sqrt to each element in the array using as many threads as there are cores in the system! Assuming an 8-core system, the first 8 sqrts will be computed in parallel, and then a new sqrt will be started as soon as one of the previous finishes. Notice the line specifying the number of cores, the package is supposed to detect the number of cores on initialization, but that’s not the case on my system.

This pattern worked perfectly for the ARMA strategy (and for any other strategy computing all required outcomes in similar fashion for that matter): On each day, we need to compute the different actions to be taken for different closing prices. The only invariant in the loop body, ie between different closing prices, is the particular closing price for that iteration. So, I did exactly the same as what I did for sqrt – computed all interesting prices in a loop and then passed everything to mclapply to do the work!

A small and low-risk code change (easy to verify using a known-to-work function) resulted in almost 4 times performance improvement (I run each of the two instruments I currently trade with mc.cores == 4, that’s why the factor of only 4)!

Make sure to remember this patter next time you consider using a loop – I certainly will.

Posted in coding, R | 4 Comments »

The Weekly Update

Posted by The Average Investor on Jul 11, 2011

A short but still positive week for the markets. The S&P 500 ended the week up by 0.31%. The ARMA strategies would have returned 1.78% over the last week:

Date GSPC Gain Position Position Gain
2011-07-05 -0.13% Short 0.13%
2011-07-06 0.10% Short -0.10%
2011-07-07 1.05% Long 1.05%
2011-07-08 -0.70% Short 0.70%

This method guessed the market direction correctly on 3 of the last four days. The success of this strategies continues to impress me. Still, with a gain of 4.36%, it still lags the S&P 500 gain of 6.85% this year.

On the 20-week EMA front, all indexes are in buy mode, already showing small gains except the Emerging Markets.

Asset Symbol Position Date In Gain
US REIT VNQ Long 2011-07-01 2.67%
Nasdaq 100 ^NDX Long 2011-07-01 1.88%
S&P 500 ^GSPC Long 2011-07-01 0.31%
Emerging Markets EEM Long 2011-07-01 -0.48%

Posted in Market Timing, Trades | Leave a Comment »

ARMA Models for Trading, Part VI

Posted by The Average Investor on Jul 6, 2011

All posts in this series were combined into a single, extended tutorial and posted on my new blog.

In the fourth posting in this series, we saw the performance comparison between the ARMA strategy and buy-and-hold over the last approximately 10 years. Over the last few weeks (it does take time, believe me) I back-tested the ARMA strategy over the full 60 years (since 1950) of S&P 500 historic data. Let’s take a look at the full results.

ARMA vs Buy-and-Hold

ARMA vs Buy-and-Hold

It looks quite good to me. In fact, it looks so impressive that I have been looking for bugs in the code since. 🙂 Even on a logarithmic chart the performance of this method is stunning. Moreover, the ARMA strategy achieves this performance with a maximum drawdown of only 41.18% vs 56.78% for the S&P 500. Computing the S&P 500 returns and drawdowns is simple:


getSymbols("^GSPC", from="1900-01-01")
gspcRets = Ad(GSPC) / lag(Ad(GSPC)) - 1
gspcRets[as.character(head(index(Ad(GSPC)),1))] = 0
gspcBHGrowth = cumprod( 1 + gspcRets )

The above code will produce the 10 biggest drawdowns in the return series. To compute the ARMA strategy growth, we first need the daily indicator. This indicator is what took so long to compute. It is in Excel format (since WordPress doesn’t allow csv files). To use the file in R, save it as csv, without any quotes, and then import it in R via:

gspcArmaInd = as.xts( read.zoo(file="gspc.all.csv", format="%Y-%m-%d", header=T, sep=",") )

The first column is the date, the second the position for this day: 1 for long, -1 for short, 0 for none. Note, the position is already aligned with the day of the return (it is computed at the close of the previous day), in other words, no need to shift right via lag. The indicator needs to be multiplied with the S&P 500 daily returns, and then we can follow the above path. The next two columns are the number of auto regressive and the number of moving average coefficients for the model giving the best fit and used for the prediction. The GARCH components are always (1,1).

The only thing I didn’t like was the number of trades, a bit high for my taste – 6,346, or a trade every 2.35 days on average. This has the potential to eat most of the profits, more so in the past than today, however (lower transaction costs, higher liquidity). Still, taking into account the gains from this strategy together with the exceptional liquidity of S&P 500 instruments (SPY for instance trades about 167.7 million shares lately), should suffice to keep a significant amount of the profits.

Last, the simple R-code that produced this nice chart from the two growth vectors is worth showing:

png(width=480, height=480, filename="~/ttt/growth.png")
plot(log(gspcArmaGrowth), col="darkgreen", main="Arma vs Buy-And-Hold")
lines(log(gspcBHGrowth), col="darkblue")
legend(x="topleft", legend=c("ARMA", "Buy and Hold"), col=c("darkgreen", "darkblue"), lty=c(1,1), bty="n")

Pretty neat if you ask me! Code snippets like this one are what makes me believe command line interface is the most powerful interface.

Posted in Market Timing, R, Strategies | Tagged: , , , | 7 Comments »

The Weekly Update

Posted by The Average Investor on Jul 5, 2011

A huge week for the markets enough for all indexes to close above their 20-week moving average. The correction was not deep enough, the surge on the upside was quite powerful, and the re-entry ended up being about 3% higher than the previous exit on the S&P 500. The situation in the other indexes were similar.

Contrast this with the behavior on the (longer term) 10-month EMA. Although the S&P was certainly within a striking range, it ended up closing the month of June above it, thus, continuing the long position established at the end of September 2010.

Another interesting (but commonly overlooked) factor to consider when following a MA strategy is the dividend situation. Following the 20-week MA, for instance, the sell signal came before the ex-date for the second quarter. This amounts to another 0.43% “missed” gains, bringing the total cost for exiting and re-entering to about 3.4%. Certainly random, but an important factor still and worth modeling especially for higher-yielding instruments.

Posted in Market Timing, Trades | Leave a Comment »

Low-hanging R Optimizations on Ubuntu

Posted by The Average Investor on Jul 1, 2011

A friend of mine brought my attention recently to the fact that the default R install is way to generic and thus sub-optimal. While I didn’t go all the way rebuilding everything from scratch, I did find a few cheap steps one can do to help things a little.

Simply install the libatlas3gf-base package. That’s all, but it boosts the performance on some R calculations many fold. This package is an optimized BLAS library, which R can use out of the box.

My next step was to enable some SSE instructions when packages are compiled. To do that one needs to overwrite some compiler settings. First, I needed to find the R home path on my system: R.home() returned /usr/lib64/R on my Ubuntu 11.04. The I created a file called /usr/lib64/R/etc/ with the following content:

CFLAGS = -std=gnu99 -O3 -pipe -msse4.2
CXXFLAGS = -O3 -pipe -msse4.2

FCFLAGS = -O3 -pipe -msse4.2
FFLAGS = -O3 -pipe -msse4.2

How did I figured out what to add? Well, I looked up the defaults for these settings in /usr/lib64/R/etc/Makeconf and combined them with what I had in mind (adding SSE4.2 by default). I also removed the default -g. Now, when a new package is installed and compiled, you should see the options. For existing packages, uninstall them using remove.packages and then install them back using install.packages. I start R as root (sudo R) for these operations.

Does your CPU support SSE and what version? Run grep -i sse /proc/cpuinfo.

Last I noticed these two variables in Makeconf:

LAPACK_LIBS = -llapack
BLAS_LIBS = -lblas

The next thing I may try when my current simulations finish is to install optimized development libraries for BLAS and LAPACK (libatlas-dev for instance) and then change the above definitions …

Posted in coding, R | 6 Comments »

%d bloggers like this: