The Average Investor's Blog

A software developer view on the markets

ARMA Models for Trading, Part II

Posted by The Average Investor on Apr 21, 2011

All posts in this series were combined into a single, extended tutorial and posted on my new blog.

We left the last post at the point of determining the best ARMA model. Before continuing the discussion, however, I would like to make a few points that might seem a bit questionable or unclear:

  • We model the daily returns instead of the prices. There are multiples reasons: this way financial series usually become stationary, we need some way to “normalize” a series, etc
  • We use the diff and log function to compute the daily returns instead of percentages. Not only this is a standard practice in statistics, but it also provides a damn good approximation

Now back to choosing the best fitting ARMA model. A well known statistic to measure the goodness of fit test is AIC (for Akaike Information Criteria). Once the fitting is done, the value of the aic statistics is accessible via:

xxArma = armaFit( xx ~ arma( 5, 1 ), data=xx )

There are other statistics of course, which for instance penalize models with mode parameters to avoid over-parametrization, however, typically the results are quite similar.

To summarize, all we need is a loop to go through all parameter combinations we deem reasonable, for instance from 0 to 5, inclusive, both for the AR (the first component) and the MA (the second component), for each parameter pair fit the model, and finally pick the model with the lowest AIC or some other statistic. The full code for findBestArma is at the end of the post.

In the code below, note that sometimes armaFit fails to find a fit and returns an error, thus quitting the loop immediately. findBestArma handles this problem by using the tryCatch function to catch any error or warning and return a logical value (FALSE) instead of interrupting everything and exiting with an error. Thus we can distinguish an erroneous and normal function return just by checking the type of the result. A bit messy probably, but it works.

findBestArma = function( xx, minOrder=c(0,0), maxOrder=c(5,5), trace=FALSE )
   bestAic = 1e9 
   len = NROW( xx )
   for( p in minOrder[1]:maxOrder[1] ) for( q in minOrder[2]:maxOrder[2] )
      if( p == 0 && q == 0 ) 

      formula = as.formula( paste( sep="", "xx ~ arma(", p, ",", q, ")" ) ) 

      fit = tryCatch( armaFit( formula, data=xx ),
                      error=function( err ) FALSE,
                      warning=function( warn ) FALSE )
      if( !is.logical( fit ) ) 
         fitAic = fit@fit$aic
         if( fitAic < bestAic )
            bestAic = fitAic
            bestFit = fit 
            bestModel = c( p, q ) 

         if( trace )
            ss = paste( sep="", "(", p, ",", q, "): AIC = ", fitAic )
            print( ss )
         if( trace )
            ss = paste( sep="", "(", p, ",", q, "): None" )
            print( ss )

   if( bestAic < 1e9 )
      return( list( aic=bestAic, fit=bestFit, model=bestModel ) ) 

   return( FALSE )

3 Responses to “ARMA Models for Trading, Part II”

  1. […] there are more robust statistical methods to do that. More on that in the next post … LD_AddCustomAttr("AdOpt", "1"); LD_AddCustomAttr("Origin", "other"); […]

  2. bgpl said

    hi, nice post.
    however, I am confused..
    You only have two parameters for the ARMA anyway. So, using AIC to compare different iterations of this essentially becomes comparison of the actual fit, not of the degree of parametrization. in other words AIC reflects both fit and number of parameters, but since your number of parameters are the same across different iterations, it devolves to the fit.

    AIC probably makes sense to be employed for comparison among multiple strategies with different parameters and different numbers of parameters, but in this context, I fail to see how it helps – I am likely missing something here.

    would the same results be obtained merely by using the least-error instead of AIC ?

    thanks !

    • Hi, the two parameters are in fact the number of parameters used in the ARMA model. (3,5,1,1) describes the best fit using two AR components, five MA components, and GARCH parameters. Once we find a fit for (3,5,1,1) and (5,3,1,1) how do we choose between them? Which one is better? Using AIC for this purpose seems to be a common practice. I have toyed with other ideas – one of them is to use the in-sample returns of the model to choose the better one, a very greedy approach. Never implemented it though …

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: