Previous: Portfolio Trading

Workshop 7: Machine Learning

Zorro's advise function can be used for applying machine learning functions to candle patterns, and using the most profitable patterns for a trade signal. Here's a simple example (Workshop7.c):

function run()
  StartDate = 2005;   // use > 10 years data
  BarPeriod = 1440;   // 1 day
  BarZone = WET;      // West European midnight
  Weekend = 1;        // separate Friday and Sunday bars
  LookBack = 3;       // only 3 bars needed
  NumWFOCycles = 10;   // mandatory for machine learning functions
  set(RULES+TESTNOW); // generate rules, test after training
  if(Train) Hedge = 2; // allow long + short
  LifeTime = 5;        // = one week
  MaxLong = MaxShort = 1; // only 1 open trade
    priceHigh(0),priceLow(0),priceClose(0)) > 40)
  if(adviseShort() > 40)

Many lines in this code should be familiar, but there are also some new concepts. The adviseLong function takes price candles or other data, and generates a trade signal when the return value is above a threshold. It can be called like a normal indicator, but internally uses various machine learning training and prediction algorithms. The function is here called with the PATTERN classification method and the High, Low, and Close prices of the last 3 candles. Aside from PATTERN, other and more complex machine learning methods can be used, such as a deep learning neural net. A detailed introduction in pattern detection with adviseLong/Short can be found in the Black Book. An application for deep learning is described in this article.

WFO or some other out-of-sample test method is mandatory for machine learning or pattern classification strategies. All machine learning systems tend to overfitting, so any in-sample result from price patterns, decision trees, or preceptrons would be far too optimistic and thus meaningless. The number 10 is a compromise: higher numbers produce more WFO cycles, ergo less bars for any cycle to train, so less patterns are found and the results become more random. Lower numbers produce more bars per cycle and more patterns are found, but they are from a longer time period - above one year - within which the market can have substantially changed. So the results can become more random, too.

The number of bars from the same time period could theoretically be increased with oversampling. But oversampling is useless for daily bars because the High, Low, and Close prices depend on a certain bar start and end time. Resampled bars would produce very different patterns. So we must use more than 10 years for the simulation period to get enough data for training. (If price data from a certain year is not included in the Zorro program, it can be downloaded either automatically from the broker server, or with the historic price package from the Zorro download page).

Normally, daily bars begin and end at UTC midnight. But for price patterns the time zone of the bars is critical. A good time for low EUR/USD volatility is midnight in Western Europe. BarZone determines the time zone of a daily bar; WET is the Western European Time, the time zone of London, considering daylight saving time. The Weekend variable determines how the simulator deals with weekend bars. Normally, no bar is allowed to start or end within a weekend. This means that for daily bars, the bar starting Friday 00:00 midnight would end Monday 00:00 midnight. This is not desired here because this bar would then contain prices from Friday as well as from Sunday evening, and spoil the candle pattern. Thus, Weekend = 1 enforces the Friday bar to end Saturday 00:00 midnight, although due to the weekend no trades can be entered on that bar. The week then consists of 6 instead of 5 bars.

Because the strategy needs only the last 3 candles for trade decisions, we can set the lookback period from its default 80 down to 3 bars. This gives us three months more for training and testing. The RULES flag is required for generating price patterns with the advise function. TESTNOW runs a test automatically after training - this saves a button click when experimenting with different pattern finding methods.

The next code line behaves differently in training and in test or trade mode:

if(Train) Hedge = 2;

Train is true in [Train] mode. In this mode we want to determine the profitability of a trade that follows a certain pattern. Hedge is set to 2, which allows long and short positions at the same time. This is required for training the patterns, otherwise the short trade after adviseShort would immediately close the long positions that was just opened after adviseLong, and thus assign a wrong profit/loss value to its candle pattern. Hedge is not set in test and trade mode where it makes sense that positions are closed when opposite patterns appear.

LifeTime sets the duration of a trade to 5 bars, equivalent to about one week. If a trade is not closed by an opposite pattern, it is closed after a week. The trade results after one week are also used for training the candle patterns and generating the trade rules. MaxLong/Short limit the number of open long trades in test or trade mode to 1.

The result

Click [Train]. Depending on the PC speed, Zorro will need a few seconds for running through the 10 WFO cycles and finding about 50 profitable long or short patterns in every cycle. Click [Result] for the equity curve:

The machine learning algorithm with daily candle patterns seems to give us a relatively steadily rising equity curve and symmetric results in long and short trading. But can the same result be achieved with live trading? Or was it just a lucky test? For finding out, you have to do a Reality Check. There are several methods, a simple one is running the test many times (use NumTotalCycles) with a randomized price curve (Detrend = SHUFFLE), plotting a histogram of the results, and comparing it with the result from the real price curve. How to do such a reality check is also covered in the Black Book.

We're now at the end of the coding course. For writing your own systems, it can save you a lot of time when you flip through this manual and make yourself familiar with Zorro's math and statistics functions. Often-used code snippets for your own scripts and strategies can be found on the Tips & tricks page. Writing good code and fixing bugs is described under Troubleshooting. If you worked with a different trade platform before, read the Conversion page about how to convert your old scripts or EAs to C. For serious strategy development, some knowledge of the leading data analysis software R can be of advantage - check out the R lectures.

What have we learned in this workshop?

Further reading: ► advise, RULES