Exporting and importing historical data, asset data, account data

Historical price data and data streams

Historical price data is used in [Test] and [Train] mode; in [Trade] mode the prices required for the LookBack period are normally automatically downloaded from the broker at start of the strategy. The used data period can be determined with StartDate and EndDate. If historical prices from the desired period are missing, they can be downloaded with the assetHistory function or the Download script (see below). All downloaded prices are automatically aligned to UTC time. The broker's time zone does not matter. Therefore it's normally no problem to use price data from different sources for the simulation and for trading.

Zorro stores historical price data in the History subfolder in files that can contain tick, minute, or day based ask prices. Each file name is composed from the asset (with special characters such as '/' removed), the year number, and the extension ".t6" for candles with volume data, or ".t1" for plain price quotes (see also M1 vs. T1). For instance, "EURUSD_2010.t6" contains all prices of the EUR/USD asset in 2010. The price data files included with Zorro are minute based, but any other time base can also be used as long as it's equal or less than the bar period.

Price data in different formats can not be mixed. You can use either .t1, .t6, or .bar price data in the backtest of a strategy, but not a combination, for instance .t6 from 2016 and .t1 from 2015.

A .t1 or .t6 price data file is a simple list of T1 or T6 structs in descending order (newest price first). They are defined in include\trading.h:

typedef struct T1
{
DATE time; // time of the tick in UTC, OLE date/time format
float fVal; // price data, positive for ask and negative for bid
} T1; typedef struct T6 { DATE time; // time of the tick in UTC, OLE date/time format float fHigh,fLow; float fOpen,fClose; float fVal,fVol; // additional data, like ask-bid spread, volume etc. } T6;

For historical data of options and futures, there's also a .t8 file format that includes options specific parameters, such as strike and expiration date. It is a list of CONTRACT structs in descending order:

typedef struct CONTRACT
{
  DATE  time;   // time in UTC, OLE date/time format
  float fAsk,fBid; // premium without multiplier
  float fVal;   // open interest
  float fVol;   // volume
  float fUnl;   // underlying price
  float fStrike;
  long  Expiry; // YYYYMMDD
  long  Type;   // PUT, CALL, FUTURE, EUROPEAN, BINARY
} CONTRACT;

For downloading and reading .t1 files, Zorro S is required. For converting price data from an external source to the Zorro format, separate it into years, convert it to a list of T6 or T1 structs, and store it in a binary file with the name described above. The structs are stored in reverse order, i.e. the most recent tick comes first. The OLE DATE format is a double float value, counting days since midnight 30 December 1899; hours, minutes, and seconds are represented as fractions of days. Zorro generally uses UTC time: if you have price data based on local time, convert it to UTC (you can find some example code here). If the price data contains bid prices, you may convert them to ask prices by adding your broker's spread. Using ask or bid price history has however normally no effect on backtest results.

The fOpen, fClose, fHigh and fLow data streams can be separately accessed in the script. The fVal and fVol data can be used to store and access additional data, such as spread, quote frequency, trade volume, or rollover. Data in .t1 format can be used for additional information, such as external indicators or statistics such as the VIX (in that case set Detrend at 16 for disabling the automatic fixing of price data). All data streams are automatically synchronized to the streams of the first asset call in the script; so it does not matter if some data is only available in daily or weekly 'ticks'. The asset function can be used to select between different data files, while the price functions select between the four data streams inside the file.

Historical data in arbitrary CSV formats can be converted to .t1 or .t6 with the dataParse function. Exporting or converting historical data to and from other formats is also easy due to the simple format. In the Strategy folder you can find several small scripts for this purpose. Download.c (see below) downloads historical data in .t6 or .t1 format from Yahoo, IB, FXCM, or Oanda. CSVtoHistory converts data from arbitrary .csv formats, f.i. from HistData, Quandl, or Yahoo, to the Zorro .t6 format. CSVExport.c exports the complete price history of the selected asset to a .csv spreadsheet, and CSVFromHistory converts a particular .t6 file to a .csv spreadsheet. Example code for automatically downloading .csv price data from the Internet or from a provider API can be found on the http page. Here's a small script that converts .t1 price history to a .t6 file:

#define NN 400000	// > 252*24*60
T6 Buffer[NN];
int Ticks = 0;

void Tick() { Ticks++; }

void run()
{
  NumYears = 1;
  BarPeriod = 1;
  LookBack = 0;
  History = ".t1";
  set(TICKS);
  asset(Asset);

// store candles in reverse order
  static T6 *Candle;
  if(is(INITRUN)) Candle = Buffer + NN;
  Candle--;
  Candle->fOpen = priceOpen();
  Candle->fHigh = priceHigh();
  Candle->fLow = priceLow();
  Candle->fClose = priceClose();
  Candle->fVal = 0;
  Candle->fVol = max(1,Ticks);
  Candle->time = wdate();
  Ticks = 0;
  
  if(is(EXITRUN)) {
    string Name = strf("History\\%s_%d.t6",Asset,year());
    int Size = (int)(Buffer+NN)-(int)Candle;
    file_write(Name,Candle,Size);
    printf("\n%d bars written!",Size/sizeof(T6));
  }
} 
A script for generating .t8 data for options from the underlying price history can be found on the Financial Hacker blog.

The Download script

Download.c is a small script for downloading price history from Yahoo or from a broker or signal service. It offers also a convenient way for retrieving broker-specific asset data, such as lot size, margin, and rollover, directly from the broker API. The script opens a control panel with buttons (blue) and entry fields (yellow):

Asset List

Download or update historical price data for the current year

Adding a new asset to the Asset scrollbox

Downloading historical M1 price data for a new asset

Downloading historical tick-based price data

Downloading historical daily data from Yahoo™

Simulating a different broker account Updating the recent price history of all assets

The asset list

Any used asset must have an entry in an asset list in the History folder. The default asset list, AssetsFix.csv, also determines the assets that appear in the scroll box. Since any broker has his individual asset parameters, different asset lists can be used for simulating different brokers and accounts. If no particular asset list is given in a script, AssetsFix.csv is also used in the strategy script. The parameters in the asset list affect training and testing. In live trading, asset parameters are normally not read from the list, but loaded from the broker API in real time. But when the broker API does not provide certain parameters, their values from the asset list are used.

Different asset lists for backtesting and training can be selected either by script through the assetList command, or automatically with an account list (see below). Asset lists are simple comma separated spreadsheet files that can be edited with Excel or with a text editor for adding new assets, or for modifying parameters of the asset and the broker account. The parameters are stored in this format (example):

Asset List

The first line must be the header line. Assets can be temporarily commented out: A line is ignored when it begins with "#". Names and symbols must contain no blanks, commas, or semicolons. Zorro also accepts semicolon separated csv files with commas instead of decimal points, but not a mix of both in the same file. Excel uses your local separation character, so make sure in your PC regional settings that it's a comma and the decimal is a point, otherwise Excel can not read standard csv files.

Every asset is represented by a line in the csv file. New assets can be added either manually - by editing the file and entering a new line - or automatically as described below. The asset parameters have the following meanings:

Name Name of the asset, f.i. "EUR/USD" (up to 15 characters, with no blanks and no special characters except for slash '/' and underline '_'). This name is used in the script and in the Asset scrollbox. The asset is ignored when it begins with "#".
Price Ask price of one contract, in counter currency units. Accessible with the InitialPrice variable. For non-Forex assets the counter currency is usually the currency of the exchange place, such as USD for US stocks, or EUR for the DAX (GER30). For information only; not used in the backtest.
Spread The difference of ask and bid price of the asset, in counter currency units. Accessible with the Spread variable.
RollLong
RollShort
Daily rollover fee ("swap") for long resp. short trades per contract, resp. per 10000 contracts for currencies. This is the interest that is added to or subtracted from the account for holding trades overnight. Account currency units; accessible with the RollLong/Short variables. On Wednesdays it is often three times higher for compensating the weekend when no rollover fee is charged. When manually entering them, make sure to convert them to 10,000 contracts for currency pairs.
PIP Size of 1 pip in counter currency units; accessible with the PIP variable. About ~1/10000 of the asset price. The pip size is normally 0.0001 for assets (such as currency pairs) with a single digit price, 0.01 for assets with a price between 10 and 200 (such as USD/JPY and most stocks), and 1 for assets with a 4- or 5-digit price. For consistency, use the same pip sizes for all your asset lists.
PipCost Value of 1 pip profit or loss per lot, in units of the account currency. Accessible with the PipCost variable and internally used for calculating the trade profit. When the asset price rises or falls by x, the equivalent profit or loss of a trade in account currency is x * Lots * PIPCost / PIP. For assets with pip size 1 and one contract per lot, the pip cost is just the conversion factor from counter currency to account currency. For calculating it manually, multiply LotAmount with PIP and divide by the price of the account currency in the asset's counter currency. Example 1: AUD/USD on a micro lot EUR account has PipCost of 1000 * 0.0001 / 1.11 (current EUR/USD price) = 0.09 EUR. Example 2: AAPL stock on a USD account has PipCost of 1 * 0.01 / 1.0 = 0.01 USD = 1 cent. Example 3: S&P500 E-Mini futures on a USD account have PipCost of 50 USD (1 point price change of the underlying is equivalent to $50 profit/loss of an S&P500 E-Mini contract).
MarginCost Initial margin for purchasing 1 lot of the asset in units of the account currency. Depends on account leverage, account currency, and counter currency; accessible with the MarginCost variable. Internally used for the conversion from trade Margin to Lot amount: the number of lots that can be purchased with a given trade margin is Margin / MarginCost. Also affects the Required Capital and the Annual Return in the performance report. Can be left at 0 when Leverage (see below) is used for determining the margin.
Leverage Account leverage for the asset, f.i. 100 for 100:1 leverage. Accessible with the Leverage variable. MarginCost and Leverage are different methods for determining a margin for a given purchase volume. If the price is known, they can be converted into each other: MarginCost = Asset price / Leverage * PipCost / PIP. When the broker uses Leverage, the margin per purchased lot depends on the current price of the asset. When the broker uses MarginCost, the margin is independent of the asset price, therefore the broker will adapt MarginCost from time to time when the price has moved far enough. When only Leverage is entered in the asset list, the MarginCost variable is calculated from the Leverage value and the current price. When MarginCost is nonzero, Leverage is ignored and the Leverage variable is calculated from MarginCost and the initial price.
LotAmount Number of contracts for 1 lot of the asset; accessible with the LotAmount variable. It's the smallest amount that you can buy or sell without getting the order rejected or a "odd lot size" warning. For currencies the lot size is normally 1000 on a micro lot account, 10000 on a mini lot account, and 100000 on standard lot accounts. Some CFDs can have a lot size less than one contract, such as 0.1 contracts. For most other assets it's normally 1 contract per lot.
Commission Roundturn commission for opening and closing one contract, resp. 10000 contracts for currencies. Accessible with the Commission variable. When manually entering the commission, double it if it's single turn. For currency pairs make sure to convert it to 10,000 contracts.
Symbol Broker symbol of the asset (up to 31 characters, no blanks, but special characters such as '.' or '-' are allowed). Additional information such as the asset type and exchange name can be coded in the symbol, dependent on the Broker API (f.i. AAPL-STK-NYSE-USD, see IB API). If asset prices have to be downloaded from Quandl or Yahoo EOD datasets, enter QUANDL: or YAHOO: followed by the code or symbol (f.i. QUANDL:CHRIS/CME_EC1). Asset prices are not downloaded at all when the symbol begins with '#'. When the field is empty, the asset name is used for the symbol.

Some broker APIs, such as IB or FIX, do not provide asset parameters. So they must be calculated and entered manually. The values can normally be taken from the broker's website. If the broker uses a very complex structure of fees, margin, and commission, enter estimated or average values. They need not be 100% accurate, but they should not be too far off in the interest of realistic backtests. An alternative way to get to the data is opening a minimum position of the asset in a demo account. The commission and margin is then often displayed in the broker's trade platform.

Since the price of the account currency is not constant, PipCost and MarginCost are only valid for the time at which the asset list was created or downloaded. In live trading they are automatically updated from the broker API. In the backtest they could be updated by script. But this is normally not required, since the deviations in trade profit and margin value are negligible in comparison to bias and randomness that affect strategy performance results.

Up to 8 additional asset-specific parameters - either numbers or strings - can be entered in the asset list behind the Symbol column. They can be accessed in the script through the AssetVar/AssetStr variables, and can be used for storing additional asset or strategy specific information, for instance the minimum and maximum portfolio weights, or the trading class for options. Asset-specific strings must not have more than 7 characters.

For backtesting or trading a certain asset, make sure that historical price data files for the tested period are available, and the asset it contained in the selected asset list. US citizens are restricted in trading, as high leverage and CFDs are usually not available to them. FXCM US accounts have often only 10:1 leverage and offer only currencies; the other assets should be removed from the list when simulating US accounts. For simulating many different accounts, place several Asset...csv files in the History folder and call assetList with the desired asset file name in the script for simulating the corresponding account.

There is a special asset list named Assets.csv in the Log folder. This list is updated on every [Trade] session with the current parameters of all assets contained in the script. It's a convenient way to simulate your current broker account: just connect to the broker with a script that selects all needed assets (f.i. the Download script), then copy Assets.csv to the History folder under the name of the asset list that you use for testing. It's preferable to do this a Monday, Tuesday, Thursday, or Friday, as on weekends most assets have an unrealistic spread and no rollover fee, and on Wednesdays often the rollover is three times as high for compensating the weekend. For permanently simulating a certain asset/account state, copy the corresponding asset line from Assets.csv into the History\AssetsFix.csv file. Zorro must be restarted when AssetsFix.csv was modified. Below you'll find examples for adding assets and downloading price data with the Download script.

The included asset list AssetsCur.csv contains about 35 currency pairs, including all pairs of the 7 major currencies EUR, USD, AUD, GBP, CHF, NZD, CAD. It can be used for trading with a multitude of currencies. For making this list permanent, copy AssetsCur.csv to AssetsFix.csv. The included asset lists AssetsIB.csv and AssetsOanda.csv contain selected assets for particular brokers.

For simulating direct market access (DMA) with no broker interference, set the parameters in the following way: Spread to a realistic bid/ask spread; Commission, RollLong, RollShort, and MarginCost to 0; Leverage to 1; LotAmount to 1; and PipCost to PIP multiplied with the price of the assets resp. the counter currency in account currency units.

The account list

By default, only a Demo and a Real account can be selected with the [Account] scrollbox. With Zorro S a list of additional accounts with extra parameters can be added through a simple spreadsheet file named Accounts.csv in the History folder. This file is a convenient way to manage many broker accounts with different logins, passwords, asset parameters, and special modes. The Accounts.csv file can be edited with Excel or a simple text editor. It contains the account info in plain comma separated spreadsheet format (example):

Asset List

An example file AccountsExample.csv is contained in the History folder as a template for creating your own Accounts.csv file. Every account in the scrollbox corresponds to a line in the file with the following parameters, separated with commas:

Name Account name (no blanks) that appears in the account scrollbox.
Broker Name of the broker (with no blanks).
Account Account ID, or 0 if only one account belongs to the login data.
User User name for the login, or 0 for manually entering the user name.
Pass Password for the login, or 0 for manually entering the password.
Assets Default name of the AssetList file used for simulations with this account (see above). Only affects the script, not the Asset scrollbox.
CCY Name of the account currency, f.i. EUR or USD.
Real Real account (1), real account with no trading (3) or demo account (0). When at 3, all trades are opened in Phantom Mode and not sent to the broker.
NFA Compliance of the account. Affects default state of NFA flag and Hedge mode: 0 for no restrictions, 2 for Hedge = 0 (no hedging), 14 or 15 for NFA = on (full NFA compliance).
Plugin Name of the broker plugin (without .dll extension).

The first line in the csv file must be the header line. Names and strings must contain no blanks and no special characters except for '-' and '_'. Zorro also accepts semicolon separated csv files with commas instead of decimal point, but not a mix of both in the same file. User names and passwords are stored in unencrypted text in the spreadsheet, so leave those parameters at 0 when other people have access to your PC. Zorro must be restarted when Accounts.csv was modified.
 

The trade log

Zorro exports events in two files in the Log folder when the LOGFILE flag is set: a *.log event log and a *.csv trade spreadsheet. The *.log file records the profit at every bar, and all events such as opening or closing a trade, adjusting a trailing stop, or a printf message by the script. If VERBOSE is set, it also records the daily state of all open trades and the daily profit or loss. Event logs have individual names composed from the strategy name and the selected asset. They are normal text files and can be opened with any text editor; their content looks like this (the trade messages are explained under Trading):

--- Monday 03.08. - daily profit +363$ ---
[GER30:EA:L7407] +684.97 / +685.0 pips
[GER30:HU:L7408] +684.97 / +685.0 pips
[UK100:EA:L7409] +580.55 / +464.4 pips
[USD/CAD:LP:S7612] +136.29 / +851.8 pips
[USD/CAD:LS:S7613] +68.15 / +851.8 pips
[AUD/USD:LP:L8412] +115.92 / +483.0 pips
[AUD/USD:MA:L1511] +42.50 / +265.6 pips
[USD/CAD:HU:S5412] +14.63 / +182.9 pips
[AUD/USD:HU:L5413] +28.05 / +175.3 pips
[GER30:EA:L7407] Trail 1@4746 Stop 4907
 
[5770: 04.08. 04:00]  6: 9289p 143/299
[GER30:EA:L7407] Trail 1@4746 Stop 4914
[AUD/USD:MA:L1511] Exit after 56 bars
[AUD/USD:MA:L1511] Exit 2@0.8406: +38.71 07:36
 
[5771: 04.08. 08:00]  6: 9773p 143/299
[GER30:EA:L7407] Trail 1@4746 Stop 4920
 
[5772: 04.08. 12:00]  6: 9773p 143/299
[GER30:EA:L7407] Trail 1@4746 Stop 4927
[USD/JPY:LP:S7308] Short 1@94.71 Risk 8
[USD/JPY:LS:S7309] Short 2@94.71 Risk 29
[USD/JPY:HU:S7310] Short 1@94.71 Risk 13

New log files always overwrite old log files of the same name. If you want to compare log files or charts generated with different parameters from the same script, rename them or copy them to a backup folder. For preventing that the Log folder gets cluttered with thousands of files, Zorro automatically deletes log files that are older than 1 month (it will ask you before).

Trade spreadsheets

Three exported spreadsheet files - testtrades.csv, demotrades.csv, trades.csv - contain a description of every trade in comma separated format for import in Excel™ or other spreadsheet or database programs. They can be used for evaluating trade statistics or for the tax declaration. testtrades.csv is exported in [Test] mode when LOGFILE is set. demotrades.csv and trades.csv are exported in in [Trade] mode, dependent on whether a demo or real account was selected. The latter two are perpetual files, meaning that they are never overwritten, but their content is preserved and any new trade sent to the broker is just added at the end. So they will grow longer and longer until they are deleted manually or moved to a different folder. Depending on the Comma setting, numbers are exported with either a decimal comma or point, and separated with either a semicolon or a comma; this is because German spreadsheet programs require CSV data to be separated with semicolon. A trade spreadsheet (generated with Comma) looks like this:

Meaning of the fields:

Name Algo identifier (see algo), or the script name when no identifier is used.
Type Trade type, Long or Short.
Asset Traded asset.
ID Trade identifier number, also used in the broker's records.
Lots Number of lots. Multiply this with the asset's lot size (see above) to get the number of contracts.
Open Date and time when the trade was opened, in the format Day.Month.Year Hour:Minute.
Close Date and time when the trade was closed., in the format Day.Month.Year Hour:Minute.
Entry Trade entry price (Bid or Ask, dependent on the trade type). Dependent on broker plugin and SET_PATCH parameters, the price is either the open price of the trade, or the market price at trade entry, which can slightly differ.
Exit Trade exit price (Bid or Ask, dependent on the trade type). Dependent on broker plugin and SET_PATCH parameters, the price is either the close price of the trade, or the market price at trade exit, which can slightly differ.
Profit Profit or loss in units of the account currency, as returned by the broker API. Includes spread, commission, and slippage.
Rollover Interest received from or paid to the broker for keeping the trade open overnight, in units of the account currency.
ExitType Sold (by exitTrade), Reverse (by enterTrade), Stop (stop loss), Target (profit target), Time (ExitTime), Exit (by a TMF that returned 1) or Closed (externally closed in the broker platform).

Importing trade lists and other data from CSV

Sometimes you want to import further data from a spreadsheet or data file, or export price data or backtest results for further evaluation, or convert a data file into a different format. All this is possible using the file and string functions. The sscanf function can read comma separated data from a .csv file, while the strf or sprintf functions can write comma separated data into a .csv file.

For reading back the CSV file format above, an example function can be found in the Simulate.c script:

string readTrade(string csv,
  string*  tAsset,
  string*  tType,
  int*    tLots,
  DATE*    tOpen,
  DATE*    tClose,
  var*    tProfit)
{
  string nextline = strstr(csv,"\n");
  if(nextline) nextline++;
  
  string separator = ",";
  
  *tLots = 0;  // no valid trade
  string s = strtok(csv,separator);
  if(!s) return nextline;
  if(s != "Long" && s != "Short")  // s = Algo name?
    s = strtok(0,separator);
  if(s != "Long" && s != "Short")  // invalid line?
    return nextline;

  *tType = s;      
  *tAsset = strtok(0,separator);
  strtok(0,separator); // ID  
  sscanf(strtok(0,separator),"%u",tLots);
  
  int Year,Month,Day,Hour,Minute;
  sscanf(strtok(0,separator),"%2u.%2u.%2u %2u:%2u",
    &Day,&Month,&Year,&Hour,&Minute);
  *tOpen = ConvertTime(Year,Month,Day,Hour,Minute);
  
  sscanf(strtok(0,separator),"%2u.%2u.%2u %2u:%2u",
    &Day,&Month,&Year,&Hour,&Minute);
  *tClose = ConvertTime(Year,Month,Day,Hour,Minute);

  strtok(NULL,separator); // Entry  
  strtok(NULL,separator); // Exit
  s = strtok(NULL,separator); // Profit
  *tProfit = strvar(s,0,0);

  return nextline;
}

Under Tips & Tricks more examples can be found for exporting data to .csv files or for importing data from an external text file, for instance to set up strategy parameters.

Exporting P&L curves

When the LOGFILE flag is set in [Test] or [Trade] mode, the equity or balance curve is exported to a ..test.csv file in the Log folder. If the BALANCE flag is set, the balance curve is exported, otherwise the equity curve. Some Zorro versions also generate a .dbl file that simply consists of a double array containing the daily balance or equity values.

When LOGFILE and a Curves file name is set in [Train] mode, the P&L curves of all optimize parameter steps are exported. The exported curves can then be evaluated for research purposes, f.i. for a White's Reality Check. All curves are attached to the end of the Curves file, so training runs from different scripts can add curves to the same file. The file can be deleted or renamed for getting rid of old curves. Any curve is stored in the following format in the file:

  1. string Name, the identifier of the curve in the form "Script_Asset_Algo_ParameterNumber_StepNumber". Example: "Workshop6_EUR/USD_TRND_2_10".
  2. int Size, the size of the subsequent Values array in bytes.
  3. double Values[], array containing the daily balance or equity values. The number of elements is Size/8.

Here's a code snippet that reads all curves from a file and prints their identifiers and end values:

byte *Content = file_content("Log\\Balance.curves");
while(*Content) 
{ 
  string Name = Content;
  Content += strlen(Name)+1; // skip the name
  int *Size = Content;
  Content += 4;     // skip the size
  var *Values = Content;
  Content += *Size; // skip the balance array
  int Num = *Size/8; // number of values
  var Profit = Values[Num-1]; // end balance
  printf("\n%s: %.2f",Name,Profit);
}

See also:

Bars, file, string, asset parameters, assetHistory

► latest version online