# Transformation and statistics functions

A library of general functions for evaluating statistical and other mathematical properties of data series (not necessarily prices).

## Correlation(vars Data1, vars Data2, int TimePeriod): var

Pearson's correlation coefficient between two data series over the given **TimePeriod**, in the range between **-1..+1**. A coefficient of +1.0, a "perfect positive correlation," means that changes in **Data2** cause identical changes in **Data1** (e.g., a change in the indicator will result in an identical change in the asset price). A coefficient of -1.0, a "perfect negative correlation," means that changes in **Data2** cause identical changes in **Data1**, but in the opposite direction. A coefficient of zero means there is no relationship between the two series and that a change in **Data2** will have no effect on **Data1**. This function can be also used to get the autocorrelation of a series by calculating the correlation coefficient between the original series and the same series lagged by one or two bars (**series+1** or **series+2**).
## Covariance(vars Data1, vars Data2, int TimePeriod): var

Covariance between two data series. Can be used to generate a covariance matrix f.i. for the markowitz efficient frontier calculation.
## Fisher(vars Data): var

Fisher Transform; transforms a normalized **Data** series to a normal distributed range. The return value has no theoretical limit, but most values are between **-1 .. +1**. All **Data** values must be in the **-1 .. +1** range, f.i. by normalizing with the **AGC**, **Normalize**, or cdf function. The minimum **Data** length is **1**. Source available in **indicators.c**.
## FisherInv(vars Data): var

Inverse Fisher Transform; compresses the **Data** series to be between **-1** and **+1**. The minimum length of the **Data** series is **1**. Source available in **indicators.c**.
## FisherN(vars Data, int TimePeriod): var

Fisher Transform with normalizing; normalizes the **Data** series with the given **TimePeriod** and then transforms it to a normal distributed range. Similar to a **Normalize** filter (see below), but more selective due to the normal distribution of the output. The return value has no theoretical limit, but most values are in the **-1.5 .. +1.5** range. The minimum length of the **Data** series is equal to **TimePeriod**. The function internally creates series and thus must be called in a fixed order in the script. Source available in **indicators.c**.
## FractalDimension(vars Data, int TimePeriod): var

Fractal dimension of the **Data** series; normally **1..2**. Smaller values mean more 'jaggies'. Can be used to detect the current market regime or to adapt moving averages to the fluctuations of a price series. Source available in **indicators.c**.
## Gauss(vars Data, int TimePeriod): var

Gauss Filter, returns a weighted average of the data within the given time period, with the weight curve equal to the Gauss Normal Distribution. Useful for removing noise by smoothing raw data. The minimum length of the **Data** series is equal to **TimePeriod**, the lag is half the **TimePeriod**.
## HTDcPeriod(vars Data): var

Hilbert Transform - Dominant Cycle Period, developed by John Ehlers. Hilbert transform algorithms are explained in Ehler's book "Rocket Science for Traders" (see book list). This function is equivalent, but less accurate than the DominantPeriod function.

## HTDcPhase(vars Data): var

Hilbert Transform - Dominant Cycle Phase.

## HTPhasor(vars Data): var

Hilbert Transform - Phasor Components. Result in **rInPhase**, **rQuadrature**.

## HTSine(vars Data): var

Hilbert Transform - SineWave. Result in **rSine**, **rLeadSine**.

## HTTrendline(vars Data): var

Hilbert Transform - Instantaneous Trendline.

## HTTrendMode(vars Data): int

Hilbert Transform trend indicator - returns **1** for Trend Mode, **0** for Cycle Mode.
## Hurst (vars Data, int TimePeriod): var

Hurst exponent of the **Data** series; between **0..1**. The Hurst exponent measures the 'memory' of a series. It quantifies the autocorrelation, i.e. the tendency either to revert to the mean (**Hurst < 0.5**) or to continue trending in a direction (**Hurst > 0.5**). This way the Hurst exponent can detect if the market is in a trending state. The **TimePeriod** window (minimum **20**) must have sufficient length to catch the long-term trend. The function internally creates a series and thus must be called in a fixed order in the script. Source available in **indicators.c**.
## Laguerre(vars Data, var alpha): var

4-element Laguerre filter. Used for smoothing data similar to an EMA, but with less lag and a wide tuning range given by the smoothing factor **alpha** (**0..1**). The low frequency components are delayed much more than the high frequency components, which enables very smooth filters with only a short amount of data. The minimum length of the **Data** series is 1, the minimum lookback period is 4. The function internally creates series and thus must be called in a fixed order in the script. Source available in **indicators.c**.

## LinearReg(vars Data, int TimePeriod): var

Linear Regression, also known as the "least squares method" or "best fit." Linear Regression attempts to fit a straight trendline between several data points in such a way that the distance between each data point and the trendline is minimized. For each point, the straight line over the specified previous bar period is determined in terms of **y = b + m*x**. The **LinearReg** function returns **b+m*(TimePeriod-1)**.

**LinearReg** is a TA-LIB function. Alternatively, or for higher order regression, use the polyfit / polynom functions. For logistic regression with multiple variables, use the advise(PERCEPTRON,...) function.

## LinearRegAngle(vars Data, int TimePeriod): var

Linear Regression Angle. Returns **m** converted to degrees. Due to the different x and y units of a price chart, the angle is normally of little use.

## LinearRegIntercept(vars Data, int TimePeriod): var

Linear Regression Intercept. Returns **b**.

## LinearRegSlope(vars Data, int TimePeriod): var

Linear Regression Slope. Returns **m** as data difference per bar.

## MaxVal(vars Data, int TimePeriod): var

Highest value over a specified period.

## MaxIndex(vars Data, int TimePeriod): int

Index of highest value over a specified period. **0** = highest value is at current bar, **1** = at one bar ago, and so on. If the series was shifted (+N), add the offset **N** to the returned index for getting the index of the series.
## Median(vars Data, int TimePeriod): var

Median Filter; sorts the elements of the **Data** series and returns their middle value within the given time period. Useful for removing noise spikes by eliminating extreme values. The minimum length of the **Data** series is equal to **TimePeriod**, the lag is half the **TimePeriod**. See also Percentile.

## MinVal(vars Data, int TimePeriod): var

Lowest value over a specified period.

## MinIndex(vars Data, int TimePeriod): int

Index of lowest value over a specified period. **0** = lowest value is at current bar, **1** = at one bar ago, and so on. If the series was shifted (+N), add the offset **N** to the returned index for getting the index of the series.

## MinMax(vars Data, int TimePeriod): var

Lowest and highest values over a specified period. Result in **rMin**, **rMax**.

## MinMaxIndex(vars Data, int TimePeriod): int

Indexes of lowest and highest values over a specified period. Result in **rMinIdx**, **rMaxIdx**. **0** = current bar, **1** = one bar ago, and so on.
## Moment(vars Data, int TimePeriod, int N): var

The statistical moment **N** (**1..4**) of the **Data** series section given by **TimePeriod**. The first moment is the mean, the second is the variance, third is skewness, and fourth ist kurtosis.
The standard deviation is the square root of the second moment. Source available in **indicators.c**.
## Normalize(vars Data, int TimePeriod): var

Transforms the **Data** series to the **-1...+1** range within the given **TimePeriod**. Similar to the **AGC** function, but does not differentiate between attack and decay. The minimum length of the **Data** series is equal to **TimePeriod**. Source available in **indicators.c**. See also scale.
## NumInRange(vars Low, vars High, var Min, var Max, int Length): var

Number of data ranges, given by their **Low** and **High** values, that lie completely inside the interval from **Min** to **Max** within the given **Length**. Can be used to calculate the distribution of prices or candles. **Low** and **High** can be set to the same value for counting all values in the interval, or swapped for counting all candles that touch the interval. Returns a value of **1..TimePeriod**. Source available in **indicators.c**. See also **PercentRank**.
## NumRiseFall(vars Data, int TimePeriod): var

Length of the last streak of rising or falling values in the **Data**
series, back to the given **TimePeriod**. For a rising sequence its length is returned, for a falling sequence the negative length
(f.i. **-3** when **Data[3] < Data[2] > Data[1] > Data[0]**). Range: **1..TimePeriod-1**. See the **RandomWalk**
script for an example.
Source available in **indicators.c**.
## NumUp(vars Data, int TimePeriod, var Theshold): var

## NumDn(vars Data, int TimePeriod, var Theshold): var

Number of upwards or downwards **Data**
changes by more than the given **Threshold** within the **TimePeriod**,
from **0** to **TimePeriod-1**. See also **SumUp**,
**SumDn**. Source code in **indicators.c**.

## Percentile(vars Data, int Length, var Percent): var

Returns the given percentile of the **Data** series with given **Length**; f.i. **Percent = 95** returns the **Data** value that is above 95% of all other values. **Percent = 50** returns the **Median** of the **Data** series.

## PercentRank(vars Data, int Length, var Value): var

The opposite of **Percentile**: Returns the percentage of **Data** values within the given **Length** that are smaller or equal than the given **Value**; returns 100 when** Value ** is the greatest in the data range. Can transform any series to a range of 0..100. See also **NumInRange**.
## ShannonGain(vars Data, int TimePeriod): var

Expected logarithmic gain rate of the **Data** series in the range of about **+/-0.0005**. The gain rate is derived from the Shannon probability **P = (1 + Mean(Gain) / RootMeanSquare(Gain)) / 2**, which is the likeliness of a rise or fall of a high entropy data series in the next bar period. A positive gain rate indicates that the series is more likely to rise, a negative gain rate indicates that it is more likely to fall. The zero crossover could be used for a trade signal. Algorithm by **John Conover**. Source available in **indicators.c**.
## ShannonEntropy(vars Data, int Length, int PatternSize): var

Entropy of patterns in the **Data** series, in bit; can be used to determine the 'randomness' of the data. **PatternSize** (2..8) determines the partitioning of the data into patterns of up to 8 bit. Each **Data** value is either higher than the previous value, or it is not; this is a binary information and constitutes one bit of the pattern. The more random the patterns are distributed, the higher is the Shannon entropy. Totally random data has a Shannon entropy identical to the pattern size. Algorithm explained on the Financial Hacker blog; source available in **indicators.c**.
## Spearman(vars Data, int TimePeriod): var

Spearman's rank correlation coefficient; correlation between the original **Data** series and the same series sorted in ascending order within **TimePeriod** (**1..256**). Returns the similarity to a steadily rising series and can be used to determine trend intensity and turning points. Range **= -1..+1**, lag = **TimePeriod/2**. For usage and details, see Stocks & Commodities magazine 2/2011. Source available in **indicators.c**.

## StdDev(vars Data, int TimePeriod): var

Standard Deviation of the **Data** series in the time period, from the **ta-lib;**
accuracy = **0.0001**. Use instead the square root of the second **Moment**** **when high accuracy or long time periods are required.

## Sum(vars Data, int TimePeriod): var

Sum of all **Data** elements in the time period.
## SumUp(vars Data, int TimePeriod): var

## SumDn(vars Data, int TimePeriod): var

Sum of all upwards or downwards **Data**
changes within the **TimePeriod**. See also **NumUp**,
**NumDn**. Source code in **indicators.c**.

## TSF(vars Data, int TimePeriod): var

Time Series Forecast. Returns **b + m*(TimePeriod)**, the Linear Regression forecast for the next bar.

## Variance(vars Data, int TimePeriod): var

Variance of the **Data** series in the time period, from the **ta-lib; **
accuracy = **0.0001.** Use instead the second **Moment**** **when high accuracy or long time periods are required.

### Standard parameters:

**TimePeriod ** |
The number of bars for the time period of the function, if any; or **0** for using a default period. |

**Length ** |
The length of the **Data** series. |

**Data** |
A data series, often directly derived from the price functions **price(), priceClose()** etc.. Alternatively a user created series or any other double float array with the given minimum length can be used. If not mentioned otherwise, the minimum length of the **Data** series is **TimePeriod**. Some functions require a second data array **Data2**. |

### Usage example:

**Volatility(Prices,20)** calculates the standard volatility of a daily price series over the last 20 days.
### Remarks:

- The
**TA-Lib** function prototypes are defined in **include\ta.h**. Information about the usage and the indicator algorithms can be found online at www.tadoc.org. The C source code of all included TA-Lib indicators is contained in **Source\ta_lib.zip** and can be studied for examining the algorithms. Some TA-Lib indicators that originally didn't work properly - such as Correlation or SAR - have been replaced by working versions. The C source code of most additional indicators that are not part the the TA-Lib is contained in **Source\indicators.c**.
- All TA functions are applied on series and do normally not accept other data arrays. In the INITRUN, all TA functions return
**0**, and LookBack is automatically increased to the largest required lookback time by a TA function.
- TimeFrame affects subsequent data series and thus also affects all indicators that use the data series as input. The
**TimePeriod** is then not in Bar units, but in time frame units. **TimeFrame** has no effect on indicators that do not use data series.

### Examples:

*// plot some indicators*
function run()
{
set(PLOTNOW);*
* var* Price = series(price());

*// plot Bollinger bands
* BBands(Price,30,2,2,MAType_SMA);

plot("Bollinger1",rRealUpperBand,BAND1,0x00CC00);

plot("Bollinger2",rRealLowerBand,BAND2,0xCC00FF00);

plot("SAR",SAR(0.02,0.02,0.2),DOT,RED);

ZigZag(Price,20*PIP,5,BLUE);

*// plot some other indicators *

plot("ATR (PIP)",ATR(20)/PIP,NEW,RED);

plot("Doji",CDLDoji(),NEW+BARS,BLUE);

plot("FractalDim",FractalDimension(Price,30),NEW,RED);

plot("ShannonGain",ShannonGain(Price,40),NEW,RED);

}

### See also:

Spectral filters, indicators, normalization, candle patterns, machine learning

► latest
version online