**BIS Working Papers **

**No 689**

**Estimating unknown arbitrage costs: evidence from a three-regime threshold vector error correction model **

by Kristyna Ters and Jörg Urban

**Monetary and Economic Department **

January 2018

JEL classification: G12, G14 and G15

Keywords: Transaction cost, arbitrage, basis, threshold, regime switch, intraday, nonlinear, non-stationary, error correction

BIS Working Papers are written by members of the Monetary and Economic Department of the Bank for International Settlements, and from time to time by other economists, and are published by the Bank. The papers are on subjects of topical interest and are technical in character. The views expressed in them are those of their authors and not necessarily the views of the BIS.

This publication is available on the BIS website (www.bis.org).

*© Bank for International Settlements 2017. All rights reserved. Brief excerpts may be reproduced or translated provided the source is stated.*

ISSN 1020-0959 (print)

ISSN 1682-7678 (online)

**Estimating unknown arbitrage costs: Evidence from a 3-regime threshold vector error correction model**

Kristyna Ters† J¨org Urban

**Abstract **

We present a methodology for estimating a 3-regime threshold vector error correction model (TVECM) with an unknown cointegrating vector based on a new dynamic grid evaluation. This model is particularly suited to estimating deviations from parity conditions such as unknown arbitrage costs in markets with a persistent non-zero basis between two similar financial market instruments traded in the spot and the derivative markets. Our proposed 3-regime TVECM can estimate the area where arbitrageurs have no incentives for trading. Only when the basis exceeds a critical threshold, where the potential gain from the basis trade exceeds the overall transaction costs, do we expect arbitrageurs to step in and carry out the respective trade. This leads to non-linear adjustment dynamics and regimes with different characteristics. The overall transaction costs for the basis trades can be inferred from the estimated no-arbitrage regime. Our methodology allows us to quantify overall transaction costs for an arbitrage trade in markets where trading costs are opaque or unknown, as in credit risk or index arbitrage trading. The key contributions of this paper are the further development of the 2-threshold VECM, together with the numerical evaluation of the model through numerous simulations to prove its robustness. We present two short applications of the model in arbitrage trades in the palladium market and index trading for the S&P 500.

JEL classification: G12, G14 and G15.

Keywords: Transaction cost, arbitrage, basis, threshold, regime switch, intraday, nonlinear, non-stationary, error correction.

The authors thank Darrell Duffie, Leonardo Gambacorta, Peter H¨ordahl, Erik Theissen and OTC markets seminar participants at the SNB Gerzensee Study Center, the Bank for International Settlements, and the EFMA 2016 conference for useful discussions and comments. J¨org Urban developed this paper while he was working at the BIS. The views expressed in this paper do not necessarily reflect those of the Bank for International Settlements. The research leading to these results has received funding from the Swiss National Science Foundation under career grant agreement no.: PMPDP1-158369.

- University of Basel, Basel, Switzerland. email: [email protected]
- University of Basel, Basel, Switzerland. email: [email protected] (corresponding author)

**1 Introduction**

We present a methodology for estimating three-regime vector error correction models with two thresholds and an unknown cointegrating vector. Our proposed methodology is particularly suited to modelling arbitrage in markets with frictions and allows us to estimate unknown transaction costs for such trades.

The theoretical no-arbitrage condition between two similar financial market instruments traded in the spot and derivative markets, is a cornerstone for the empirical research on economic parity relationships. The no-arbitrage condition requires that the pricing in the spot market must be equal to the derivative market. If not, any pricing discrepancy would present investors with an arbitrage opportunity which will disappear rapidly, as arbitrageurs will exploit any mispricing. This mispricing is measured by the so-called basis. For the arbitrage condition to hold, markets must be perfect and frictionless. In practice, however, frictions and imperfections often make such arbitrage trades difficult and costly to varying degrees. These imperfections include, amongst others, limited and time-varying liquidity across market segments, unavailability of instruments with identical maturity and payout structures, and the fact that some arbitrage trades require large amounts of capital to be tied up for extended periods of time.

The basis trading strategy, in which an arbitrageur believes that two similar financial market instruments are mispriced relative to each other, aims to take opposing long and short positions in these two securities in order to make a gain on the convergence of their values. In the case of a positive basis, arbitrageurs will bet on a weakening basis (short basis position) and in the case of a negative basis, arbitrageurs bet on a strengthening basis (long basis position). There exists no universal definition of the basis, and different definitions are more commonly used in different markets. In credit risk markets the basis is defined as derivatives minus spot price (Gyntelberg et al.; 2013) or more specifically as the CDS spread minus the spread on a par risky fixed-rate bond over the risk-free rate. Lien and Yang (2008) define the basis as the difference between spot and future prices in their application in commodity markets. Fama and French (1987) and McMillan (2005), on the other hand, define the basis as future minus spot prices. Our proposed methodology is independent of the chosen specification.

A substantial part of the transaction costs of an arbitrage transaction is unknown when the trade is initiated, making the arbitrage trade risky. For index trades, for example, Sutcliffe (2006) states that this risk can occur because the bid-ask spread and brokers’ commission, when unwinding the spot position at delivery, vary with the value of the index basket and that there may be a transaction tax which varies in proportion to the index (Sutcliffe; 2006). Adams and van Deventer (1993) suggest, that in the case of unknown arbitrage costs, traders should depart from the usual one-to-one ratio for the size of the spot and futures positions. In order to eliminate the transaction cost risk, while buying shares and selling futures, the arbitrageur should buy, for every one futures contract sold, 1/(1 — p) index baskets. *p* is the proportion of the value of the index basket that must be paid in transaction costs at delivery. When selling shares and buying futures, the arbitrageurs should sell 1/(1 + p) index baskets for every futures contract bought. Adams and van Deventer (1993) state that this will remove the transaction cost risk from the arbitrage trade. However, they do not propose a methodology that can estimate the unknown transaction costs of such trades.

As a result of existing transaction costs on arbitrage trades, the difference between the prices in the spot and derivatives market for two similar financial market instruments, the basis, is typically not zero. Moreover, the basis can become sizeable and persistent in times of market stress. Finally, when entering into a basis trade, the arbitrageur is exposed to the risk that the trade will move in the wrong direction.

A persistent non-zero basis is therefore likely to reflect the unwillingness of arbitrageurs to try to exploit it, unless the pricing mismatch is greater than the cost of undertaking the arbitrage trade. Empirically, we would therefore expect to see such arbitrage forces intensifying as the magnitude of the basis exceeds some level that reflects the costs that traders face in the market. This suggests that the adjustment process towards the long-run equilibrium is non-linear, in that it differs depending on the level of the basis.

Some research specifically aims to estimate the effect of transaction costs on arbitrage such as Stevens (2015) in the market for crude oil. Stevens (2015) finds that transaction costs increase the persistence of the basis in the market for crude oil. He explains the non-zero basis by the absence of arbitrage. Forbes et al. (1999) investigate index futures arbitrage for the S&P500 stock index and the nearest-to-delivery futures contract and find significant transaction costs that prevent arbitrage. Forbes et al. (1999) also find clear indications for arbitrage trading when the basis breaks out beyond a threshold. However, both Forbes et al. (1999) and Stevens (2015) employ a univariate structural change test to the cointegrating residual based on Tsay (1989). This approach would, however, only be valid when the cointegrating vector is known a priori. Neither of the above-mentioned papers provide a solution for this problem. Forbes et al. (1999) also state in their conclusion, that the problem of an unknown cointegrating relationship in multiple threshold error correction models has not yet been resolved.

Hansen and Seo (2002) provide a methodology for estimating 2-regime threshold vector error correction (TVECM) models with an unknown cointegrating relationship. However, they do not provide a solution for the case beyond two regimes. An extension to a 3-regime TVECM with two thresholds is important from an economic viewpoint, as it allows us to test for positive and negative deviations from theoretical parity relationships. For example it allows to test for the existence of transaction costs in positive and negative basis trading strategies. Also, the model setup as proposed by Hansen and Seo (2002) is not ideal for economic and financial market problems, where often a significant deviation from the theoretical parity relationship exists, because they did not account for a persistent deviation in their long-term equilibrium condition.

We contribute to the existing literature by developing an estimation procedure for threshold error correction models with three regimes (two thresholds) and an unknown cointegrating vector which is especially suited to modelling arbitrage in markets with frictions and unknown transaction costs. The estimation of an unknown cointegrating vector is particularly important for distorted parity relationships such as in financial markets and economic applications that exhibit a significant non-zero deviation from the theoretical parity relationship. Our proposed model allows for non-linear adjustment of prices, in derivative and spot markets, towards the long-run equilibrium. Consequently, we can estimate the region where arbitrageurs step into the market as the trading opportunity is ‘sufficiently profitable’ for both positive and negative basis trades. The overall transaction costs can be inferred from the no-arbitrage regime.

The rest of the paper is structured as follows: Section 2 discusses the setup and estimation of the TVECM. Section 3 provides some empirical applications for the illustration of our methodology and Section 4 concludes.

**2 3-regime threshold vector error correction model**

Threshold cointegration was introduced by Balke and Fomby (1997) as a feasible mean to combine regime switches and cointegration. In the case of a persistent non-zero basis between the spot and the derivative market we expect to see that if the basis is lower than the cost of undertaking an arbitrage trade, arbitrageurs have no incentives to carry out the trade. Only when the deviation from the long-term equilibrium exceeds a critical threshold, such that the expected profit exceeds the costs, will economic agents act to move the basis back towards its long-term equilibrium. As a result, adjustments to the long-term equilibrium are likely to be regime-dependent, with a relatively weak adjustment mechanism in the regime where arbitrageurs have no incentive for trading as the overall transaction costs exceed the expected profit from the arbitrage trade. By extending the linear VECM approach to a threshold vector error correction model (TVECM) we can model these non-linearities in the adjustment dynamics.

The TVECM approach extends the VECM by allowing the behaviour of price quotes for spots St and derivatives Dt for a specific reference entity or underlying to depend on the state of the system. One can formulate a general TVECM with l regimes, ie with l — 1

with the error correction term ec_{t-1}(b_{0},b_{1}) = (S — b_{1}D — b_{0})_{t-1}. The indicator function I(9^{j-i} < ec_{t-1}(b_{0},b_{1}) < 9^{j}) is 1 if the error correction term ec_{t-1}(b_{0},b_{1}) is in the interval [9^{j-i}, 9^{j}) and otherwise 0. Further, by definition the threshold 9^{0} is —to and *9 ^{l}* is to in Equation (1). We focus on the bi-variate case where Ay

Equation (1) constitutes a vector autoregressive model in first-order differences with r^{j} (L) = fc=_{1} L^{k} and *L* as lag operator, m as the number of VAR lags and an additional error correction term A^{j}ec_{t-1}(b_{0}, b_{1}). = (A ^2)^{T} and the lagged VAR terms are regime-dependent conditioned on the state of the error correction term ec_{t}(b_{0},b_{1}). The speed of adjustment parameters characterize to what extent the price changes in Ay_{t} = (AS_{t} AD_{t})^{T} react to deviations from the long-term equilibrium.

The Schwarz (Bayesian) information criterion (SIC) is used to determine the VAR order. Liitkepohl (2006) states that in large samples for multivariate models when T ^ to only the SIC criterion is seen to be strongly consistent for any *K*-variate system.

The error correction term represents the long-term equilibrium of the two time series which has to be an AR(1) process by construction (Johansen; 1988). The VAR-term represents the short-run dynamics coming from market imperfections (Baillie et al.; 2002).

Contrary to the 1-threshold TVECM proposed by Hansen and Seo (2002) we introduce an intercept b_{0} in the error correction term which denotes the deviation from the theoretical parity relationship (long-term equilibrium). This is motivated by our no-arbitrage discussion in Section 1 as the local constant b0 represents the persistent non-zero basis. In frictionless markets (without deviations from the theoretical parity relationship), the error correction term in Equation (1) is equal to the observed basis (S — D)_{t}, with b_{0} = 0 and b1 = 1.

The transaction costs for a basis trade prevent a complete adjustment towards a zero basis. As such, in markets with frictions there may be a neutral band between the derivative and the spot market in which the error correction term in Equation (1) may fluctuate without incentives for market participants to switch funds between the spot and derivatives market. Outside of that neutral band there might however be strong incentives for market participants to switch funds, which results in an adjustment towards the long-term equilibrium. We expect to find the speed of adjustment parameters to indicate that arbitrageurs engaging in St — Dt basis trades as soon as the basis exceeds a threshold. In a market with a positive basis (S_{t} > D_{t}), arbitrageurs bet on a declining basis and will go short in the spot market and go long in the derivative market. In case of a negative basis (S_{t} < D_{t}), arbitrageurs bet on an increasing basis while carrying out the reverse trade.

According to arbitrage theory we would in general expect to find a 3-regime TVECM with two thresholds when the basis fluctuates between positive and negative figures with sizeable and persistent deviations from zero. The lower regime is defined as ec_{t-1}(b_{0}, b_{1}) < 9^{1}, the middle regime as 9^{1} < ec_{t-1}(b_{0},b_{1}) < 9^{2}, and the upper regime is defined as 9^{2} < ec_{t-1}(b_{0},b_{1}). The middle regime is the neutral band where no arbitrage trading occurs while the outer regimes (lower and upper) are the arbitrage regimes. There may also be certain markets or time periods with only a persistent positive basis. In that case we expect to find at most two regimes (l = 2) with only one threshold 9^{1}. The lower regime (neutral regime) is defined as ec_{t-1}(b_{0},b_{1}) < 9^{1}, and the upper regime as 9^{1} < ec_{t-1}(b_{0}, b_{1}). The regimes are reversed in case of a persistent negative basis market.

Therefore, we will discuss three classes of nested models: model class H_{1} is a 1-regime VECM, with no statistical significant threshold, ie where markets are efficient enough to not allow the basis to deviate too far from zero. In markets described by model class H_{1} we have no market frictions such as transaction costs, hence arbitrageurs will step in as soon as the basis deviates from zero. Model class H_{2} is a 2-regime TVECM and model class H_{3} is a 3-regime TVECM.

The threshold 9^{j} is computed relative to the estimated basis S — b_{1}D — b_{0}. Therefore, transaction costs, which are defined relative to the observed/real basis S — D, are the sum 9^{j} + b_{0}. In the case of 9^{j} + b_{0} < 0, we have transaction costs for an arbitrage trade on basis strengthening and in the case of 9^{j} + b_{0} > 0 we have transaction costs for an arbitrage trade on basis weakening. In the case of 9^{j} + b_{0} < 0, the transaction costs have to be interpreted in absolute terms.

From Equation (1), the VECM (model class *Hi)* can be rewritten as:

All parameters are allowed to switch between the regimes except for P^ Following Hansen and Seo (2002) we the model classes by imposing the following additional constraint for each regime:

where n_{0} > 0 is a trimming parameter and P is the share of observations in each regime. This constraint allows us to identify a threshold effect only if the share of observations in each regime is large enough, ie is greater than n_{0}. If this condition is not met the model H_{i} reduces to H_{i-1} for *i >* 2 (eg from to H_{2}). Andrews (1993) argues that setting n_{0} between 0.05 and 0.15 are typically good choices. We chose as a baseline setup n_{0} = 0.1, but perform robustness tests also for n_{0} = 0.05 and n_{0} = 0.15.

**2.1 Estimating the model**

The most important statistical issue for threshold models is estimating the unknown threshold(s) and test for their significance. Balke and Fomby (1997) suggest transforming the TVECM into a univariate arranged autoregression, while Tsay (1989) reformulates the problem into a univariate structural change test to the cointegrating residual. However, these approaches are valid only in the univariate case when the cointegrating vector is known.

We contribute to the existing literature by examining the case of an unknown coin- tegrating vector for a 3-regime TVECM. We implement, as Hansen and Seo (2002) for the 2-regime TVECM, a maximum likelihood estimation of a bivariate TVECM with the assumption of i.i.d. Gaussian error terms. The likelihood function to be maximized for a l regime model takes the form:

with E = *E(e _{t}* e

Hansen and Seo (2002) suggest, that it is computationally convenient to hold P0 and Pi as well as e^{j} fixed and compute the concentrated maximum likelihood estimations for Ai, Ti and E. Due to the linearity of the model, this is simply an OLS regression. As shown in Hansen and Seo (2002) the concentrated likelihood function for p_{0}, p_{1} and thresholds e^{j} for a l regime model is:

where again j = 1 to l — 1, i = 1 to l in a l regime case and all variables with a hat are OLS estimators. We consider the bi-variate case with p = 2. The remaining task of finding the maximum likelihood estimation of P0, P1 and the thresholds is therefore to minimize ln|E(P_{0}, P_{1}, e^{j})|, subject to the constraint in Equation (6).

Unfortunately, the function in Equation (8) is not smooth (see for example the left- hand panel of Figure 2), hence conventional hill-climbing algorithms cannot be used to find the minimum. Therefore, Hansen and Seo (2002) suggest a joint grid search. We present evidence of the good performance of the proposed grid search in Section 2.3.

Two issues remain to be discussed with respect to the parameter estimation. The 2- regime model used by Hansen and Seo (2002) requires a two-dimensional grid search over (P_{1}, e^{1}). Our proposed model requires a search over a three dimensional grid (P_{0}, P_{1}, e^{1}) in the 2-regime case and a four-dimensional grid search (P_{0}, P_{1}, e^{1}, e^{2}) in the 3-regime case. The 3-regime model requires the evaluation of Equation (8) at 100 million grid points, if we evaluate each variable at 100 grid points. In order to keep the computation feasible, we suggest a sequential threshold search as it was proven to be consistent by Bai (1997) and Bai and Perron (1998).

A standard sequential search would require 2 x 100^{3} grid point evaluations for the two thresholds, ie 100^{3} for p_{0}, p_{1} and e^{1} during the first grid search with 100 grid points each and then another 100^{3} grid point evaluations for p_{0}, p_{1} and e^{2} during the second search with 100 grid points each. We will however show that it is efficient to fix p_{1} in the second threshold search to the value found in the first threshold search and hence reduce the second search to a two-dimensional space. This reduces the computational burden dramatically to 100^{3} grid points for the first search and 100^{2} grid points for the second search. We will discuss and justify this proposal in a comprehensive simulation exercise in Section 2.3. In the same section we will also show that we cannot fix p_{0} in the threshold search for e^{2}, unlike P_{1}, as the p_{0} estimate suffers from a large uncertainty.

The second remaining issue to be addressed is the setup of the “correct” search area for each parameter. The search region [e^, e^] for the thresholds is straightforward as it must be identical to the interval [min(ec_{t}(p_{0}, P_{1})), max(ec_{t}(p_{0}, P_{1}))] given by the error correction term for P0 and P1 . The region for the P0 and P1 parameters can be calibrated based on the estimates of the linear VECM model and the theoretical values P1 = 1 and P_{0} = 0, which would constitute the observed basis (S — P_{1}D *—* P_{0})_{t} = (S — D)_{t}. It is important to keep the search area for P0 and P1 large enough to include the minimum, but not too large to reduce the precision of the grid search. The grid search for 9^{2} will be reduced by the constraint 9^{2} + b_{0} > 0 if the first search resulted in 9^{1} + b_{0} < 0 and vice versa if 9^{1} + b_{0} > 0. This is purely based on our previous arbitrage discussion, where we expect to find at most two transaction costs, one for a positive basis trade and one for a negative basis trade.

Obviously, the precision of the estimated parameters will depend on the distance between two neighbouring grid points. We will test in Section 2.3 various grid sizes, such as 10, 50 or 100 grid points for each search dimension as well as a dynamic grid setting. The dynamic grid setting has no fixed number of grid points, but a precision parameter fixes the distance between two neighbouring grid points. For our simulation exercise we have chosen Ab_{0} = 0.5, Ab_{1} = 0.01 and A9^{j} = 0.5. This choice is motivated by the applications in Section 3 and the data generating process used in Section 2.3, which restrains our time series to become larger in absolute terms than 500. That means, our basis can become as large as ± 1,000 in extreme cases, leading to approximately 4,000 grid points for b_{0} and 9^{j}. bi is in a range of around 0 and 10, leading to maximal 1,000 grid points based on Ab_{1} = 0.01. The dynamic grid setting is computationally more expensive but we will show that, as expected, it yields the best results.

The next step is to determine whether the estimated thresholds 9^{j} are statistically significant. As the model class H_{1} is nested in H_{2} and H_{2} is nested in class H_{3}, we start with the discussion of a 1-threshold model H_{2}. In that case, under the null hypothesis, there is no threshold, so the model reduces to a conventional linear VECM model class H_{1} where A2 = A2 = A^{1} and ^(L) = T^L) = ri(L). The 1-threshold TVECM model class H_{2} is detected under the alternative hypothesis with A2 = A2 under the constraint in Equation (6). As the models are linear, a regular LM test with an asymptotic x^{2}(N)-distribution can be calculated from a linear regression on the model in Equation (4). However, the LM test statistic can only be applied if bi and the threshold variable 9^{1} are known a priori (Hansen and Seo; 2002). While the point estimates of b_{0} and b_{1} under the null hypothesis are b_{0} and bi from the linear model, there is no estimate of 9^{1} under the null hypothesis. This implies that there is no distribution theory for the parameter estimates and no conventionally defined LM-like statistic.

We follow Hansen and Seo (2002) who derive the LM-like statistic for the one threshold case. We test for a linear VECM H_{1} under the null and a 1-threshold model H_{2} under the alternative hypothesis. Our models H_{1} and H_{2} are defined in Equations (3) and (4).

where o denotes elementwise multiplication. Y* contains only non-zero entries if 9^{i-i} < ec_{t-}i < 9*. For the here considered 2-regime or 1-threshold TVECM model class H_{2} we have i e {1, 2} with 9^{0} = —to and 9^{2} = to.

£* is defined as e_{t} C Yt_{-1}(b_{0}, b_{1}) ◦ d_{t}(b_{0}, b_{1}, 9^{i-i}, 9*), with e_{t} is the OLS estimate of the residual vector from the linear model and C is the Kronecker product.

with b* being the point estimates obtained under the null hypothesis (linear VECM, model class H_{1}).

According to the constraint in Equation (6) we set the search region [9_{L}, 9^] such that 9_{l} is the n_{0} percentile of the error correction term, and 9^ is the (1 — n_{0}) percentile. This grid evaluation over [9l,9u] is necessary to implement the maximisation defined in Equation (16) as the function LM(b_{0}, bi, 9^{1}) is non-differentiable in 9^{1} and hence conventional hill-climbing algorithm cannot be used to find the extremum.

The value of 9^{1} which maximizes Equation (16) is different from the MLE 9^{1} in Section 2.1, as Equation (16) are LM tests that are based on parameter estimates obtained under the null hypothesis, ie H_{1}. Also, the test statistic is calculated with HAC-consistent covariance matrix estimates which leads to differing estimates compared to the estimate in Section 2.1 (see also the discussion in Hansen and Seo (2002)).

For three sample simulations we present the maximum likelihood function and the supremum LM estimator in Figure 2, as well as the corresponding estimators of b0 and bi used to compute the functions presented there. The left-hand side of Figure 2 shows the bi estimates under the alternative hypothesis, ie in this case the TVECM (class H_{2}), and therefore the minimum is very close to the theoretical threshold. The right-hand side shows the bi estimates obtained from the linear VECM (Hi) estimation, which are far off from the values used in the simulation (b_{0} = 10, bi = 1.1), and therefore the maxima of the LM estimator are also far from the theoretical threshold, which is 3.

This issue was also discussed by Hansen and Seo (2002). The displayed difference in the estimated thresholds is generic and not special to threshold cointegration. It is important to stress that the supremum LM test is performed to compute the critical value, which is later used to decide if the computed threshold is statistically significant. The supremum LM test is not used to estimated the model parameters.

Figure 2: Maximum likelihood estimator versus supremum LM estimator

Just like in the one threshold case (H_{2}), the next step is to determine whether the estimated 2-threshold TVECM (H_{3}) is statistically significant. Under the null hypothesis, there is one threshold 0^{1}, so the model reduces to a TVECM with two regimes (H_{2}) described in Section 2.1 with A| = A3 = A|. The 2-threshold TVECM is detected under the alternative hypothesis H_{3} with A| = A3 = A3. The constraint in Equation (6) is again applied. To be explicit, now the null hypothesis is the model H_{2}:

Following the same steps and arguments discussed before we can define the following LM-like statistics:

with the point estimates bi and 9 ^{1} obtained under the null (1-threshold TVECM). There is no point estimate of 9^{2} under the null hypothesis. We perform again a grid search with the search region for 9^{2} subject to the constraint in Equation (6). Furthermore, based on our arbitrage discussion in Section 1 and the assumption, that we have at most one positive (basis weakening trade) and one negative (basis strengthening trade) transaction cost, we further impose for the grid search of 9^{2} and b_{0} the constraints that 9^{2} + b_{0} > 0 if the first search resulted in transaction costs 9^{1} + b_{0} < 0 and 9^{2} + b_{0} < 0 if the first search resulted in transaction costs 9^{1} + b_{0} > 0.

As there is no formal distribution theory in the case under discussion we follow the proposition by Hansen and Seo (2002) and perform two different bootstrap methodologies in order to estimate the asymptotic distribution for our model specification in Equation (1).

**2.2.1 Fixed regressor bootstrap**

We implement a non-parametric bootstrap of the residuals, called the ’’fixed regressor bootstrap”, which resamples (Monte-Carlo) the residuals from the estimated linear VECM or 1-threshold TVECM, in the case of the threshold search for 9^{1} or 9^{2}, respectively.

We follow the discussion of Hansen and Seo (2002), Hansen (2000) and Hansen (1996). We take estimates under the null hypothesis of b_{i}, denoted as b, and define e~c_{t-1} = ec_{t-1}(b_{i}) and Yt_{-1} = Y_{t-}i(bi), whereby ec_{t-1} denotes the error correction term (see definition below Equation (2)) and Y_{t-1} is defined in Equation (14). Further, 9_{t} are the residuals of the null. The name ’fixed regressor bootstrap” conveys the message that b, 9_{t}, ec_{t-1} and Yt_{-1} are kept fixed at their sample values.

Next, we compute a large number of times (eg 1,000) y_{bt} = e_{t}e_{bt}, whereby e_{bt} is i.i.d. N(0,1) and in each draw e_{bt} is independently chosen. For each draw (identified by the index b), we perform an LM test. e_{bt} is computed by regressing y_{bt} on Y_{t-1}. A,(b_{i}, 9^{1})_{b} (see Equation (10) for the 1-threshold case) and e_{bt}(b_{i}, 9^{1}) are computed by regressing y_{bt} on Yt_{-1}d_{1}_{t}(b_{i}, 9^{1}) and Yt_{-1}d_{2}_{t}(b_{i}, 9^{1}), whereby ^{9} is kept fixed. Further, for each draw b we compute Vj(b_{i}, 9^{1})_{b} (see Equation (11) for the 1-threshold TVECM), whereby ^{9} is kept fixed again. Similar to Equations (9) and (16) we compute for each draw b:

where ^{A}2_{b} and are functions of the fixed ^{9} and 9^{1}.

Hansen (1996) has shown that SupLM* is a valid first-order approximation to the asymptotic null distribution of SupLM. Despite having the computational cost of a bootstrap, it only approximates the asymptotic distribution.

The p-value is calculated as the percentage of SupLM^{*} values which exceed the actual SupLM value of the original time series.

**2.2.2 Residual bootstrap**

This method is fully parametric with respect to the data generating process, that means for the one threshold case we use the complete specification for the null as given by H_{2} (single regime VECM versus 1-threshold TVECM as alternative) and for the 2-threshold TVECM we use the complete specification for the null as given by H_{2} (1-threshold TVECM versus 2-threshold TVECM as alternative). We further assume to be i.i.d. from an unknown distribution and fixed initial conditions. To be specific, random draws are made from 9, which are the residuals under the null. Using the given initial conditions from the data, and the parameters estimated under the null (V, bo, ^{9} and r) we recursively generate the bivariate vector series for the given model (H_{2} or H_{2}). For each draw the SupLM* value is computed and then again the percentage of the SupLM^{*} values which exceed the actual SupLM value (computed from the original time series) gives the p-value.

Hansen and Seo (2002) conjecture that this bootstrap method gives better finite sample performance at the computational cost of being fully parametric with respect to the data generating process.

**2.3 Simulation**

We perform numerous simulations in order to test the power of the proposed LM tests using different data generating processes for the VECM and TVECM model specifications from Equation (1), using ec_{t-1}(b_{0},b_{1}) = (S — AD — b_{0})_{t-1}. In particular, we aim to test the power of the two different bootstrap methodologies, the fixed regressor bootstrap

Figure 3: Null versus alternative hypothesis in the sequential search for *9 ^{l}* and

The fixed regressor bootstrap and the residual bootstrap test assume that there is no threshold in the threshold search for *9 ^{1},* ie for the first threshold search the null hypothesis is a VECM (Hi). For the threshold search for

In this subsection we use the following notation: / and *6 ^{%}* are the parameters fixed in the data generating process, while / and

The aim of the simulation is to understand how precise we can estimate / and *9*^{%}, as well as to understand the power of the fixed regressor and the residual bootstrap.

We begin our simulation to test the proposed methodology for the 2-regime TVECM H_{2} as specified in Equation (4). The aim of the simulations is to understand how precise we can estimate /0, */ _{l}* and

we do not enforce these restrictions. We label this simulation in the following tables as ’unrestricted” as opposed to the ’restricted” cases. We investigate different time series lengths, however our baseline sample size is 1,000 periods. The grid search is performed using an equidistant grid size of 10, 50 and 100 as well as a dynamic grid setting. The dynamic grid setting is determined by a minimum distance, between two individual grid points. The dynamic setting is of course potentially very expensive, as it may lead to a large number of grid points. In our simulation exercise, we have chosen 0.01 as the distance between two grid points for the b

All parameters in the data generating process given in Equation (4) are generated with a random number generator, except for b_{0}, bi and the threshold 9^{1} (in case of a threshold model). The means of the estimates b_{0}, bi and 9^{1} from the 1,000 simulations are then compared to the values b_{0}, b_{1} and 9^{1} chosen in the data generating process. We have fixed the VAR lag in the data generating process to one. Relaxing this restriction to more VAR lags is straight forward and yields the same results, however with lower precision due to the larger number of parameters to be estimated.

The detailed results of the 1-threshold TVECM model specification are presented in Table 1. Table 1 yields two immediate observations, the mean of bi is practically independent of the grid setup and the outcomes of the bootstrap methodologies are also independent. The fixed regressor and residual bootstrap have very small b-errors, below 2%. As expected the standard deviation of A is getting larger for coarser grid settings. In other words, we find a clear dependence of the quality of the estimation precision on the grid size. The dynamic grid setting shows the best performance. Independent of the grid size, the mean values of b_{0} and 9^{1} from our 1,000 simulations are imprecise, ie far-off from their theoretical values, fixed in the data generating process. No relationship between the grid setup and the precision can be inferred.

However and most importantly, if we compute the sum of /_{0} and *d*^{1} for each simulation and compute the average and standard deviation, we find that the results of the point estimate of the sum /_{0} + *d ^{1},* which are the estimated transaction costs, have a much higher estimation precision compared to the point estimates of the individual parameter estimates. The results are displayed in Table 2.

The best performance is again found for the dynamic grid setting for the important restricted case. This result is also graphically shown in Figure 4. This is a very positive result, which we have tested for arbitrary parameters, because for arbitrageurs the sum /_{0} + *0*^{1}, which represents the transaction costs, is important and not the individual parameter estimates /_{0} and *d ^{1}.*

The graphs in Figure 4 contain, despite their similarity, several interesting features. As already shown in the Table 1, the means generated from our 1,000 simulations are close to the theoretically expected value. The range of the distribution is partly determined by the chosen grid. That is why the plot for the estimator of b_{1} is shifted to the right, because we have calibrated the grid to include the value /3_{1} = 1 (observed basis). For the /_{0} and ' estimators we find that the weight of the distribution is heavily centred around the values of the data generating process. However, we find several outliers far away from that theoretical value fixed in the data generating process. The most important and interesting finding is, that if we look at the very right graph in Figure 4, namely the sum of /_{0} + *9 ^{1},* we see that the dispersion is dramatically lower compared to its individual parameter estimates (two graphs in the middle). The individual parameter estimates of /0 and

As a next and important step we generalize the simulation to search and test for two thresholds. We are making one assumption, based on a purely economic viewpoint, namely that, we have at most one positive and/or one negative threshold, which correspond to the two transaction costs for a positive basis trade and a negative basis trade. This assumption relates to our no-arbitrage discussion in Section 1 which in turn links straight to the three classes of models *H _{i}* (see Figure 1). We want to test for the performance of the methodology when searching for a second threshold in a data generating process that only contains one threshold.

The previous analysis has shown that the dynamic grid point setting is the most precise methodology to estimate the transaction costs (see Table 1 for /_{1} and Table 2 for /_{0} + 0^{1}), even though computationally more costly. Therefore, we will from now on use this grid setting only.

We use the same model parameters and the same model, a 1-threshold TVECM (model class *H*_{2}), as in the previous subsection, however, test if our two bootstrap methodologies reject the existence of a second threshold. The data generating process has no second threshold, ie *0*^{2} does not exist and the estimate 0^{2} should be distributed around zero.

Our previous simulation has shown that /_{1} is estimated with a high degree of precision if we use a dynamic grid setup. In order to save computational time we fix /_{1} during the second threshold search to the parameter value found in the first threshold search and hence reduce the search for the second threshold to a 2-dimensional grid search for /_{0} and *0 ^{2}.*

As in the previous simulations, the estimators /_{1} as well as 0^{1} +/_{0} are very precise, with the highest precision achieved in the restricted case and especially in the case where the tested time series has a length of 2,000. We use a dynamic grid setup, which is the reason why we find similar standard deviations for the estimators bo and 0^{1} in the unrestricted case compared to the restricted case for a time series length 1,000. In contrast, we have tested fixed and dynamic grid points in Table 1 which results in much wider standard deviations of the unrestricted case compared to the figures of the restricted case. The estimator *0 ^{2}* is distributed around zero, as expected, because we do not have a second threshold in our data generating process.

The results of the reliability of the bootstrap algorithms for both threshold searches are given in Table 4. Again, we find that both tests, the fixed regressor and the residual bootstrap, produce consistent and precise estimates in terms of a- and b-errors.

In order to test the robustness of the method, where we keep b1 fixed in the threshold search for 0^{2}, we repeat the simulation and search now also for an optimal b_{1} in the threshold search for 0^{2} using the log-likelihood method. The threshold search for 0^{1} is performed in the usual manner, hence we only need to present the figures for the threshold search for 0^{2}. As an illustration, we focus on the restricted case and a time series length of 1,000 in our data generating process. The results presented in Table 5 lead to two immediate observations: firstly, the estimator /3_{1} is dramatically less precise compared to the results with a fixed b_{1} (Table 3) and the a-errors are very similar to the ones presented in the first row and the last six columns of Table 4.

We can conclude that the performance of the tests is higher for the second threshold search if bi is kept fixed to the value found in search for 0^{1}. This also reduces the computational burden.

Finally, we present the results for a data generating process with two thresholds. Without loss of generality we present results for the following choice of parameters: / = 1.1, /0 = 1 as well as two thresholds *d ^{1}* = —4 and Q

The power of the two bootstrap tests is very high, as expected from the previous simulation results (see Table 7), except for the case where the tested series have a length of 100 data points. Even though the tests have still an acceptable power in this case, it is visibly worse compared to the test statistics where we consider time series with a length of 1,000 and more observations.

The figures for the second threshold search, presented in Table 8, are extremely promising, because the two transaction costs are well estimated. In addition to the simulations presented in Tables 6 and 7, we add for the second threshold search also the simulation, where /1 is kept variable in the search for the second threshold. In this simulation we do not fix /1 to the value found in the threshold search for 0^{1}, but perform a grid search to find the ’’optimal” value based on the maximum likelihood method.

We find in Table 8 a similar picture as in the first threshold search, however as expected with slightly bigger standard deviations. The threshold parameter estimate is poor. However, the sum of /_{0} and the threshold is for the negative values very close to /_{0} + 0^{1} = —3 and for the positive values very close to /_{0} + 0^{2} = 7. In general, we find higher precisions for longer time series length and for smaller grid spacing.

The /1 estimation in the variable case has a low degree of precision (last simulation in Table 8). This result is in line with previous simulation findings from model class H_{2}.

The negative log-likelihood function for 3 different sample simulations is shown in Figure 6. The negative log-likelihood function for the threshold estimation is now more complex, reflecting the fact that we have two thresholds. The functional form shows immediately the non-differentiable structure. Further, it is again evident, why the individual parameter estimates of the threshold and /0 are imprecise. The estimation of /1 is as in the previous cases very accurate.

This table shows the results of the second threshold search for our 2-threshold TVECM, respective 3- regime TVECM. The point estimate of 0 is imprecise, simply because we have a chance to find either of the two thresholds (-4 or 6). In addition to the first threshold search presented in Table 6 we also include a simulation where we keep */3 _{i}* variable (last two rows), ie where we have not fixed

The distribution of the estimators /0, /31, Q* and the transaction costs for the first and second threshold search are presented in Figure 7. The distribution of /0 and the thresholds is wide in line with the form of the negative likelihood function in Figure 6. In each search we can either find Q1 or Q2, which cannot be distinguished due to the imprecise estimation. However, looking at the transaction costs (right-hand panel), which are estimated at a good precision, we can clearly unravel the two different transaction costs.

The comprehensive simulation exercise has revealed the following outcomes: The method discussed in Sections 2.1 and 2.2 leads to stable and robust results for the 1- and 2- threshold TVECM. The transaction costs *d ^{%}* + /

**3 Application: Index trading and Commodity arbitrage**

We apply our proposed 3-regime TVECM methodology to different markets that exhibit a non-zero basis (either in contango or backwardation) for two similar financial market instruments traded in the spot and the derivatives market.

We empirically estimate unknown transaction costs for the S&P 500 and for palladium by analysing the basis defined as the spot price minus the price of the future on the same underlying.

In our application we present results for our dynamic grid setup, for different periods and we run the model for 1 lag up to 5 lags in the VAR term. We have used the Schwarz information criterion (SIC) to determine the model with the best fit. We furthermore have also performed unit root, stationarity and cointegration tests. We find cointegration for the spot and futures prices of the S&P 500 and palladium, while the individual time series are I(1).

To estimate the Equations (22) we rely on the estimated covariance matrix from the VECM to capture the terms of, a_{12} and of. In the following we define HAS as the average of HAS1 and HAS2. For simplicity, we have skipped the superscripts indicating the regime.

The second indicator for price discovery, the GG measure (Gonzalo and Granger; 1995) decomposes the common factor itself, but ignores the correlation of the innovations in the two markets. The following two measures exist

whereby it is obvious that GG^{spot} + GG^{futures} = 1. Therefore, we will use GG^{spot} only and skip the superscript.

HAS and GG measures greater than 0.5 imply that more than 50% of the price discovery occurs in the spot market. When the measures are close to 0.5 both markets contribute to price discovery without evidence on which market is dominant. GG and HAS below 0.5 suggest price leadership of the futures market.

Finally, we are interested in examining the speed of adjustment towards the longterm equilibrium. As the spot and futures time series in the bivariate VECM share a common stochastic trend, the impulse response function for the cointegrating residual can be used to determine the speed of adjustment to the long-run equilibrium (Zivot and Wang (2006)). The vector error correction mechanism directly links the speed of adjustments to the cointegration residual u_{t} which follows an implied AR(1) process:

Regulators can use the model in a variety of different ways. They can study which market moves first, how large are transactions costs and how long does it take for the markets to return to the long-run equilibrium after an exogenous shock. Changes in the speed of adjustment or half-life might be especially interesting, after regulatory changes have been introduced, in order to study the impact of the implemented changes on market participants behaviour and also on relative or absolute liquidity of both markets. In markets, where high frequency data is available (eg. index-trading), a pre and post-event analysis could be performed to evaluate the effectiveness of policy actions.

**3.0.1 S&P 500**

As an illustration we present the analysis of the Standard & Poor’s 500, abbreviated as the S&P 500 index, and its futures index for the period starting in 2000. The S&P 500 is an American stock market index based on the market capitalizations of 500 large companies having common stock listed on the New York stock exchange or NASDAQ. The index and the future prices, as well as the basis are presented in Figure 8.

The threshold search for Q^{1} estimates transaction costs for a negative basis trade of around /_{0} + Q^{1} = -10 index points. The transaction costs for a negative basis trade have to be interpreted in absolute terms. The local minimum of the SIC is at lag=3, for which

the threshold is also significant. The estimated transaction costs are very stable, except for lag 2.

The spot market is leading in the two arbitrage regimes and as expected, the half-life in the neutral regime is the longest. The HAS, GG and half-lives for the upper and lower regime in Table 12 are based on the point estimates of insignificant speed of adjustments. This seems to be contradicting to the general notion, that speed of adjustments must be unequal to zero in the extreme regimes (arbitrage regimes). However, the upper and lower regimes contain only around 10% of the data points and therefore it is challenging to achieve statistical significance. This issue could be avoided by using intraday data (see for example Gyntelberg et al. (2013) and Gyntelberg et al. (2017)).

Palladium is a precious metal, which has similar properties like Platinum and belongs to the same chemical group. It is used as an investment and industrial commodity. Its main industrial use is in catalytic converters.

We have analysed several periods and present the period starting in 2000 for which we have found evidence for two thresholds, resulting in the two transaction costs, for a negative basis trade /_{0} + Q^{1} = —5.7USD per contract size of 100troy oz and a positive basis trade /_{0} + Q^{2} = 1.1USD per contract size of 100troy oz for a model with one VAR lag. The transaction costs for the negative basis trade have to be interpreted in absolute terms. The time series of the spot, futures as well as the basis are graphically presented in Figure 9. The two transactions costs are shown as horizontal lines in the right-hand panel of that figure.

The estimation procedure shows the model fit based on SIC is not straight forward as the value of the SIC is declining with a weak slope, reaching a plateaux at around 7 lags before it declines further with increasing number of lags. The most parsimonious model suggests the existences of two thresholds with high significance (see Tables 13 and 14) for the period starting in 2000. Apart from lag=2 the transaction costs for a negative basis trade are stable and around /0 + Q1 = —5.7USD for a contract size of 100troy oz which have to be interpreted in absolute terms. The lower regime is an arbitrage regime, which as expected contains only 11% of all observations.

In the upper regime we find some indication that the futures market leads the price discovery process, whereas in the lower regime there is leadership of the spot market. Both markets seem to absorb one unit shocks rather quickly. The neutral regime has as expected a longer half life, however the difference is not very large. Furthermore, as expected we have no clear indication as to which market leads the price finding in the neutral regime.

**4 Conclusion**

The identification of thresholds in non-linear vector correction models is of rather complex origin with several unresolved problems. In this paper, we present the solution for one unresolved issue which is the estimation of a 3-regime TVECM with an unknown cointegrating vector. Our proposed methodology extends the 2-regime TVECM model as proposed by Hansen and Seo (2002). In contrast to Hansen and Seo (2002) we also introduce an intercept /_{0} in the cointegrating relation (S — — /_{0})_{t} to account for a

distorted parity relationship, eg the non-zero basis. Using a sequential grid search for the first and the second threshold search, we estimate the cointegration relationship as well as the two thresholds (0^{1} and 0^{2}) by employing a maximum likelihood approach. As there are practically no empirical studies for TVECMs even for the 1-threshold case we present a comprehensive simulation study to investigate the reliability of our proposed method. Hansen and Seo (2002) have suggested a grid point search to estimate the variables / and *d ^{i}* with a fixed number of grid points. We show that a dynamic grid point setting, where the distance between two grid points is set via a precision parameter, instead of defining a fixed number of grid points, yields the best results, albeit at potentially high computational costs. The slope / in the cointegrating relationship is estimated at a very high degree of precision. We show that the value / found in the first threshold search can be fixed in the second search. This lowers the dimension of the grid space and hence reduces the computing time in the second grid search. The thresholds

Our proposed methodology is particularly appealing for the analysis of distorted parity relationships in economics, such as no-arbitrage relationships with a non-zero basis. A persistent non-zero basis between two similar financial market instruments traded in the spot and in the derivative markets points towards the presence of transaction costs on arbitrage trades that prevent a complete adjustment of market prices to the theoretical no-arbitrage condition of a zero basis. In our applications, we show that our proposed 3- regime TVECM methodology can estimate unknown transaction costs on arbitrage trades (sum of *d ^{i}* + /

**References**

Adams, K. and van Deventer, D. (1993). Comment on intra-day arbitrage opportunities and price behavior of the Hang Seng index futures, Review of Futures Markets .

Andrews, D. W. K. (1993). Tests for parameter instability and structural change with unknown change point, Econometrica 61: 821-1414.

Bai, J. (1997). Estimating multiple breaks one at a time, Econometric Theory 13(3): 315¬352.

Bai, J. and Perron, P. (1998). Estimating and testing linear models with multiple struc¬tural changes, Econometrica 66(1): 47-78.

Baillie, R. T., Booth, G. G., Tse, Y. and Zabotina, T. (2002). Price discovery and common factor models, Journal of Financial Markets 5: 309-321.

Balke, N. S. and Fomby, T. B. (1997). Threshold cointegration, International Economic Review 38(3): 627-645.

Davies, R. B. (1987). Hypothesis testing when a nuisance parameter is present only under the alternative, Biometrika 74(1): 33-43.

Fama, E. F. and French, K. R. (1987). Commodity futures prices: Some evidence on forecast power, premiums, and the theory of storage, The Journal of Business 60: 55¬73.

Forbes, C. S., Kalb, G. R. J. and Kofhian, P. (1999). Bayesian arbitrage threshold analysis, Journal of Business & Economic Statistics 17: 364-372.

Gonzalo, J. and Granger, C. W. J. (1995). Estimation of common long-memory compo¬nents in cointegrated systems, Journal of Business & Economic Statistics 13(1): 27-36.

Gyntelberg, J., Hoerdahl, P., Ters, K. and Urban, J. (2013). Intraday dynamics of euro area sovereign cds and bonds, BIS Working Papers 423, Bank for International Settle¬ments.

Gyntelberg, J., Hoerdahl, P., Ters, K. and Urban, J. (2017). Arbitrage costs and the persistent non-zero cds-bond basis: Evidence from intraday euro area sovereign debt markets, BIS Working Papers 631, Bank for International Settlements.

Hansen, B. E. (1996). Inference when an nuisance parameter is not identified under the null hypothesis, Econometrica 64: 413-430.

Hansen, B. E. (2000). Testing for structural change in conditional models, Journal of Econometrics 97: 93-115.

Hansen, B. E. and Seo, B. (2002). Testing for two-regime threshold cointegration in vector error-correction models, Journal of Econometrics 110(2): 293-318.

Johansen, S. (1988). Statistical analysis of cointegrated vectors, Journal of Economic Dynamics and Control 12: 231-254.

Lien, D. and Yang, L. (2008). Asymmetric effect of basis on dynamic futures hedging: Empirical evidence from commodity markets, Journal of Banking and Finance 32: 187¬198.

Lutkepohl, H. (2006). New Introduction to Multiple Time Series Analysis, Springer.

Man, K., Wang, J. and Wu, C. (2013). Price discovery in the u.s. treasury market: Automation versus intermediation, Management Science 59: 695-714.

McMillan, D. G. (2005). Threshold adjustment in spot-futures metal prices, Applied Financial Economics Letters 1: 5-8.

Stevens, J. (2015). Do transaction costs prevent arbitrage in the market for crude oil? Evidence from a threshold autoregression, Applied Economic Letters 22: 169-172.

Sutcliffe, C. M. S. (2006). Stock Index Futures, Ashgate Publishing.

Tsay, R. S. (1989). Testing and modeling threshold autoregressive processes, Journal of the American Statistical Association 84(405): 231-240.

Zivot, E. and Wang, J. (2006). Modeling Financial Time Series with S-PLUS, Springer¬Verlag New York, Inc.