**BIS Working Papers **

**No 682**

**Bank business models: popularity and performance **

by Rungporn Roengpitya, Nikola Tarashev, Kostas Tsatsaronis and Alan Villegas

**Monetary and Economic Department **

December 2017

JEL classification: D20, G21, L21, L25

Keywords: Balance sheet characteristics, cluster analysis, discriminant analysis, model transitions, bank performance

BIS Working Papers are written by members of the Monetary and Economic Department of the Bank for International Settlements, and from time to time by other economists, and are published by the Bank. The papers are on subjects of topical interest and are technical in character. The views expressed in them are those of their authors and not necessarily the views of the BIS.

This publication is available on the BIS website (www.bis.org).

*© Bank for International Settlements 2017. All rights reserved. Brief excerpts may be reproduced or translated provided the source is stated.*

ISSN 1020-0959 (print)

ISSN 1682-7678 (online)

**Bank business models: popularity and performance**

Rungporn Roengpitya, Nikola Tarashev, Kostas Tsatsaronis and Alan Villegas

**Abstract**

We allocate banks to distinct business models by experimenting with various combinations of balance sheet characteristics as inputs in cluster analysis. Using a panel of 178 banks for the period 2005–15, we identify a retail-funded and a wholesale-funded commercial banking model that are robust to the choice of inputs. In comparison, a model emphasising trading activities and a universal banking model are less robustly identified. Both commercial banking models exhibit lower cost-to-income ratios and more stable return-on-equity than the trading model. In a reversal of a pre-crisis trend, the crisis aftermath witnessed mainly switches away from wholesale-funded and into retail-funded banking. Over the entire sample period, banks that switched into the retail-funded model saw their return-on-equity improve by 2.5 percentage points on average relative to non-switchers. By contrast, the relative performance of banks switching into the wholesale-funded model deteriorated by 5 percentage points on average.

Keywords: Balance sheet characteristics, cluster analysis, discriminant analysis, model transitions, bank performance

JEL classification: D20, G21, L21, L25

We would like to thank Stijn Claessens, Ulf Lewrick and participants in a BIS seminar for useful comments. The views expressed in this article are those of the authors and do not necessarily reflect those of the Bank of Thailand or the Bank for International Settlements. Email: [email protected], [email protected], [email protected], [email protected]

**Introduction**

Banks are not all alike. Just as any other firm, a bank seeks a competitive edge by exploiting its comparative advantages in terms of access to specialised resources, available market opportunities and managerial skill. The result of this effort is a business model that emphasises some activities as opposed to others, and that is reflected, inter alia, in the bank's balance sheet composition. A good match between available opportunities and the bank's business model is a basis for healthy and sustainable profitability. Conversely, changes in the business mix of an underperforming bank are often part of a turnaround plan. Ultimately, the business model influences the bank's value and is thus of natural interest to stakeholders.

The business models banks choose are also of interest to policymakers. For one thing, different business models may be systematically associated with differences in bank performance. If so, information about business models would allow prudential supervisors to better gauge institutions' ability to generate reliable earnings that would support bank resilience. Likewise, to the extent that there are model-specific risk factors, the distribution of business models across the banking sector can point to concentration of risk exposures in the banking system.

Despite frequent references to bank business models in the press and analyst reports, there is no established notion of what constitutes a model and what distinguishes one model from another. In this paper, we systematise the identification of bank business models on the basis of balance-sheet characteristics. We use a sample of 178 banks across 34 jurisdictions and over 11 years (from 2005 to 2015), and an approach that is primarily data-driven but also uses judgment in a structured way. To implement this approach, we experiment with variables from both the asset and liability sides of the balance sheet. These are *input variables* in the identification of business models and the allocation of banks across models. Specific parsimony and clear-discrimination criteria guide our search for input variables that deliver sharp and interpretable results. In addition, stability checks help us focus on business models that are robust to changes in the data sample.

We classify banks into four business models. Two models are alternative versions of a commercial banking model, one that relies mainly on *retail* sources of funding and one that puts more emphasis on *wholesale* sources. These two models are quite stable, in the sense that the balance sheet characteristics of the typical constituent bank change little as we experiment with different sets of input variables. A third, *trading* model, where banks hold larger securities portfolios funded in the interbank and wholesale markets, also emerges from the data but is less stable. Likewise, the characteristics of the members of a *universal* model vary somewhat with the underlying input variables. This model blends characteristics of the other three business models.

We find that bank characteristics that are not used as input variables also differ systematically across the four models. For example, universal banks tend to be larger than banks in other models. In turn, trading banks tend to have more volatile profitability without performing better on average. Efficiency - as measured by a cost-to-income ratio - is highest for institutions emphasising commercial banking.

Banks' business models evolve over time in response to changes in the economic and financial environment as well as to new rules and regulations. The panel structure of the data allows us to identify patterns in this evolution. Over our sample period, we observe that the trading model is the most insular: few banks transition in or out of it. The transition patterns of the other business models change in a very distinct way with the global financial crisis. Most post-crisis switches are towards retail-funded commercial banking at the expense of the wholesale-funded model. This represents a reversal of a pre-crisis trend on the back of regulatory restrictions that increase the relative cost of wholesale funding.

We also analyse differences in relative performance around the time that a bank switches business models. We benchmark the performance of a switching bank against that of a peer group of nonswitchers that stick to their original business model. Somewhat surprisingly, there is no evidence that underperformers are inclined to switch models. That said, banks that switch into the retail-funded model improve their relative performance. These banks see their return-on-equity (RoE) rise by 2.5 percentage points on average relative to the RoE in their peer group over a five-year period around the switch. By contrast, banks switching into the wholesale-funded model suffer a 5 percentage-point decline in relative performance. In each case, the finding stems from switches between the retail- and wholesale-funded models, implying that differences in banks' liabilities drive differences in profitability.

The business models literature has a rather long history. Initially, the notion of a business model - or a "strategic group," as originally referred to in Hunt (1978) - was used mainly in the field of management studies (Zott and Amit (2011) provide a recent overview). Since then, a business model has been understood as a strategy that translates into similar balance sheet and income statement ratios. Studies of bank business models can be traced to early work by Amel and Rhoades (1988) and Mehra (1996).

Only recently have researchers adopted more systematic quantitative approaches to identifying and analysing bank business models. A number of papers focus on the link between banks' revenue mix and their profitability. Adopting this perspective, Stiroh and Rumble (2006) examine US bank holding companies and Kohler (2015) studies listed and unlisted EU banks. In contrast to this literature, we focus exclusively on balance sheet ratios as identifiers of bank business models. We sustain that balance sheet composition is more directly and more stably linked to banks' strategic choices than income composition. In implementing a strategy, a bank's management has close control over the types of exposures and thus the balance sheet of the institution. Income, on the other hand, tends to vary over time due to conjunctural drivers even if the strategy of the bank has not changed. Admittedly, the revenue and profit mix will tend to be broadly consistent with the balance sheet choices over the long term (assuming no changes in bank business strategy). But the year-to-year volatility will also tend to reflect a host of factors that are not under management's control, such as strategy execution risks and cyclical factors stemming from the economic and financial environment.

In fact, Mergaerts and Vander Vennet (2016) present evidence on the variability of different balance sheet and income ratios that supports our approach. For a panel of bank data covering several years they find that the ratio of within (ie over time for the same bank) to between (ie across banks) variability is higher for income ratios than for balance sheet ratios. Thus, by identifying business models only on the basis of balance sheet ratios, we can be more confident that a model bears a direct relationship to choices of the bank that remain relatively stable over time and, by extension, that a transition from one model to another can be attributed to managerial choice rather than exogenous events or conjunctural factors.

Papers that follow our general approach are Ayadi et al (2011), Ayadi and de Groen (2014), and Farne and Vouldis (2017). All these papers deploy balance sheet ratios in cluster analysis in ways that are similar - albeit not identical - to ours. Ayadi and de Groen (2014) focus on 173 European banks from 2006 to 2013 and Farne and Vouldis (2017) on 365 euro area banks at end-2014. Both studies converge on four banking models. Even though they use a dataset with a smaller geographical coverage and a shorter time span, they also find that business models with a stronger commercial-banking focus - similar to our retail- and wholesale-funded models - are more robustly identified than models with more extensive capital markets activities - similar to our trading and universal models.

These similarities notwithstanding, our paper differs from Ayadi and de Groen (2014) and Farne and Vouldis (2017) in two important ways. For one, we push the analysis of banks' transitions across business models further, looking in particular for systematic effects that such transitions may have on banks' performance. The second difference stems from the respective approaches to the selection of input variables for the allocation of banks to distinct models. In particular, we employ a more structured and transparent approach to the use of judgmental elements in the selection process. In addition, we focus on stable results by filtering out model allocations that are sensitive to the specific sample. In assessing the stability of our results, we also employ discriminant analysis to test the capacity of input variables to distinguish banks in line with the models implied by cluster analysis. Our strategy to avoid over-reliance on a single allocation method bears some resemblance to the approach of Mergaerts and Vander Vennet (2016). They go further to argue against a classification of banks into distinct groups and favour a characterisation of business models on the basis of two factors that take values in a continuous range.

The rest of this paper is organised in five sections. In the first, we lay out a methodology for classifying banks into distinct business models. In the second section, we characterise the four business models in terms of banks' balance sheet composition. In the third, we highlight systematic differences in banks' size, measured riskiness, efficiency and performance across business models. In the fourth section, we consider transitions from one model to another, looking for time-dependent patterns and performance changes. The last section concludes.

Our approach to the allocation of banks to business models - or clusters representing different mixes of banking activities - is primarily data-driven but, as typical for the related literature (eg Ayadi and de Groen (2014)), it also incorporates judgmental elements. We provide a structure within which to exercise this judgment. This section presents a broad-strokes description of the approach and Annex A has a more detailed exposition.

The allocation approach comprises three stages. At the first stage, we select a set of bank balance sheet ratios that hold the promise of differentiating business models. We use different combinations of these ratios as *input variables* to a cluster analysis algorithm. Each combination of inputs defines a *trial. *For each trial, cluster analysis delivers a series of allocations, or *trial variants,* each representing a different number of clusters. As the first stage produces a very large number of trial variants, we sort through them at the other two stages of the approach. At the second stage, we narrow down this number on the basis of a goodness-of-fit metric and the judgmental criteria of parsimony and clear discrimination. At the third stage, we further narrow down the trial variants by combining cluster and discriminant analyses with the criterion of stability of the results.

In what follows we flesh out the three stages and how we implement them with our data.

**First stage**

We consider eight balance sheet ratios as possible input variables for the cluster analysis. These variables are available over several years for a wide range of banks, domiciled in a wide range of countries. We interpret them as variables reflecting *strategic managerial choices.** ^{[2]}* At this stage, we do not take a stand as to which of the eight input variables are more important in discriminating across business models. That said, in order to avoid redundancies, we exclude combinations in which two or more input variables are highly correlated.

We allocate observations to distinct clusters by applying the agglomerative clustering algorithm of Ward (1963) to various subsets of the input variables, ie construct different trials. The algorithm produces a hierarchical clustering. It first splits the universe of observations into two clusters so that observations within the same cluster are more alike in terms of the underlying input variables than observations in different clusters. It then proceeds to split the more loosely connected cluster into two smaller ones, thus generating a more granular allocation, to three clusters. At the most granular level, there are as many clusters as observations (ie no grouping at all).^{4} For a given set of input variables, an allocation outcome with a specific number of clusters is a trial variant.

**Second stage**

At the second stage, we narrow down the number of trial variants using two judgmental criteria: *parsimony *and *clear discrimination.* The process simultaneously selects the input variables with the strongest discriminatory power and the number of clusters that can be reasonably distinguished in the data.

For **parsimony**, we restrict the trial variants to those that involve three to five clusters. We judge that a two-cluster classification would be rather crude and uninformative, whereas a classification with six or more clusters would make it difficult to articulate clearly the distinctions among business models. At the same time, however, the range of three to five clusters allows us to gauge the extent to which the data are compatible with groupings of different granularity.

**Clear discrimination **favours classification schemes with outcomes that stand out from alternative candidate schemes. In implementing the criterion, we make use of the F-index, which is a goodness-of-fit measure for a clustering outcome proposed by Calinski and Harabasz (1974). The F-index is calculated for each trial variant (see Annex A). Succinctly, it promotes a small number of clusters - in effect, it has its own parsimony objective that reinforces ours - as well as classifications where the clusters are sufficiently distinct from each other.

We use two alternative operationalisations of the clear discrimination criterion. Under the first operationalisation, we look for a *clear winner.* Specifically, for each trial, we check whether the maximum F-index (i) occurs for a variant with three, four or five clusters and (ii) is at least 10% higher than the Findex for any other variant of the same trial.^{5} When this is the case, we retain only the variant with the highest F-index. The second operationalisation considers trials in which the F-indices are similar inside the 3-to-5 cluster range but drop off outside this range. Concretely, we consider trials that are not retained under the first operationalisation and check whether the maximum value of the F-index (i) occurs for a variant with three to five clusters and (ii) is at least 10% higher than that of any variant outside the 3-to-5 cluster range. When this is the case, we keep all three variants. We discard all other trial variants.

**Third stage**

At the third stage, we filter out additional trial variants by applying the judgmental criterion of **stability**. We consider a trial variant to be stable if it is robust to excluding subsets of the original data and to using a different allocation method, concretely, discriminant analysis. We perform three stability exercises.

First, we rerun the cluster algorithm on reduced samples, dropping one year of data at a time. Having omitted a specific year of data, we compare the new allocation of banks into models with the initial allocation that is based on the entire dataset. Concretely, we compute a discrepancy rate by dividing the number of observations that are now allocated to a different model by the total number of observations in the reduced sample. Repeating this exercise for each year in the sample, we obtain the average discrepancy rate. We drop trial variants that have an average discrepancy rate higher than 20%.

Our second exercise parallels the first but replaces cluster analysis with discriminant analysis. We take a cluster-analysis allocation (based on the entire sample) and deploy discriminant analysis to estimate a functional mapping from the input variables to the cluster allocation. The mapping generates a set of probabilities that an observation belongs to each cluster (see Annex A).^{6} We allocate an observation to the cluster with the highest probability. For each trial variant, this delivers a discriminant-analysis allocation that is based on the entire sample. Then, we obtain discriminant analysis allocations for subsamples - omitting one year of data at a time - and drop all trial variants with average discrepancy rates greater than 10%. The threshold is lower than in the first exercise because discriminant analysis provides less volatile outcomes than cluster analysis.

In the third exercise, we gauge the agreement between cluster analysis and discriminant analysis. Specifically, we use the so-called leave one out (LOO) procedure, which is typical for discriminant analysis. The procedure re-estimates the discriminant functions on the basis of all but one observation and then allocates the omitted observation to a cluster using the newly estimated functions. If this allocation is different from that implied by cluster analysis, the procedure records an error. Performing the LOO exercise sequentially across all observations delivers an error rate, ie the percentage of misclassified observations. The higher this rate, the weaker the agreement between cluster analysis and discriminant analysis.

**The methodology applied to our data**

We apply the methodology outlined above to a panel of annual data, covering 178 banks from 34 countries from 2005 to 2015. The data are extracted from Bankscope, and the unit of our analysis - ie one data point - is a bank-year. As the panel is unbalanced - ie available data do not cover the entire period for each bank - we work with roughly 1600 bank-year observations. The exact number of observations is trial- specific because input variables differ across trials and data coverage differs across input variables. By focusing on bank-years, our approach allows institutions to switch between business models at any point over the period we cover.

The selection of banks seeks to cover a wide range of activities and countries. We thus started with the top 1000 banks in *The Banker's* 2013 list and then narrowed down the sample by looking for a balanced geographical coverage (for instance, limiting the number of banks from the same country) and for consistent data coverage of key variables (see below). Another aspect of our approach is to consider consolidated data at the bank level or, if such data are not available, at the level of the bank holding company. There are two exceptions in this respect. First, there are no consolidated data for four institutions in our sample. Second, for twelve institutions, we consider consolidated data for two to three large bank subsidiaries with potentially distinct business orientations.^{7} Table B.1 (Annex B) provides the full list of banks, the corresponding Bankscope consolidation codes and the years for which each bank is included in our dataset.

As candidate input variables, we select eight balance sheet characteristics - four from the asset side and four from the liability side of the ledger - and express them as ratios of balance sheet size. More specifically, we work with: loans-to-assets, trading book-to-assets, trading assets-to-total assets, interbank lending-to-assets, interbank borrowing-to-assets, deposits-to-assets, wholesale funding-to-assets, and the stable funding ratio (see Table B.2 in Annex B for exact definitions).^{8} Because the banks in our panel are subject to different accounting standards that result in differences in the reporting of derivative positions, we net these positions out in calculating our measure of total assets. Even though this is a necessary correction in a cross-country study, it is often omitted in the literature.

At the first stage, we run the clustering algorithm on combinations of three to eight input variables. From the full set of combinations, we exclude at this stage those in which input variables have a correlation of 65% or higher in absolute value.^{9} This means that we run cluster analysis on 48 trials. For each trial, we calculate the F-index scores for variants featuring two to 15 clusters. The result is 672 trial variants in total.

At the second stage, we narrow down the trial variants by applying the parsimony and clear discrimination criteria. The application of the parsimony criterion (ie a maximum F-index within the 3-to- 5 cluster range) leaves us with 12 trials, each with a three-, four- and five-cluster variant. The first operationalisation of the clear discrimination criterion (ie a clear winner in the 3-to-5 cluster range) eliminates all but four trial variants, each of them featuring three clusters. As we raise the bar for a clear winner, the "survivors" are trial variants with fewer clusters, a sign that provides support to the rationale behind our parsimony criterion. The second operationalisation of the clear discrimination criterion (ie all F-index scores outside the 3-to-5 cluster range being at least 10% lower than the maximum) leaves us with four trials, each with three-, four- and five-cluster allocations. So, at the end of the second stage, there are 16 trial variants that remain under consideration.

At the third stage, we apply the three stability criteria. This leaves us with four trial variants, which we focus on in the next section.

**2. Distinct business models and their characteristics**

In this section we consider the four trial variants that we are left with. Since each of them stems from a different trial, we label them trial A, trial B, trial C and trial D. We describe the business models identified by these four trials and draw general lessons from their commonalities.

Table 1 provides a characterisation of the business models in terms of all eight balance sheet ratios, with each trial's underlying input variables appearing in bold. For each ratio, the cells report two descriptive statistics of the observations classified in a given business model. The first is a simple average and the second (in square brackets underneath) is the range between the 10th and 90th percentiles. The last row reports the number of observations - ie bank-years - in each model.

Trial A discriminates across banks on the basis of three input variables, from both the asset and liability sides of the ledger: loans, interbank lending and wholesale debt. The first column for this trial paints a picture of banks with a large loan book (67.9% of assets on average) and a rather small trading book (3.3%). These banks are funded mainly through stable sources (76% of assets) and primarily deposits (68.8%) and have limited interbank activity. The algorithm classifies 556 bank-years in this group (about one-third of the banks in our sample). We label this group the **retail-funded **business model.

The banks in the second column for trial A tend to be similar to those in the first in terms of asset side but not liability side characteristics. Namely, loan books are again large and trading books small, at 73.2% and 5.3% of assets, respectively. However, there is now a larger weight of wholesale debt (36.1%) relative to deposits (41%) and more active borrowing than lending in the interbank market. We label this profile, which comprises 344 bank-years, the **wholesale-funded **business model.

The third business model in trial A is more **trading**-oriented. The average bank in this model has a small loan book (17.2%) and a sizeable trading book (17.4%) relative to other models. It also has a small deposit base as it raises funds mainly in the wholesale (21%) and interbank markets (18.9%). This is the smallest group among the four with 171 observations (bank-years).

The fourth group of banks in trial A is the second most populous, with 532 observations, and appears to blend characteristics of other models. Banks in this group have a rather moderate loan book but hold a rather sizeable portfolio of tradeable securities (33.2%). While they have a good deposit funding base, they are quite active in the interbank market both as borrowers and lenders. We label these banks **universal **because of their hybrid characteristics.

The other three panels of Table 1 present statistics for the other three trials: trial B, which also points to four business models, and trials C and D, which include only three models. The average statistics reveal that each of these three trials points to two commercial banking models that are very similar to the retail- and wholesale-funded models in trial A. This is despite the substantial differences in the number of bank-years across trials. This number varies from 556 (trial A) to 810 (trial C) for the retail-funded model, and from 324 (trial C) to 587 (trial B) for the wholesale-funded model.

The trading model also emerges in all four allocations. It appears most purely captured in trial B, where it includes a small group - only 55 bank-years - characterised by a negligible deposit base and a large trading book, which is funded in wholesale and interbank markets. In trials C and D, which feature only three business models, the average characteristics of the trading banks are a mixture of the trading and universal bank characteristics in the four-model trials.

To further assess the consistency of the classification across the four trials, we refer to the 10th- 90th percentile ranges in Table 1 and calculate the overlap between these ranges for trials B, C and D, on the one hand, and the corresponding ranges for trial A, on the other. The greater this overlap, the more similar the business models across trials. As presented in Table C.1 (Annex C), the overlap is consistently bigger for the commercial banking - retail- and wholesale-funded - models than for the trading model. A partial exception to this picture stems from the comparison between trials A and D, where the instances of the biggest overlap are roughly equally split between the commercial banking and trading models. The stability of the characteristics of the retail- and wholesale-funded models is quite remarkable given that the trials are all based on different combinations of input variables. We interpret this as a sign of robustness of the overall approach.

For trial A, Graph 1 provides a visualisation of how the methodology separates the four business models on the basis of the underlying input variables.^{10} The top scatter plots depict bank-year observations that are colour-coded according to the business model to which they have been allocated. The top left-hand panel shows that banks separate neatly into three groups on the basis of the size of their loan book (horizontal axis). Wholesale-funded (blue dots) and retail-funded (red dots) banks tend to have larger loan portfolios than universal banks (yellow dots), which in turn have larger loan portfolios than trading banks (black dots). Staying with the same panel, we see that the wholesale debt ratio is the variable that distinguishes between the retail- and wholesale-funded models. The top right-hand panel indicates that the third input variable in trial A - interbank lending - is weaker in discriminating across models.

Importantly, the comparison does not point to any distortion in terms of the implied characteristics of each business model. It does underscore, however, the less robust identification of the trading and universal models: (i) a relatively smaller fraction of the bank-years in these models make it to the respective cores and (ii) core observations are weaker determinants of these models' characteristics (see Table C.2 in Annex C and the accompanying discussion).

From this point on, we will report and discuss results for trial A only, relegating parallel results for trials B, C and D to Annex D. By focusing on only one trial, we keep the exposition shorter. Our choice of trial A is based on three observations. For one, the trial A allocation of bank-years to models is most robust to switching from cluster to discriminant analysis (see Annex A). Second, no pair of input variables for trials B, C or D delivers a cleaner delineation of the models than the one in the left-hand panel of Graph 1 (see Graphs C.1.b-d in Annex C). Third, in comparison to all other trials, the core banks in trial A are stronger drivers of the characteristics of the wholesale-funded models (see Table C.2 in Annex C and the accompanying discussion). Core banks are also more important for the retail-funded model of trial A than for the corresponding model of trial B. The retail- and wholesale-funded models are those that banks switch the most into and out of during our sample period (see Section 4 and Annex D).

For trial A, Table 2 provides information about the distribution of European, North American, Asia-Pacific and emerging market economy (EME) banks across business models. In the left-hand panel of the table - which focuses on 2015, the last year in our sample - the first entry in each cell is equal to the number of banks in a particular model and region. The second entry, in parentheses, equals the share of these banks in the total assets of all the banks from the same region. We see that the universal model comprises most of the larger banks: its high asset share in each region stems from a relatively small number of banks. At the same time, quite a few European and EME banks are in the retail-funded model. The number of North American institutions is rather evenly distributed across the retail-funded, trading and universal models.

The 2015 distribution of banks across models is influenced by preceding transitions from one model to another. The right-hand panel of Table 2 reports the number of region-specific transitions into particular models between 2006 and 2015. Transitions were quite common in Europe. Amounting to 84 in aggregate, they imply that the 67 European banks switched models 1.25 times on average during the sample period. And, in a sign that banks' strategic repositioning is wide-spread, the average number of switches is almost 1 for EME banks. In comparison, the intensity of transitions is much lower for North American and, especially, Asia-Pacific banks. We analyse in some detail the timing and direction of model transitions in Section 4.

Table 2 also zooms in on global systemically important banks (G-SIBs), as identified by the Basel Committee on Banking Supervision (see BCBS (2013b)). In 2015, most of the G-SIBs are in the universal model, which also accounts for the vast majority of G-SIB assets (left-hand panel). Before 2015, there were quite a few transitions across models, with the average G-SIB switching models 0.84 times. The vast majority of G-SIB transitions were into the retail-funded and universal models.

**3. Size, riskiness, efficiency and performance across models**

To better understand how banks in one business model differ from those in another, we look beyond the input variables that we considered for the allocation of banks to models. As we argued above, the input variables are useful for capturing the roles that banks choose to play in the financial system: eg issuers of retail vs wholesale debt, or providers of retail vs interbank loans, or intermediaries on the interbank market and active traders on capital markets. By contrast, the variables we analyse in this section relate more closely to the effect of those choices. We try to understand whether the banks in different business models

differ in terms of: size, risk profile, efficiency and profitability.^{11} We interpret consistent patterns of these characteristics as a result of the interaction between a strategically chosen business model and the macroeconomic and regulatory environments.

For each variable in Graph 2, we plot medians across banks that belong to a specific business model in a given year. The top panels refer to advanced economy banks and the bottom panels to EME banks in the retail-funded and universal models.^{12} We are interested not only in differences across business models but also in changes over time.

We first consider size, as measured by total assets (net of derivatives positions). As shown in the left-hand panels of Graph 2, the largest banks have tended to be in the trading and universal models. That said, the aggregate time trends in these two models have been in opposite directions. The median bank in the trading model downsized during the crisis, shrinking from more than $1 trillion in assets in 2007 to less than $800 billion in 2015. By contrast, the median universal bank more than doubled its size over our sample period. This reflected expansions by both advanced economy and EME banks. In comparison, the median banks in the retail- and wholesale-funded models have consistently been of a smaller and rather constant size, at around $150 billion.

One metric capturing banks' risk profile is the capital-to-assets ratio, which corresponds roughly to the percentage drop in assets that could drive an institution into insolvency. Advanced economy banks have steadily increased this ratio post-crisis (Graph 2, top centre panel), as regulation tightened and appetite for risk declined.^{13} While this upward trend was evident across all models, it was strongest for advanced economy trading banks, whose median capital-to-assets ratio in 2015 exceeded that of EME retail-funded banks (bottom centre panel).

A complementary risk metric is the ratio of risk-weighted to total assets, or the risk density (Graph 2, right-hand panels). Assuming that risk weights are measured correctly and reported truthfully, this ratio would increase with the likelihood of a sizeable asset drop (see, for instance, BCBS (2013a)). Taken at face value, the time profiles of the risk densities suggest that the riskiness of the median EME bank has stayed roughly constant and that advanced economy banks in the retail-, wholesale-funded and universal models have been de-risking. Findings about de-risking should be taken with a grain of salt, however, given evidence that banks underreport risk-weighted assets in order to economise on regulatory capital (Behn et al (2016)).14 In turn, advanced economy trading banks preserved their main activities during most of the period under study. As a result, these banks experienced a marked increase in risk density on the back of a post-crisis rise in the regulatory risk weights on trading assets.

Turning to performance indicators, we first examine banks' net interest income. Relative to total assets, this source of earnings would be expected to be largest for banks with sizeable loan books - ie retail- and wholesale-funded commercial banks - and lowest for banks that rely mostly on tradinggenerated revenue - ie trading banks. Graph 3 (centre panels) reveals that this is indeed the case. Moving beyond levels, there is a slight downward trend post-crisis, which is consistent with exceptionally low monetary policy rates squeezing interest margins in advanced economies and having a spill-over effect in emerging markets. Controlling for the concurrent role of different factors, Borio et al (2017) and Claessens et al (2017) find that banks' net interest income-to-total assets drops as policy rates decline towards zero.

**4. Transitions across models**

Model popularity has evolved materially over time. Measuring popularity as the number of banks in a particular model in a given year, we see that the global financial crisis was a pivotal point for the wholesale- and retail-funded models (Graph 4). On the back of favourable funding conditions in the run-up to the crisis, banks tended to switch into the wholesale-funded model and out of the universal and retail-funded models between 2005 and 2008. Afterwards, banks reassessed their strategies in order to adapt to new regulatory requirements, more demanding markets and a new financial environment. This surfaces as a steady increase in the popularity of the retail-funded model from 2009 to 2015. The flip side was a concurrent decline in the popularity of the wholesale-funded and universal models. The exception to this distinct pattern of bank repositioning comes from the trading model, whose popularity was quite stable over the entire sample period.

These findings shed light on the evolution of bank characteristics over time. For instance, we find that the median size of advanced economy banks in the retail-funded model increases when the popularity of the retail-funded model declines, and vice versa (compare the red line in Graph 2, top left-hand panel, with that in Graph 4). This is consistent with the evolution of the retail-funded model's popularity being driven by smaller banks. The picture is similar for universal banks (yellow lines). By contrast, the stable

popularity of the trading model suggests that a constant set of banks experienced a strongly volatile RoE over our sample period (recall Graph 3, top right-hand panel).

Table 3 presents additional information on transitions across business models, focusing on three distinct sub-periods: pre-crisis (2005 to 2007), the crisis and its immediate aftermath (2007 to 2013) and the most recent sub-period in our sample, characterised by relative financial tranquillity (2013 to 2015).^{15} Each cell in the core of this table reports the number of banks that started the relevant sub-period in the model in the row heading and finished it in the model in the column heading. The large numbers along the diagonal of each panel indicate that the majority of institutions remains in the same business model over time.^{16} The overall number of banks that start (finish) a sub-period in a given model is in the rightmost column (bottom row). And the grand total number of banks in each sub-period - ie the banks allocated to any of the four models at both the beginning and end of that sub-period - is in the lower right cell.

The transition patterns are quite different across the three periods. Pre-crisis, the number of banks in the wholesale-funded model increased by almost half (from 23 to 33), mainly as a result of transitions from the retail-funded model. This pattern changed dramatically after the crisis. Transitions out of the wholesale-funded - as well as the universal - models resulted in the number of retail-funded banks almost doubling (from 34 to 60) between 2007 and 2013. In the most recent period, there were few transitions, resulting in an only slight reversal of the preceding phenomenon. In each of the three sub-periods, the trading model experienced few transitions. Overall, the transition patterns suggest that the global financial crisis marked a distinct turning point in banks' strategic choices: it increased the appeal of the most traditional of the four business models.

**5. Transitions and performance**

Does the transition from one model to another pay off? To answer this question, we focus on banks' RoE as a performance indicator and strip out the effect of systematic factors by calculating changes in *relative *performance. Concretely, focusing on a bank that transitioned from model M to model N in year T, we subtract the bank's average RoE in years T+1 and T+2 from the corresponding average in years T-2 and T-1. This is the change in the bank's performance, or ABP. To calculate the change in the bank's peer group performance - APP - we perform similar calculations on those banks that stayed in model M from T-2 to T+2 and then take the median across these banks. Note that the peer group is the same for banks that switched out of a given model in a given year, but would likely differ from the peer group of all other banks that switched models.^{17} Finally, the difference APP - ABP is the change in the bank's relative performance.

We find that, while it paid off to switch into the retail-funded model, switching into the wholesale- funded model was associated with a worsening of relative performance (Table 4, column 1). Concretely, a switch into the retail-funded model resulted in a median increase in relative RoE of roughly 2.5 percentage points. By contrast, a switch into the wholesale-funded model resulted in a corresponding decline of roughly 5 percentage points. These changes are statistically significant in the sense that there is a probability of less than 1% that they arise from a random labelling of the models that banks switch into.^{18} The changes are also economically significant, given a median RoE of 14% and 15% in the retail- and wholesale-funded models, respectively.

Could it be that switches into the wholesale-funded model led to underperformance not because of this model's structural drawbacks but because of the timing of most of these switches: just before the global financial crisis (recall Table 3)? Arguably, the crisis and its immediate aftermath offered a particularly hostile environment to those banks that relied on wholesale funding sources. This prompts us to dig deeper and calculate changes in relative performance over two sub-periods. We find that switches into the wholesale-funded model occurring up to 2009 resulted in a median RoE deterioration of roughly 5 percentage points relative to peer institutions. But the deterioration was not a crisis phenomenon: switches into the same model occurring in 2010 or later resulted in an even bigger relative decline, of more than 9 percentage points. Likewise, the crisis does not come across as materially influencing the relative performance of banks switching into the retail-funded model: the corresponding calculations for the two sub-periods point to 3- and 2-percentage point improvements in relative performance. All this leads us to conclude that the statistically significant changes reported in Table 4 (column 1) reflect inherent properties of bank business models.

There is evidence that adjustments to the structure of banks' liabilities have contributed to these changes in relative performance. Of the 40 switches into the retail-funded model (Table 4, column 2), 22 are from the wholesale-funded model and the remainder from the universal model. The individual improvements in relative performance associated with switches from the wholesale-funded model are distinctly larger: in their distribution, all percentiles above the 30th are higher than the corresponding percentiles for the switches from the universal model. A symmetric finding in the case of switches into the wholesale-funded model reveals that the deterioration in relative performance is predominantly the result of switches from the retail-funded model.^{19} Thus, the statistically significant changes in relative performance are driven by switches between the wholesale- and retail-funded models, which - as reported in Table 1 and Graph 1 - feature similar asset side characteristics but differ markedly in terms of the importance of wholesale debt and deposits on the liability side.

What is the likelihood that a bank switching into a specific model would outperform the vast majority of its peers and what drives the switches? These questions are important but formal tests to address them - while controlling for conflating factors - are bound to have low power in our dataset. For instance, all the switches in the retail-funded model are not only few (40, as indicated in column 2 of Table 4) but occurred in different macro-economic and financial conditions. Concretely, they occurred in 9 different years and had to do with banks headquartered in 19 different countries. And the smaller number of switches into other models is even less amenable to formal analysis. Nevertheless, we do investigate whether changes in relative performance exhibit systematic patterns.

We compare the change in each switching bank's performance to the distribution of performance changes in its peer group. First, we check whether a switching bank's performance improved by more than that of at least two-thirds of its peers. And we record the number of banks satisfying this criterion in column 3 of Table 4. Second, in column 4, we record the number of banks whose performance deteriorated by more (or improved by less) than the performance of at least two-thirds of its peers.

The takeaways of this exercise differ across models that banks switched into. There is no systematic pattern as regards switches into the retail-funded, universal or trading model: the entries in columns 3 and 4 for this model are roughly balanced and/or low relative to the corresponding total number of switches, in column 2. By contrast, there is a distinct pattern for the wholesale-funded model. For 14 of the 19 banks switching into this model, performance deterioration is larger than that of at least two-thirds of the corresponding peers. Moreover, only one of these 19 banks saw its performance improve by more than that of two-thirds of its peers.

We also investigate whether the switch into a model could be explained by poor pre-switch performance. For this, we examine whether banks switched into a given model after performing worse than at least half of their respective peer group. Columns 2 and 5 in Table 4 reveal that this is the case for 50-60% of the banks switching into the wholesale-funded, universal or retail-funded models. Since this could have been the outcome of a coin toss, we conclude that there is no evidence that poor performance drove banks to reassess their business strategy. The reasons for the reassessment could thus stem from policy measures or a revised post-crisis attitude towards risks in the banking sector.^{20}

Finally, what about those underperformers that did switch their model? As revealed by a comparison between columns 5 and 6 in Table 4, most of the underperformers switching into the retail- funded model saw their performance improve in relative terms. Concretely, of the 23 banks that switched into the retail-funded model over our sample period, 17 saw their performance improve by more than the peer median within two years. By contrast, there is no similar evidence for banks that switched into the wholesale-funded, universal or trading models.

To recap, the clearest transition pattern is consistent with post-crisis business strategy reassessments leading many banks in our sample to switch from the wholesale- to the retail-funded model. With hindsight, this strategy benefited bank shareholders, as it was accompanied by a statistically and economically significant improvement in banks' RoE relative to their peers' RoE. This was true in particular for those banks that were relative underperformers before the switch. By contrast, banks switching into the wholesale-funded model experienced a substantial deterioration in relative performance.

**Conclusions**

The global financial crisis - and markets' and regulators' reaction to it - drove banks to reassess their business strategies. The two robustly identified models were in the centre of the resulting repositioning: retail-funded and wholesale-funded commercial banking models. While the wholesale-funded model was quite popular in the run-up to the crisis, many banks abandoned it during the first five post-crisis years, attracted by the retail-funded model. Smaller institutions were the main drivers of this repositioning, while a third, universal banking model preserved its appeal to the larger institutions over the years. A fourth model, emphasising trading activity, stood out with the stability of its composition: banks avoided switching either into or out of it despite tectonic shifts elsewhere in the banking sector.

These developments carry mixed messages for bank supervisors. The good news is the increased popularity of retail-funded commercial banking, which has exhibited consistent and sustainable performance supported by greater operational efficiency. Indeed, banks switching into this model saw a significant improvement in their relative profitability, which - all else the same - would put them in a better position than peers to strengthen their balance sheets. The persistent sub-par performance of trading banks presents a puzzle. Profitability of trading banks has been very volatile and costs have been persistently higher than in other models. Yet relatively few banks transition out of this business model in our sample, even despite the very poor performance in the post-crisis years.

**References**

Altunbas, Y, S Manganelli and D Marques-Ibanez (2011): "Bank risk during the financial crisis: do business models matter?", *ECB Working Paper Series,* no 1394, November.

Amel, D F, S A Rhoades (1988): "Strategic Groups in Banking," *Review of Economics and Statistics* 70 (4), pp 685-689.

Ayadi, R, E Arbak and W P de Groen (2011): Business Models in European Banking: A pre-and post-crisis screening, Centre for European Policy Studies (CEPS), Brussels.

Ayadi, R and W de Groen (2014): *Banking business models monitor 2014 - Europe,* Centre for European Policy Studies and International Observatory on Financial Services Cooperatives.

Basel Committee on Banking Supervision (2017): Twelfth progress report on adoption of the Basel regulatory framework, April.

Basel Committee on Banking Supervision (2016): Reducing variation in credit risk-weighted assets - constraints on the use of internal model approaches, Consultative document, March.

Basel Committee on Banking Supervision (2013a): Regulatory Consistency Assessment Programme (RCAP). Analysis of risk-weighted assets for credit risk in the banking book, July.

Basel Committee on Banking Supervision (2013b): Global systemically important banks: updated assessment methodology and the higher loss absorbency requirement, July.

Behn, M, R Haselmann and V Vig (2016): *The limits of model-based regulation,* ECB Working Paper No. 1928.

Borio, C, L Gambacorta, B Hofmann (2017): *The influence of monetary policy on bank profitability, *International Finance, vol 20, issue 1 (also BIS Working Paper, No 514).

Calinski, T and J Harabasz (1974): "A dendrite method for cluster analysis", *Communications in Statistics, *no 3, pp 1-27.

Claessens, A, N Coleman and M Donnelly (2017): "'Low-For-Long' interest rates and banks' interest margins and profitability: Cross-country evidence," Journal of Financial Intermediation, forthcoming.

Egan, M, S Lewellen and A Sunderam (2017): *"The Cross Section of Bank Value”,* NBER Working Paper 23291.

Everitt, B S, S Landau, M Leese and D. Stahl (2011): *Cluster analysis,* fifth edition, John Wiley.

Farne, M and A Vouldis (2017): *Business models of the banks in the euro area,* ECB Working Paper No 2070.

Fisher, R A (1936): "The use of multiple measurements in taxonomic problems", *Annals of Eugenics,* no 7, pp 179-188.

Fix, E and J L Hodges (1951): "Discriminatory analysis, nonparametric discrimination: consistency properties", in technical report no 4, project no 21-49-004, Brooks Air Force Base, USAF School of Aviation Medicine.

Hunt, M S (1978): "Competition in the major home appliance industry, 1960-1970," Unpublished PhD Dissertation, Business Economics committee, Harvard University.

Koepke, R (2015): What drives capital flows to emerging markets? A Survey of the empirical literature, IIF Working Paper, April 23.

Kohler, M (2015): "Which banks are more risky? The impact of business models on bank stability," *Journal of Financial Stability* 16, pp 195-212.

Lachenbruch, P A and M R Mickey (1968): "Estimation of error rates in discriminant analysis", *Technometrics, *vol 10, no 1, pp 1-11.

McLachlan, G J (2004): Discriminant analysis and statistical pattern recognition, John Wiley.

Mergaerts, F and R Vander Vennet (2016): Business models and bank performance: A long-term perspective," *Journal of Financial Stability* 22, pp. 57-75.

Mehra, A (1996): "Resource and Market Based Determinants of Performance in the U.S. Banking Industry," *Strategic Management Journal,* vol. 17, no. 4 (April), pp. 307-322.

Smith, C A B (1947): "Some examples of discrimination", *Annals of Eugenics,* no 13, pp 272-282.

Stiroh, K J, A Rumble (2006): "The dark side of diversification: the case of US financial holding companies," *Journal of Banking and Finance* 30 (8), pp 2131-2161.

Ward, J H Jr (1963): "Hierarchical grouping to optimise an objective function" *Journal of the American Statistical Association,* no 58, pp 236-44.

Wilkinson, L, L Engelman, J Corter and M Coward (2007): "Cluster Analysis" in SYSTAT® 12 Statistics I, pp I- 66-122.

Zott, C and R Amit (2011): "The business model: Recent developments, and future research," *Journal of Management,* vol. 37 no. 4, pp 1019-1042.

**Annex A. General methodology**

In allocating banks to models, we use annual balance sheet data for 178 banks from 2005 to 2015, cluster analysis (CA) and "expert" judgment. On the basis of the balance sheet data, we construct eight ratios for each bank-year pair (see Table B.2 below). To run CA, we need to choose which ratios - ie which bank- year characteristics - to use as "input" variables. We adopt a rather agnostic approach, considering all combinations of three to eight inputs. From the resulting 219 combinations, we drop those in which at least two of the inputs have a correlation coefficient that is greater than 65% in absolute value (see Table B.3). Each combination of inputs defines a "trial" for which CA allocates the bank-year pairs in the sample to distinct clusters, or "models". Finally, we narrow down the number of trials we analyse by considering measures of the discriminatory power of the underlying inputs; (ii) measures of the stability of the allocation outcome, for which we rely inter alia on discrimination analysis (DA); and (iii) the ease with which the allocation outcomes can be described in terms of distinct real-world banking models.

The rest of this annex is organised as follows. First, we provide a non-technical overview of CA and DA, referring the interested reader to existing detailed descriptions in the related literature. Then, we explain how we use CA to arrive at a model allocation for each combination of inputs, ie for each trial. Finally, we describe a step-by-step elimination procedure that allows us to narrow down the number of trials and the number of models in each trial. DA plays a role in this elimination procedure.

**Cluster analysis **employs the algorithm proposed by Ward (1963). This is an *agglomerative *algorithm. For a given trial of input variables, the algorithm forms progressively larger clusters (or models), at each step minimising the "distance" between the observations (in our case, bank-years) within a model and maximising the corresponding distance across models. The algorithm measures the distance between two observations by the sum of squared differences of the respective values of the input variables. Thus, one can derive many variants for each trial, where each variant features a different number of models. The number of models in a trial variant ranges from one to the total number of bank-years.

The (pseudo) F-index proposed by Calinski and Harabasz (1974) is a metric that distinguishes among the variants of a given trial according to their information content. The index is akin to an adjusted R^{2} in that it increases with the "goodness of fit" and decreases in the number of clusters (or models). In our context, goodness of fit is measured as the average distance between the characteristics of bank-years belonging to different clusters divided by the corresponding average for bank-years in the same cluster. The F-index is not comparable across trials. For a given trial, however, the highest F-index indicates the most informative variant.

**Discriminant analysis **(DA) can be used to study the discriminant power of input variables for bank-years with a known model membership (see Fisher (1936)), Smith (1947), Fix and Hodges (1951), Lachenbruch and Mickey (1968)). Two sets of assumptions are necessary. The first one is on the prior probability that any given observation belongs to a particular model. We adopt the agnostic assumption that this prior is uniform across models. The second assumption is on the functional form of the probability density function of input variables in each model. Again, we choose an agnostic approach and consider three options in STATA: "linear" where the densities are multivariate normal, with equal covariance matrices across models; "quadratic", which is similar to "linear" but allows the covariance matrices to differ across models; and "logistic", where the likelihood ratios have an exponential form.

DA uses these assumptions to derive posterior probabilities: in our case, the probability that a bank-year belongs to a particular model, conditional on the values of the input variables. In a first step, DA uses the input variables for bank-years with a known model to estimate the joint density of these variables, conditional on the model (for the density's assumed functional form). Then, in a second step, DA combines this density with the assumed priors and invokes Bayes' theorem to derive the desired posterior. Importantly, DA may assign a bank-year's maximum posterior probability to a model that is different from the one that this bank-year is "known" to belong to. In such a case, the input variables generate an allocation error.

In the above process, DA derives posterior probability *functions,* ie mappings from the values of input variables into a probability that the associated bank-year belongs to a particular model. With such functions in place, DA can attribute a model to additional bank-years, for which the model is a priori unknown.

Trial elimination procedure

- For each set of input variables (ie each trial), we calculate the CA F-scores for versions with two to fifteen models. We drop those trials for which the maximum F-score is not associated with three, four or five models. This reflects the judgment that fewer than three models do not allow for a meaningful differentiation of banks and more than five models are difficult to interpret.
- From the remaining trials, we construct two sets. In the first - set I - the maximum F-score is (i) at least 10% higher than the F-scores associated with two or six, seven or more models for the same trial and
- is less than 10% higher than the other F-scores associated with three to five models for the same trial. In the second - set II - the maximum F-score is at least 10% higher than all the other F-scores for the same trial.
- From set I, we keep the three-, four-
*and*five-model variants of each trial. From set II, we keep the three-, four-*or*five-model variant that has the highest F-score in the corresponding trial. We drop all the other trial variants. - For each of the remaining trial variants, we consider 11 subsamples. To construct each subsample, we drop observations that belong to a specific year, from 2005 to 2015. We then rerun cluster analysis on one subsample at a time and record each time the number of the observations that are now allocated to a different model. Dividing this number by the total number of observations in the subsample gives a "discrepancy rate". Averaging across years delivers the "CA discrepancy rate" for each trial variant.
- In applying DA to a given trial variant, we treat CA-based allocations as indicating the "known" model of each bank-year. Leaving out one bank-year at a time, we use DA to derive the mapping from input variables to posterior probabilities of belonging to particular models. On the basis of the so- determined mapping, we obtain a DA-implied model for the left-out observation and record if it is different than the CA-implied one. Combining the results for all bank-years, we obtain a "DA leave- one-out (LOO) error rate". This is available only for the linear and quadratic versions of DA.
- We conduct a similar exercise as in step 4 but use a DA-based model allocation over the whole sample of bank-years as the baseline. For each trial variant, this delivers a "DA discrepancy rate".
- We eliminate trials for which at least one of the three metrics in steps 4-6 exceed some judgment- based thresholds.
- As regards set I, we eliminate all trial variants for which: the CA discrepancy rate is greater than 20%,
*or*the average DA leave-one-out error rate is greater than 10%,*or*the DA discrepancy rate is greater than 10%. When referring to the DA-based metrics, we eliminate trial variants if the corresponding threshold is crossed for the linear, quadratic*or*logistic specification (when available). This leaves us with two trials, each with three-, four- and five-model versions:

- Trial A, where the input variables are gross loans, interbank lending and wholesale debt.
- Trial B, where the input variables are trade, interbank lending, interbank borrowing and deposits.

- As regards set II, we perform a similar robustness check and end up with two additional trials, each with a three-model version.

- Trial C, where the input variables are: trading book, interbank borrowing and wholesale debt.
- Trial D, where the input variables are: gross loans, trading book, interbank lending, interbank borrowing and deposits.

- We also consider scatter plots of input variables in order to determine whether their distributions within and across models allow for an intuitive interpretation (see Graphs 1 and C.1.b-d). Background (unreported) plots reveal that the three- and five-model versions of trials A and B generate significant overlaps of the input variables, preventing a clear distinction across models. We thus focus only on the four-model versions of A and B. In each of these two trials, the four-model version has the highest F-score. In the main text, we use "trial A and B" to refer to the four-model variants of these trials.
- Most of our discussion in the main text is based on trial A. There are three reasons for this choice.

- The allocation of bank-years to models under trial A is most robust to switching from CA to DA. To derive this, we conduct DA on the whole sample of bank-years while assuming that the "known" models are as implied by CA. The CA to DA switch results in a different model allocation for 5.5% of the bank-years in trial A, compared to 6.6% for trial B and 6% for trials C and D. In addition, when the model allocation changes, we calculate the difference between (i) the DA-implied probability for the CA-implied model and (ii) the DA-implied probability for the DA-implied model. The smaller these differences, the greater the agreement between CA and DA for the given trial, all else equal. The 10th, 50th and 90th percentiles of these differences are 3%, 26% and 66%, respectively, for trial A. The corresponding statistics are 4%, 45% and 94% for trial B, 5%, 31% and 80% for trial C, and 7%, 46% and 89% for trial D.
- No pair of input variables for trials B, C or D delivers as clean a delineation of the models as the loan and wholesale debt ratios for trial A (compare Graph 1 and Graphs C.1.b-d).

In comparison to the other three trials, trial A exhibits a distinctly smaller role of periphery bank- years in the two models that banks switch into and out of the most during our sample period. Annex C outlines our methodology for identifying core and periphery bank-years and Table C.2 reports a measure of the importance of periphery banks in each model and trial variant.

The two commercial banking - ie retail- and wholesale-funded - models tend to feature more stable characteristics than the trading and universal models. We establish this on the basis of the following comparisons of intersection-to-union ratios in Table C.1. Focusing on the left-hand panel and on one input variable at a time, we first record the number of instances in which an intersection-to-union ratio for the trading model is smaller than that for the retail-funded model. This number is 6, for a maximum of 8. The corresponding number is 7 (again for a maximum of 8) when we compare the universal and retail-funded models. And the findings are identical if we replace the retail-funded with the wholesale-funded model for the comparisons. These findings imply that - across trials A and B - the characteristics of the two retailbanking models are more stable than those of the trading and universal models. The message from the centre panel is similar: the trading model tends to have smaller ratios and thus has less stable characteristics across trials A and C than the two commercial banking models. In a slight departure from this general picture, the right-hand panel reveals that the characteristics of the trading model are less stable across trials A and D than the characteristics of the wholesale-funded model, but more stable than those of the retail-funded model.

**Identifying core banks**

We first state the general condition for specific bank-years to belong to the "core" of a trial variant's model. Core banks should be so similar to each other that they would tend to be allocated to the same model for different sets of the input variables used in cluster analysis. To operationalise this condition, we require that the core banks be allocated to the same model for at least three-quarters of the trials that are left after stage 3 of the trial elimination procedure in Annex A.

Focusing on model X in trial variant Y, we filter out the non-core - or periphery - banks as follows:

Stage I: Score matrix

- The other trial variants we consider for this exercise feature the same number of models as trial variant Y. We denote by Z the set comprised of Y and these trial variants.
- We pick one bank-year, i, in model X of trial variant Y.
- We pair this observation with another bank-year in the same model and trial variant, while disregarding pairs related to the same bank in different years.
- We then record the number, N, of trial variants in Z for which the pair from step 3 is allocated to the same model. We also record the number, M, of trial variants in Z that feature this pair in any model. A bank-year would be excluded from a trial because of missing data for any of the relevant input variables. In our particular application, the maximum value of M is 12. To abstract from banks with too few data points, we drop bank-year pairs for which M < 9.
- We define the score of the bank-year pair as the ratio N/M.
- We repeat steps 3-5 for all bank-years other than
*i*that are in model X of trial variant Y. This delivers a vector of scores for i, which we place as a row vector in a score matrix. - The score matrix is constructed by repeating steps 2-5 for all the bank-years in model X of trial variant Y. This matrix is symmetric.

Stage II: Filtering out periphery observations

- Let row
*j*(equivalently, column*j)*in the score matrix obtained at Stage I correspond to bank-year*j*in model X of trial variant Y. - If more than P% of the scores on row
*j*are smaller than 0.75, we mark this row and the corresponding column for deletion. - Once we have conducted step 2 for all rows, we delete the marked rows/columns. This produces a downsized matrix.
- We pick the value of P% in step 2 to be as low (ie as conservative) as possible so that at least 95% of the upper-triangle entries in the downsized matrix are at least as high as 0.75. The value of P% is model-specific. In our application, it ranges between 70 and 77%.

Applying this procedure to the models in trials A, B, C and D, we obtain the core observations portrayed in Graphs 1 and C.1.b-d. In turn, Table C.2 illustrates the relative roles of core and periphery observations in the model's summary characteristics.

To see that the cores of the commercial banking - ie retail- and wholesale-funded models - tend to include the highest fractions of initial observations, refer to the last rows in each panel of Table C.2 and the corresponding rows in Table 1 in the main text. For trials A, C and D, 50 to 86% of the observations in a retail- or wholesale-funded model are also in that model's core. The corresponding shares are 15 to 37% for a trading or universal model. An exception to this general pattern comes from trial B, where the trading model's core includes the highest fraction of initial observations: 75%.

The higher a number in parentheses, the more important core bank-years are as drivers of the corresponding model's characteristic. A comparison of these numbers reveals that core banks are distinctly more important for the characteristics of retail- and wholesale-funded models than for those of trading and universal models in trials A, C and D and of the universal model in trial B. The high numbers in parentheses for the trading model in trial B are the only exception to the general pattern.

The same numbers indicate that - as far as the wholesale-funded model is concerned - the core bank- years are most important in trial A. That said, as far as the retail-funded model is concerned, the core bank- years in trial A are more important only relative to trial B.