I hope to break even this week. I need the money. –veteran gambler

Everyone has to model slippage on their own based on historical execution data since it depends on account size, instruments traded, what time of each day you trade, etc so there’s no good general model. Slippage is important because it can make alpha-generating strategies lose money. Modelling slippage is important because then you can optimize trade size and frequency to minimize transaction costs. We want everything to be measured as accurately as possible so optimization can be automated.

This section is naturally organized under the development of a few definitions.

**Participation Rate** = Percent of ADV.

If you transact 100,000 shares of GOOG in one day which trades an average of 3.0M shares per day, then your participation rate is 100, 000/3, 000, 000 = 3.0%.

Actually we can generalize percent of ADV to be over a time range other than a day. Participation rate is actually the percentage of volume your trading accounts for out of the average total volume that transacts in the stock over a given range of time. So if 500,000 shares of GOOG are traded every hour, and you buy up a 100,000 position over the course of an hour, then your participation rate is 20%. Higher participation rates increase slippage and vice-versa. 1% could be considered a high participation rate (for GOOG this would be 3M ∗ 500 ∗ 0.01 = $15M) although I’m just trying to give you a rough feel for how low a high participation can be.

# Liquidity Impact

Liquidity impact is the money you lose when there aren’t enough sellers so you have to bid more to access more supply of shares (i.e. buyers). The two key determinants here are Spread and Depth. Here’s what they look like in terms of an order book:

# Slicing Up Orders

Spread hits you with costs every time you trade, low depth eats into big orders. Higher participation rates gen- erally cause higher liquidity impact. There are some simple but widely used strategies to decrease participation rate and reduce liquidity impact. Basically you slice your order up over time. There are two simple patterns of slicing:

VWAP is better than TWAP. There are other execution strategies with creative names, like guerilla, sonar, etc which actually interact with the market during the day. Instead of trying to target liquidity based on historical average volumes, they ping the market to look for “icebergs” - large hidden orders, which the algo can extract liquidity from without any impact, until the iceberg is exhausted. These more sophisticated execution algorithms are generally classified as liquidity seeking, rather than passive.

In the news you’d hear that spreads are widening because of high volatility. We’ve covered how participation rate causes liquidity impact when the market depth is too shallow; now I’ll explain briefly here how the market making business works to show why spreads widen and complete this part of the cost picture. A market maker is like a warehouse. That’s their business model, warehousing inventory (sounds kind of boring huh?). Instead of selling to other traders, you sell to the market maker and then he sits around and waits for a buyer so you don’t have to. He’s worried that the price will go down before he finds another buyer so he charges you a spread. Now of course there are a bunch of market makers so just one can’t set the spread, but competition drives them to more or less agree.

When volatility is high, the price has a tendency to move faster so the market makers charge higher spreads to hold the inventory until someone else comes. They are taking more of a risk. Market making is one of the most profitable trading strategies since some exchanges will actually give rebates (the opposite of commissions) for “liquidity providers”, i.e. traders who use limit orders, like market makers do.

Now let’s look at the other component of slippage.

# Alpha Loss

You only attempt to trade into a position that is going to make money, so the longer you take to get in, the more of that profit has already been absorbed and you missed. Whereas we wanted to slow down execution to decrease liquidity impact, here we want to execute as fast as possible. This suggests an optimization problem. We’ll return to that in a second.

Estimate short term alpha by asking the portfolio manager or trader where he expects the market to go over the next few hours after a buy/sell order is issued on average, or by statistically analysing a large sample of past executions and the path of prices before and after on average. Anytime we say statistical analysis we generally mean dealing with averages or frequencies instead of looking for trends or meaning in individual points.

Here we assume alpha is generated linearly from the time of the strategy issues a signal to the time when the alpha signal is totally exhausted. Like if you have a momentum strategy, it might signal at time T that supply and demand dynamics are indicating the market will rise over the next hour. After that hour the signal is exhausted. If it takes you 20 minutes to scale into the position while minimizing liquidity impact, then you only get to ride the momentum (alpha) from time T + 20 to T + 60, i.e. 40 minutes instead of 60 minutes. We assume that the alpha is generated linearly so the difference from T + 20 to T + 30 is the same as T + 40 to T + 50 etc. Finding alpha is hard enough, so in modelling the exact shape keep it simple.

Let’s look at a diagram of how liquidity impact and alpha loss combine. Notice that going up in this chart means losing more money, and going right means looking at how the price impact evolves over time. Assume you start executing at the left side and as you execute more shares the liquidity impact increases until you’re done and the price corrects itself. The longer you take, the more alpha loss you will suffer.

This is a very important diagram. It displays how you can estimate the liquidity impact even though it’s confounded by the alpha loss. (We’ll need estimates of these two components later) Basically we need to estimate alpha loss first and then liquidity impact second, after subtracting out the effect of alpha loss. For alpha loss, draw the line of best fit on the average of all the stocks that you got signals on but then didn’t trade. This is like the dashed line in the picture. Then once you have this baseline, draw the line of best fit on all the stocks you did trade and subtract out the baseline. (similarly you could execute a lot of share with no signal to estimate the liquidity first, but it’s cheaper the other way) Now you have each component.

# Execution Shortfall

This motivates another definition: Execution Shortfall = Liquidity Impact + Alpha Loss

Shortfall is the number we’ll be trying to minimize by choosing a certain time to spread execution over.

Here’s a diagram of where the minimizing point lies:

Notice that here the x-axis gives different choices of the time you might take to execute, whereas in the previous chart it showed what happened after you chose a certain time and then started to execute. These two curves were estimated in the previous step (but now we say alpha loss is kind of curved because we’re looking over a longer time range). To find the minimum in Excel or Matlab, add up the two curves and choose the minimum of this sum-curve. For example with numbers along two curves $x = (10, 9, 8, 6, 5, 4, 2, 1, 1)$, $y = (1, 2, 3, 4, 5, 6, 7, 8, 9)$ to minimize $x+y = (10+1,9+2,8+3,6+4,5+5,4+6,2+7,1+8,1+9) = (11,11,11,10,10,10,9,9,10)$ we could pick either the 3rd or 2nd to last, which correspond to, say, execution times of $T = 8$ or $T = 9$.

This is what you need to optimize duration of execution. It should be clear that these issues are more important for a large investor than a small one. Also this is a framework that can help to analyse your strategies execution, but of course it’s not 1-size fits all. For example if you trade around very sudden events then the alpha profile might not be linear. And if you trade in micro caps then liquidity impact could be extremely hard (useless) to model.

# Capacity

One final thing to mention, this model is also useful to get a grasp on the capacity of a strategy. Capacity is the amount of money you can manage following a certain strategy before the transaction costs wipe out the alpha. Transaction costs rise when you’re managing and moving more money. Look at how much transaction costs would have to rise to make there be no optimal execution time. In other words it’s best never to trade. Based on your transaction cost model, see what this rise corresponds to in terms of number of shares traded and back out the portfolio value that would produce that many orders. There’s your capacity. Here’s what it means to be beyond capacity in a simple diagram:

# Practical Transaction Costs

It is really good to have a correct intuitive theoretical understanding of systems. However transaction costs are hard to model and usually you just want a rough idea of if a strategy will be profitable. If the details of the transaction costs matter, it is often better to find a model with more edge.

Take out a pen and envelope/paper to jot down the numbers and calculations that follow or it will just blur together and you won’t retain anything.

Let’s say you finish a backtest that ignores transaction costs and on average your strategy makes a couple of basis points per trade. Trades are medium-high frequency with a holding period of 15 minutes and a few signals per day. It’s an equities strategy that works in highly liquid shares like MSFT, with one-to-two penny spreads. Specifically let’s say it makes on average .00022=2.2bp per trade in MSFT, which was at $26.04 at the time of this analysis. We’ll trade through Interactive Brokers, which has a decent API for automated strategies. Look at the profits and costs on 200-shares.

The dollar size, 200 shares at $26.04, is $5208. So the dollar alpha per trade is 5208*.00022=1.16$. Conservatively say the spread is $0.02.

Here are the choices we have for execution strategies, with their cost per 200 shares, difficulty of programming, and other risks/uncertainty.

- IB bundled, take liquidity. Commission = $0.005/share. total = 0.005*200 = $1. total = $1+spread = 1+.02*200 = $5, easy to code.
- Interactive Brokers (IB) bundled, add liquidity. total = commission = $1, harder to code, adverse selection, earn spread.
- IB unbundled, Island, add liquidity, peg to midpoint. Commission=.0035/share, Rebate=-.0020/share. total = .0015*200 = $0.3, not too hard to code, probably not much adverse selection nor price improvement.
- IB unbundled, BATS, add liquidity. Rebate=.0027. total = .0008*200 = $0.16, harder to code
- Lime (another broker with a good API), ?. Commission=?. * high volume per month will lower the commissions as low as .001/share.

Adverse selection is when you’re using limit orders and you only get filled on the shares that are moving against you (ie a bid typically gets filled when the price is moving down, so you get a long position as the market drops). Programming liquidity adding strategies takes more work because you have to work orders and eventually cancel and hedge if it never gets filled. This adds a few more steps to the strategy logic.

#3 looks like the best. It’s low cost and not too hard to program. The uncertain risks mitigate each other. #1 is too expensive. #2 is harder to code and costs more than #3. #4 is harder to code and worse than #3 for a slow trader (ie anyone who’s not co-located) since BATS doesn’t have the pegged-to-midpoint order type that Island/INET/NASDAQ has. I don’t know the details of #5 since they don’t publish rates, but in the past they were competitive with IB, offered colocation, and had a better API.

This strategy might seem really bad - you risk $5000 and make $1.16 and pay $0.30.

The economics that make automated strategies work are: It trades each of the 100 top liquidity symbols on average 5 times per day on each of the 20 trading days per month and does 400-share position sizes instead of 200. Therefore the total volume is 100*5*20*400=4,000,000. At that level, IB drops commissions from $0.0035/share to $0.0015/share which is less than the rebate, so transaction costs are about $0 (roughly assuming adverse selection slightly outweighs the extra profit from collecting spreads). Profits are 100*5*20*2*$1.16=$23,200/month = $278,400/year.

The back-of-the-envelope amount of capital you need: The average holding period is 15 minutes. There are 390 minutes per trading day. We are watching 100 symbols and making on average 5 trades in each per day. So on average we will have 15/390*100*5= 20 positions on at any time. Stocks prices are very correlated and the strategy is based on a price formation though, so to be safe multiply that by 5. 100 positions times $10000 gives the net required capital of $5,000,000. Intraday, it is easy to get 4-20x leverage so the most we’d probably put up is $1,000,000.

Bottom line: 27.8% profits per year is not a huge win but the strategy can probably be improved by filtering. It could be possible to trade more symbols at larger size and get a better commission structure of $0.001/share. Pushing leverage from 5x to 20x, possibly by sitting at a prop shop, boosts returns a lot too.

# Extra: Measuring Impact

You can measure the average price impact of a trade of a certain size using public data. Basically you bin all trades by their size and then check the average price change on the next quote.

In practice there are more steps:

- Load the trade data
- Aggregate trades with the same timestamp
- Load the quote data
- Clean the quote data
- Classify trades as buys or sells according to the Lee-Ready Rule
- Measure the impact per trade
- Bin trades by size and average following returns

Then you get nice log-log plots like:

This says that if you execute an order 0.05 times the average order size, then you will impact the price on average 0.00001. If you execute an order 20 times the average order size, then you will impact the price on average 0.00025. This is in large cap equities where the average spread is 0.01.

Notice that since this is measured as the impact on the very next quote update, it might miss reactions by market makers to pull away after a few microseconds.

This comes from “Single Curve Collapse of the Price Impact Function for the New York Stock Exchange” by Lillo, Farmer, and Mantegna.

Here is R code to replicate it using WRDS’s TAQ data (see example of trade and quote data above, in simulation). Although this is very specific, I include it because it’s a good example of a side project you can show an employer that will impress them:

```
# Single Curve Collapse of the Price Impact Function for the New York Stock Exchange
# Fabrizio Lillo, J. Doyne Farmer and Rosario N. Mantegna
# 2002
require(xts)
require(Hmisc)
path = 'your/taq/data/'
csv_mypath = function(file) {
read.csv(paste(path,file,sep=''), stringsAsFactors=FALSE)
}
# Load GE Trade Data
ge.t = csv_mypath('ge_trade_1995.csv')
ge.t = ge.t[-1]
ge.t$TIME = mapply(paste, as(ge.t$DATE,'character'), ge.t$TIME)
ge.t$TIME = as.POSIXct(ge.t$TIME,tz='EST',format='%Y%m%d %H:%M:%S')
require(xts)
ge.t = xts(ge.t[c(3,4)],order.by=ge.t$TIME)
colnames(ge.t) = c('price','size')
# Aggregate GE trades with the same timestamp
agg = coredata(ge.t)
dates = as.integer(index(ge.t))
for (i in 2:dim(agg)[1]) {
if (dates[i]==dates[i-1]) {
if (agg[i,1] == agg[i-1,1]) {
agg[i,2] = agg[i-1,2] + agg[i,2]
}
else
{
agg[i,1] = (agg[i-1,1]*agg[i-1,2]+agg[i,1]*agg[i,2])/(agg[i-1,2]+agg[i,2])
agg[i,2] = agg[i-1,2] + agg[i,2]
}
agg[i-1,2] = -1
}
}
ge.t = xts(agg[,c(1,2)],order.by=index(ge.t))
ge.t = ge.t[ge.t$size!=-1]
# Load GE Quote Data
ge.q = rbind(csv_mypath('ge_quote_1995_JanJun.csv'),
csv_mypath('ge_quote_1995_JulDec.csv'))
ge.q = ge.q[-1]
ge.q$TIME = mapply(paste, as(ge.q$DATE,'character'), ge.q$TIME)
ge.q$TIME = as.POSIXct(ge.q$TIME,tz='EST',format='%Y%m%d %H:%M:%S')
ge.q = xts(ge.q[c(3,4)],order.by=ge.q$TIME)
colnames(ge.q) = c(’bid’,’ofr’)
# Clean GE Quote Data
ge.q = ge.q[!(ge.q$bid==0 | ge.q$ofr==0),] # Bid/Offer of 0 is a data error
ge = merge(ge.t, ge.q)
# Final Result: variable ’ge’ is an xts with NA’s showing trades / quotes
#ge.t = NULL
#ge.q = NULL
# Classify the direction of trades according to Lee-Ready (1991)
ge$mid = (ge$bid+ge$ofr)/2
ge$dir = matrix(NA,dim(ge)[1],1)
d = coredata(ge) # data.frame(coredata(ge)) is insanely slower
p1 = d[1,1]
d[1,6] = 1 # Assume the first trade was up
dir1 = d[1,6]
q1 = d[2,5]
for (i in 3:dim(d)[1])
{
p2 = d[i,1] # current price
if (!is.na(p2)) # Trade
{
# Quote Rule
if (p2 > q1)
{
d[i,6] = 1 # Direction
}
else if (p2 < q1)
{
d[i,6] = -1
}
else # p == midpoint
{
# Tick Rule
if (p2 > p1)
{
d[i,6] = 1
}
if (p2 < p1)
{
d[i,6] = -1
}
else # prices stayed the same
{
d[i,6] = dir1
}
}
p1 = p2
dir1 = d[i,6]
}
else # Quote
{
q1 = d[i,5] # Update most recent midpoint
}
}
# Measure impact per trade
d2 = cbind(d[,6],d[,2],log(d[,5]),matrix(NA,dim(d)[1],1))
colnames(d2) = c(’dir’,’size’,’logmid’,’impact’)
trade_i1 = 1
quote_i1 = 2
for (i2 in 3:dim(d2)[1])
{
dir_i2 = d2[i2,1]
if (!is.na(dir_i2)) # Trade
{
if (i2-trade_i1 == 1) # Following another a trade
{
d2[trade_i1,4] = 0 # \delta p = 0
}
trade_i1 = i2
}
else # Quote
{
if (i2-trade_i1 == 1) # Following a trade
{
d2[trade_i1,4] = d2[i2,3]-d2[quote_i1,3] # diff(logmids)
}
quote_i1 = i2
}
}
# Look only at buyer initiated trades
buy = d2[!is.na(d2[,4]) & d2[,1]==1,]
buy = buy[,c(2,4)]
buy[,1] = buy[,1]/mean(buy[,1])
require(Hmisc)
max_bins = 50
sizes = as.double(levels(cut2(buy[,1],g=max_bins,levels.mean=TRUE)))
buy[,1] = cut2(buy[,1],g=max_bins,levels.mean=TRUE)
ge_imp = aggregate(buy[,2], list(buy[,1]), mean)
plot(sizes,ge_imp[,2],log='xy',type='o',pch=15,ylab='price shift',
xlab='normalized volume',main='GE')
# Find how the price impact increases with volume
imp_fit = function(pow){summary(lm(ge_imp[,2] ~ I(sizes^pow)))$r.squared}
fits = sapply(seq(0,1,.01), imp_fit)
print(paste('R^2 minimizing beta in volume^beta=impact:',which.max(fits)/100))
print('Difference is likely because of Lee-Ready flat trade/quote ambiguity')
print('and timestamp aggregation ambiguity.')
print('A huge number of trades were labeled as having 0 impact.')
```