Integrating Custom Data
By combining your custom data with the feed data (order book and trades), you can enhance your strategy while harnessing the full potential of hftbacktest.
Accessing Spot Price
In this example, we’ll combine the spot BTCUSDT mid-price with the USDM-Futures BTCUSDT feed data. This will enable you to estimate the fair value price, taking the underlying price into consideration.
The spot data is used only in the local-side, and thus, should come with a local timestamp. Following this, in your backtesting logic, your task is to identify the most recent data that predates the current timestamp.
The raw spot feed is processed to create spot data, which includes both a local timestamp and the spot mid price.
[1]:
import numpy as np
import gzip
import json
spot = np.full((100_000, 2), np.nan, np.float64)
i = 0
with gzip.open('spot/btcusdt_20240809.gz', 'r') as f:
while True:
line = f.readline()
if line is None or line == b'':
break
line = line.decode().strip()
local_timestamp = int(line[:19])
obj = json.loads(line[20:])
if obj['stream'] == 'btcusdt@bookTicker':
data = obj['data']
mid = (float(data['b']) + float(data['a'])) / 2.0
spot[i] = [local_timestamp, mid]
i += 1
spot = spot[:i]
It displays the basis and spot mid price as it identifies the latest Point-in-Time data that falls before the current timestamp.
[2]:
from numba import njit
from hftbacktest import BacktestAsset, HashMapMarketDepthBacktest
out_dtype = np.dtype([('timestamp', 'i8'), ('mid_price', 'f8'), ('spot_mid_price', 'f8')])
@njit
def print_basis(hbt, spot):
out = np.empty(1_000_000, out_dtype)
t = 0
spot_row = 0
# Checks every 60-sec (in nanoseconds)
while hbt.elapse(1_000_000_000) == 0:
# Finds the latest spot mid value.
while spot_row < len(spot) and spot[spot_row, 0] <= hbt.current_timestamp:
spot_row += 1
spot_mid_price = spot[spot_row - 1, 1] if spot_row > 0 else np.nan
depth = hbt.depth(0)
mid_price = (depth.best_bid + depth.best_ask) / 2.0
basis = mid_price - spot_mid_price
if t % 10 == 0:
print(
'current_timestamp:',
hbt.current_timestamp,
'futures_mid:',
round(mid_price, 2),
', spot_mid:',
round(spot_mid_price, 2),
', basis:',
round(basis, 2)
)
out[t].timestamp = hbt.current_timestamp
out[t].mid_price = mid_price
out[t].spot_mid_price = spot_mid_price
t += 1
return out[:t]
asset = (
BacktestAsset()
.data(['usdm/btcusdt_20240809.npz'])
.initial_snapshot('usdm/btcusdt_20240808_eod.npz')
.linear_asset(1.0)
.constant_latency(10_000_000, 10_000_000)
.risk_adverse_queue_model()
.no_partial_fill_exchange()
.trading_value_fee_model(0.0002, 0.0007)
.tick_size(0.1)
.lot_size(0.001)
)
hbt = HashMapMarketDepthBacktest([asset])
out = print_basis(hbt, spot)
_ = hbt.close()
current_timestamp: 1723161602500000000 futures_mid: 61659.85 , spot_mid: 61688.0 , basis: -28.14
current_timestamp: 1723161612500000000 futures_mid: 61713.95 , spot_mid: 61727.8 , basis: -13.85
current_timestamp: 1723161622500000000 futures_mid: 61713.45 , spot_mid: 61728.94 , basis: -15.5
current_timestamp: 1723161632500000000 futures_mid: 61666.05 , spot_mid: 61690.08 , basis: -24.02
current_timestamp: 1723161642500000000 futures_mid: 61638.45 , spot_mid: 61661.5 , basis: -23.06
current_timestamp: 1723161652500000000 futures_mid: 61632.05 , spot_mid: 61663.98 , basis: -31.93
current_timestamp: 1723161662500000000 futures_mid: 61578.15 , spot_mid: 61600.0 , basis: -21.85
current_timestamp: 1723161672500000000 futures_mid: 61524.25 , spot_mid: 61562.0 , basis: -37.74
current_timestamp: 1723161682500000000 futures_mid: 61552.45 , spot_mid: 61570.0 , basis: -17.54
current_timestamp: 1723161692500000000 futures_mid: 61593.05 , spot_mid: 61606.0 , basis: -12.96
current_timestamp: 1723161702500000000 futures_mid: 61587.45 , spot_mid: 61608.0 , basis: -20.54
current_timestamp: 1723161712500000000 futures_mid: 61561.15 , spot_mid: 61589.88 , basis: -28.73
current_timestamp: 1723161722500000000 futures_mid: 61589.95 , spot_mid: 61614.08 , basis: -24.14
current_timestamp: 1723161732500000000 futures_mid: 61608.95 , spot_mid: 61632.13 , basis: -23.18
current_timestamp: 1723161742500000000 futures_mid: 61653.45 , spot_mid: 61681.74 , basis: -28.29
current_timestamp: 1723161752500000000 futures_mid: 61673.45 , spot_mid: 61700.0 , basis: -26.54
current_timestamp: 1723161762500000000 futures_mid: 61663.95 , spot_mid: 61683.84 , basis: -19.89
current_timestamp: 1723161772500000000 futures_mid: 61640.85 , spot_mid: 61664.0 , basis: -23.15
current_timestamp: 1723161782500000000 futures_mid: 61634.15 , spot_mid: 61654.0 , basis: -19.85
current_timestamp: 1723161792500000000 futures_mid: 61618.05 , spot_mid: 61666.0 , basis: -47.94
current_timestamp: 1723161802500000000 futures_mid: 61626.65 , spot_mid: 61648.34 , basis: -21.69
current_timestamp: 1723161812500000000 futures_mid: 61586.25 , spot_mid: 61612.0 , basis: -25.74
current_timestamp: 1723161822500000000 futures_mid: 61624.65 , spot_mid: 61649.98 , basis: -25.33
current_timestamp: 1723161832500000000 futures_mid: 61611.55 , spot_mid: 61644.0 , basis: -32.46
current_timestamp: 1723161842500000000 futures_mid: 61633.95 , spot_mid: 61658.4 , basis: -24.46
current_timestamp: 1723161852500000000 futures_mid: 61635.95 , spot_mid: 61656.02 , basis: -20.07
current_timestamp: 1723161862500000000 futures_mid: 61671.45 , spot_mid: 61689.92 , basis: -18.47
current_timestamp: 1723161872500000000 futures_mid: 61651.55 , spot_mid: 61664.0 , basis: -12.46
current_timestamp: 1723161882500000000 futures_mid: 61614.15 , spot_mid: 61640.0 , basis: -25.84
current_timestamp: 1723161892500000000 futures_mid: 61605.95 , spot_mid: 61622.12 , basis: -16.18
current_timestamp: 1723161902500000000 futures_mid: 61583.95 , spot_mid: 61607.98 , basis: -24.04
[3]:
import polars as pl
import holoviews as hv
df = pl.DataFrame(out).with_columns(
pl.from_epoch('timestamp', time_unit='ns').alias('timestamp')
)
hv.extension('bokeh')
df.plot(x='timestamp')
[3]:
Although this is a short-period sample, you can observe that the basis is mean-reverting. There may be statistical arbitrage opportunities, particularly if you are eligible for rebates or zero fees.
[4]:
((df['mid_price'] - df['spot_mid_price']) / df['mid_price'] * 10000).alias('basis bp').plot(x='timestamp')
[4]: