Integrating Custom Data

By combining your custom data with the feed data (order book and trades), you can enhance your strategy while harnessing the full potential of hftbacktest.

Accessing Spot Price

In this example, we’ll combine the spot BTCUSDT mid-price with the USDM-Futures BTCUSDT feed data. This will enable you to estimate the fair value price, taking the underlying price into consideration.

The spot data is used only in the local-side, and thus, should come with a local timestamp. Following this, in your backtesting logic, your task is to identify the most recent data that predates the current timestamp.

The raw spot feed is processed to create spot data, which includes both a local timestamp and the spot mid price.

[1]:
import numpy as np
import gzip
import json

spot = np.full((100_000, 2), np.nan, np.float64)
i = 0

with gzip.open('spot/btcusdt_20230405.dat.gz', 'r') as f:
    while True:
        line = f.readline()
        if line is None or line == b'':
            break

        line = line.decode().strip()
        local_timestamp = int(line[:16])

        obj = json.loads(line[17:])
        if obj['stream'] == 'btcusdt@bookTicker':
            data = obj['data']
            mid = (float(data['b']) + float(data['a'])) / 2.0
            # Sets the event ID to 110 and assign an invalid exchange timestamp,
            # as it's not utilized in the exchange simulation.
            # And stores the mid-price in the price column.
            spot[i] = [local_timestamp, mid]
            i += 1

spot = spot[:i]

It displays the basis and spot mid price as it identifies the latest Point-in-Time data that falls before the current timestamp.

[2]:
from numba import njit
from hftbacktest import HftBacktest, FeedLatency, Linear

@njit
def print_basis(hbt, spot):
    spot_row = 0

    # Checks every 60-sec (in microseconds)
    while hbt.elapse(60_000_000):
        # Finds the latest spot mid value.
        while spot_row < len(spot) and spot[spot_row, 0] <= hbt.current_timestamp:
            spot_row += 1
        spot_mid_price = spot[spot_row - 1, 1] if spot_row > 0 else np.nan

        mid_price = (hbt.best_bid + hbt.best_ask) / 2.0
        basis = mid_price - spot_mid_price

        print(
            'current_timestamp:',
            hbt.current_timestamp,
            'futures_mid:',
            round(mid_price, 2),
            ', spot_mid:',
            round(spot_mid_price, 2),
            ', basis:',
            round(basis, 2)
        )

hbt = HftBacktest(
    [
        'btcusdt_20230405_m.npz'
    ],
    tick_size=0.1,
    lot_size=0.001,
    maker_fee=0.0002,
    taker_fee=0.0007,
    order_latency=FeedLatency(),
    asset_type=Linear,
    snapshot='btcusdt_20230404_eod.npz'
)

print_basis(hbt, spot)
Load btcusdt_20230405_m.npz
current_timestamp: 1680652860032116 futures_mid: 28150.75 , spot_mid: 28164.42 , basis: -13.67
current_timestamp: 1680652920032116 futures_mid: 28144.15 , spot_mid: 28155.82 , basis: -11.67
current_timestamp: 1680652980032116 futures_mid: 28149.95 , spot_mid: 28163.48 , basis: -13.53
current_timestamp: 1680653040032116 futures_mid: 28145.75 , spot_mid: 28158.88 , basis: -13.12
current_timestamp: 1680653100032116 futures_mid: 28140.55 , spot_mid: 28156.06 , basis: -15.51
current_timestamp: 1680653160032116 futures_mid: 28143.85 , spot_mid: 28155.82 , basis: -11.97

Combining Spot Price

While integrating custom data with feed data might be more challenging than simply accessing the data demonstrated in the first example, this process could be necessary if you’re intending to develop your own custom exchange model. Viewing the custom data from the exchange-side could indeed provide a more comprehensive approach to backtesting, such as when considering funding.

[3]:
tmp = np.full((100_000, 6), np.nan, np.float64)
i = 0

with gzip.open('spot/btcusdt_20230405.dat.gz', 'r') as f:
    while True:
        line = f.readline()
        if line is None or line == b'':
            break

        line = line.decode().strip()
        local_timestamp = int(line[:16])

        obj = json.loads(line[17:])
        if obj['stream'] == 'btcusdt@bookTicker':
            data = obj['data']
            mid = (float(data['b']) + float(data['a'])) / 2.0
            # Sets the event ID to 110 and assign an invalid exchange timestamp,
            # as it's not utilized in the exchange simulation.
            # And stores the mid-price in the price column.
            tmp[i] = [110, -1, local_timestamp, 0, mid, 0]
            i += 1

tmp = tmp[:i]

You can merge the two data sets using merge_on_local_timestamp and then proceed to validate the data.

[4]:
from hftbacktest import merge_on_local_timestamp, validate_data

usdm_feed_data = np.load('btcusdt_20230405_m.npz')['data']

merged = merge_on_local_timestamp(usdm_feed_data, tmp)

validate_data(merged)
[4]:
0

You can obtain the spot mid-price by using get_user_data function along with event id 110.

[5]:
from hftbacktest import reset, COL_PRICE

@njit
def print_basis(hbt):
    # Checks every 60-sec (in microseconds)
    while hbt.elapse(60_000_000):
        funding_rate = hbt.get_user_data(102)
        spot_mid_price = hbt.get_user_data(110)
        mid_price = (hbt.best_bid + hbt.best_ask) / 2.0
        basis = mid_price - spot_mid_price[COL_PRICE]

        print(
            'current_timestamp:',
            hbt.current_timestamp,
            'futures_mid:',
            round(mid_price, 2),
            'funding_rate:',
            funding_rate[COL_PRICE],
            ', spot_mid:',
            round(spot_mid_price[COL_PRICE], 2),
            ', basis:',
            round(basis, 2)
        )

reset(
    hbt,
    [
        merged
    ],
    snapshot='btcusdt_20230404_eod.npz'
)

print_basis(hbt)
current_timestamp: 1680652860004231 futures_mid: 28150.75 funding_rate: 2.76e-05 , spot_mid: 28164.42 , basis: -13.67
current_timestamp: 1680652920004231 futures_mid: 28144.15 funding_rate: 2.813e-05 , spot_mid: 28155.82 , basis: -11.67
current_timestamp: 1680652980004231 futures_mid: 28149.95 funding_rate: 2.826e-05 , spot_mid: 28163.48 , basis: -13.53
current_timestamp: 1680653040004231 futures_mid: 28145.75 funding_rate: 2.826e-05 , spot_mid: 28158.88 , basis: -13.12
current_timestamp: 1680653100004231 futures_mid: 28140.55 funding_rate: 2.841e-05 , spot_mid: 28156.06 , basis: -15.51
current_timestamp: 1680653160004231 futures_mid: 28143.85 funding_rate: 2.85e-05 , spot_mid: 28155.82 , basis: -11.97

Combining Funding Rate by Using Built-in Data Utility

If you’re using data that has been converted from raw feed by the built-in utility, you can effortlessly incorporate markPrice stream data. Find out more details here.

[6]:
from hftbacktest.data.utils import binancefutures

data = binancefutures.convert('usdm/btcusdt_20230405.dat.gz', opt='m')
np.savez('btcusdt_20230405_m', data=data)
local_timestamp is ahead of exch_timestamp by 26932.0
found 6555 rows that exch_timestamp is ahead of the previous exch_timestamp
Correction is done.

You can obtain the funding rate by using get_user_data function along with event id 102.

[7]:
@njit
def print_funding_rate(hbt):
    # Checks every 60-sec (in microseconds)
    while hbt.elapse(60_000_000):
        # funding_rate data is stored with event id 102.
        funding_rate = hbt.get_user_data(102)
        mid_price = (hbt.best_bid + hbt.best_ask) / 2.0

        print(
            'current_timestamp:',
            hbt.current_timestamp,
            'futures_mid:',
            round(mid_price, 2),
            'funding_rate:',
            funding_rate[COL_PRICE]
        )

reset(
    hbt,
    [
        'btcusdt_20230405_m.npz'
    ],
    snapshot='btcusdt_20230404_eod.npz'
)

print_funding_rate(hbt)
Load btcusdt_20230405_m.npz
current_timestamp: 1680652860032116 futures_mid: 28150.75 funding_rate: 2.76e-05
current_timestamp: 1680652920032116 futures_mid: 28144.15 funding_rate: 2.813e-05
current_timestamp: 1680652980032116 futures_mid: 28149.95 funding_rate: 2.826e-05
current_timestamp: 1680653040032116 futures_mid: 28145.75 funding_rate: 2.826e-05
current_timestamp: 1680653100032116 futures_mid: 28140.55 funding_rate: 2.841e-05
current_timestamp: 1680653160032116 futures_mid: 28143.85 funding_rate: 2.85e-05