Data

Please see https://github.com/nkaz001/collect-binancefutures or Data Preparation regarding collecting and converting the feed data.

Format

hftbacktest can digest a numpy file such as npz or npy and pickled pandas.DataFrame. The data has 6 columns as follows in the following order.

  • event: A type of event

DEPTH_EVENT = 1
TRADE_EVENT = 2
DEPTH_CLEAR_EVENT = 3
DEPTH_SNAPSHOT_EVENT = 4

Event code above 100 is used for user-defined events.

  • exch_timestamp: exchange timestamp

  • local_timestamp: local timestamp that your system receives

  • side: side 1: Buy(Bid) -1: Sell(Ask)

  • price: price

  • qty: quantity

Example

Raw data

1676419207212527 {'stream': 'btcusdt@depth@0ms', 'data': {'e': 'depthUpdate', 'E': 1676419206974, 'T': 1676419205108, 's': 'BTCUSDT', 'U': 2505118837831, 'u': 2505118838224, 'pu': 2505118837821, 'b': [['2218.80', '0.603'], ['5000.00', '2.641'], ['22160.60', '0.008'], ['22172.30', '0.551'], ['22173.40', '0.073'], ['22174.50', '0.006'], ['22176.80', '0.157'], ['22177.90', '0.425'], ['22181.20', '0.260'], ['22182.30', '3.918'], ['22182.90', '0.000'], ['22183.40', '0.014'], ['22203.00', '0.000']], 'a': [['22171.70', '0.000'], ['22187.30', '0.000'], ['22194.30', '0.270'], ['22194.70', '0.423'], ['22195.20', '2.075'], ['22209.60', '4.506']]}}
1676419207212584 {'stream': 'btcusdt@trade', 'data': {'e': 'trade', 'E': 1676419206976, 'T': 1676419205116, 's': 'BTCUSDT', 't': 3288803053, 'p': '22177.90', 'q': '0.001', 'X': 'MARKET', 'm': True}}

Normalized data

event

exch_timestamp

local_timestamp

side

price

qty

1

1676419205108000

1676419207212527

1

2218.8

0.603

1

1676419205108000

1676419207212527

1

5000.00

2.641

1

1676419205108000

1676419207212527

1

22160.60

0.008

1

1676419205108000

1676419207212527

1

22172.30

0.551

1

1676419205108000

1676419207212527

1

22173.40

0.073

1

1676419205108000

1676419207212527

1

22174.50

0.006

1

1676419205108000

1676419207212527

1

22176.80

0.157

1

1676419205108000

1676419207212527

1

22177.90

0.425

1

1676419205108000

1676419207212527

1

22181.20

0.260

1

1676419205108000

1676419207212527

1

22182.30

3.918

1

1676419205108000

1676419207212527

1

22182.90

0.000

1

1676419205108000

1676419207212527

1

22183.40

0.014

1

1676419205108000

1676419207212527

1

22203.00

0.000

1

1676419205108000

1676419207212527

-1

22171.70

0.000

1

1676419205108000

1676419207212527

-1

22187.30

0.000

1

1676419205108000

1676419207212527

-1

22194.30

0.270

1

1676419205108000

1676419207212527

-1

22194.70

0.423

1

1676419205108000

1676419207212527

-1

22195.20

2.075

1

1676419205108000

1676419207212527

-1

22209.60

4.506

2

1676419205116000

1676419207212584

-1

22177.90

0.001

Validation

Before you start backtesting, you should check if the data is valid. The data that is received from crypto exchanges needs data cleaning and validation.

  1. All timestamp should be in the correct order.

  2. You might find local_timestamp is advanced to exch_timestamp due to time-sync. As local_timestamp - exch_timestamp is used as latency, the value must be positive.

  3. Even though local_timestamp is in the correct order, exch_timestamp can be in the incorrect order.

See the following example. exch_timestamp of depth feed is advanced to the prior trade feed even though depth feed is received after trade feed.

1676419207212385 {'stream': 'btcusdt@trade', 'data': {'e': 'trade', 'E': 1676419206968, 'T': 1676419205111, 's': 'BTCUSDT', 't': 3288803051, 'p': '22177.90', 'q': '0.300', 'X': 'MARKET', 'm': True}}
1676419207212480 {'stream': 'btcusdt@trade', 'data': {'e': 'trade', 'E': 1676419206968, 'T': 1676419205111, 's': 'BTCUSDT', 't': 3288803052, 'p': '22177.90', 'q': '0.119', 'X': 'MARKET', 'm': True}}
1676419207212527 {'stream': 'btcusdt@depth@0ms', 'data': {'e': 'depthUpdate', 'E': 1676419206974, 'T': 1676419205108, 's': 'BTCUSDT', 'U': 2505118837831, 'u': 2505118838224, 'pu': 2505118837821, 'b': [['2218.80', '0.603'], ['5000.00', '2.641'], ['22160.60', '0.008'], ['22172.30', '0.551'], ['22173.40', '0.073'], ['22174.50', '0.006'], ['22176.80', '0.157'], ['22177.90', '0.425'], ['22181.20', '0.260'], ['22182.30', '3.918'], ['22182.90', '0.000'], ['22183.40', '0.014'], ['22203.00', '0.000']], 'a': [['22171.70', '0.000'], ['22187.30', '0.000'], ['22194.30', '0.270'], ['22194.70', '0.423'], ['22195.20', '2.075'], ['22209.60', '4.506']]}}
1676419207212584 {'stream': 'btcusdt@trade', 'data': {'e': 'trade', 'E': 1676419206976, 'T': 1676419205116, 's': 'BTCUSDT', 't': 3288803053, 'p': '22177.90', 'q': '0.001', 'X': 'MARKET', 'm': True}}
1676419207212621 {'stream': 'btcusdt@trade', 'data': {'e': 'trade', 'E': 1676419206976, 'T': 1676419205116, 's': 'BTCUSDT', 't': 3288803054, 'p': '22177.90', 'q': '0.005', 'X': 'MARKET', 'm': True}}

This should be converted into the following form. hftbacktest provides correct method to automatically correct this type of mess.

...
1676419207212385 {'stream': 'btcusdt@trade', 'data': {'e': 'trade', 'E': 1676419206968, 'T': -1, 's': 'BTCUSDT', 't': 3288803051, 'p': '22177.90', 'q': '0.300', 'X': 'MARKET', 'm': True}}
1676419207212480 {'stream': 'btcusdt@trade', 'data': {'e': 'trade', 'E': 1676419206968, 'T': -1, 's': 'BTCUSDT', 't': 3288803052, 'p': '22177.90', 'q': '0.119', 'X': 'MARKET', 'm': True}}
1676419207212527 {'stream': 'btcusdt@depth@0ms', 'data': {'e': 'depthUpdate', 'E': 1676419206974, 'T': 1676419205108, 's': 'BTCUSDT', 'U': 2505118837831, 'u': 2505118838224, 'pu': 2505118837821, 'b': [['2218.80', '0.603'], ['5000.00', '2.641'], ['22160.60', '0.008'], ['22172.30', '0.551'], ['22173.40', '0.073'], ['22174.50', '0.006'], ['22176.80', '0.157'], ['22177.90', '0.425'], ['22181.20', '0.260'], ['22182.30', '3.918'], ['22182.90', '0.000'], ['22183.40', '0.014'], ['22203.00', '0.000']], 'a': [['22171.70', '0.000'], ['22187.30', '0.000'], ['22194.30', '0.270'], ['22194.70', '0.423'], ['22195.20', '2.075'], ['22209.60', '4.506']]}}
...
-1 {'stream': 'btcusdt@trade', 'data': {'e': 'trade', 'E': 1676419206968, 'T': 1676419205111, 's': 'BTCUSDT', 't': 3288803051, 'p': '22177.90', 'q': '0.300', 'X': 'MARKET', 'm': True}}
-1 {'stream': 'btcusdt@trade', 'data': {'e': 'trade', 'E': 1676419206968, 'T': 1676419205111, 's': 'BTCUSDT', 't': 3288803052, 'p': '22177.90', 'q': '0.119', 'X': 'MARKET', 'm': True}}
1676419207212584 {'stream': 'btcusdt@trade', 'data': {'e': 'trade', 'E': 1676419206976, 'T': 1676419205116, 's': 'BTCUSDT', 't': 3288803053, 'p': '22177.90', 'q': '0.001', 'X': 'MARKET', 'm': True}}
1676419207212621 {'stream': 'btcusdt@trade', 'data': {'e': 'trade', 'E': 1676419206976, 'T': 1676419205116, 's': 'BTCUSDT', 't': 3288803054, 'p': '22177.90', 'q': '0.005', 'X': 'MARKET', 'm': True}}

Normalized data

event

exch_timestamp

local_timestamp

side

price

qty

2

-1

1676419207212385

-1

22177.90

0.300

2

-1

1676419207212480

-1

22177.90

0.119

1

1676419205108000

1676419207212527

1

2218.8

0.603

1

1676419205108000

1676419207212527

1

5000.00

2.641

1

1676419205108000

1676419207212527

1

22160.60

0.008

1

1676419205108000

1676419207212527

1

22172.30

0.551

1

1676419205108000

1676419207212527

1

22173.40

0.073

1

1676419205108000

1676419207212527

1

22174.50

0.006

1

1676419205108000

1676419207212527

1

22176.80

0.157

1

1676419205108000

1676419207212527

1

22177.90

0.425

1

1676419205108000

1676419207212527

1

22181.20

0.260

1

1676419205108000

1676419207212527

1

22182.30

3.918

1

1676419205108000

1676419207212527

1

22182.90

0.000

1

1676419205108000

1676419207212527

1

22183.40

0.014

1

1676419205108000

1676419207212527

1

22203.00

0.000

1

1676419205108000

1676419207212527

-1

22171.70

0.000

1

1676419205108000

1676419207212527

-1

22187.30

0.000

1

1676419205108000

1676419207212527

-1

22194.30

0.270

1

1676419205108000

1676419207212527

-1

22194.70

0.423

1

1676419205108000

1676419207212527

-1

22195.20

2.075

1

1676419205108000

1676419207212527

-1

22209.60

4.506

2

1676419205111000

-1

-1

22177.90

0.300

2

1676419205111000

-1

-1

22177.90

0.119

2

1676419206976000

1676419207212584

-1

22177.90

0.001

2

1676419206976000

1676419207212621

-1

22177.90

0.005

-1 in exch_timestamp means that the event is not processed on exchange-side logic such as order fill. -1 in local_timestamp means that the event is not recognized by the local.