Data Validation
- convert_to_struct_arr(data, add_exch_local_ev=True)[source]
Converts the 2D ndarray currently used in Python hftbacktest into the structured array that can be used in Rust hftbacktest.
- Parameters:
- Returns:
Converted structured array.
- Return type:
- correct(data, base_latency, tick_size=None, lot_size=None, err_bound=1e-08, method='separate')[source]
Validates the specified data and automatically corrects negative latency and unordered rows. See
validate_data()
,correct_local_timestamp()
,correct_exch_timestamp()
, andcorrect_exch_timestamp_adjust()
.- Parameters:
data (ndarray[Any, dtype[ScalarType]] | DataFrame) – Data to be checked and corrected.
base_latency (float) – The value to be added to the feed latency. See
correct_local_timestamp()
.tick_size (float | None) – Minimum price increment for the specified data.
lot_size (float | None) – Minimum order quantity for the specified data.
err_bound (float) – Error bound used to verify if the specified
tick_size
orlot_size
aligns with the price and quantity.method (Literal['separate', 'adjust']) –
The method to correct reversed exchange timestamp events.
separate
: Usecorrect_local_timestamp()
.adjust
: Usecorrect_exch_timestamp_adjust()
.
- Returns:
Corrected data
- Return type:
- correct_event_order(sorted_exch, sorted_local, add_exch_local_ev)[source]
Corrects exchange timestamps that are reversed by splitting each row into separate events, ordered by both exchange and local timestamps, through duplication. See
data
for details.- Parameters:
- Returns:
Adjusted data with corrected exchange timestamps.
- Return type:
- correct_exch_timestamp(data, num_corr)[source]
Corrects exchange timestamps that are reversed by splitting each row into separate events, ordered by both exchange and local timestamps, through duplication. See
data
for details.
- correct_exch_timestamp_adjust(data)[source]
Corrects reversed exchange timestamps by adjusting the local timestamp value for proper ordering. It sorts the data by exchange timestamp and fixes out-of-order local timestamps by setting their value to the previous value, ensuring correct ordering.
- correct_local_timestamp(data, base_latency)[source]
Adjusts the local timestamp if the feed latency is negative by offsetting the maximum negative latency value as follows:
feed_latency = local_timestamp - exch_timestamp adjusted_local_timestamp = local_timestamp + min(feed_latency, 0) + base_latency
- Parameters:
data (ndarray[Any, dtype[ScalarType]] | DataFrame) – Data to be corrected.
base_latency (float) – Due to discrepancies in system time between the exchange and the local machine, latency may be measured inaccurately, resulting in negative latency values. The conversion process automatically adjusts for positive latency but may still produce zero latency cases. By adding
base_latency
, more realistic values can be obtained. Unit should be the same as the feed data’s timestamp unit.
- Returns:
Adjusted data with corrected timestamps
- Return type:
- validate_data(data, tick_size=None, lot_size=None, err_bound=1e-08)[source]
Validates the specified data for the following aspects, excluding user events. Validation results will be printed out:
Ensures data’s price aligns with tick_size.
Ensures data’s quantity aligns with lot_size.
Ensures data’s local timestamp is ordered.
Ensures data’s exchange timestamp is ordered.
- Parameters:
data (ndarray[Any, dtype[ScalarType]] | DataFrame) – Data to be validated.
tick_size (float | None) – Minimum price increment for the given asset.
lot_size (float | None) – Minimum order quantity for the given asset.
err_bound (float) – Error bound used to verify if the specified
tick_size
orlot_size
aligns with the price and quantity.
- Returns:
The number of rows with reversed exchange timestamps.
- Return type: