Market Raker Beta Progress Report
Introduction
Since successfully releasing the first iteration of our ‘Alpha’ model, we have been working hard to improve the model’s accuracy and flexibility, which will become the ‘Beta’ model. Some items on our shopping list included a stop-loss recommendation algorithm (previously written about here:), incorporating leverage into our indicator recommendations, expanding the forecast horizons, and improving the performance of our current +12hr forecast. This report details some of the progress made towards the last two items on this shopping list: we discuss the training process and perfor-
mance of the current best +12hr forecast candidate model, +24 hour forecast candidate model, and provide a bonus comparison with our currently operating model.
The data
Similarly to the Alpha model, the data set used for the Beta model/s is derived from price and indicator data downloaded from Tradingview. To improve the model’s generalisation capacity, and to increase the data set size, we downloaded data for the 500 stock symbols in the S&P 500 and
+/- 300 crypto trading pairs based on Binance’s top coins at the time. This left us with over 800 symbols with several months to years’ worth of samples each.
To enable more flexibility in our data pipeline, we made a few changes:
• Rather than downloading the VMC-cipher indicators directly, we implement their calculations in the data pipeline itself and download price data and basic indicators. This allows for expanding the indicator selection without the need to re-download data.
• We improved the Data Generator implementation to enable shuffling data when selecting batches during training.
• We included the capacity to calculate either price difference in X hours, or the ratio between current price and price in X hours based on configuration file.
Beyond the data pipeline itself, we changed the data pre-processing steps for the Beta model as follows:
• We selected for longer input time windows.
• We decided to predict the ratio in current price to future price rather than price difference.
The modeling pipeline
The modeling pipeline mostly remained the same, with the exception of expanding the capacity to vary loss functions, improvements to bias node settings, and generally increased flexibility during hyperparameter selection. The TCN implementation was revised to ensure that the residual blocks, dilation rate, and receptive field size behaved as was theoretically expected. We also added greater flexibility to optimisation choices, and made it easier to implement and use custom loss functions.
Model training
Model training followed the same bayesian search method as discussed in previous reports: searching over number and size of dense layers, convolutional layer setups, batch sizes, learning rate, dropout,
and weight decay. Activation functions on the output, as well as on the dense layers, were varied to determine the effect on convergence speed and performance.
We also included two new metrics that were monitored during the training process besides Coefficient of Determination (R2). The two metrics are directional accuracy (expressed as a % predictions that have the same sign/direction as the target) and % valid trades. Note that % valid trades, as measured here, measures the % of predictions that fall within a 10% tolerance of the target value, or are more extreme in value than the target, and have the same sign/direction as the target. The % valid trades as measured by the Market Raker site and reported in other evaluations measures the % of trades that are equal to or more extreme in value than the true price change within an accepted time range. It is important to remember this distinction as we discuss the results.
Results
It was found during the initial hyperparameter searches that convolutional layers with subsequently decreasing filter sizes and a small kernel size worked the best, along with 3 dense layers with de-creasing widths.
The winning +12hr configuration reached a validation R2 score of 19.36 within 110 epochs, and the winning +24hr configuration reached a validation R2 score of 12.20 after 140 epochs. The learning curves of these models are shown in Figure 1.
Figure 1: Learning Curves
In terms of performance on the evaluation set, the +12hr configuration achieves an average R2 score of -249.41. This is mostly attributed to individual symbols achieving extreme negative values, since a large majority of the symbols in the evaluation set achieved positive values. The average R2 score of the symbols achieving positive values is 20.16, with the best R2 score of the evaluation set being 35.
Similarly, the +24hr configuration achieves an average evaluation R2 score of -141.80, indicating that on average, +24hrs does on average ‘less bad’ than +12hr does. However, when looking at the average R2 scores for those symbols who achieved positive values, +12hrs achieved an average positive R2 score of 12.22, with the best R2 score of the evaluation set being 23.81.
Table 1 summarises the results for these two models, and Figures 2 and 3 show a kernel density plot and time-series comparison plot for the predictions and target values for a selected symbol (BTC/USD).
Table 1: Evaluation set performance statistics.
Figure 2: KDE plots of target vs. prediction for BTCUSD
Figure 3: Time-serier plot of target vs prediction for BTCUSD
Comparison with current +12hr model
The average R2 value achieved by the R2 model on currently available indicators is -2278.33. There are no symbols with more than 5 samples in the indicator data that have a positive R2. The average directional accuracy for the currently implemented model is 49%. % Valid trades, as defined in this report, is on average 3.34%.
Table 2 compares the statistics for the Alpha model with the Beta model for the symbols sup-ported on the Alpha model only.
Table 2: Evaluation set performance statistics for Alpha model vs. Beta model
Conclusion and next steps
It is clear that, when comparing the evaluation performance of the current Beta model/s to the indicator performance of the Alpha model, the Beta model outperforms the Alpha model significantly on all metrics. It is also quite encouraging to note that all the symbols currently supported by the
Alpha model perform very well on the Beta model.
The increase in data set size, improvement in data management strategy, and training time investment appears to have truly paid off.
Next steps include refining and fine-tuning these models in the hope of better performance gains, implementing a leverage recommendation, and improving the stop-loss algorithm for continuous recommendation. We are excited to see how these models perform once they are released into the
wild the en of May 2024.
To summarize (Simple version)
We’re thrilled to share an update on our progress in developing the next generation of our AI trading assistant, which we’re calling the MarketRaker Beta model. Our team has been working hard to build upon the success of the Alpha model by making it even smarter and more capable.
Here are the key things we’ve been focusing on:
- Giving the AI a lot more data to learn from by expanding our dataset to include 500 major US stocks and 300 top cryptocurrencies. The more examples the AI sees, the better it can identify patterns and make accurate predictions.
- Making improvements under the hood to our data processing, AI architecture, and training methods. This allows us to more easily tweak and optimize the model to achieve peak performance.
- Extending the model’s prediction window to forecast price movements up to 12 and 24 hours in advance, going beyond what the Alpha model could do.
- Comparing the new Beta model’s performance to the Alpha version on our testing dataset. We’re seeing the Beta model significantly outperform when it comes to key measures like accuracy, ability to predict price direction, and percentage of trades that fall within an acceptable range.
While there’s still more work to do before we release the Beta model, these early results are extremely promising. We’re especially excited that the Beta version is showing strong predictive ability for all the trading pairs you can currently use with the Alpha model.
Next up, we’ll be further fine-tuning the models, working on adding the ability to recommend how much leverage to use on trades, and making the stop-loss feature even smarter. We’re so appreciative of your ongoing support and can’t wait to put this next-level AI powerhouse in your hands to supercharge your trading.
The future of algorithmic trading is looking brighter than ever!
Cheers, The MarketRaker Team