AI Trading with Sentiment Analysis, the chronicles

6 min readAug 30, 2024

Introduction

It is well discussed that stock price prediction with technical indicator analysis alone is insufficient. Analysis based on price behavior alone fails to take into account factors that influence the price without being obviously ‘baked in’ to its technicals. Unfortunately, if we avoid illegally obtaining tips from folks immediately and intimately in-the-know, we are left with our gut-feelings, the news, and whatever sleuthing skills we can deploy on Social Media and Forums. Luckily, Sentiment Analysis provides a solution to this dilemma. While we can never truly model any intuitively blessed trader’s ‘gut feeling’, Sentiment Analysis, when deployed correctly, is very useful in tandem with technical analysis.

The obvious problems with using Sentiment Analysis, such as lag in news vs. price impact, accuracy, and recency bias, does not detract from Sentiment Analysis in combination with technical indicators providing a more holistic picture than technical analysis alone. We suspect that this
increased ‘holisticity’ will have a positive impact on our predictive power, regardless of how the information tends to ‘lag’ behind the actual price trend.

We plan to balance the sentiment analysis of longer term/more verified information sources such as news articles with the recency/on-the-pulse nature of X posts. The combination of these two sources, with various degrees of bias towards one or the other, along with technical indicators,
provide a very rich dataset for our models to learn from. This report discusses the data we will use, how it will be pre-processed and balanced, and the workflow design of our Sentiment Analysis components. We will also discuss why we make certain design decisions, and how we plan to integrate this component into the larger Market Raker framework.

Data collection

In the new post-A.I. paradigm of the internet, everyone and their cousin wants a piece of the internet to train their next model/s on. As a result, several traditional avenues for obtaining online data have now been ‘gatekept’ with costly API subscriptions, limiting the amount of data that we can properly obtain from sources such as X or other social media platforms. Since the data that we are 1able to retrieve from X is not alone sufficient for our use case, we supplement the data we are able to obtain from X with headlines obtained from Google News. This has the added benefit of being able to balance recency (X) with reliability (Google News). Essentially, for a given prediction time, the average sentiment of Google News articles published in the preceding 24 hours is provided alongside the average sentiment of currently circulating X posts.

Workflow design

The proposed workflow for our Sentiment Analysis integration is diagrammized in Figure 1.

Figure 1: Workflow Diagram for Sentiment Analysis

Here, SYMBOL (or symbol) represents the requested cryptocurrency trading pair or stock ticker. If a prediction request for a particular symbol is lodged, the system will first determine if sentiment analysis is supported for that symbol or not. This is since, logically, not all symbols are equally
represented in the news and online, which will affect effectiveness of the sentiment analysis. If that symbol is not supported, the prediction request will be redirected to the current vanilla (technical- 2analysis only) model. If that symbol is supported, the system will then determine if there is currently valid sentiment data available in the database, before sending the request to the sentiment-enabled model for prediction. In the scenario that the sentiment data is not already in the database, the sentiment analysis
routine needs to be run. This is diagrammized in Figure 2.

Figure 2: Flow diagram for sentiment analysis routine

The data is first retrieved via and A.P.I. request from Google News and X. This is then sent, item by item (where item could be a single article title or a single X post) to the sentiment analysis component. This component consists of an A.P.I. call to the open source Groq A.P.I., using the “llama-3.1–70b-versatile” model. The main rationale for using an LLM is this way is the flexibility of prompt/input design. With this workflow, the desired behaviour can be explicitly extracted from a third-party model in absence of our own purpose-built model. The particular aspect of prompt
design was to have the model consider the sentiment with regards to the content’s subject. The content’s subject should usually be the symbol in question. Once the sentiment value for each item, which would be encoded as -1 for a negative sentiment, 0 for a neutral sentiment, and 1 for a positive sentiment, the current average sentiment is calculated for this symbol. The current average sentiment consists of the average 24-hourly sentiment as per Google News combined with the average sentiment of the currently circulating X posts that could be retrieved.

Design justifications

The first major design decision is the use of an ‘average sentiment’. This is to try and limit noise or undue variance in the Sentiment Analysis signal, and to avoid skewing opinion too quickly. This strategy could be adjusted pending model predictive strength results. This ‘average sentiment’ strategy could also be leveraged to deal with or incorporate the reliability/recency biases. This could be done in several ways, such as where the average sentiment of the longer term news is modulated by the sentiment of the more recent X posts to ‘skew’ the score to more recent opinions, or vice-versa. The average sentiment of the longer term news could be averaged out with the average of the more recent sentiment to treat these two quantities equally. Beyond the average sentiment strategy,
different data sources (such as different news providers or different X users with larger/smaller followings) could be weighted to adjust the sentiment according to viewership or reliability as a proxy for impact.
Another major design decision is the choice of sentiment analysis agent. While ‘cooking’ a completely bespoke Sentiment Analysis model from scratch could be the most ideal solution, this is highly impractical for an initial feature rollout. The investment of design and development time in
3a feature such as this before truly knowing the impact on the predictive power of the Final Model is imprudent at best. In this case, publicly available and open-source models that already have been developed are an appropriate stand-in solution. But still, why an LLM, when there are several
suitable text-classifiers available on Hugging Face and similar sources? At present, one of the major issues with using these models are their lack of ‘context’. It is very difficult to specify what the subject in a sentence should be — for example, a sentence stating that a vehicle manufacturer has
outpaced Tesla is ‘positive’ at face-value, but negative for Tesla. Vanilla text classifiers will have difficulty providing the correct classification in this case. For this reason, the flexibility of prompting an LLM with specific instructions on how to parse a sentence correctly is in our favour when it comes to having to trust the sentiment output.

Conclusion

We know that technical analysis data alone cannot sustain our models forever, and we are very excited to see what a more holistic dataset containing sentiment data for various tickers and trading pairs will bring to our predictive power. We expect to see an increase in accuracy and reliability of our models, and will likely uncover interesting patterns in the relationship of news/events to price movement.

The sentiment analysis component of our system is set to be fully implemented and released along with our Final Model in November. We trust that this update serves as a well-timed view into our design and planning.