This article and the data driven insights were created by Bouke Stam, Data Scientist at Nuklai.
Between February and April 2024 Nuklai used the power of its Smart Data Technology to launch a Web3-focused research campaign - the Bitcoin Price Research.
The objective was to try and predict the price of Bitcoin at the halving event on April 20. To qualify you would need to do at least 5 predictions and the winner would be the one who predicted the price closest to the actual price. The price for the closest prediction was 1 Bitcoin ($64,948.57 at the halving).
All these predictions were stored within a dataset, alongside a comment about the participants sentiment of the current market conditions.
In this article we will look at the data and analyse it to find interesting relations. We will also tap into our partner Finage’s pricing data to create insights into our combined data.
Tracking the Surge: Daily Prediction Trends in the Research
In the graph below you can see the number of predictions that were done on each day.
It started out strong and then averaged at around 50 predictions per day for most of the research period. In the end, predictions started to increase again, maybe because people wanted to predict as close to the deadline as possible. In total, there were 3068 predictions made.
And here are some rows in the dataset to get an idea of what the data looks like:
We can also look at how many predictions each user did. The graph below reveals that the majority of users made only a single prediction. The frequency of predictions diminishes progressively, with a slight increase at five predictions, as this is the requirement to qualify for the prize. Only a small number of users made more than ten predictions.
const ApiUrl = "https://api.nukl.ai/api/public/v1/datasets/:datasetId/queries";
const ApiKey = "[API_KEY]";
const DatasetId = "[DATASET_ID]";
const headers = {
"Content-Type": "application/json",
'authentication': ApiKey
}
ApiUrl = "https://api.nukl.ai/api/public/v1/datasets/:datasetId/queries"
ApiKey = "[API_KEY]"
DatasetId = "[DATASET_ID]"
headers = {
"Content-Type": "application/json",
"authentication": ApiKey
}
$ApiUrl = "https://api.nukl.ai/api/public/v1/datasets/:datasetId/queries";
$ApiKey = "[API_KEY]";
$DatasetId = "[DATASET_ID]";
$headers = [
"Content-Type: application/json",
"authentication: $ApiKey"
];
Looking for insights
The best way to get a good overview of the predictions was to plot it against the BTC price. To do this we calculated the median prediction per day. We took the median (middle) instead of the mean (average) because there are a lot of outliers in our data.
In this kind of scenario the median provides a better measure of central tendency than the mean. We also used BTC price data from the Finage historical price API. The resulting graph can be seen below:
// @dataset represents your dataset rows as a table
const body = {
sqlQuery: "select * from @dataset limit 5",
}
@dataset represents your dataset rows as a table
body = {
"sqlQuery": "select * from @dataset limit 5"
}
// @dataset represents your dataset rows as a table
$body = [
"sqlQuery" => "select * from @dataset limit 5"
];
Some observations:
- The predictions are highly correlated with the current price
- The predictions seem to be consistently $5000 to $10000 above the current price, only one day going below the current price
- On certain days with lots of upward or downward price movement, the predictions become extremely positive or negative
These observations seem to indicate that people are highly influenced by the current daily price swings, and have a short-term view on the market.
Analyzing Trends: Sentiment and Quality Breakdown
In order to analyse the sentiments that were left, we needed to analyse and divide them into concrete categories. We decided to categorise it based on sentiment and quality.
Sentiment can be “bearish”, “bullish” or “neutral”. And quality can be “bad”, “short” or “detailed”. The analysis was done using ChatGPT 4o, the newest model from OpenAI. We used the API and a short script to analyse all 3000 comments, which cost around 80 cents. In the graph below you can see the sentiment for each day:
const ApiUrl = "https://api.nukl.ai/api/public/v1/datasets/:datasetId/queries";
const ApiKey = "[API_KEY]";
const DatasetId = "[DATASET_ID]";
const headers = {
"Content-Type": "application/json",
'authentication': ApiKey
}
// @dataset represents your dataset rows as a table
const body = {
sqlQuery: "select * from @dataset limit 5",
}
// make request
fetch(ApiUrl.replace(':datasetId', DatasetId), {
method: "POST",
headers: headers,
body: JSON.stringify(body), // convert to json object
})
.then((response) => response.json())
.then((data) => {
console.log(data);
})
.catch((error) => {
console.error(error);
});
import requests
import json
ApiUrl = "https://api.nukl.ai/api/public/v1/datasets/:datasetId/queries"
ApiKey = "[API_KEY]"
DatasetId = "[DATASET_ID]"
headers = {
"Content-Type": "application/json",
"authentication": ApiKey
}
# @dataset represents your dataset rows as a table
body = {
"sqlQuery": "select * from @dataset limit 5"
}
# make request
url = ApiUrl.replace(':datasetId', DatasetId)
try:
response = requests.post(url, headers=headers, data=json.dumps(body))
data = response.json()
print(data)
except requests.RequestException as error:
print(f"Error: {error}")
$ApiUrl = "https://api.nukl.ai/api/public/v1/datasets/:datasetId/queries";
$ApiKey = "[API_KEY]";
$DatasetId = "[DATASET_ID]";
$headers = [
"Content-Type: application/json",
"authentication: $ApiKey"
];
// @dataset represents your dataset rows as a table
$body = [
"sqlQuery" => "select * from @dataset limit 5"
];
// make request
$ch = curl_init(str_replace(':datasetId', $DatasetId, $ApiUrl));
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($body)); // convert to json object
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$result = curl_exec($ch);
curl_close($ch);
echo $result;
curl -X POST 'https://api.nukl.ai/api/public/v1/datasets/[DATASET_ID]/queries' \
-H 'Content-Type: application/json' \
-H 'authentication: [API_KEY]' \
-d '{"sqlQuery":"select * from @dataset limit 5"}'
With this new data, we can now calculate the average prediction error for each of the categories. The prediction error is the difference between the predicted price and the actual price at the halving. The higher the error, the worse the prediction was.
const ApiUrl = "https://api.nukl.ai/api/public/v1/datasets/:datasetId/queries/:jobId";
const ApiKey = "[API_KEY]";
const DatasetId = "[DATASET_ID]";
const JobId = "[JOB_ID]"; // retrieved from /queries request
const headers = {
"Content-Type": "application/json",
'authentication': ApiKey
}
// make request
fetch(ApiUrl.replace(':datasetId', DatasetId).replace(':jobId', JobId), {
method: "GET",
headers: headers
})
.then((response) => response.json())
.then((data) => {
console.log(data);
})
.catch((error) => {
console.error(error);
});
import requests
ApiUrl = "https://api.nukl.ai/api/public/v1/datasets/:datasetId/queries/:jobId"
ApiKey = "[API_KEY]"
DatasetId = "[DATASET_ID]"
JobId = "[JOB_ID]" # retrieved from /queries request
headers = {
"Content-Type": "application/json",
"authentication": ApiKey
}
# make request
url = ApiUrl.replace(':datasetId', DatasetId).replace(':jobId', JobId)
try:
response = requests.get(url, headers=headers)
data = response.json()
print(data)
except requests.RequestException as error:
print(f"Error: {error}")
$ApiUrl = "https://api.nukl.ai/api/public/v1/datasets/:datasetId/queries/:jobId";
$ApiKey = "[API_KEY]";
$DatasetId = "[DATASET_ID]";
$JobId = "[JOB_ID]"; // retrieved from /queries request
$headers = [
"Content-Type: application/json",
"authentication: $ApiKey"
];
// @dataset represents your dataset rows as a table
$body = [
"sqlQuery" => "select * from @dataset limit 5"
];
// make request
$ch = curl_init(str_replace(array(':datasetId', ':jobId'), array($DatasetId, $JobId), $ApiUrl));
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$result = curl_exec($ch);
curl_close($ch);
echo $result;
curl 'https://api.nukl.ai/api/public/v1/datasets/[DATASET_ID]/queries/[JOB_ID]' \
-H 'Content-Type: application/json' \
-H 'authentication: [API_KEY]'
The bearish predictions had a lower error, likely because the actual price was lower than most people anticipated. We can also apply the same analysis to the quality.
Here we see that low-quality comments had the highest error, and the predictions with short comments did the best.
Insights Unveiled: Leveraging Smart Data Research for 3000+ Predictions
So, what have we learned? Our Smart Data Research gathered data from many participants on Nuklai, and it is a great way to gather a diverse set of data. By incentivizing users with a reward, we managed to gather more than 3,000 individual predictions.
We also managed to leverage our partners Finage's historical price API, to create even better insights into the data.
By analysing this data we saw that the predictions operated on a short-term market view, being highly dependent on the current price swings. With analysis using ChatGPT we determined the sentiment and quality of comments and saw that low quality comments also had worse predictions.
Want to collaborate with us on creating community driven Smart Datasets? Get in touch here.
About Finage Ltd
Finage Ltd is a leading enterprise-grade financial data provider. It provides instant access to real-time and historical data on the stock, currency, and cryptocurrency markets.
Finage's technology-first approach to finance and market data feeds provides its partners with innovative features that simplify financial data and computing.
The Finage Widget Collection and APIs & Web Sockets provide a low-barrier app-building ecosystem for developers and their teams. Besides a developer-friendly flexible platform, Finage provides wide data coverage resources, including financial and ownership statements, news sentiments, analyst estimates, merger and acquisitions news, and earning call transcripts on request.
About Nuklai
Nuklai is a collaborative data marketplace and infrastructure provider for data ecosystems. It combines the power of community-driven data analysis with the datasets of successful modern businesses.
The marketplace allows grassroots data enthusiasts and institutional partners to find new ways to use untapped data and generate new revenue streams.
Our vision is to unify the fragmented data landscape. We fulfill this mandate by providing a user-friendly, streamlined, and inclusive approach to sharing, requesting, and evaluating data for key insights.
We also provide better processes and new business opportunities, empowering next-generation large language models and AI.
Follow Nuklai on X and join Telegram to stay updated on the latest Nuklai news and updates.