RKL Boost Analysis
Originally a Twitter thread (10/11/21), I wanted to reformat this as Medium article so it was easier to read and reference. At the end I will link the GitHub repository and the original Twitter thread. Let’s dive in!
Rumble Kong League (RKL) is an upcoming 3v3 basketball game being built on the Ethereum network. Drawing lots of attention from the NFT world, NBA players such as Stephen Curry and Paul George have bought Kongs in anticipation of its release.
There has been significant price action and traction within this project since the original date of this analysis, the idea is still there. I plan on expanding upon this analysis with some more advanced methods, later. But now, lets investigate if there is any correlation between Rumble Kongs sales prices and their boost totals and maybe do some simple machine learning.
To begin, let’s double-check the RNG of the boost distributions across all Kongs. As expected, fairly normal across each individual boost and cumulative total.
The datasets had sales on the same scale for all currencies (i.e. DAI price and ETH on the same scale), so I narrowed it down to only Kongs that have the last sale in ETH. After doing that, we get these totals for the number of Kongs listed and the number of Kongs that have at least one sale:
Now, we can see a distribution of the last sales price for all Kongs that have had at least one sale.
In addition, we can look at the correlation between some variables of interest and the last sale price. For the most part, there seems to be little to no correlation between these variables, except for the very highest ranked boost values.
However, there is a lot of skew because of the factor that rare aesthetic traits play in sales price. Take a look at how different the average and median sales prices are:
To combat that, we can apply some normalizing transformations to our data. It is important to normalize our data if we want to use parametric predictive models (spoiler alert, simple linear regression).
First, I decided to cut off the sales that were extreme outliers — giving us a much more normal-looking distribution. Still kind of skewed.
Using a square-root transformation, we get a nice little stat table with more normalized values to reduce the influence of rare-trait outlier sales.
And now, our Last Sale Price histogram looks kind of normal.
Our correlations between traits and sales price looks less sporadic, but still weak. The last graph looks promising, though.
So now, for some fun, we are going to try to predict the Last Sale Price using the following variables:
- Cumulative Boost
- Each of the four boost types (Shooting, Defense, Vision, Finish)
- Last Sale Date
We will be using a simple linear regression with both the raw values and the normalized values to see if we find anything of interest.
Before we do that, two important metrics to know:
- R-square: This is interpreted as the percent of variance in the predicted value, as explained by the variables used to make the prediction. i.e. an R-square of 0.75 means our X variables explain 75% of the variance in Y.
- MSE: Mean squared error. This squares the difference between the predicted value and the actual value for each observation, and then averages them all.
Okay, now let’s run some models.
Cumulative Boost:
With raw values, we get an R-square of 0.0152 and an MSE of 6.43. With normalized values, we get an R-square of 0.0155 and an MSE of 0.119. We decrease our error significantly, but cumulative boost is proving to not be a reliable predictor.
Individual Boosts:
With raw values, we get an R-square of 0.0277 and an MSE of 6.352. Large error again, but an R-square almost twice as high (still very poor). With normalized values, we get an R-square of ~0(???) with a much smaller MSE.
So, like cumulative boost, the individual boost stats do not paint an accurate picture of sales price.
Last Sale Price:
Raw values give us an R-square of 0.0337 and an MSE of 6.3135. Same ballpark as the previous predictors. Normalized values give us an R-square of 0.504 (!!!) and an MSE of 0.06.
We finally have something that can work as a predictor! With this, we can interpret this linear regression as indicating that ~50% of the variability in Sales Price can be explained by the Last Sale Date of the Kong. The other 50% is likely trait rarity. Looking back at a previous graph, it seems about right.
Why is Last Sale Date the only significant predictor? Well, the RKL floor has been rising significantly over the past month. Because of that, even normalization isn’t accurately illustrating how people are buying.
My prediction is that re-running this analysis once the game has been released will give us results showing more significant influence from stat ranking.