Mapping the NFT revolution: market trends, trade networks, and visual features

Non Fungible Tokens (NFTs) are digital assets that represent objects like art, collectible, and in-game items. They are traded online, often with cryptocurrency, and are generally encoded within smart contracts on a blockchain. Public attention towards NFTs has exploded in 2021, when their market has experienced record sales, but little is known about the overall structure and evolution of its market. Here, we analyse data concerning 6.1 million trades of 4.7 million NFTs between June 23, 2017 and April 27, 2021, obtained primarily from Ethereum and WAX blockchains. First, we characterize statistical properties of the market. Second, we build the network of interactions, show that traders typically specialize on NFTs associated with similar objects and form tight clusters with other traders that exchange the same kind of objects. Third, we cluster objects associated to NFTs according to their visual features and show that collections contain visually homogeneous objects. Finally, we investigate the predictability of NFT sales using simple machine learning algorithms and find that sale history and, secondarily, visual features are good predictors for price. We anticipate that these findings will stimulate further research on NFT production, adoption, and trading in different contexts.

. Breakdown of NFTs categories. Overall statistics of each NFT category under consideration.

S1.2 Generation of the traders and NFTs networks of interaction
Example Trader network NFT network Table S3. Link creation mechanism of the NFT network. Directed links are generated using the trader network as reference and following three rules. The first two rules take into consideration the same buyer, while the third rule another buyer, both interacting with the same three NFTs. Visualization is done using Graph-tool: 5 .
While the trader network was directly obtained from our data collection, the NFT network was created by linking NFTs that are purchased in a sequential order by the same buyer. Let's consider NFT i , NFT j , NFT k , and NFT h as identifier of generic NFTs, and t α , t β , t γ , t α , t β , and t γ as identifiers for time instants (with a temporal resolution of seconds). Table S3 illustrates two meaningful examples of how the NFT network is created. (i) When a buyer, who purchased NFT i at time t α , buy NFT j at time t β > t α , a directed link from NFT i to NFT j is created at time t β . If the same buyer purchases NFT k at a later time t γ > t β , a directed link from NFT j to NFT k is drawn at time t γ . (ii) When a buyer, who purchased NFT i at time t α buy NFT j and NFT k at the same instant t β > t α , a directed link from NFT i to NFT j and another from NFT i to NFT k are drawn. If the same buyer purchases a fourth NFT h at time t α . The NFT network hereby constructed includes 4 657 713 NFTs out of a total of 4 704 479. The NFTs that are left out belongs to buyers who perform only one transaction. The network analysis is done by leveraging selected functions in the networkx Python package.

S1.3 Generation of random networks
Random networks relative to the trader and NFT networks are generated in a similar fashion, and by preserving each node outgoing and incoming strength. We consider the pool of observed links with repetition, that is, a link appears a number of times equal to its weight. Two links (l 1 and l 2 ) are randomly extracted over this pool, where node n 1 l 1 create a directed link to node n 2 l 1 and node n 1 l 2 create a directed link to node n 2 l 2 . These links are swapped if the fours nodes are different. The swap consists in creating link l 1 , where node n 1 l 1 create a directed link to node n 2 l 2 , and link l 2 , where node n 1 l 2 create a directed link to node n 2 l 1 . We repeat the procedure for a number of times equal to the total links in the network. We create 100 independent realization of this random network for the trader network and 100 for the NFT network.

S1.4 NFT features
We characterize NFTs with a set of 11 features, partitioned in three groups. An NFT's features were calculated only from the data that could be collected until the day before its primary sale, t s . We used these features in two separate tasks of regression (Section S1.5), and classification (Section S1.6).
The first group of features includes network centrality scores obtained from the trader network. Specifically, we considered the degree centrality (k), and the PageRank centrality (PR) of the seller and the buyer, for a total of 4 features. The degree centrality of a node is the count of all its incoming and outgoing unique links 6 , and its PageRank centrality measures the stationary probability that a random walk on the network ends up in that node 7 .
The second group includes the visual features of the object associated with the NFT, namely 5 PCA components extracted from the AlexNet vector of the object (PCA 1...5 ). We experimented with a number of components varying from 2 to 10, and results varied only slightly-fewer components caused a feeble decrease in the quality of the regression and prediction results, while additional components did not add any predictive power.
The third and last group includes two features to account for the previous sale history in the NFT's collection. The first is the median price of primary and secondary sales made in the collection of interest during a time window prior to t s . The latter models the prior probability of secondary sale. We acknowledge that the likelihood that a NFT gets transacted in a secondary sale might depend on the collection it belongs to. For example, NTFs corresponding to collectible items from very popular collections may be more likely to be resold than an NFT serving for a specific purpose, such as determining the ownership of a name server. We defined the probability of secondary sale, p resale , as 0.5 (random probability) when the NFT is the first to be sold in its collection; else, the probability of secondary sale is calculated as: where n represents the NFTs with a primary sale up to the day before the first purchase and s the number of these NFTs with at least one secondary sale. When the collection is large, the probability of secondary sales becomes p (n → +∞) = s/n and corresponds to the ratio between items with secondary sales over all items with one sale. The frequency distributions of our features have different skews and ranges. To make them comparable and suitable for regression and prediction tasks, we first transform their values to make their distributions closer to a Normal distribution. Specifically, we calculate the logarithm of the network degree and the median sale price (after adding 1, so that zero-values were preserved), and we apply a BoxCox transformation 8 to the PageRank centrality and to p resale ; BoxCox uses power functions to create a monotonic transformation that stabilizes variance and makes 3/11 the data closer to a normal distribution. No transformation was needed for the PCA features. Last, we scale all the variables in the range [0, 1] (i.e., min-max scaling).

S1.5 Sale price regression
We perform linear regressions to estimate an NFT's primary and secondary sale prices. Linear regression is an approach for modeling a linear relationship between a dependent variable (secondary sale price, in our experiments) and a set of independent variables (features describing the NFT at the time it was first sold), and it does so by associating a so-called β -coefficient with each independent variable such as the sum of all independent variables multiplied by their respective β -coefficients approximates the value of the dependent variable with minimal error. Specifically, we used an Ordinary Least Squares regression model to estimate coefficients such that the sum of the squared residuals between the estimation and the actual value is minimized.
We use the NFT features described in Section S1.4 as independent variables, and either the price of primary sale or the median secondary sale price calculated over a time window starting at t s as dependent variables. For secondary sale price, the results changed only slightly when using different aggregations other than the median (e.g., mean, maximum). We experimented with different lengths of the time window, ranging from one week after the primary sale up to two years after. To make sure that the secondary sale price of each NFT was calculated over time windows of equal length, we exlcuded from the regression NFTs that were sold for the first time too recently-namely those NFTs whose t s was within one time window before the most recent timestamp in our dataset. In the regression, we considered only NFTs with at least one secondary sale in the time window considered.
We evaluated the goodness of the linear fit using coefficient of determination R 2 , a score in the range [0, 1] that measures the proportion of the variance in the dependent variable that the linear model is able to predict from the independent variables. In particular, we used its 'adjusted' version R 2 ad j , that discounts the effect of the R 2 spuriously increasing as more independent variables are added to the model.

S1.6 Secondary sale prediction
We performed a binary classification task to predict whether an NFT will be transacted in a secondary sale after its primary sale at time t s . We adopted a standard supervised learning approach. In supervised learning, instances in a dataset (the NFTs) are described with a number of features (those presented in Section S1.4) and marked with a target label (1 if the NFT was transacted in a secondary sale, 0 otherwise). A mathematical model learns a function that maps the features to the target label based on a number of training instances from the dataset. The performance of the model is later assessed on a test set of unseen instances. In our experiments, we emulate a prediction on future data based on past knowledge. To do so, we sort the NFTs according to their time of primary sale t s , and we use the first 95% of NFTs for training and the latest 5% for testing. Our dataset is sufficiently large so that the test set, albeit small in relative terms, includes a large selection of tens of thousands of instances. Similar to the regression task, we consider multiple time windows of varying size to determine the target label (i.e., whether the NFT was resold or not), and we exclude from the dataset recent NFTs whose t s is within one time window before the last timestamp in our dataset.
There are several classes of models that can be used for supervised learning 9 . We pick AdaBoost 10 , an ensemble of weak learners (in our case, decision trees) whose output is combined into single score through a weighted sum. In particular, we initialized the AdaBoost classifier with 100 decision tree stumps (i.e., trees of depth 1), and trained it with a learning rate of 1. Despite its relatively simple design, AdaBoost can achieve good performance compared to more complex model and it effectively limits overfitting the learned function on the training data.
The labels of our dataset are imbalanced: the number of negative labels is much higher than the number of positive ones (i.e., 80% of NFTs in our dataset are more not resold). Imbalanced datasets can affect the ability of the model to learn a function that can effectively associate the correct label to both positive and negative instances. To mitigate this problem, we perform random oversampling 11 to balance the classes. Specifically, within the training set, we add multiple copies of positive samples picked at random until the size of the two classes is balanced. Compared to other oversampling techniques 12,13 , random oversampling does not generate synthetic data points, which exhibiting unrealistic features. By applying oversampling, we effectively set the model to assign higher importance to positive samples: misclassifying a positive instance causes a loss in performance that is proportional to the number of its replicas.
To evaluate the performance on the test set, we measure two quantities. The first is the F1-score, namely the harmonic mean of the precision (fraction of instances that are classified as positive that are indeed positive) and recall (fraction of positive instances the are correctly classified). The second is the "Area Under the ROC Curve" (AUC); it measures the ability of the model to correctly rank positive and negative samples by confidence score, independent of any fixed decision threshold. AUC is equal to 0.5 for a random classification and it is equal to 1 for a perfect ranking.  ad j of a linear regression fit to predict (a) the price of primary sales, and (b) the median price of secondary sales 1 month after their respective primary sale from the historical median price of sale in the collection calculated over varying time windows (one week to two years) preceding the primary sale. Bottom: R 2 ad j of a linear regression fit to predict (c) the price of secondary sales from the price of their respective primary sales, and (d) the price of secondary sales from the median price of sales in the NFT's collection in the previous week; we perform different regressions to predict the median price of secondary sales over varying time windows (one week to two years) after the primary sale. All results are broken down by NFT categories. Figure S6. Vast majority of NFTs does not have a secondary sale. Fraction of NFTs that were sold in at least one secondary sale n days after their primary sale. Figure S7. Result of predicting the existence of a secondary sale. F1 score of a binary classification task aimed at predicting whether a NFT will be sold in a secondary sale within 1 year after its primary sale. Results are broken down by different feature sets and NFT categories. Figure S8. Classification task with different time windows. F1 score of a binary classification task aimed at predicting whether a NFT will be sold in a secondary sale within varying time windows after its primary sale. We used all available features for training and testing the models. Results are broken down by different NFT categories.