Fig. 2: Comparison of typical features in training. | Communications Physics

Fig. 2: Comparison of typical features in training.

From: Mode-assisted unsupervised learning of restricted Boltzmann machines

Fig. 2

A training performance comparison between contrastive divergence (CD) with k = 1 steps of the Markov chain (in blue), and mode-assisted training (in orange) across 25 randomly generated 6 × 6 restricted Boltzmann machines with a random uniform data set of size nd = 10. a, b Kullback-Leibler (KL) divergence as a function of training iterations of CD and mode-assisted training, respectively. Median KL divergence shown as the solid curves, with the shaded region defined by the maximum/minimum KL divergence at that point in training. The mode sampling probability in mode-assisted training, Pmode, is shown as the dotted line in b. c, d The median log-differences in probability between the data (qi) and model (pi) distributions for CD and mode-assisted training, respectively. In both cases, the learning rate was a constant ϵCD = 0.05 for 100,000 iterations.

Back to article page