Wasserstein dictionary learning is an unsupervised approach to learning a collection of probability distributions that generate observed distributions as combinations of Wasserstein centroids. Existing methods of Wasserstein dictionary learning optimize the objective of finding dictionaries with sufficient expressive power via barycentric interpolation to approximate the observed training data, but the coefficients associated with dictionaries have additional structure. It does not impose any characteristics. This creates a dense representation of the observed data, making the coefficients difficult to interpret and can lead to poor empirical performance when using the learned coefficients in downstream tasks. In contrast, we propose a geometrically sparse regularization of Wasserstein space that promotes representation of data points using only nearby dictionary elements, motivated by sparse dictionary learning in Euclidean space . We show that this approach leads to sparse representations of Wasserstein spaces and addresses the problem of non-uniqueness of centroid representations. Moreover, if the data are generated as fixed-distribution Wasserstein centroids, this regularizer facilitates the recovery of the generated distribution if it is not suitable for non-regularized Wasserstein dictionary learning. Through experiments on synthetic and real data, we show that a geometrically regularized approach yields a sparse, interpretable dictionary in Wasserstein space that performs better in downstream applications. increase.