Parallel family trees for transfer matrices in the Potts model
Author
dc.contributor.author
Navarro, Cristóbal A.
Author
dc.contributor.author
Canfora, Fabrizio
Author
dc.contributor.author
Hitschfeld Kahler, Nancy
Author
dc.contributor.author
Navarro, Gonzalo
Admission date
dc.date.accessioned
2015-07-09T19:37:16Z
Available date
dc.date.available
2015-07-09T19:37:16Z
Publication date
dc.date.issued
2014-10-18
Cita de ítem
dc.identifier.citation
Computer Physics Communications 187 (2015) 55–71
en_US
Identifier
dc.identifier.uri
https://repositorio.uchile.cl/handle/2250/131905
General note
dc.description
Artículo de publicación ISI
en_US
Abstract
dc.description.abstract
The computational cost of transfer matrix methods for the Potts model is related to the question in how
many ways can two layers of a lattice be connected? Answering the question leads to the generation of a
combinatorial set of lattice configurations. This set defines the configuration space of the problem, and
the smaller it is, the faster the transfer matrix can be computed. The configuration space of generic (q, v)
transfer matrix methods for strips is in the order of the Catalan numbers, which grows asymptotically
as O(4m) where m is the width of the strip. Other transfer matrix methods with a smaller configuration
space indeed exist but they make assumptions on the temperature, number of spin states, or restrict the
structure of the lattice. In this paper we propose a parallel algorithm that uses a sub-Catalan configuration
space of O(3m) to build the generic (q, v) transfer matrix in a compressed form. The improvement is
achieved by grouping the original set of Catalan configurations into a forest of family trees, in such a way
that the solution to the problem is now computed by solving the root node of each family. As a result, the
algorithm becomes exponentially faster than the Catalan approach while still highly parallel. The resulting
matrix is stored in a compressed form using O(3m × 4m) of space, making numerical evaluation and
decompression to be faster than evaluating the matrix in its O(4m×4m) uncompressed form. Experimental
results for different sizes of strip lattices show that the parallel family trees (PFT) strategy indeed runs
exponentially faster than the Catalan Parallel Method (CPM), especially when dealing with dense transfer
matrices. In terms of parallel performance, we report strong-scaling speedups of up to 5.7×when running
on an 8-core shared memory machine and 28× for a 32-core cluster. The best balance of speedup and
efficiency for the multi-core machine was achieved when using p = 4 processors, while for the cluster
scenario it was in the range p ∈ [8, 10]. Because of the parallel capabilities of the algorithm, a large-scale
execution of the parallel family trees strategy in a supercomputer could contribute to the study of wider
strip lattices.