Skip to main content

Bulk-synchronous parallelism

Bulk-synchronous parallelism (BSP) is a parallel computation framework, which encompasses algorithm theory, programming theory, software and hardware. I use the term BSP in a broad sense, including various similar models and frameworks, e.g. CGM. I am particularly interested in:
  • the design of efficient BSP algorithms;
  • advanced programming models based on BSP.

Below is a list of BSP resources that I find useful in my work.

Seminal papers

L. G. Valiant. A Bridging Model for Parallel Computation. Communications of the ACM, 33(8), pp. 103—111, August 1990. [DOI]

L. G. Valiant. General Purpose Parallel Architectures. In J. van Leeuwen, ed., Handbook of Theoretical Computer Science, pp. 943—971. Elsevier, 1990.

L. G. Valiant. Bulk-synchronous parallel computers. In M. Reeve, ed., Parallel Processing and Artificial Intelligence, pp. 15—22. John Wiley & Sons, 1989.

Introductory papers and tutorials

BSP computing by Bill McColl.

W. F. McColl. Scalable Computing. In J. van Leeuwen, ed., Computer Science Today: Recent Trends and Developments, number 1000 of Lecture Notes in Computer Science, pp. 41—61. Springer-Verlag, 1996. [DOI]

D. B. Skillicorn, J. M. D. Hill, and W. F. McColl. Questions and answers about BSP. Scientific Programming, 6(3): 249—274, Fall 1997. [CiteSeer]

W. F. McColl. General Purpose Parallel Computing. In A. Gibbons and P. Spirakis, eds., Lectures on parallel computation, vol. 4 of Cambridge International Series on Parallel Computation, pp. 337—391. Cambridge University Press, 1993. [CiteSeer]

D. Pountain. Parallel Processing in Bulk. BYTE, November 1996. [Publisher’s HTML]


R. Bisseling. Parallel Scientific Computation: A Structured Approach using BSP and MPI. Oxford University Press, 2004. [Publisher’s HTML]

Book chapters

J. Hill. Portability of Performance in the BSP Model. In K. Hammond and G. Michaelson, eds., Research Directions in Parallel Functional Programming, p. 267—287. Springer-Verlag, 1999.

D. B. Skillicorn. Predictable Parallel Performance: The BSP Model. In R. Correa et al., eds., Models for Parallel and Distributed Computation: Theory, Algorithmic Techniques and Applications, Vol. 67 of Applied Optimization series, p. 85—115. Kluwer Academic Publishers, 2002.

A. Ferreira and I. Guerin Lassous. Discrete Computing with CGM. In R. Correa et al., eds., Models for Parallel and Distributed Computation: Theory, Algorithmic Techniques and Applications, Vol. 67 of Applied Optimization series, p. 117—143. Kluwer Academic Publishers, 2002.

S. W. Song. Parallel Graph Algorithms for Coarse-Grained Multicomputers. In R. Correa et al., eds., Models for Parallel and Distributed Computation: Theory, Algorithmic Techniques and Applications, Vol. 67 of Applied Optimization series, p. 147—178. Kluwer Academic Publishers, 2002.

A. Tiskin. Bulk-synchronous parallelism: An emerging paradigm of high-performance computing. In L. T. Yang and M. Guo, eds., High Performance Computing: Paradigm and Infrastructure, pp. 69—80. John Wiley and Sons, 2005.

Doctorate theses

D. Lecomber. Methods of BSP Programming. University of Oxford, 1998.

M. Marin. Discrete-Event Simulation on the Bulk-Synchronous Parallel Model. University of Oxford, 1998. [CiteSeer]

R. Calinescu. Architecture-independent Loop Parallelisation. University of Oxford, 1999. Also published by Springer-Verlag, 2000.

A. Tiskin. The Design and Analysis of Bulk-Synchronous Parallel Algorithms. University of Oxford, 1999. [Local PS]

M. Alves de Inda. Constructing Parallel Algorithms for Discrete Transforms. Universiteit Utrecht, 2000.

M. Bamha. Parallelism and load balancing in the treatment of join and multi-join in SN architectures (in French). Universite d’Orleans, 2000. [HTML+PS]

F. Loulergue. Design of functional languages for massively parallel programming (in French). Universite d’Orleans, 2000. [PDF]

H. Mongelli. CGM Algorithms for One- and Two-dimensional Pattern Matching with and without Scaling (in Portuguese). Universidad de Sao Paulo, 2000. [HTML+PS]

A. Zavanella. Skeletons and BSP: Performance Portability for Parallel Programming. Universita di Pisa, 2000.

T. L. Williams. A General-Purpose Model for Heterogeneous Computation. University of Central Florida, 2000. [CiteSeer]

Chun-Hsi Huang. Communication-Efficient Bulk Synchronous Parallel Algorithms. The State University of New York at Buffalo, 2001. [CiteSeer]

C. E. R. Alves. Coarse-grained Parallel Algorithms for String Alignment Problems (in Portuguese). Universidad de Sao Paulo, 2002. [HTML+PS]

A. H. Gebremedhin. Practical Parallel Algorithms for Graph Coloring Problems in Numerical Optimization. University of Bergen, 2003. [HTML+PS]

T. Garcia. String matching algorithms: from systolic model to CGM model (in French). Universite de Picardie Jules Verne, 2003. [HTML+PS+PDF]

F. Abu Salem. Factorisation Algorithms for Univariate and Bivariate Polynomials over Finite Fields. University of Oxford, 2004. [HTML+PS]

F. Gava. Functional Approach to Parallel Programming and Meta-Computing: Semantics, Implementation and Cerification (in French). Universite Paris XII Val de Marne, 2005. [HTML+PS+PDF]

BSP algorithms

Basic BSP algorithms

B. Juurlink, P. Kolman, F. Meyer auf der Heide, and I. Rieping. Optimal Broadcast on Parallel Locality Models. Journal of Discrete Algorithms, 1(2): 151—166, 2003. [DOI]

J. Sibeyn. List-ranking on interconnection networks. Information and Computation, 181(2): 75—87, 2003. [DOI]

Advanced BSP algorithms

References to current work on BSP algorithms can be found in my publications. Some people active in BSP algorithm research (not an exhaustive list): Rob Bisseling, Edson Caceres, Frank Dehne, Alex Gerbessiotis, Friedhelm Meyer auf der Heide, Andrea Pietracaprina, Geppino Pucci, Andrew Rau-Chaplin, Jop Sibeyn, Siang Wun Song and others.

BSP computation in the wider algorithmic context

F. Dehne, W. Dittrich, D. Hutchinson, and A. Maheshwari. Bulk Synchronous Parallel Algorithms for the External Memory Model. Theory of Computing Systems, 35: 567—597, 2002. [DOI]

F. Dehne, W. Dittrich, and D. Hutchinson. Efficient external memory algorithms by simulating coarse-grained parallel algorithms. Algorithmica, 36: 97—122, 2003. [DOI]

BSP algorithm engineering

Project Edupack: dense matrix algebra, Fast Fourier Transform, sparse matrix algebra.

Project CGMLab, includes project CGMlib/CGMgraph: prefix sums, sorting, minimum spanning tree, connectivity, other.

Project parXXL: BSP-style cellular networks. Succeeds project SSCRAP: prefix sums, sorting, list contraction.

Alex Gerbessiotis’s Parallel Software: broadcast, prefix sums, sorting, dense matrix algebra, other.

Project NestStep: prefix sums, sorting.

BSP programming

The BSPlib standard and its predecessors

J. M. D. Hill, W. F. McColl, D. C. Stefanescu, M. W. Goudreau, K. Lang, S. B. Rao, T. Suel, T. Tsantilas, and R. H. Bisseling. BSPlib: The BSP programming library. Parallel Computing, 24(14): 1947—1980, December 1998. [DOI]

The Oxford BSP Toolset

M. W. Goudreau, K. Lang, S. B. Rao, T. Suel and T. Tsantilas. Portable and efficient parallel computing using the BSP model. IEEE Transactions on Computers, 48(7): 670—689, July 1999. [DOI]

O. Bonorden, B. Juurlink, I. von Otte, and I. Rieping. The Paderborn University BSP (PUB) library. Parallel Computing, 29(2): 187—207, February 2003. [DOI]

Project PUB

Beyond BSPlib

K. Hinsen. High-level parallel software development with Python and BSP. Parallel Processing Letters, 13(3): 473—484, 2003. [DOI]

Project ScientificPython

C. W. Kessler. NestStep: Nested Parallelism and Virtual Shared Memory for the BSP Model. Journal of Supercomputing, 17(3): 245—262, 2000. [DOI]

Project NestStep

Frederic Loulergue’s projects

Project ZPL

BSP and Grid computing

Project InteGrade

Project GridNestStep


BSP Worldwide

BSP resources by Bill McColl