Communication in parallel simulations¶
In parallel simulations, communication between the cpus may be a challenge. In most cases, this is handled automatically by Asap, but sometimes you will need to communicate in your own modules. This is most likely to happen in on-the-fly data analysis.
This page falls in three parts. The first part explains how normal Asap communication works. The second part explains how you can use these pre-existing communication patterns in your own module. The third part explains how you can roll your own communication patterns from scratch, using the Message Passing Interface (MPI).
In the following, a task refers to the process running on each cpu/core (this is consistent with usual MPI terminology). Thus, if a calculation is running on eight cpus/cores, there will be eight tasks, each taking care of atoms belonging to one part of space. A neighboring task is a task responsible for an adjacent part of space.
Normal Asap communication¶
When an Asap calculator calculates forces, energies or stresses in a parallel simulation, the following communication steps occur:
Global communication of changes: Each task determines if the atoms have been changed, and if so if any atom has been moved sufficiently to trigger a migration. This is then communicated globally, so that all tasks proceed to the following steps even if atoms have only been modifies in some tasks.
Migration (not every time): If any atom has moved sufficiently outside the part of space handled by its task, it is necessary to migrate atoms. All data about atoms that should be handled by another task is moved to that task.
Ghost atoms update: Ghost atoms are atoms that are not in the part of space handled by this task, but are sufficiently close that they interact with atoms handled by this task. Before a calculation can proceed, the positions and atomic numbers of all ghost atoms must be transmitted from the task responsible for them. The calculation can now proceed.
Additional ghost update in some calculators: Some calculators need an additional communication step halfway through the calculation, where intermediate results for the ghost atoms are communicated. This is for example the case for EMT, but not for LennardJones.
Using Asap’s communication patterns¶
Migrating your own data¶
If you have your own data about the atoms, you will need to migrate that data along with the atoms as they migrate. Example: During an MD simulation, you calculate the average position of all the atoms. This average value is associated with the atoms, and needs to remain associated with the right atoms when atoms are migrating between processors.
Fortunately, Asap can do this for you. If you store data on the atoms
with atoms.set_data(name, array)
, that data is now associated with
the atoms, and will migrate along with them. Access the data with
atoms.get_data(name)
. Just be sure that all such data is stored on
the atoms when the calculator is invoked (and migration may happen):
after a call to atoms.get_forces()
, atoms.get_potential_energy()
etc any data in local variables of your script may not any longer be
associated with the right atoms.
Piggybacking on the ghost atom mechanism¶
If your on-the-fly analysis not only needs information about the atoms, but also about their neighbors, difficulties arise as the neighboring atoms may be on other processors. In some cases, Asap’s internal mechanisms may help.
If you need information about an atom’s neighbors at distances closer than the cutoff used in the Calculator, you can access the calculator’s neigbor list object. This object will return information about neighbors even if they are on other processors. The positions are returned directly by the XXXX method, it also returns the indices of the atoms. If the index is larger than the number of atoms on this processor, it refers to “ghost atoms” on other processors. Data for the ghost atoms can be found in the atoms.ghosts dictionary, it normally contains ‘positions’ and ‘numbers’ (atomic numbers). To access e.g. the atomic numbers of the ghost atoms, use atoms.ghosts['numbers'][i-len(atoms)]
, where i is the index returned by the neighbor list.
Other kinds of data may be added to the atoms.ghosts
dictionary. Add an array of the same name and type as an array in atoms.arrays
, with the appropriate shape (i.e. the first dimension should be the number of ghosts, the remaining dimensions should be the same as in atoms.arrays
). After the next force/energy calculation, the ghost data will be updated. If you add the ghost data before the first calculation, the first dimension of the array should be 0.
Beware that calling e.g. atoms.get_forces()
does not guarantee update of the ghost data, if any other kind of calculation has been performed since last time the atoms were moved. An update may be guaranteed by invalidating the neighbor list, that will also trigger a migration.
Communication using message passing¶
The module asap3.mpi
exposes the default MPI communicator, world
, which can be used to perform your own communication. The interface is the same as the similar module in GPAW, any divergences that may arise should be regarded as bugs.
The MPI Communicator object¶
The MPI communicator objects (e.g. world
) have the following attributes:
size
:The number of MPI tasks.
rank
:The rank of this task.
In addition, it has the following methods:
send(a, dest, tag=123, block=True)
:Send the array
a
to taskdest
using tagtag
.If
block
is True, a blocking send is used and None is returned by the method.If
block
is False, a non-blocking send is used, and the arraya
may not be modified until the communication has completed. The method returns a Request object that can be used to detect when this occurs (see the wait method of the communicator object). The Request object keeps a reference to the arraya
until the communication has completed.receive(a, src=-1, tag=123, block=True)
:Receive data from task
src
into arraya
; if src is -1 then data is accepted from any task. Important: The arraya
must be contiguous, and of the correct type and size. This is not checked!If
block
is true, the method returns when data is received, and the return value is the task from which it was received.If
block
is False, the method returns immediately, returning a Request object that can be used to query if communication has occurred. The arraya
should not be accessed until the communication has completed.sendreceive(a, dest, b, src, desttag=123, recvtag=123)
:Simultaneously send array
a
tosrc
and receive arrayb
fromdest
. The same effect may be obtained by overlapping a nonblocking send with a blocking receive or vice versa, butsendreceive
may give better performance with some MPI implementations.probe(src=-1, tag=-1, block=False)
:Check if data is available from task
src
with tagtag
(-1 means any task/tag). If data is available, the method returns a tuple containing the number of bytes, the source and the tag. If no data is available, None is returned. Ifblock
is True, None is never returned, instead the method waits until data is available.barrier()
:Wait until all the other tasks have also called this method.
wait(request)
:Wait until the non-blocking communication represented by the Request object
request
has completed. Equivalent torequest.wait()
.test(request)
:Tests if a communication has completed, returns True if complete, False if not. Cleans up after the communication if it is complete, so you do not have to call wait(), but doing so is allowed. Equivalent to
request.test()
.waitall(requests)
:Takes a sequence (list, tuple or similar) of requests, and waits until all of them have completed.
testall(requests)
:Takes a sequence of requests and tests if they have all completed. If just one is still incomplete, False is returned.
sum(data, root=-1)
:Sums the scalar or array
data
on all tasks. Ifroot
is -1, all tasks receive the result, otherwise only the taskroot
receives it. Ifdata
is a scalar, the method returns the result, if it is an array it is overwritten with the result.product(data, root=-1)
,min(data, root=-1)
andmax(data, root=-1)
:As
sum(...)
, but performs a product, a min, or a max operation instead of a sum. Unlikesum(...)
, these methods do not support complex data.broadcast(data, root)
:Send the array
data
from taskroot
to all other tasks. The array must have the same size and type on all tasks.scatter(a, b, root)
:Send data from array
a
on taskroot
to the arrayb
on all tasks, such that task N receives the Nth part ofa
. The size of arraya
must thus be the size ofb
times the number of tasks.all_gather(a, b)
:Data from the
a
arrays on all tasks are gathered into theb
array. The size of theb
array must be the size ofa
times the number of tasks. This version of the call updatesb
on all tasks.gather(a, root, b=None)
:As
allgather
, but the result is only available on taskroot
.new_communicator(ranks)
:Create a new communicator object containing the tasks listed in
ranks
.name()
:Returns the processor name.
abort()
:Aborts all tasks. For emergency use only!
Request object¶
The send
and receive
methods of the communicator object will
return a Request object if called with block=False
. The request
object should be used to wait for the communication to completed by
calling the wait
and/or test
method.
The request object has the following methods:
wait()
:Wait until the communication has completed, and release the reference to the array used for the communication.
test()
:Test if the communication has completed, return True if complete, False if not. After True has been returned, the reference to the array used for the communication has been discarded.
It is safe to call wait()
and test()
multiple times on the
same object. Once test()
has returned True or wait()
has been
called, any subsequent call to test()
returns True.
You should not discard a request until wait()
has been called or
test()
has returned True or you have by some other means insured
that the communication has completed. Failure to insure this may
cause your script to hang until communication has completed, while garbage
collecting the request object.