10. parallel

Functions for parallel computations on a single multi core machine using the standard library multiprocessing.

Not the programming details, but the way how to speed up some things.

  • If your computation is already fast (e.g. <1s) go on without parallelisation. In an optimal case you gain a speedup as the number of cpu cores.

  • If you want to use a cluster with all cpus, this is not the way (you need MPI).

Parallelisation is no magic and this module is for convenience for non specialist of parallel computing. The main thing is to pass additional parameters to the processes (a pool of workers) and loop only over one parameter given as list. Opening and closing of the pool is hidden in the function. In this way we can use a multicore machine with all cpus.

During testing I found that shared memory does not really speed up, if we just want to calculate a function e.g. for a list of different Q values dependent on model parameters. Here the pickling of numpy arrays is efficient enough compared to the computation we do. The amount of data pickled should not be too large as each process gets a copy and pickling needs time.

If speed is an issue and shared memory gets important i advice using Fortran with OpenMP as used for ff.cloudScattering with parallel computation and shared memory. For me this was easier than the different solutions around.

We use here only non modified input data and return a new dataset, so we dont need to care about what happens if one process changes the data needed in another process (race conditions,…), anyway its not shared. Please keep this in mind and dont complain if you find a way to modify input data.

For easier debugging (to find the position of an error in the pdb debugger) use the option debug. In this case the multiprocessing is not used and the debugger finds the error correctly.

See example in doForList.


Parallel functions

doForList(funktion, looplist, *args, **kwargs)

Apply function with values in looplist in a pool of workers in parallel using multiprocessing.

doForQlist(funktion, qList, *args, **kwargs)

Calculates for qlist the function in a pool of workers using multiprocessing.

psphereAverage(funktion[, relError])

Parallel evaluation of spherical average of function.

Helper functions

randomPointsOnSphere(NN[, r, skip])

N quasi random points on sphere of radius r based on low-discrepancy sequence.

randomPointsInCube(NN[, skip, dim])

N quasi random points in cube of edge 1 based on low-discrepancy sequence.

rphitheta2xyz(RPT)

Transformation spherical coordinates [r,phi,theta] to cartesian coordinates [x,y,z]

fibonacciLatticePointsOnSphere(NN[, r])

Fibonacci lattice points on a sphere with radius r (default r=1)

haltonSequence(size, dim[, skip])

Pseudo random numbers from the Halton sequence in interval [0,1].


Functions for parallel computations on a single multi core machine using the standard library multiprocessing.

Not the programming details, but the way how to speed up some things.

  • If your computation is already fast (e.g. <1s) go on without parallelisation. In an optimal case you gain a speedup as the number of cpu cores.

  • If you want to use a cluster with all cpus, this is not the way (you need MPI).

Parallelisation is no magic and this module is for convenience for non specialist of parallel computing. The main thing is to pass additional parameters to the processes (a pool of workers) and loop only over one parameter given as list. Opening and closing of the pool is hidden in the function. In this way we can use a multicore machine with all cpus.

During testing I found that shared memory does not really speed up, if we just want to calculate a function e.g. for a list of different Q values dependent on model parameters. Here the pickling of numpy arrays is efficient enough compared to the computation we do. The amount of data pickled should not be too large as each process gets a copy and pickling needs time.

If speed is an issue and shared memory gets important i advice using Fortran with OpenMP as used for ff.cloudScattering with parallel computation and shared memory. For me this was easier than the different solutions around.

We use here only non modified input data and return a new dataset, so we dont need to care about what happens if one process changes the data needed in another process (race conditions,…), anyway its not shared. Please keep this in mind and dont complain if you find a way to modify input data.

For easier debugging (to find the position of an error in the pdb debugger) use the option debug. In this case the multiprocessing is not used and the debugger finds the error correctly.

See example in doForList.

jscatter.parallel.doForList(funktion, looplist, *args, **kwargs)[source]

Apply function with values in looplist in a pool of workers in parallel using multiprocessing.

Like multiprocessing map_async but distributes automatically all given arguments.

Parameters
funktionfunction

Function to process with arguments (args, loopover[i]=looplist[j,i], kwargs) Return value of function should contain parameters or at least the loopover value to allow a check, if desired.

loopoverlist of string, default= None

Names of arguments to use for (sync) looping over with values in looplist. - If not given the first funktion argument is used. - If loopover is single argument this gets looplist[i,:] .

looplistlist or array N x len(loopover)

List of values to loop over.

ncpuint, optional
Number of cpus in the pool.
  • not given or 0 -> all cpus are used

  • int>0 min (ncpu, mp.cpu_count)

  • int<0 ncpu not to use

cbNone, function

Callback after each calculation.

debugint

debug > 0 allows serial output for testing.

outputbool

If False no output is shown.

Returns
listlist of function return values as [result1,result2,…..]

The order of return values is not explicitly synced to looplist.

Notes

The return array of function may be prepended with the value looplist[i] as reference. E.g.:

def f(x,a,b,c,d):
result = x+a+b+c+d
    return [x, result]

Examples

import jscatter as js
import numpy as np

def f(x,a,b,c,d):
   res=x+a+b+c+d
   return [x,res]

# loop over first argument, here x
res = js.parallel.doForList(f,looplist=np.arange(100),a=1,b=2,c=3,d=11)
# loop over 'd' ignoring the given d=11 (which can be omitted here)
res = js.parallel.doForList(f,looplist=np.arange(100),loopover='d',x=0,a=1,b=2,c=3,d=11)

# using a list of 2 values for x (is first argument)
def f(x,a,b,c,d):
   res=x[0]+x[1]+a+b+c+d
   return [x[0],res]
loop = np.arange(100).reshape(-1,2)  # has 2 values fin second dimension
res = js.parallel.doForList(f,looplist=loop,a=1,b=2,c=3,d=11)

# looping over several variables in sync
loop = np.arange(100).reshape(-1,2)
res = js.parallel.doForList(f,looplist=loop,loopover=['a','b'],x=[100,200],a=1,b=2,c=3,d=11)
jscatter.parallel.doForQlist(funktion, qList, *args, **kwargs)[source]

Calculates for qlist the function in a pool of workers using multiprocessing.

Calcs [function(Qi, *args, **kwargs) for Qi in qlist ] in parallel. The return value of function will contain the value Qi as reference.

Parameters
funktionfunction

Function to process with arguments (looplist[i],args,kwargs)

qListlist

List of values for first argument in function. qList value prepends the arguments args.

ncpuint, optional
number of cpus in the pool
not given or 0 -> all cpus are used
int>0 min (ncpu, mp.cpu_count)
int<0 ncpu not to use
cb :function, optional

Callback after each calculation

debugint

debug > 0 allows serial output for testing

Returns
listndim function_return.ndim+1

The list elements will be prepended with the value qlist[i] as reference.

Examples

def f(x,a,b,c,d):
   return [x+a+b+c+d]
# loop over first argument here x
js.parallel.doForList(f,Qlist=np.arange(100),a=1,b=2,c=3,d=11)
jscatter.parallel.fibonacciLatticePointsOnSphere(NN, r=1)[source]

Fibonacci lattice points on a sphere with radius r (default r=1)

This can be used to integrate efficiently over a sphere with well distributed points.

Parameters
NNinteger

number of points = 2*N+1

rfloat, default 1

radius of sphere

Returns
list of [r,phi,theta] pairs in radians

phi azimuth -pi<phi<pi; theta polar angle 0<theta<pi

References

1

Measurement of Areas on a Sphere Using Fibonacci and Latitude–Longitude Lattices Á. González Mathematical Geosciences 42, 49-64 (2009)

Examples

import jscatter as js
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
points=js.formel.fibonacciLatticePointsOnSphere(1000)
pp=list(filter(lambda a:(a[1]>0) & (a[1]<np.pi/2) & (a[2]>0) & (a[2]<np.pi/2),points))
pxyz=js.formel.rphitheta2xyz(pp)
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(pxyz[:,0],pxyz[:,1],pxyz[:,2],color="k",s=20)
ax.set_xlim([-1,1])
ax.set_ylim([-1,1])
ax.set_zlim([-1,1])
ax.set_aspect("equal")
plt.tight_layout()
plt.show(block=False)

points=js.formel.fibonacciLatticePointsOnSphere(1000)
pp=list(filter(lambda a:(a[2]>0.3) & (a[2]<1) ,points))
v=js.formel.rphitheta2xyz(pp)
R=js.formel.rotationMatrix([1,0,0],np.deg2rad(-30))
pxyz=np.dot(R,v.T).T
#points in polar coordinates
prpt=js.formel.xyz2rphitheta(np.dot(R,pxyz.T).T)
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(pxyz[:,0],pxyz[:,1],pxyz[:,2],color="k",s=20)
ax.set_xlim([-1,1])
ax.set_ylim([-1,1])
ax.set_zlim([-1,1])
ax.set_aspect("equal")
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
plt.tight_layout()
plt.show(block=False)
jscatter.parallel.haltonSequence(size, dim, skip=0)[source]

Pseudo random numbers from the Halton sequence in interval [0,1].

To use them as coordinate points transpose the array.

Parameters
sizeint

Samples from the sequence

dimint

Dimensions

skipint

Number of points to skip in Halton sequence .

Returns
array

Notes

The visual difference between pseudorandom and random in 2D. See [2] for more details.

comparisonRandom-Pseudorandom

References

1

https://mail.python.org/pipermail/scipy-user/2013-June/034741.html Author: Sebastien Paris, Josef Perktold translation from c

2

https://en.wikipedia.org/wiki/Low-discrepancy_sequence

Examples

import jscatter as js
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
for i,color in enumerate(['b','g','r','y']):
   # create halton sequence and shift it to needed shape
   pxyz=js.parallel.haltonSequence(400,3).T*2-1
   ax.scatter(pxyz[:,0],pxyz[:,1],pxyz[:,2],color=color,s=20)
ax.set_xlim([-1,1])
ax.set_ylim([-1,1])
ax.set_zlim([-1,1])
ax.set_aspect("equal")
plt.tight_layout()
plt.show(block=False)
jscatter.parallel.psphereAverage(funktion, relError=300, *args, **kwargs)[source]

Parallel evaluation of spherical average of function.

A Fibonacci lattice or Monte Carlo integration with pseudo random grid is used.

Parameters
funktionfunction

Function to evaluate. Function first argument gets cartesian coordinate [x,y,z] of point on unit sphere.

relErrorfloat, default 300
Determines how points on sphere are selected
  • >1 Fibonacci Lattice with relError*2+1 points

  • 0<1 Pseudo random points on sphere (see randomPointsOnSphere).

    Stops if relative improvement in mean is less than relError (uses steps of 40 new points). Final error is (stddev of N points) /sqrt(N) as for Monte Carlo methods even if it is not a correct 1-sigma error in this case.

arg,kwargs :

forwarded to function

Returns
array like with values from function and appended error

Notes

  • Works also on single core machines.

  • For integration over a continuous function as a form factor in scattering the random points are not statistically independent. Think of neighbouring points on an isosurface which are correlated and therefore the standard deviation is biased. In this case the Fibonacci lattice is the better choice as the standard deviation in a random sample is not a measure of error but more a measure of the differences on the isosurface.

Examples

def f(x,r):
   return [js.formel.xyz2rphitheta(x)[1:].sum()*r]
js.parallel.psphereAverage(f,relError=500,r=1)
js.parallel.psphereAverage(f,relError=0.01,r=1)
jscatter.parallel.randomPointsInCube(NN, skip=0, dim=3)[source]

N quasi random points in cube of edge 1 based on low-discrepancy sequence.

For numerical integration quasi random numbers are better than random samples as the error drops faster [1]. Here we use the Halton sequence to generate the sequence. Skipping points makes the sequence additive and does not repeat points.

Parameters
NNint

Number of points to generate.

skipint

Number of points to skip in Halton sequence .

dimint, default 3

Dimension of the cube.

Returns
array of [x,y,z]

References

1(1,2)

https://en.wikipedia.org/wiki/Low-discrepancy_sequence

Examples

The visual difference between pseudorandom and random in 2D. See [1] for more details.

import jscatter as js
import matplotlib.pyplot as pyplot
fig = pyplot.figure(figsize=(10, 5))
fig.add_subplot(1, 2, 1, projection='3d')
fig.add_subplot(1, 2, 2, projection='3d')
js.sf.randomLattice([2,2],3000).show(fig=fig, ax=fig.axes[0])
fig.axes[0].set_title('random lattice')
js.sf.pseudoRandomLattice([2,2],3000).show(fig=fig, ax=fig.axes[1])
fig.axes[1].set_title('pseudo random lattice \n less holes more homogeneous')
fig.axes[0].view_init(elev=85, azim=10)
fig.axes[1].view_init(elev=85, azim=10)
#fig.savefig(js.examples.imagepath+'/comparisonRandom-Pseudorandom.jpg')
comparisonRandom-Pseudorandom

Random cubes of random points in cube.

# random cubes of random points in cube
import jscatter as js
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
N=30
cubes=js.parallel.randomPointsInCube(20)*3
for i,color in enumerate(['b','g','r','y','k']*3):
   points=js.parallel.randomPointsInCube(N,skip=N*i).T
   pxyz=points*0.3+cubes[i][:,None]
   ax.scatter(pxyz[0,:],pxyz[1,:],pxyz[2,:],color=color,s=20)
ax.set_xlim([0,3])
ax.set_ylim([0,3])
ax.set_zlim([0,3])
ax.set_aspect("equal")
plt.tight_layout()
plt.show(block=False)
#fig.savefig(js.examples.imagepath+'/randomRandomCubes.jpg')
randomRandomCubes
jscatter.parallel.randomPointsOnSphere(NN, r=1, skip=0)[source]

N quasi random points on sphere of radius r based on low-discrepancy sequence.

For numerical integration quasi random numbers are better than random samples as the error drops faster [1]. Here we use the Halton sequence to generate the sequence. Skipping points makes the sequence additive and does not repeat points.

Parameters
NNint

Number of points to generate.

rfloat

Radius of sphere

skipint

Number of points to skip in Halton sequence .

Returns
array of [r,phi,theta] pairs in radians

References

1

https://en.wikipedia.org/wiki/Low-discrepancy_sequence

Examples

A random sequence of points on sphere surface.

import jscatter as js
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
for i,color in enumerate(['b','g','r','y']):
   points=js.parallel.randomPointsOnSphere(400,skip=400*i)
   points=points[points[:,1]>0,:]
   pxyz=js.formel.rphitheta2xyz(points)
   ax.scatter(pxyz[:,0],pxyz[:,1],pxyz[:,2],color=color,s=20)
ax.set_xlim([-1,1])
ax.set_ylim([-1,1])
ax.set_zlim([-1,1])
fig.axes[0].set_title('random points on sphere (half shown)')
plt.tight_layout()
plt.show(block=False)
#fig.savefig(js.examples.imagepath+'/randomPointsOnSphere.jpg')
randomPointsOnSphere
jscatter.parallel.rphitheta2xyz(RPT)[source]

Transformation spherical coordinates [r,phi,theta] to cartesian coordinates [x,y,z]

Parameters
RPTarray Nx3
dim Nx3 with [r,phi,theta] coordinates
r : float length
phi : float azimuth -pi < phi < pi
theta : float polar angle 0 < theta < pi
Returns
Array with same dimension as RPT.