# Working with percolation clusters in Python

ma. 25 mars 2013We're working on a new project in FYS4460 about percolation. In the introduction of this project, we are given a few commands to help us demonstrate a few properties of percolation clusters using MATLAB.

As the Python-fan I am, I of course had to see if I could find equivalent commands in Python, and thankfully that was quite easy. Below I will summarize the commands that will generate a random matrix of filled and unfilled areas, label each cluster in this matrix and calculate the area of each such cluster. Finally, we'll draw a bounding box around the largest cluster.

First of all, we need to include the **pylab** package together with the
**scipy.ndimage.measurements** package:

```
from pylab import *
from scipy.ndimage import measurements
```

Now, let's create a random matrix of filled and unfilled regions and visualize it:

```
L = 100
r = rand(L,L)
p = 0.4
z = r < p
imshow(z, origin='lower', interpolation='nearest')
colorbar()
title("Matrix")
show()
```

This creates a matrix of random numbers **r** before creating a map
**z** of this matrix where each element is set to True if **r** < p
and False if **r** > p:

[caption id="attachment_852" align="aligncenter" width="300"]{.size-medium .wp-image-852 width="300" height="225"} Visualization of the map z.[/caption]

Furthermore we want to label each cluster. This is performed by the scipy.ndimage.measurements.label function:

```
lw, num = measurements.label(z)
imshow(lw, origin='lower', interpolation='nearest')
colorbar()
title("Labeled clusters")
show()
```

This labels each cluster with a number which may be visualized using the imshow function as above:

[caption id="attachment_872" align="aligncenter" width="300"]{.size-medium .wp-image-872 width="300" height="225"} Matrix with labeled clusters, so it is a bit hard to distinguish each cluster.[/caption]

Beacause the label starts from the bottom up, the cluster colors are a bit too ordered, making it hard to distinguish two clusters close to each other. To fix this, we may shuffle the labeling with the following code:

```
b = arange(lw.max() + 1)
shuffle(b)
shuffledLw = b[lw]
imshow(shuffledLw, origin='lower', interpolation='nearest')
colorbar()
title("Labeled clusters")
show()
```

Now it is way easier to see the different clusters:

[caption id="attachment_871" align="aligncenter" width="300"]{.size-medium .wp-image-871 width="300" height="225"} The same matrix with all clusters labeled with a number.[/caption]

After this we may want to extract some properties of the clusters, such as the area. This is possible using the scipy.ndimage.measurements.sum function:

```
area = measurements.sum(z, lw, index=arange(lw.max() + 1))
areaImg = area[lw]
im3 = imshow(areaImg, origin='lower', interpolation='nearest')
colorbar()
title("Clusters by area")
show()
```

The above code now plots the same matrix as above, but this time with all clusters colored by area:

[caption id="attachment_873" align="aligncenter" width="300"]{.size-medium .wp-image-873 width="300" height="225"} Matrix with the clusters colored by area.[/caption]

Finally, we may want to find the bounding box of the largest cluster, so we again may see if there is a path from one side to the other. This is possible by using the function scipy.ndimage.measurements.find_objects. Note however that this is a bit risky. If there are two clusters that are largest with the same area, find_objects will find the bounding box of both clusters. I'll leave it up to you to figure out a method to avoid this if necessary. (Pro tip: Make a mapping of the cluster labels to the areas.)

With the find_objects function, we are able to plot a rectangle around the largest cluster:

```
sliced = measurements.find_objects(areaImg == areaImg.max())
if(len(sliced) > 0):
sliceX = sliced[0][1]
sliceY = sliced[0][0]
plotxlim=im3.axes.get_xlim()
plotylim=im3.axes.get_ylim()
plot([sliceX.start, sliceX.start, sliceX.stop, sliceX.stop, sliceX.start],
[sliceY.start, sliceY.stop, sliceY.stop, sliceY.start, sliceY.start],
color="red")
xlim(plotxlim)
ylim(plotylim)
show()
```

This shows up like this:

[caption id="attachment_874" align="aligncenter" width="300"]{.size-medium .wp-image-874 width="300" height="225"} A bounding box on top of the largest cluster. The rest are colored by area.[/caption]

The whole program then becomes as follows:

```
from pylab import *
from scipy.ndimage import measurements
L = 100
r = rand(L,L)
p = 0.4
z = r<p
figure(figsize=(16,5))
subplot(1,3,1)
imshow(z, origin='lower', interpolation='nearest')
colorbar()
title("Matrix")
# Show image of labeled clusters (shuffled)
lw, num = measurements.label(z)
subplot(1,3,2)
b = arange(lw.max() + 1) # create an array of values from 0 to lw.max() + 1
shuffle(b) # shuffle this array
shuffledLw = b[lw] # replace all values with values from b
imshow(shuffledLw, origin='lower', interpolation='nearest') # show image clusters as labeled by a shuffled lw
colorbar()
title("Labeled clusters")
# Calculate areas
subplot(1,3,3)
area = measurements.sum(z, lw, index=arange(lw.max() + 1))
areaImg = area[lw]
im3 = imshow(areaImg, origin='lower', interpolation='nearest')
colorbar()
title("Clusters by area")
# Bounding box
sliced = measurements.find_objects(areaImg == areaImg.max())
if(len(sliced) > 0):
sliceX = sliced[0][1]
sliceY = sliced[0][0]
plotxlim=im3.axes.get_xlim()
plotylim=im3.axes.get_ylim()
plot([sliceX.start, sliceX.start, sliceX.stop, sliceX.stop, sliceX.start],
[sliceY.start, sliceY.stop, sliceY.stop, sliceY.start, sliceY.start],
color="red")
xlim(plotxlim)
ylim(plotylim)
show()
```