Is there some way to avoid using *for loops* to speed up the code below:

All arrays are numpy, I can’t figure out a way to vectorize this operation

```
rows, cols = Y.shape
M = np.zeros((cols, cols))
for i in range(cols):
for j in range(i, cols):
M[i, j] = sum(Y[:, i] == Y[:, j])/rows
M[j, i] = M[i, j]
```

Implementation of this expression:

PS: After writing this post, I **replaced sum with np.sum** and the speed up is already significant. But I would still appreciate inputs specially on ways to eliminate

`for`

loop