Is there some way to avoid using for loops to speed up the code below:
All arrays are numpy, I can’t figure out a way to vectorize this operation
rows, cols = Y.shape
M = np.zeros((cols, cols))
for i in range(cols):
for j in range(i, cols):
M[i, j] = sum(Y[:, i] == Y[:, j])/rows
M[j, i] = M[i, j]
Implementation of this expression:
PS: After writing this post, I replaced sum
with np.sum
and the speed up is already significant. But I would still appreciate inputs specially on ways to eliminate for
loop