Before going into more detail, it’s important to understand how vectorization works in Python. When performing a calculation on an array/matrix, there are several feasible methods:

The first is to go through the list and perform the calculation element by element, known as an iterative approach. The second method consists of applying the calculation to the entire array/matrix at once, which is known as vectorization.

Although it’s not feasible to do this in all cases without applying real parallelism using a GPU, for example, we speak of vectorization when we use the built-in functions of TensorFlow, NumPy or Pandas.

We’ll also have an iterative loop, but it will be executed in lower-level code ©. As with the use of built-in functions in general, since low-level languages like C are optimized, execution will be much faster and therefore emit less CO2.

Non compliant Code Example

results = [[0 for _ in range(cols_B)] for _ in range(rows_A)]


for i in range(len(A)):
    for j in range(len(B[0])):
        for k in range(len(B)):
            results[i][j] += A[i][k] * B[k][j]

Compliant Solution

results = np.dot(A, B)
# np stands for NumPy, the Python library used to manipulate data series.

Relevance Analysis

The following results were obtained through local experiments.

Configuration

  • Processor: Intel® Core™ Ultra 5 135U, 2100 MHz, 12 cores, 14 logical processors

  • RAM: 16 GB

  • CO2 Emissions Measurement: Using CodeCarbon

Context

This study is divided into 3 parts, comparing a vectorized and an iterative method: measuring the impact on a dot product between two vectors, measuring the impact on an outer product between two vectors, measuring the impact on a matrix calculation.

Impact Analysis

1. dot product:

Non compliant

def iterative_dot_product(x,y):
    total = 0
    for i in range(len(x)):
        total += x[i] * y[i]
    return total

Compliant

def vectorized_dot_product(x,y):
    return np.dot(x,y)
dot

2. Outer product:

Non compliant

def iterative_outer_product(x, y):
    o = np.zeros((len(x), len(y)))
    for i in range(len(x)):
        for j in range(len(y)):
            o[i][j] = x[i] * y[j]
    return o

Compliant

def vectorized_outer_product(x, y):
    return np.outer(x, y)
outer

3. Matrix product:

Non compliant

def iterative_matrix_product(A, B):
    for i in range(len(A)):
        for j in range(len(B[0])):
            for k in range(len(B)):
                results[i][j] += A[i][k] * B[k][j]
    return results

Compliant

def vectorized_outer_product(A, B):
    return np.dot(A, B)
matrix

Conclusion

The results show that the vectorized method is significantly faster than the iterative method. The CO2 emissions are also lower. This is a clear example of how using built-in functions can lead to more efficient code, both in terms of performance and environmental impact.