C | Python | Program Optimization | Unrolling, SIMD Instructions, OpenMP

Specification

In this project, I first implemented a subset of numpy called numc using a naive solution for matrix operations. Then, I experimented with a setup file in Python to install the numc module. After that, I constructed the Python-C interface by overloading some operators and defining some instance methods for numc.Matrix objects. Finally, I sped up my naive solution, thus making numc.Matrix operations faster.

To speed up the naive solution, I:

  • used conventional code optimizations to speed up the matrix operation algorithms themselves
  • SIMD instructions, which perform operations on 256 bit values (4 64 bit operations can be performed in parallel)
  • parallel computation using OpenMP, being careful to keep the threads from overwriting one another