You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the solution you'd like
IEEE Superscalar SIMD architecture / loop parallelism or vectorization in code here can significantly speed up FP calculations, depending on the levels of floating precision needed. I would recommend evaluating how much precision is needed, and consider enabling this compiler optimization if there is room for small inaccuracy, for large speed increases.
A paper with more on the topic can be found here : https://ieeexplore.ieee.org/document/234917 ;