Program Template to Fuse Loop
Program Template to Fuse Loop
How to fuse for loop with template?
Method 1 fuses the three loops.
Method 2 seems not. (or the compiler has already done it?)
gcc god bolt for method 1
gcc god bolt for method 2
code:
#include <iostream>
#include <type_traits>
template <int Depth>
inline void loop(int* dim, int* counter, int size){
int d = size - Depth;
for( counter[d] = 0;
counter[d] < dim[d];
++counter[d]){
loop<Depth - 1>(dim, counter,size);
}
}
template <>
inline void loop<0>(int dim, int counter, int size){
std::cout<<"#";
}
int main(void){
int dim = {2,3,2};
int counter = {0,0,0};
//Method 1
for(int i = 0; i < 12; ++i) std::cout <<"#";
/*
//Method 2
loop<3>(dim, counter, 3);
*/
}
clang seems to completely unroll the loops. i want to fuse the loops because in actual usage, the array is quite large. may be the compiler will behave differently if I use larger array?
– rxu
Sep 23 '16 at 14:18
of course. It will do what is 'optimal'. It has to be said, that since IO is involved (in this example) unrolling doesn't really optimise anything.
– Richard Hodges
Sep 23 '16 at 16:36
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
clang inlines and unrolls all loops with method 2, gcc seems to miss the optimisation opportunity.
– Richard Hodges
Sep 22 '16 at 20:01