If cuda-gdb throws Program received signal CUDA_EXCEPTION_4, Warp Illegal Instruction. for the following code line:
(note debugger always points to the next address after problematic instruction, i.e. 0xe8 + 0x8 = 0xf0 in this case)
then this means used register index is outside of the legal bounds set by kernel's register count.