Add a few ARM coprocessor intrinsics. Testcases included