OpenCL macOS support
Either my laptop's GPU (Intel Iris Graphics 550) or Apple's OpenCL implementation does not support double precision. This patch checks all kernel arguments for double precision types, though I guess there is probably some easier way to just check the entire AST, but I couldn't figure out how.
Also, get_local_id
et al. return size_t
per the OpenCL specification, while CUDA's threadIdx
et al. return an int
, so there is a cast needed to silence a conversion warning.