Skip to content

Instantly share code, notes, and snippets.

@ymgve
Last active December 20, 2017 06:18
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ymgve/5b9be80f386d661053d69f44acfd35a3 to your computer and use it in GitHub Desktop.
Save ymgve/5b9be80f386d661053d69f44acfd35a3 to your computer and use it in GitHub Desktop.
C:\Users\ymgve\Documents\Visual Studio 2015\Projects\bugtest\bugtest>nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:08:12_Central_Daylight_Time_2017
Cuda compilation tools, release 9.1, V9.1.85
C:\Users\ymgve\Documents\Visual Studio 2015\Projects\bugtest\bugtest>nvcc kernel.cu
kernel.cu
Creating library a.lib and object a.exp
C:\Users\ymgve\Documents\Visual Studio 2015\Projects\bugtest\bugtest>a.exe
result is 2340
C:\Users\ymgve\Documents\Visual Studio 2015\Projects\bugtest\bugtest>nvcc -G kernel.cu
kernel.cu
Creating library a.lib and object a.exp
C:\Users\ymgve\Documents\Visual Studio 2015\Projects\bugtest\bugtest>a.exe
result is 12340
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#include <stdio.h>
__global__ void testKernel(unsigned int *output)
{
unsigned short al, ah;
al = 0x1234;
ah = 0x10;
asm volatile ("mul.wide.u16 %0, %1, %2;\n\t" : "=r"(output[0]) : "h"(ah), "h"(al));
}
int main()
{
unsigned int *dev_output = 0;
unsigned int output[1] = { 0 };
cudaSetDevice(0);
cudaMalloc((void**)&dev_output, 1 * sizeof(unsigned int));
testKernel<<<1, 1>>>(dev_output);
cudaDeviceSynchronize();
cudaMemcpy(output, dev_output, 1 * sizeof(int), cudaMemcpyDeviceToHost);
printf("result is %x\n", output[0]);
cudaFree(dev_output);
cudaDeviceReset();
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment