Created
September 17, 2023 18:41
-
-
Save bhargav/7f8c2984ba32ff99ce8e93433d9059a6 to your computer and use it in GitHub Desktop.
2023-09 ctransformer amd build failure
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
CC=/opt/rocm/llvm/bin/clang CXX=/opt/rocm/llvm/bin/clang++ CT_HIPBLAS=1 pip install ctransformers --no-binary ctransformers | |
Collecting ctransformers | |
Using cached ctransformers-0.2.27.tar.gz (376 kB) | |
Installing build dependencies: started | |
Installing build dependencies: finished with status 'done' | |
Getting requirements to build wheel: started | |
Getting requirements to build wheel: finished with status 'done' | |
Preparing metadata (pyproject.toml): started | |
Preparing metadata (pyproject.toml): finished with status 'done' | |
Collecting py-cpuinfo<10.0.0,>=9.0.0 | |
Using cached py_cpuinfo-9.0.0-py3-none-any.whl (22 kB) | |
Collecting huggingface-hub | |
Using cached huggingface_hub-0.17.1-py3-none-any.whl (294 kB) | |
Collecting fsspec | |
Using cached fsspec-2023.9.1-py3-none-any.whl (173 kB) | |
Collecting packaging>=20.9 | |
Using cached packaging-23.1-py3-none-any.whl (48 kB) | |
Collecting pyyaml>=5.1 | |
Using cached PyYAML-6.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (705 kB) | |
Collecting typing-extensions>=3.7.4.3 | |
Using cached typing_extensions-4.7.1-py3-none-any.whl (33 kB) | |
Collecting filelock | |
Using cached filelock-3.12.4-py3-none-any.whl (11 kB) | |
Collecting requests | |
Using cached requests-2.31.0-py3-none-any.whl (62 kB) | |
Collecting tqdm>=4.42.1 | |
Using cached tqdm-4.66.1-py3-none-any.whl (78 kB) | |
Collecting charset-normalizer<4,>=2 | |
Using cached charset_normalizer-3.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (201 kB) | |
Collecting urllib3<3,>=1.21.1 | |
Using cached urllib3-2.0.4-py3-none-any.whl (123 kB) | |
Collecting certifi>=2017.4.17 | |
Using cached certifi-2023.7.22-py3-none-any.whl (158 kB) | |
Collecting idna<4,>=2.5 | |
Using cached idna-3.4-py3-none-any.whl (61 kB) | |
Building wheels for collected packages: ctransformers | |
Building wheel for ctransformers (pyproject.toml): started | |
Building wheel for ctransformers (pyproject.toml): finished with status 'error' | |
error: subprocess-exited-with-error | |
× Building wheel for ctransformers (pyproject.toml) did not run successfully. | |
│ exit code: 1 | |
╰─> [2479 lines of output] | |
-------------------------------------------------------------------------------- | |
-- Trying 'Ninja' generator | |
-------------------------------- | |
--------------------------- | |
---------------------- | |
----------------- | |
------------ | |
------- | |
-- | |
CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required): | |
Compatibility with CMake < 3.5 will be removed from a future version of | |
CMake. | |
Update the VERSION argument <min> value or use a ...<max> suffix to tell | |
CMake that the project does not need compatibility with older versions. | |
Not searching for unused variables given on the command line. | |
-- The C compiler identification is Clang 16.0.0 | |
-- Detecting C compiler ABI info | |
-- Detecting C compiler ABI info - done | |
-- Check for working C compiler: /opt/rocm/llvm/bin/clang - skipped | |
-- Detecting C compile features | |
-- Detecting C compile features - done | |
-- The CXX compiler identification is Clang 16.0.0 | |
-- Detecting CXX compiler ABI info | |
-- Detecting CXX compiler ABI info - done | |
-- Check for working CXX compiler: /opt/rocm/llvm/bin/clang++ - skipped | |
-- Detecting CXX compile features | |
-- Detecting CXX compile features - done | |
-- Configuring done (0.6s) | |
-- Generating done (0.0s) | |
-- Build files have been written to: /tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/_cmake_test_compile/build | |
-- | |
------- | |
------------ | |
----------------- | |
---------------------- | |
--------------------------- | |
-------------------------------- | |
-- Trying 'Ninja' generator - success | |
-------------------------------------------------------------------------------- | |
Configuring Project | |
Working directory: | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/_skbuild/linux-x86_64-3.10/cmake-build | |
Command: | |
/tmp/pip-build-env-v3f_z5es/overlay/local/lib/python3.10/dist-packages/cmake/data/bin/cmake /tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5 -G Ninja -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-v3f_z5es/overlay/local/lib/python3.10/dist-packages/ninja/data/bin/ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/_skbuild/linux-x86_64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.6 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-v3f_z5es/overlay/local/lib/python3.10/dist-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/usr/bin/python3 -DPYTHON_INCLUDE_DIR:PATH=/usr/include/python3.10 -DPython_EXECUTABLE:PATH=/usr/bin/python3 -DPython_ROOT_DIR:PATH=/usr -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/usr/include/python3.10 -DPython3_EXECUTABLE:PATH=/usr/bin/python3 -DPython3_ROOT_DIR:PATH=/usr -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/usr/include/python3.10 -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-v3f_z5es/overlay/local/lib/python3.10/dist-packages/ninja/data/bin/ninja -DCT_HIPBLAS=1 -DCMAKE_BUILD_TYPE:STRING=Release | |
Not searching for unused variables given on the command line. | |
-- The C compiler identification is Clang 16.0.0 | |
-- The CXX compiler identification is Clang 16.0.0 | |
-- Detecting C compiler ABI info | |
-- Detecting C compiler ABI info - done | |
-- Check for working C compiler: /opt/rocm/llvm/bin/clang - skipped | |
-- Detecting C compile features | |
-- Detecting C compile features - done | |
-- Detecting CXX compiler ABI info | |
-- Detecting CXX compiler ABI info - done | |
-- Check for working CXX compiler: /opt/rocm/llvm/bin/clang++ - skipped | |
-- Detecting CXX compile features | |
-- Detecting CXX compile features - done | |
-- CT_INSTRUCTIONS: avx2 | |
-- CT_CUBLAS: OFF | |
-- CT_HIPBLAS: 1 | |
-- CT_METAL: OFF | |
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD | |
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success | |
-- Found Threads: TRUE | |
-- x86 detected | |
CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config.cmake:20 (cmake_minimum_required): | |
Compatibility with CMake < 3.5 will be removed from a future version of | |
CMake. | |
Update the VERSION argument <min> value or use a ...<max> suffix to tell | |
CMake that the project does not need compatibility with older versions. | |
Call Stack (most recent call first): | |
CMakeLists.txt:177 (find_package) | |
-- hip::amdhip64 is SHARED_LIBRARY | |
-- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS | |
-- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Success | |
CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config.cmake:20 (cmake_minimum_required): | |
Compatibility with CMake < 3.5 will be removed from a future version of | |
CMake. | |
Update the VERSION argument <min> value or use a ...<max> suffix to tell | |
CMake that the project does not need compatibility with older versions. | |
Call Stack (most recent call first): | |
/tmp/pip-build-env-v3f_z5es/overlay/local/lib/python3.10/dist-packages/cmake/data/share/cmake-3.27/Modules/CMakeFindDependencyMacro.cmake:76 (find_package) | |
/opt/rocm/lib/cmake/hipblas/hipblas-config.cmake:90 (find_dependency) | |
CMakeLists.txt:178 (find_package) | |
-- hip::amdhip64 is SHARED_LIBRARY | |
-- HIP and hipBLAS found | |
-- Configuring done (0.9s) | |
-- Generating done (0.0s) | |
-- Build files have been written to: /tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/_skbuild/linux-x86_64-3.10/cmake-build | |
[1/8] Building C object CMakeFiles/ctransformers.dir/models/ggml/ggml-alloc.c.o | |
[2/8] Building C object CMakeFiles/ctransformers.dir/models/ggml/ggml.c.o | |
FAILED: CMakeFiles/ctransformers.dir/models/ggml/ggml.c.o | |
/opt/rocm/llvm/bin/clang -DCC_TURING=1000000000 -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMQ_Y=64 -DGGML_CUDA_MMV_Y=1 -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -DK_QUANTS_PER_ITERATION=2 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -Dctransformers_EXPORTS -I/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models -isystem /opt/rocm/include -isystem /opt/rocm-5.6.0/include -O3 -DNDEBUG -std=gnu11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -mfma -mavx2 -mf16c -mavx -MD -MT CMakeFiles/ctransformers.dir/models/ggml/ggml.c.o -MF CMakeFiles/ctransformers.dir/models/ggml/ggml.c.o.d -o CMakeFiles/ctransformers.dir/models/ggml/ggml.c.o -c /tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml.c | |
In file included from /tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml.c:252: | |
In file included from /tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.h:46: | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda-ggllm.h:15:37: error: array has incomplete element type 'struct cudaDeviceProp' | |
struct cudaDeviceProp device_props[GGML_CUDA_MAX_DEVICES]; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda-ggllm.h:15:10: note: forward declaration of 'struct cudaDeviceProp' | |
struct cudaDeviceProp device_props[GGML_CUDA_MAX_DEVICES]; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml.c:2413:5: warning: implicit conversion increases floating-point precision: 'float' to 'ggml_float' (aka 'double') [-Wdouble-promotion] | |
GGML_F16_VEC_REDUCE(sumf, sum); | |
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml.c:2045:37: note: expanded from macro 'GGML_F16_VEC_REDUCE' | |
#define GGML_F16_VEC_REDUCE GGML_F32Cx8_REDUCE | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml.c:2035:33: note: expanded from macro 'GGML_F32Cx8_REDUCE' | |
#define GGML_F32Cx8_REDUCE GGML_F32x8_REDUCE | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml.c:1981:11: note: expanded from macro 'GGML_F32x8_REDUCE' | |
res = _mm_cvtss_f32(_mm_hadd_ps(t1, t1)); \ | |
~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml.c:3456:9: warning: implicit conversion increases floating-point precision: 'float' to 'ggml_float' (aka 'double') [-Wdouble-promotion] | |
GGML_F16_VEC_REDUCE(sumf[k], sum[k]); | |
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml.c:2045:37: note: expanded from macro 'GGML_F16_VEC_REDUCE' | |
#define GGML_F16_VEC_REDUCE GGML_F32Cx8_REDUCE | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml.c:2035:33: note: expanded from macro 'GGML_F32Cx8_REDUCE' | |
#define GGML_F32Cx8_REDUCE GGML_F32x8_REDUCE | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml.c:1981:11: note: expanded from macro 'GGML_F32x8_REDUCE' | |
res = _mm_cvtss_f32(_mm_hadd_ps(t1, t1)); \ | |
~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
2 warnings and 1 error generated. | |
[3/8] Building C object CMakeFiles/ctransformers.dir/models/ggml/k_quants.c.o | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/k_quants.c:186:11: warning: variable 'sum_x' set but not used [-Wunused-but-set-variable] | |
float sum_x = 0; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/k_quants.c:187:11: warning: variable 'sum_x2' set but not used [-Wunused-but-set-variable] | |
float sum_x2 = 0; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/k_quants.c:182:14: warning: unused function 'make_qkx1_quants' [-Wunused-function] | |
static float make_qkx1_quants(int n, int nmax, const float * restrict x, uint8_t * restrict L, float * restrict the_min, | |
^ | |
3 warnings generated. | |
[4/8] Building CXX object CMakeFiles/ctransformers.dir/models/llm.cc.o | |
FAILED: CMakeFiles/ctransformers.dir/models/llm.cc.o | |
/opt/rocm/llvm/bin/clang++ -DCC_TURING=1000000000 -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMQ_Y=64 -DGGML_CUDA_MMV_Y=1 -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -DK_QUANTS_PER_ITERATION=2 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -Dctransformers_EXPORTS -I/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models -isystem /opt/rocm/include -isystem /opt/rocm-5.6.0/include -O3 -DNDEBUG -std=gnu++11 -fPIC -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -mfma -mavx2 -mf16c -mavx -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -x hip --offload-arch=gfx900 --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx1030 -MD -MT CMakeFiles/ctransformers.dir/models/llm.cc.o -MF CMakeFiles/ctransformers.dir/models/llm.cc.o.d -o CMakeFiles/ctransformers.dir/models/llm.cc.o -c /tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/llm.cc | |
In file included from /tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/llm.cc:1: | |
In file included from /tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/llm.h:4: | |
In file included from /tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/common.h:24: | |
In file included from /tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.h:46: | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda-ggllm.h:15:25: error: field has incomplete type 'struct cudaDeviceProp' | |
struct cudaDeviceProp device_props[GGML_CUDA_MAX_DEVICES]; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda-ggllm.h:15:10: note: forward declaration of 'cudaDeviceProp' | |
struct cudaDeviceProp device_props[GGML_CUDA_MAX_DEVICES]; | |
^ | |
In file included from /tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/llm.cc:1: | |
In file included from /tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/llm.h:4: | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/common.h:239:35: warning: braces around scalar initializer [-Wbraced-scalar-init] | |
return ct_new_tensor(ctx, type, {x}, gpu); | |
^~~ | |
In file included from /tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/llm.cc:1: | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/llm.h:28:58: warning: unused parameter 'add_bos_token' [-Wunused-parameter] | |
const bool add_bos_token) const { | |
^ | |
In file included from /tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/llm.cc:7: | |
In file included from /tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/llms/llama.cc:5: | |
In file included from /tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/llama.cpp:6: | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/llama.h:31:13: warning: 'DEPRECATED' macro redefined [-Wmacro-redefined] | |
# define DEPRECATED(func, hint) func __attribute__((deprecated(hint))) | |
^ | |
/opt/rocm/include/hip/hip_runtime_api.h:494:9: note: previous definition is here | |
#define DEPRECATED(msg) __attribute__ ((deprecated(msg))) | |
^ | |
In file included from /tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/llm.cc:7: | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/llms/llama.cc:7:51: warning: unused parameter 'level' [-Wunused-parameter] | |
static void ct_llama_log_callback(llama_log_level level, const char *text, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/llms/llama.cc:7:70: warning: unused parameter 'text' [-Wunused-parameter] | |
static void ct_llama_log_callback(llama_log_level level, const char *text, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/llms/llama.cc:8:41: warning: unused parameter 'user_data' [-Wunused-parameter] | |
void *user_data) {} | |
^ | |
In file included from /tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/llm.cc:9: | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/llms/replit.cc:631:50: warning: unused parameter 'add_bos_token' [-Wunused-parameter] | |
const bool add_bos_token) const override { | |
^ | |
In file included from /tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/llm.cc:15: | |
In file included from /tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/llms/falcon.cc:5: | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/libfalcon.cpp:18:10: fatal error: 'cuda_runtime.h' file not found | |
#include <cuda_runtime.h> | |
^~~~~~~~~~~~~~~~ | |
7 warnings and 2 errors generated when compiling for gfx1030. | |
[5/8] Building CXX object CMakeFiles/ctransformers.dir/models/ggml/cmpnct_unicode.cpp.o | |
[6/8] Building CXX object CMakeFiles/ctransformers.dir/models/ggml/ggml-cuda.cu.o | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:166:41: warning: cast from 'const signed char *' to 'unsigned short *' drops const qualifier [-Wcast-qual] | |
const uint16_t * x16 = (uint16_t *) (x8 + sizeof(int) * i32); // assume at least 2 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:176:41: warning: cast from 'const unsigned char *' to 'unsigned short *' drops const qualifier [-Wcast-qual] | |
const uint16_t * x16 = (uint16_t *) (x8 + sizeof(int) * i32); // assume at least 2 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:186:22: warning: cast from 'const signed char *' to 'int *' drops const qualifier [-Wcast-qual] | |
return *((int *) (x8 + sizeof(int) * i32)); // assume at least 4 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:190:22: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
return *((int *) (x8 + sizeof(int) * i32)); // assume at least 4 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2047:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2047:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2068:45: warning: cast from 'const void *' to 'block_q4_0 *' drops const qualifier [-Wcast-qual] | |
const block_q4_0 * bx0 = (block_q4_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2057:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2058:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2108:37: warning: cast from 'const __half2 *' to 'float *' drops const qualifier [-Wcast-qual] | |
const float * x_dmf = (float *) x_dm; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2104:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2104:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2141:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2141:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2162:45: warning: cast from 'const void *' to 'block_q4_1 *' drops const qualifier [-Wcast-qual] | |
const block_q4_1 * bx0 = (block_q4_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2151:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2152:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2195:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2195:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2233:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2233:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2254:45: warning: cast from 'const void *' to 'block_q5_0 *' drops const qualifier [-Wcast-qual] | |
const block_q5_0 * bx0 = (block_q5_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2243:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2244:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2307:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2307:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2347:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2347:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2368:45: warning: cast from 'const void *' to 'block_q5_1 *' drops const qualifier [-Wcast-qual] | |
const block_q5_1 * bx0 = (block_q5_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2357:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2358:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2418:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2418:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2453:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q8_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2453:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q8_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2475:45: warning: cast from 'const void *' to 'block_q8_0 *' drops const qualifier [-Wcast-qual] | |
const block_q8_0 * bx0 = (block_q8_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2463:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2464:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2508:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2508:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2542:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q2_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2565:45: warning: cast from 'const void *' to 'block_q2_K *' drops const qualifier [-Wcast-qual] | |
const block_q2_K * bx0 = (block_q2_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2554:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2611:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2686:45: warning: cast from 'const void *' to 'block_q3_K *' drops const qualifier [-Wcast-qual] | |
const block_q3_K * bx0 = (block_q3_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2767:41: warning: cast from 'const int *' to 'signed char *' drops const qualifier [-Wcast-qual] | |
const int8_t * scales = ((int8_t *) (x_sc + i * (WARP_SIZE/4) + i/4 + kbx*4)) + ky/4; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2881:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2904:45: warning: cast from 'const void *' to 'block_q4_K *' drops const qualifier [-Wcast-qual] | |
const block_q4_K * bx0 = (block_q4_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2949:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2893:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2962:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3062:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3085:45: warning: cast from 'const void *' to 'block_q5_K *' drops const qualifier [-Wcast-qual] | |
const block_q5_K * bx0 = (block_q5_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3141:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3074:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3154:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3191:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q6_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3214:45: warning: cast from 'const void *' to 'block_q6_K *' drops const qualifier [-Wcast-qual] | |
const block_q6_K * bx0 = (block_q6_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3203:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3274:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:5693:72: warning: unused parameter 'i02' [-Wunused-parameter] | |
float * src0_ddf_i, float * src1_ddf_i, float * dst_ddf_i, int64_t i02, int64_t i01_low, int64_t i01_high, int i1, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2068:45: warning: cast from 'const void *' to 'block_q4_0 *' drops const qualifier [-Wcast-qual] | |
const block_q4_0 * bx0 = (block_q4_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3423:9: note: in instantiation of function template specialization 'load_tiles_q4_0<64, 8, false>' requested here | |
load_tiles_q4_0<mmq_y, nwarps, need_check>, VDR_Q4_0_Q8_1_MMQ, vec_dot_q4_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4520:9: note: in instantiation of function template specialization 'mul_mat_q4_0<false>' requested here | |
mul_mat_q4_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2068:45: warning: cast from 'const void *' to 'block_q4_0 *' drops const qualifier [-Wcast-qual] | |
const block_q4_0 * bx0 = (block_q4_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3423:9: note: in instantiation of function template specialization 'load_tiles_q4_0<64, 8, true>' requested here | |
load_tiles_q4_0<mmq_y, nwarps, need_check>, VDR_Q4_0_Q8_1_MMQ, vec_dot_q4_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4524:9: note: in instantiation of function template specialization 'mul_mat_q4_0<true>' requested here | |
mul_mat_q4_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2162:45: warning: cast from 'const void *' to 'block_q4_1 *' drops const qualifier [-Wcast-qual] | |
const block_q4_1 * bx0 = (block_q4_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3461:9: note: in instantiation of function template specialization 'load_tiles_q4_1<64, 8, false>' requested here | |
load_tiles_q4_1<mmq_y, nwarps, need_check>, VDR_Q4_1_Q8_1_MMQ, vec_dot_q4_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4557:9: note: in instantiation of function template specialization 'mul_mat_q4_1<false>' requested here | |
mul_mat_q4_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2162:45: warning: cast from 'const void *' to 'block_q4_1 *' drops const qualifier [-Wcast-qual] | |
const block_q4_1 * bx0 = (block_q4_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3461:9: note: in instantiation of function template specialization 'load_tiles_q4_1<64, 8, true>' requested here | |
load_tiles_q4_1<mmq_y, nwarps, need_check>, VDR_Q4_1_Q8_1_MMQ, vec_dot_q4_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4561:9: note: in instantiation of function template specialization 'mul_mat_q4_1<true>' requested here | |
mul_mat_q4_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2254:45: warning: cast from 'const void *' to 'block_q5_0 *' drops const qualifier [-Wcast-qual] | |
const block_q5_0 * bx0 = (block_q5_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3495:9: note: in instantiation of function template specialization 'load_tiles_q5_0<64, 8, false>' requested here | |
load_tiles_q5_0<mmq_y, nwarps, need_check>, VDR_Q5_0_Q8_1_MMQ, vec_dot_q5_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4594:9: note: in instantiation of function template specialization 'mul_mat_q5_0<false>' requested here | |
mul_mat_q5_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2254:45: warning: cast from 'const void *' to 'block_q5_0 *' drops const qualifier [-Wcast-qual] | |
const block_q5_0 * bx0 = (block_q5_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3495:9: note: in instantiation of function template specialization 'load_tiles_q5_0<64, 8, true>' requested here | |
load_tiles_q5_0<mmq_y, nwarps, need_check>, VDR_Q5_0_Q8_1_MMQ, vec_dot_q5_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4598:9: note: in instantiation of function template specialization 'mul_mat_q5_0<true>' requested here | |
mul_mat_q5_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2368:45: warning: cast from 'const void *' to 'block_q5_1 *' drops const qualifier [-Wcast-qual] | |
const block_q5_1 * bx0 = (block_q5_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3529:9: note: in instantiation of function template specialization 'load_tiles_q5_1<64, 8, false>' requested here | |
load_tiles_q5_1<mmq_y, nwarps, need_check>, VDR_Q5_1_Q8_1_MMQ, vec_dot_q5_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4631:9: note: in instantiation of function template specialization 'mul_mat_q5_1<false>' requested here | |
mul_mat_q5_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2368:45: warning: cast from 'const void *' to 'block_q5_1 *' drops const qualifier [-Wcast-qual] | |
const block_q5_1 * bx0 = (block_q5_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3529:9: note: in instantiation of function template specialization 'load_tiles_q5_1<64, 8, true>' requested here | |
load_tiles_q5_1<mmq_y, nwarps, need_check>, VDR_Q5_1_Q8_1_MMQ, vec_dot_q5_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4635:9: note: in instantiation of function template specialization 'mul_mat_q5_1<true>' requested here | |
mul_mat_q5_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2475:45: warning: cast from 'const void *' to 'block_q8_0 *' drops const qualifier [-Wcast-qual] | |
const block_q8_0 * bx0 = (block_q8_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3563:9: note: in instantiation of function template specialization 'load_tiles_q8_0<64, 8, false>' requested here | |
load_tiles_q8_0<mmq_y, nwarps, need_check>, VDR_Q8_0_Q8_1_MMQ, vec_dot_q8_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4668:9: note: in instantiation of function template specialization 'mul_mat_q8_0<false>' requested here | |
mul_mat_q8_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2475:45: warning: cast from 'const void *' to 'block_q8_0 *' drops const qualifier [-Wcast-qual] | |
const block_q8_0 * bx0 = (block_q8_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3563:9: note: in instantiation of function template specialization 'load_tiles_q8_0<64, 8, true>' requested here | |
load_tiles_q8_0<mmq_y, nwarps, need_check>, VDR_Q8_0_Q8_1_MMQ, vec_dot_q8_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4672:9: note: in instantiation of function template specialization 'mul_mat_q8_0<true>' requested here | |
mul_mat_q8_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2565:45: warning: cast from 'const void *' to 'block_q2_K *' drops const qualifier [-Wcast-qual] | |
const block_q2_K * bx0 = (block_q2_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3597:9: note: in instantiation of function template specialization 'load_tiles_q2_K<64, 8, false>' requested here | |
load_tiles_q2_K<mmq_y, nwarps, need_check>, VDR_Q2_K_Q8_1_MMQ, vec_dot_q2_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4705:9: note: in instantiation of function template specialization 'mul_mat_q2_K<false>' requested here | |
mul_mat_q2_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2565:45: warning: cast from 'const void *' to 'block_q2_K *' drops const qualifier [-Wcast-qual] | |
const block_q2_K * bx0 = (block_q2_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3597:9: note: in instantiation of function template specialization 'load_tiles_q2_K<64, 8, true>' requested here | |
load_tiles_q2_K<mmq_y, nwarps, need_check>, VDR_Q2_K_Q8_1_MMQ, vec_dot_q2_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4709:9: note: in instantiation of function template specialization 'mul_mat_q2_K<true>' requested here | |
mul_mat_q2_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2686:45: warning: cast from 'const void *' to 'block_q3_K *' drops const qualifier [-Wcast-qual] | |
const block_q3_K * bx0 = (block_q3_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3635:9: note: in instantiation of function template specialization 'load_tiles_q3_K<64, 8, false>' requested here | |
load_tiles_q3_K<mmq_y, nwarps, need_check>, VDR_Q3_K_Q8_1_MMQ, vec_dot_q3_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4744:9: note: in instantiation of function template specialization 'mul_mat_q3_K<false>' requested here | |
mul_mat_q3_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2686:45: warning: cast from 'const void *' to 'block_q3_K *' drops const qualifier [-Wcast-qual] | |
const block_q3_K * bx0 = (block_q3_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3635:9: note: in instantiation of function template specialization 'load_tiles_q3_K<64, 8, true>' requested here | |
load_tiles_q3_K<mmq_y, nwarps, need_check>, VDR_Q3_K_Q8_1_MMQ, vec_dot_q3_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4748:9: note: in instantiation of function template specialization 'mul_mat_q3_K<true>' requested here | |
mul_mat_q3_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2904:45: warning: cast from 'const void *' to 'block_q4_K *' drops const qualifier [-Wcast-qual] | |
const block_q4_K * bx0 = (block_q4_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3673:9: note: in instantiation of function template specialization 'load_tiles_q4_K<64, 8, false>' requested here | |
load_tiles_q4_K<mmq_y, nwarps, need_check>, VDR_Q4_K_Q8_1_MMQ, vec_dot_q4_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4782:9: note: in instantiation of function template specialization 'mul_mat_q4_K<false>' requested here | |
mul_mat_q4_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2949:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2904:45: warning: cast from 'const void *' to 'block_q4_K *' drops const qualifier [-Wcast-qual] | |
const block_q4_K * bx0 = (block_q4_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3673:9: note: in instantiation of function template specialization 'load_tiles_q4_K<64, 8, true>' requested here | |
load_tiles_q4_K<mmq_y, nwarps, need_check>, VDR_Q4_K_Q8_1_MMQ, vec_dot_q4_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4786:9: note: in instantiation of function template specialization 'mul_mat_q4_K<true>' requested here | |
mul_mat_q4_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2949:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3085:45: warning: cast from 'const void *' to 'block_q5_K *' drops const qualifier [-Wcast-qual] | |
const block_q5_K * bx0 = (block_q5_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3707:9: note: in instantiation of function template specialization 'load_tiles_q5_K<64, 8, false>' requested here | |
load_tiles_q5_K<mmq_y, nwarps, need_check>, VDR_Q5_K_Q8_1_MMQ, vec_dot_q5_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4819:9: note: in instantiation of function template specialization 'mul_mat_q5_K<false>' requested here | |
mul_mat_q5_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3141:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3085:45: warning: cast from 'const void *' to 'block_q5_K *' drops const qualifier [-Wcast-qual] | |
const block_q5_K * bx0 = (block_q5_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3707:9: note: in instantiation of function template specialization 'load_tiles_q5_K<64, 8, true>' requested here | |
load_tiles_q5_K<mmq_y, nwarps, need_check>, VDR_Q5_K_Q8_1_MMQ, vec_dot_q5_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4823:9: note: in instantiation of function template specialization 'mul_mat_q5_K<true>' requested here | |
mul_mat_q5_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3141:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3214:45: warning: cast from 'const void *' to 'block_q6_K *' drops const qualifier [-Wcast-qual] | |
const block_q6_K * bx0 = (block_q6_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3745:9: note: in instantiation of function template specialization 'load_tiles_q6_K<64, 8, false>' requested here | |
load_tiles_q6_K<mmq_y, nwarps, need_check>, VDR_Q6_K_Q8_1_MMQ, vec_dot_q6_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4856:9: note: in instantiation of function template specialization 'mul_mat_q6_K<false>' requested here | |
mul_mat_q6_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3214:45: warning: cast from 'const void *' to 'block_q6_K *' drops const qualifier [-Wcast-qual] | |
const block_q6_K * bx0 = (block_q6_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3745:9: note: in instantiation of function template specialization 'load_tiles_q6_K<64, 8, true>' requested here | |
load_tiles_q6_K<mmq_y, nwarps, need_check>, VDR_Q6_K_Q8_1_MMQ, vec_dot_q6_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4860:9: note: in instantiation of function template specialization 'mul_mat_q6_K<true>' requested here | |
mul_mat_q6_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
85 warnings generated when compiling for gfx1030. | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:166:41: warning: cast from 'const signed char *' to 'unsigned short *' drops const qualifier [-Wcast-qual] | |
const uint16_t * x16 = (uint16_t *) (x8 + sizeof(int) * i32); // assume at least 2 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:176:41: warning: cast from 'const unsigned char *' to 'unsigned short *' drops const qualifier [-Wcast-qual] | |
const uint16_t * x16 = (uint16_t *) (x8 + sizeof(int) * i32); // assume at least 2 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:186:22: warning: cast from 'const signed char *' to 'int *' drops const qualifier [-Wcast-qual] | |
return *((int *) (x8 + sizeof(int) * i32)); // assume at least 4 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:190:22: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
return *((int *) (x8 + sizeof(int) * i32)); // assume at least 4 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2047:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2047:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2068:45: warning: cast from 'const void *' to 'block_q4_0 *' drops const qualifier [-Wcast-qual] | |
const block_q4_0 * bx0 = (block_q4_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2057:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2058:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2108:37: warning: cast from 'const __half2 *' to 'float *' drops const qualifier [-Wcast-qual] | |
const float * x_dmf = (float *) x_dm; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2104:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2104:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2141:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2141:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2162:45: warning: cast from 'const void *' to 'block_q4_1 *' drops const qualifier [-Wcast-qual] | |
const block_q4_1 * bx0 = (block_q4_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2151:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2152:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2195:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2195:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2233:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2233:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2254:45: warning: cast from 'const void *' to 'block_q5_0 *' drops const qualifier [-Wcast-qual] | |
const block_q5_0 * bx0 = (block_q5_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2243:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2244:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2307:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2307:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2347:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2347:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2368:45: warning: cast from 'const void *' to 'block_q5_1 *' drops const qualifier [-Wcast-qual] | |
const block_q5_1 * bx0 = (block_q5_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2357:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2358:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2418:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2418:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2453:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q8_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2453:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q8_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2475:45: warning: cast from 'const void *' to 'block_q8_0 *' drops const qualifier [-Wcast-qual] | |
const block_q8_0 * bx0 = (block_q8_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2463:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2464:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2508:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2508:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2542:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q2_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2565:45: warning: cast from 'const void *' to 'block_q2_K *' drops const qualifier [-Wcast-qual] | |
const block_q2_K * bx0 = (block_q2_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2554:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2611:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2686:45: warning: cast from 'const void *' to 'block_q3_K *' drops const qualifier [-Wcast-qual] | |
const block_q3_K * bx0 = (block_q3_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2767:41: warning: cast from 'const int *' to 'signed char *' drops const qualifier [-Wcast-qual] | |
const int8_t * scales = ((int8_t *) (x_sc + i * (WARP_SIZE/4) + i/4 + kbx*4)) + ky/4; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2881:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2904:45: warning: cast from 'const void *' to 'block_q4_K *' drops const qualifier [-Wcast-qual] | |
const block_q4_K * bx0 = (block_q4_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2949:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2893:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2962:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3062:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3085:45: warning: cast from 'const void *' to 'block_q5_K *' drops const qualifier [-Wcast-qual] | |
const block_q5_K * bx0 = (block_q5_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3141:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3074:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3154:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3191:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q6_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3214:45: warning: cast from 'const void *' to 'block_q6_K *' drops const qualifier [-Wcast-qual] | |
const block_q6_K * bx0 = (block_q6_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3203:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3274:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:5693:72: warning: unused parameter 'i02' [-Wunused-parameter] | |
float * src0_ddf_i, float * src1_ddf_i, float * dst_ddf_i, int64_t i02, int64_t i01_low, int64_t i01_high, int i1, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2068:45: warning: cast from 'const void *' to 'block_q4_0 *' drops const qualifier [-Wcast-qual] | |
const block_q4_0 * bx0 = (block_q4_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3423:9: note: in instantiation of function template specialization 'load_tiles_q4_0<64, 8, false>' requested here | |
load_tiles_q4_0<mmq_y, nwarps, need_check>, VDR_Q4_0_Q8_1_MMQ, vec_dot_q4_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4520:9: note: in instantiation of function template specialization 'mul_mat_q4_0<false>' requested here | |
mul_mat_q4_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2068:45: warning: cast from 'const void *' to 'block_q4_0 *' drops const qualifier [-Wcast-qual] | |
const block_q4_0 * bx0 = (block_q4_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3423:9: note: in instantiation of function template specialization 'load_tiles_q4_0<64, 8, true>' requested here | |
load_tiles_q4_0<mmq_y, nwarps, need_check>, VDR_Q4_0_Q8_1_MMQ, vec_dot_q4_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4524:9: note: in instantiation of function template specialization 'mul_mat_q4_0<true>' requested here | |
mul_mat_q4_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2162:45: warning: cast from 'const void *' to 'block_q4_1 *' drops const qualifier [-Wcast-qual] | |
const block_q4_1 * bx0 = (block_q4_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3461:9: note: in instantiation of function template specialization 'load_tiles_q4_1<64, 8, false>' requested here | |
load_tiles_q4_1<mmq_y, nwarps, need_check>, VDR_Q4_1_Q8_1_MMQ, vec_dot_q4_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4557:9: note: in instantiation of function template specialization 'mul_mat_q4_1<false>' requested here | |
mul_mat_q4_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2162:45: warning: cast from 'const void *' to 'block_q4_1 *' drops const qualifier [-Wcast-qual] | |
const block_q4_1 * bx0 = (block_q4_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3461:9: note: in instantiation of function template specialization 'load_tiles_q4_1<64, 8, true>' requested here | |
load_tiles_q4_1<mmq_y, nwarps, need_check>, VDR_Q4_1_Q8_1_MMQ, vec_dot_q4_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4561:9: note: in instantiation of function template specialization 'mul_mat_q4_1<true>' requested here | |
mul_mat_q4_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2254:45: warning: cast from 'const void *' to 'block_q5_0 *' drops const qualifier [-Wcast-qual] | |
const block_q5_0 * bx0 = (block_q5_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3495:9: note: in instantiation of function template specialization 'load_tiles_q5_0<64, 8, false>' requested here | |
load_tiles_q5_0<mmq_y, nwarps, need_check>, VDR_Q5_0_Q8_1_MMQ, vec_dot_q5_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4594:9: note: in instantiation of function template specialization 'mul_mat_q5_0<false>' requested here | |
mul_mat_q5_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2254:45: warning: cast from 'const void *' to 'block_q5_0 *' drops const qualifier [-Wcast-qual] | |
const block_q5_0 * bx0 = (block_q5_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3495:9: note: in instantiation of function template specialization 'load_tiles_q5_0<64, 8, true>' requested here | |
load_tiles_q5_0<mmq_y, nwarps, need_check>, VDR_Q5_0_Q8_1_MMQ, vec_dot_q5_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4598:9: note: in instantiation of function template specialization 'mul_mat_q5_0<true>' requested here | |
mul_mat_q5_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2368:45: warning: cast from 'const void *' to 'block_q5_1 *' drops const qualifier [-Wcast-qual] | |
const block_q5_1 * bx0 = (block_q5_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3529:9: note: in instantiation of function template specialization 'load_tiles_q5_1<64, 8, false>' requested here | |
load_tiles_q5_1<mmq_y, nwarps, need_check>, VDR_Q5_1_Q8_1_MMQ, vec_dot_q5_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4631:9: note: in instantiation of function template specialization 'mul_mat_q5_1<false>' requested here | |
mul_mat_q5_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2368:45: warning: cast from 'const void *' to 'block_q5_1 *' drops const qualifier [-Wcast-qual] | |
const block_q5_1 * bx0 = (block_q5_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3529:9: note: in instantiation of function template specialization 'load_tiles_q5_1<64, 8, true>' requested here | |
load_tiles_q5_1<mmq_y, nwarps, need_check>, VDR_Q5_1_Q8_1_MMQ, vec_dot_q5_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4635:9: note: in instantiation of function template specialization 'mul_mat_q5_1<true>' requested here | |
mul_mat_q5_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2475:45: warning: cast from 'const void *' to 'block_q8_0 *' drops const qualifier [-Wcast-qual] | |
const block_q8_0 * bx0 = (block_q8_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3563:9: note: in instantiation of function template specialization 'load_tiles_q8_0<64, 8, false>' requested here | |
load_tiles_q8_0<mmq_y, nwarps, need_check>, VDR_Q8_0_Q8_1_MMQ, vec_dot_q8_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4668:9: note: in instantiation of function template specialization 'mul_mat_q8_0<false>' requested here | |
mul_mat_q8_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2475:45: warning: cast from 'const void *' to 'block_q8_0 *' drops const qualifier [-Wcast-qual] | |
const block_q8_0 * bx0 = (block_q8_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3563:9: note: in instantiation of function template specialization 'load_tiles_q8_0<64, 8, true>' requested here | |
load_tiles_q8_0<mmq_y, nwarps, need_check>, VDR_Q8_0_Q8_1_MMQ, vec_dot_q8_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4672:9: note: in instantiation of function template specialization 'mul_mat_q8_0<true>' requested here | |
mul_mat_q8_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2565:45: warning: cast from 'const void *' to 'block_q2_K *' drops const qualifier [-Wcast-qual] | |
const block_q2_K * bx0 = (block_q2_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3597:9: note: in instantiation of function template specialization 'load_tiles_q2_K<64, 8, false>' requested here | |
load_tiles_q2_K<mmq_y, nwarps, need_check>, VDR_Q2_K_Q8_1_MMQ, vec_dot_q2_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4705:9: note: in instantiation of function template specialization 'mul_mat_q2_K<false>' requested here | |
mul_mat_q2_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2565:45: warning: cast from 'const void *' to 'block_q2_K *' drops const qualifier [-Wcast-qual] | |
const block_q2_K * bx0 = (block_q2_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3597:9: note: in instantiation of function template specialization 'load_tiles_q2_K<64, 8, true>' requested here | |
load_tiles_q2_K<mmq_y, nwarps, need_check>, VDR_Q2_K_Q8_1_MMQ, vec_dot_q2_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4709:9: note: in instantiation of function template specialization 'mul_mat_q2_K<true>' requested here | |
mul_mat_q2_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2686:45: warning: cast from 'const void *' to 'block_q3_K *' drops const qualifier [-Wcast-qual] | |
const block_q3_K * bx0 = (block_q3_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3635:9: note: in instantiation of function template specialization 'load_tiles_q3_K<64, 8, false>' requested here | |
load_tiles_q3_K<mmq_y, nwarps, need_check>, VDR_Q3_K_Q8_1_MMQ, vec_dot_q3_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4744:9: note: in instantiation of function template specialization 'mul_mat_q3_K<false>' requested here | |
mul_mat_q3_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2686:45: warning: cast from 'const void *' to 'block_q3_K *' drops const qualifier [-Wcast-qual] | |
const block_q3_K * bx0 = (block_q3_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3635:9: note: in instantiation of function template specialization 'load_tiles_q3_K<64, 8, true>' requested here | |
load_tiles_q3_K<mmq_y, nwarps, need_check>, VDR_Q3_K_Q8_1_MMQ, vec_dot_q3_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4748:9: note: in instantiation of function template specialization 'mul_mat_q3_K<true>' requested here | |
mul_mat_q3_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2904:45: warning: cast from 'const void *' to 'block_q4_K *' drops const qualifier [-Wcast-qual] | |
const block_q4_K * bx0 = (block_q4_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3673:9: note: in instantiation of function template specialization 'load_tiles_q4_K<64, 8, false>' requested here | |
load_tiles_q4_K<mmq_y, nwarps, need_check>, VDR_Q4_K_Q8_1_MMQ, vec_dot_q4_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4782:9: note: in instantiation of function template specialization 'mul_mat_q4_K<false>' requested here | |
mul_mat_q4_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2949:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2904:45: warning: cast from 'const void *' to 'block_q4_K *' drops const qualifier [-Wcast-qual] | |
const block_q4_K * bx0 = (block_q4_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3673:9: note: in instantiation of function template specialization 'load_tiles_q4_K<64, 8, true>' requested here | |
load_tiles_q4_K<mmq_y, nwarps, need_check>, VDR_Q4_K_Q8_1_MMQ, vec_dot_q4_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4786:9: note: in instantiation of function template specialization 'mul_mat_q4_K<true>' requested here | |
mul_mat_q4_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2949:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3085:45: warning: cast from 'const void *' to 'block_q5_K *' drops const qualifier [-Wcast-qual] | |
const block_q5_K * bx0 = (block_q5_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3707:9: note: in instantiation of function template specialization 'load_tiles_q5_K<64, 8, false>' requested here | |
load_tiles_q5_K<mmq_y, nwarps, need_check>, VDR_Q5_K_Q8_1_MMQ, vec_dot_q5_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4819:9: note: in instantiation of function template specialization 'mul_mat_q5_K<false>' requested here | |
mul_mat_q5_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3141:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3085:45: warning: cast from 'const void *' to 'block_q5_K *' drops const qualifier [-Wcast-qual] | |
const block_q5_K * bx0 = (block_q5_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3707:9: note: in instantiation of function template specialization 'load_tiles_q5_K<64, 8, true>' requested here | |
load_tiles_q5_K<mmq_y, nwarps, need_check>, VDR_Q5_K_Q8_1_MMQ, vec_dot_q5_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4823:9: note: in instantiation of function template specialization 'mul_mat_q5_K<true>' requested here | |
mul_mat_q5_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3141:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3214:45: warning: cast from 'const void *' to 'block_q6_K *' drops const qualifier [-Wcast-qual] | |
const block_q6_K * bx0 = (block_q6_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3745:9: note: in instantiation of function template specialization 'load_tiles_q6_K<64, 8, false>' requested here | |
load_tiles_q6_K<mmq_y, nwarps, need_check>, VDR_Q6_K_Q8_1_MMQ, vec_dot_q6_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4856:9: note: in instantiation of function template specialization 'mul_mat_q6_K<false>' requested here | |
mul_mat_q6_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3214:45: warning: cast from 'const void *' to 'block_q6_K *' drops const qualifier [-Wcast-qual] | |
const block_q6_K * bx0 = (block_q6_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3745:9: note: in instantiation of function template specialization 'load_tiles_q6_K<64, 8, true>' requested here | |
load_tiles_q6_K<mmq_y, nwarps, need_check>, VDR_Q6_K_Q8_1_MMQ, vec_dot_q6_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4860:9: note: in instantiation of function template specialization 'mul_mat_q6_K<true>' requested here | |
mul_mat_q6_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
85 warnings generated when compiling for gfx900. | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:166:41: warning: cast from 'const signed char *' to 'unsigned short *' drops const qualifier [-Wcast-qual] | |
const uint16_t * x16 = (uint16_t *) (x8 + sizeof(int) * i32); // assume at least 2 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:176:41: warning: cast from 'const unsigned char *' to 'unsigned short *' drops const qualifier [-Wcast-qual] | |
const uint16_t * x16 = (uint16_t *) (x8 + sizeof(int) * i32); // assume at least 2 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:186:22: warning: cast from 'const signed char *' to 'int *' drops const qualifier [-Wcast-qual] | |
return *((int *) (x8 + sizeof(int) * i32)); // assume at least 4 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:190:22: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
return *((int *) (x8 + sizeof(int) * i32)); // assume at least 4 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2047:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2047:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2068:45: warning: cast from 'const void *' to 'block_q4_0 *' drops const qualifier [-Wcast-qual] | |
const block_q4_0 * bx0 = (block_q4_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2057:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2058:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2108:37: warning: cast from 'const __half2 *' to 'float *' drops const qualifier [-Wcast-qual] | |
const float * x_dmf = (float *) x_dm; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2104:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2104:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2141:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2141:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2162:45: warning: cast from 'const void *' to 'block_q4_1 *' drops const qualifier [-Wcast-qual] | |
const block_q4_1 * bx0 = (block_q4_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2151:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2152:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2195:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2195:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2233:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2233:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2254:45: warning: cast from 'const void *' to 'block_q5_0 *' drops const qualifier [-Wcast-qual] | |
const block_q5_0 * bx0 = (block_q5_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2243:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2244:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2307:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2307:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2347:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2347:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2368:45: warning: cast from 'const void *' to 'block_q5_1 *' drops const qualifier [-Wcast-qual] | |
const block_q5_1 * bx0 = (block_q5_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2357:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2358:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2418:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2418:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2453:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q8_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2453:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q8_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2475:45: warning: cast from 'const void *' to 'block_q8_0 *' drops const qualifier [-Wcast-qual] | |
const block_q8_0 * bx0 = (block_q8_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2463:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2464:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2508:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2508:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2542:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q2_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2565:45: warning: cast from 'const void *' to 'block_q2_K *' drops const qualifier [-Wcast-qual] | |
const block_q2_K * bx0 = (block_q2_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2554:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2611:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2686:45: warning: cast from 'const void *' to 'block_q3_K *' drops const qualifier [-Wcast-qual] | |
const block_q3_K * bx0 = (block_q3_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2767:41: warning: cast from 'const int *' to 'signed char *' drops const qualifier [-Wcast-qual] | |
const int8_t * scales = ((int8_t *) (x_sc + i * (WARP_SIZE/4) + i/4 + kbx*4)) + ky/4; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2881:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2904:45: warning: cast from 'const void *' to 'block_q4_K *' drops const qualifier [-Wcast-qual] | |
const block_q4_K * bx0 = (block_q4_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2949:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2893:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2962:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3062:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3085:45: warning: cast from 'const void *' to 'block_q5_K *' drops const qualifier [-Wcast-qual] | |
const block_q5_K * bx0 = (block_q5_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3141:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3074:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3154:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3191:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q6_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3214:45: warning: cast from 'const void *' to 'block_q6_K *' drops const qualifier [-Wcast-qual] | |
const block_q6_K * bx0 = (block_q6_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3203:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3274:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:5693:72: warning: unused parameter 'i02' [-Wunused-parameter] | |
float * src0_ddf_i, float * src1_ddf_i, float * dst_ddf_i, int64_t i02, int64_t i01_low, int64_t i01_high, int i1, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2068:45: warning: cast from 'const void *' to 'block_q4_0 *' drops const qualifier [-Wcast-qual] | |
const block_q4_0 * bx0 = (block_q4_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3423:9: note: in instantiation of function template specialization 'load_tiles_q4_0<64, 8, false>' requested here | |
load_tiles_q4_0<mmq_y, nwarps, need_check>, VDR_Q4_0_Q8_1_MMQ, vec_dot_q4_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4520:9: note: in instantiation of function template specialization 'mul_mat_q4_0<false>' requested here | |
mul_mat_q4_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2068:45: warning: cast from 'const void *' to 'block_q4_0 *' drops const qualifier [-Wcast-qual] | |
const block_q4_0 * bx0 = (block_q4_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3423:9: note: in instantiation of function template specialization 'load_tiles_q4_0<64, 8, true>' requested here | |
load_tiles_q4_0<mmq_y, nwarps, need_check>, VDR_Q4_0_Q8_1_MMQ, vec_dot_q4_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4524:9: note: in instantiation of function template specialization 'mul_mat_q4_0<true>' requested here | |
mul_mat_q4_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2162:45: warning: cast from 'const void *' to 'block_q4_1 *' drops const qualifier [-Wcast-qual] | |
const block_q4_1 * bx0 = (block_q4_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3461:9: note: in instantiation of function template specialization 'load_tiles_q4_1<64, 8, false>' requested here | |
load_tiles_q4_1<mmq_y, nwarps, need_check>, VDR_Q4_1_Q8_1_MMQ, vec_dot_q4_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4557:9: note: in instantiation of function template specialization 'mul_mat_q4_1<false>' requested here | |
mul_mat_q4_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2162:45: warning: cast from 'const void *' to 'block_q4_1 *' drops const qualifier [-Wcast-qual] | |
const block_q4_1 * bx0 = (block_q4_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3461:9: note: in instantiation of function template specialization 'load_tiles_q4_1<64, 8, true>' requested here | |
load_tiles_q4_1<mmq_y, nwarps, need_check>, VDR_Q4_1_Q8_1_MMQ, vec_dot_q4_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4561:9: note: in instantiation of function template specialization 'mul_mat_q4_1<true>' requested here | |
mul_mat_q4_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2254:45: warning: cast from 'const void *' to 'block_q5_0 *' drops const qualifier [-Wcast-qual] | |
const block_q5_0 * bx0 = (block_q5_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3495:9: note: in instantiation of function template specialization 'load_tiles_q5_0<64, 8, false>' requested here | |
load_tiles_q5_0<mmq_y, nwarps, need_check>, VDR_Q5_0_Q8_1_MMQ, vec_dot_q5_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4594:9: note: in instantiation of function template specialization 'mul_mat_q5_0<false>' requested here | |
mul_mat_q5_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2254:45: warning: cast from 'const void *' to 'block_q5_0 *' drops const qualifier [-Wcast-qual] | |
const block_q5_0 * bx0 = (block_q5_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3495:9: note: in instantiation of function template specialization 'load_tiles_q5_0<64, 8, true>' requested here | |
load_tiles_q5_0<mmq_y, nwarps, need_check>, VDR_Q5_0_Q8_1_MMQ, vec_dot_q5_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4598:9: note: in instantiation of function template specialization 'mul_mat_q5_0<true>' requested here | |
mul_mat_q5_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2368:45: warning: cast from 'const void *' to 'block_q5_1 *' drops const qualifier [-Wcast-qual] | |
const block_q5_1 * bx0 = (block_q5_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3529:9: note: in instantiation of function template specialization 'load_tiles_q5_1<64, 8, false>' requested here | |
load_tiles_q5_1<mmq_y, nwarps, need_check>, VDR_Q5_1_Q8_1_MMQ, vec_dot_q5_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4631:9: note: in instantiation of function template specialization 'mul_mat_q5_1<false>' requested here | |
mul_mat_q5_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2368:45: warning: cast from 'const void *' to 'block_q5_1 *' drops const qualifier [-Wcast-qual] | |
const block_q5_1 * bx0 = (block_q5_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3529:9: note: in instantiation of function template specialization 'load_tiles_q5_1<64, 8, true>' requested here | |
load_tiles_q5_1<mmq_y, nwarps, need_check>, VDR_Q5_1_Q8_1_MMQ, vec_dot_q5_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4635:9: note: in instantiation of function template specialization 'mul_mat_q5_1<true>' requested here | |
mul_mat_q5_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2475:45: warning: cast from 'const void *' to 'block_q8_0 *' drops const qualifier [-Wcast-qual] | |
const block_q8_0 * bx0 = (block_q8_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3563:9: note: in instantiation of function template specialization 'load_tiles_q8_0<64, 8, false>' requested here | |
load_tiles_q8_0<mmq_y, nwarps, need_check>, VDR_Q8_0_Q8_1_MMQ, vec_dot_q8_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4668:9: note: in instantiation of function template specialization 'mul_mat_q8_0<false>' requested here | |
mul_mat_q8_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2475:45: warning: cast from 'const void *' to 'block_q8_0 *' drops const qualifier [-Wcast-qual] | |
const block_q8_0 * bx0 = (block_q8_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3563:9: note: in instantiation of function template specialization 'load_tiles_q8_0<64, 8, true>' requested here | |
load_tiles_q8_0<mmq_y, nwarps, need_check>, VDR_Q8_0_Q8_1_MMQ, vec_dot_q8_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4672:9: note: in instantiation of function template specialization 'mul_mat_q8_0<true>' requested here | |
mul_mat_q8_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2565:45: warning: cast from 'const void *' to 'block_q2_K *' drops const qualifier [-Wcast-qual] | |
const block_q2_K * bx0 = (block_q2_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3597:9: note: in instantiation of function template specialization 'load_tiles_q2_K<64, 8, false>' requested here | |
load_tiles_q2_K<mmq_y, nwarps, need_check>, VDR_Q2_K_Q8_1_MMQ, vec_dot_q2_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4705:9: note: in instantiation of function template specialization 'mul_mat_q2_K<false>' requested here | |
mul_mat_q2_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2565:45: warning: cast from 'const void *' to 'block_q2_K *' drops const qualifier [-Wcast-qual] | |
const block_q2_K * bx0 = (block_q2_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3597:9: note: in instantiation of function template specialization 'load_tiles_q2_K<64, 8, true>' requested here | |
load_tiles_q2_K<mmq_y, nwarps, need_check>, VDR_Q2_K_Q8_1_MMQ, vec_dot_q2_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4709:9: note: in instantiation of function template specialization 'mul_mat_q2_K<true>' requested here | |
mul_mat_q2_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2686:45: warning: cast from 'const void *' to 'block_q3_K *' drops const qualifier [-Wcast-qual] | |
const block_q3_K * bx0 = (block_q3_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3635:9: note: in instantiation of function template specialization 'load_tiles_q3_K<64, 8, false>' requested here | |
load_tiles_q3_K<mmq_y, nwarps, need_check>, VDR_Q3_K_Q8_1_MMQ, vec_dot_q3_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4744:9: note: in instantiation of function template specialization 'mul_mat_q3_K<false>' requested here | |
mul_mat_q3_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2686:45: warning: cast from 'const void *' to 'block_q3_K *' drops const qualifier [-Wcast-qual] | |
const block_q3_K * bx0 = (block_q3_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3635:9: note: in instantiation of function template specialization 'load_tiles_q3_K<64, 8, true>' requested here | |
load_tiles_q3_K<mmq_y, nwarps, need_check>, VDR_Q3_K_Q8_1_MMQ, vec_dot_q3_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4748:9: note: in instantiation of function template specialization 'mul_mat_q3_K<true>' requested here | |
mul_mat_q3_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2904:45: warning: cast from 'const void *' to 'block_q4_K *' drops const qualifier [-Wcast-qual] | |
const block_q4_K * bx0 = (block_q4_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3673:9: note: in instantiation of function template specialization 'load_tiles_q4_K<64, 8, false>' requested here | |
load_tiles_q4_K<mmq_y, nwarps, need_check>, VDR_Q4_K_Q8_1_MMQ, vec_dot_q4_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4782:9: note: in instantiation of function template specialization 'mul_mat_q4_K<false>' requested here | |
mul_mat_q4_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2949:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2904:45: warning: cast from 'const void *' to 'block_q4_K *' drops const qualifier [-Wcast-qual] | |
const block_q4_K * bx0 = (block_q4_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3673:9: note: in instantiation of function template specialization 'load_tiles_q4_K<64, 8, true>' requested here | |
load_tiles_q4_K<mmq_y, nwarps, need_check>, VDR_Q4_K_Q8_1_MMQ, vec_dot_q4_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4786:9: note: in instantiation of function template specialization 'mul_mat_q4_K<true>' requested here | |
mul_mat_q4_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2949:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3085:45: warning: cast from 'const void *' to 'block_q5_K *' drops const qualifier [-Wcast-qual] | |
const block_q5_K * bx0 = (block_q5_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3707:9: note: in instantiation of function template specialization 'load_tiles_q5_K<64, 8, false>' requested here | |
load_tiles_q5_K<mmq_y, nwarps, need_check>, VDR_Q5_K_Q8_1_MMQ, vec_dot_q5_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4819:9: note: in instantiation of function template specialization 'mul_mat_q5_K<false>' requested here | |
mul_mat_q5_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3141:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3085:45: warning: cast from 'const void *' to 'block_q5_K *' drops const qualifier [-Wcast-qual] | |
const block_q5_K * bx0 = (block_q5_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3707:9: note: in instantiation of function template specialization 'load_tiles_q5_K<64, 8, true>' requested here | |
load_tiles_q5_K<mmq_y, nwarps, need_check>, VDR_Q5_K_Q8_1_MMQ, vec_dot_q5_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4823:9: note: in instantiation of function template specialization 'mul_mat_q5_K<true>' requested here | |
mul_mat_q5_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3141:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3214:45: warning: cast from 'const void *' to 'block_q6_K *' drops const qualifier [-Wcast-qual] | |
const block_q6_K * bx0 = (block_q6_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3745:9: note: in instantiation of function template specialization 'load_tiles_q6_K<64, 8, false>' requested here | |
load_tiles_q6_K<mmq_y, nwarps, need_check>, VDR_Q6_K_Q8_1_MMQ, vec_dot_q6_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4856:9: note: in instantiation of function template specialization 'mul_mat_q6_K<false>' requested here | |
mul_mat_q6_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3214:45: warning: cast from 'const void *' to 'block_q6_K *' drops const qualifier [-Wcast-qual] | |
const block_q6_K * bx0 = (block_q6_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3745:9: note: in instantiation of function template specialization 'load_tiles_q6_K<64, 8, true>' requested here | |
load_tiles_q6_K<mmq_y, nwarps, need_check>, VDR_Q6_K_Q8_1_MMQ, vec_dot_q6_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4860:9: note: in instantiation of function template specialization 'mul_mat_q6_K<true>' requested here | |
mul_mat_q6_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
85 warnings generated when compiling for gfx906. | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:166:41: warning: cast from 'const signed char *' to 'unsigned short *' drops const qualifier [-Wcast-qual] | |
const uint16_t * x16 = (uint16_t *) (x8 + sizeof(int) * i32); // assume at least 2 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:176:41: warning: cast from 'const unsigned char *' to 'unsigned short *' drops const qualifier [-Wcast-qual] | |
const uint16_t * x16 = (uint16_t *) (x8 + sizeof(int) * i32); // assume at least 2 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:186:22: warning: cast from 'const signed char *' to 'int *' drops const qualifier [-Wcast-qual] | |
return *((int *) (x8 + sizeof(int) * i32)); // assume at least 4 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:190:22: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
return *((int *) (x8 + sizeof(int) * i32)); // assume at least 4 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2047:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2047:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2068:45: warning: cast from 'const void *' to 'block_q4_0 *' drops const qualifier [-Wcast-qual] | |
const block_q4_0 * bx0 = (block_q4_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2057:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2058:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2108:37: warning: cast from 'const __half2 *' to 'float *' drops const qualifier [-Wcast-qual] | |
const float * x_dmf = (float *) x_dm; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2104:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2104:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2141:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2141:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2162:45: warning: cast from 'const void *' to 'block_q4_1 *' drops const qualifier [-Wcast-qual] | |
const block_q4_1 * bx0 = (block_q4_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2151:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2152:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2195:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2195:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2233:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2233:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2254:45: warning: cast from 'const void *' to 'block_q5_0 *' drops const qualifier [-Wcast-qual] | |
const block_q5_0 * bx0 = (block_q5_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2243:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2244:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2307:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2307:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2347:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2347:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2368:45: warning: cast from 'const void *' to 'block_q5_1 *' drops const qualifier [-Wcast-qual] | |
const block_q5_1 * bx0 = (block_q5_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2357:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2358:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2418:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2418:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2453:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q8_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2453:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q8_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2475:45: warning: cast from 'const void *' to 'block_q8_0 *' drops const qualifier [-Wcast-qual] | |
const block_q8_0 * bx0 = (block_q8_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2463:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2464:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2508:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2508:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2542:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q2_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2565:45: warning: cast from 'const void *' to 'block_q2_K *' drops const qualifier [-Wcast-qual] | |
const block_q2_K * bx0 = (block_q2_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2554:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2611:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2686:45: warning: cast from 'const void *' to 'block_q3_K *' drops const qualifier [-Wcast-qual] | |
const block_q3_K * bx0 = (block_q3_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2767:41: warning: cast from 'const int *' to 'signed char *' drops const qualifier [-Wcast-qual] | |
const int8_t * scales = ((int8_t *) (x_sc + i * (WARP_SIZE/4) + i/4 + kbx*4)) + ky/4; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2881:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2904:45: warning: cast from 'const void *' to 'block_q4_K *' drops const qualifier [-Wcast-qual] | |
const block_q4_K * bx0 = (block_q4_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2949:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2893:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2962:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3062:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3085:45: warning: cast from 'const void *' to 'block_q5_K *' drops const qualifier [-Wcast-qual] | |
const block_q5_K * bx0 = (block_q5_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3141:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3074:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3154:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3191:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q6_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3214:45: warning: cast from 'const void *' to 'block_q6_K *' drops const qualifier [-Wcast-qual] | |
const block_q6_K * bx0 = (block_q6_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3203:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3274:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:5693:72: warning: unused parameter 'i02' [-Wunused-parameter] | |
float * src0_ddf_i, float * src1_ddf_i, float * dst_ddf_i, int64_t i02, int64_t i01_low, int64_t i01_high, int i1, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2068:45: warning: cast from 'const void *' to 'block_q4_0 *' drops const qualifier [-Wcast-qual] | |
const block_q4_0 * bx0 = (block_q4_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3423:9: note: in instantiation of function template specialization 'load_tiles_q4_0<64, 8, false>' requested here | |
load_tiles_q4_0<mmq_y, nwarps, need_check>, VDR_Q4_0_Q8_1_MMQ, vec_dot_q4_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4520:9: note: in instantiation of function template specialization 'mul_mat_q4_0<false>' requested here | |
mul_mat_q4_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2068:45: warning: cast from 'const void *' to 'block_q4_0 *' drops const qualifier [-Wcast-qual] | |
const block_q4_0 * bx0 = (block_q4_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3423:9: note: in instantiation of function template specialization 'load_tiles_q4_0<64, 8, true>' requested here | |
load_tiles_q4_0<mmq_y, nwarps, need_check>, VDR_Q4_0_Q8_1_MMQ, vec_dot_q4_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4524:9: note: in instantiation of function template specialization 'mul_mat_q4_0<true>' requested here | |
mul_mat_q4_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2162:45: warning: cast from 'const void *' to 'block_q4_1 *' drops const qualifier [-Wcast-qual] | |
const block_q4_1 * bx0 = (block_q4_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3461:9: note: in instantiation of function template specialization 'load_tiles_q4_1<64, 8, false>' requested here | |
load_tiles_q4_1<mmq_y, nwarps, need_check>, VDR_Q4_1_Q8_1_MMQ, vec_dot_q4_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4557:9: note: in instantiation of function template specialization 'mul_mat_q4_1<false>' requested here | |
mul_mat_q4_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2162:45: warning: cast from 'const void *' to 'block_q4_1 *' drops const qualifier [-Wcast-qual] | |
const block_q4_1 * bx0 = (block_q4_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3461:9: note: in instantiation of function template specialization 'load_tiles_q4_1<64, 8, true>' requested here | |
load_tiles_q4_1<mmq_y, nwarps, need_check>, VDR_Q4_1_Q8_1_MMQ, vec_dot_q4_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4561:9: note: in instantiation of function template specialization 'mul_mat_q4_1<true>' requested here | |
mul_mat_q4_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2254:45: warning: cast from 'const void *' to 'block_q5_0 *' drops const qualifier [-Wcast-qual] | |
const block_q5_0 * bx0 = (block_q5_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3495:9: note: in instantiation of function template specialization 'load_tiles_q5_0<64, 8, false>' requested here | |
load_tiles_q5_0<mmq_y, nwarps, need_check>, VDR_Q5_0_Q8_1_MMQ, vec_dot_q5_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4594:9: note: in instantiation of function template specialization 'mul_mat_q5_0<false>' requested here | |
mul_mat_q5_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2254:45: warning: cast from 'const void *' to 'block_q5_0 *' drops const qualifier [-Wcast-qual] | |
const block_q5_0 * bx0 = (block_q5_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3495:9: note: in instantiation of function template specialization 'load_tiles_q5_0<64, 8, true>' requested here | |
load_tiles_q5_0<mmq_y, nwarps, need_check>, VDR_Q5_0_Q8_1_MMQ, vec_dot_q5_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4598:9: note: in instantiation of function template specialization 'mul_mat_q5_0<true>' requested here | |
mul_mat_q5_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2368:45: warning: cast from 'const void *' to 'block_q5_1 *' drops const qualifier [-Wcast-qual] | |
const block_q5_1 * bx0 = (block_q5_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3529:9: note: in instantiation of function template specialization 'load_tiles_q5_1<64, 8, false>' requested here | |
load_tiles_q5_1<mmq_y, nwarps, need_check>, VDR_Q5_1_Q8_1_MMQ, vec_dot_q5_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4631:9: note: in instantiation of function template specialization 'mul_mat_q5_1<false>' requested here | |
mul_mat_q5_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2368:45: warning: cast from 'const void *' to 'block_q5_1 *' drops const qualifier [-Wcast-qual] | |
const block_q5_1 * bx0 = (block_q5_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3529:9: note: in instantiation of function template specialization 'load_tiles_q5_1<64, 8, true>' requested here | |
load_tiles_q5_1<mmq_y, nwarps, need_check>, VDR_Q5_1_Q8_1_MMQ, vec_dot_q5_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4635:9: note: in instantiation of function template specialization 'mul_mat_q5_1<true>' requested here | |
mul_mat_q5_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2475:45: warning: cast from 'const void *' to 'block_q8_0 *' drops const qualifier [-Wcast-qual] | |
const block_q8_0 * bx0 = (block_q8_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3563:9: note: in instantiation of function template specialization 'load_tiles_q8_0<64, 8, false>' requested here | |
load_tiles_q8_0<mmq_y, nwarps, need_check>, VDR_Q8_0_Q8_1_MMQ, vec_dot_q8_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4668:9: note: in instantiation of function template specialization 'mul_mat_q8_0<false>' requested here | |
mul_mat_q8_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2475:45: warning: cast from 'const void *' to 'block_q8_0 *' drops const qualifier [-Wcast-qual] | |
const block_q8_0 * bx0 = (block_q8_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3563:9: note: in instantiation of function template specialization 'load_tiles_q8_0<64, 8, true>' requested here | |
load_tiles_q8_0<mmq_y, nwarps, need_check>, VDR_Q8_0_Q8_1_MMQ, vec_dot_q8_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4672:9: note: in instantiation of function template specialization 'mul_mat_q8_0<true>' requested here | |
mul_mat_q8_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2565:45: warning: cast from 'const void *' to 'block_q2_K *' drops const qualifier [-Wcast-qual] | |
const block_q2_K * bx0 = (block_q2_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3597:9: note: in instantiation of function template specialization 'load_tiles_q2_K<64, 8, false>' requested here | |
load_tiles_q2_K<mmq_y, nwarps, need_check>, VDR_Q2_K_Q8_1_MMQ, vec_dot_q2_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4705:9: note: in instantiation of function template specialization 'mul_mat_q2_K<false>' requested here | |
mul_mat_q2_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2565:45: warning: cast from 'const void *' to 'block_q2_K *' drops const qualifier [-Wcast-qual] | |
const block_q2_K * bx0 = (block_q2_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3597:9: note: in instantiation of function template specialization 'load_tiles_q2_K<64, 8, true>' requested here | |
load_tiles_q2_K<mmq_y, nwarps, need_check>, VDR_Q2_K_Q8_1_MMQ, vec_dot_q2_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4709:9: note: in instantiation of function template specialization 'mul_mat_q2_K<true>' requested here | |
mul_mat_q2_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2686:45: warning: cast from 'const void *' to 'block_q3_K *' drops const qualifier [-Wcast-qual] | |
const block_q3_K * bx0 = (block_q3_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3635:9: note: in instantiation of function template specialization 'load_tiles_q3_K<64, 8, false>' requested here | |
load_tiles_q3_K<mmq_y, nwarps, need_check>, VDR_Q3_K_Q8_1_MMQ, vec_dot_q3_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4744:9: note: in instantiation of function template specialization 'mul_mat_q3_K<false>' requested here | |
mul_mat_q3_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2686:45: warning: cast from 'const void *' to 'block_q3_K *' drops const qualifier [-Wcast-qual] | |
const block_q3_K * bx0 = (block_q3_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3635:9: note: in instantiation of function template specialization 'load_tiles_q3_K<64, 8, true>' requested here | |
load_tiles_q3_K<mmq_y, nwarps, need_check>, VDR_Q3_K_Q8_1_MMQ, vec_dot_q3_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4748:9: note: in instantiation of function template specialization 'mul_mat_q3_K<true>' requested here | |
mul_mat_q3_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2904:45: warning: cast from 'const void *' to 'block_q4_K *' drops const qualifier [-Wcast-qual] | |
const block_q4_K * bx0 = (block_q4_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3673:9: note: in instantiation of function template specialization 'load_tiles_q4_K<64, 8, false>' requested here | |
load_tiles_q4_K<mmq_y, nwarps, need_check>, VDR_Q4_K_Q8_1_MMQ, vec_dot_q4_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4782:9: note: in instantiation of function template specialization 'mul_mat_q4_K<false>' requested here | |
mul_mat_q4_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2949:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2904:45: warning: cast from 'const void *' to 'block_q4_K *' drops const qualifier [-Wcast-qual] | |
const block_q4_K * bx0 = (block_q4_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3673:9: note: in instantiation of function template specialization 'load_tiles_q4_K<64, 8, true>' requested here | |
load_tiles_q4_K<mmq_y, nwarps, need_check>, VDR_Q4_K_Q8_1_MMQ, vec_dot_q4_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4786:9: note: in instantiation of function template specialization 'mul_mat_q4_K<true>' requested here | |
mul_mat_q4_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2949:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3085:45: warning: cast from 'const void *' to 'block_q5_K *' drops const qualifier [-Wcast-qual] | |
const block_q5_K * bx0 = (block_q5_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3707:9: note: in instantiation of function template specialization 'load_tiles_q5_K<64, 8, false>' requested here | |
load_tiles_q5_K<mmq_y, nwarps, need_check>, VDR_Q5_K_Q8_1_MMQ, vec_dot_q5_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4819:9: note: in instantiation of function template specialization 'mul_mat_q5_K<false>' requested here | |
mul_mat_q5_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3141:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3085:45: warning: cast from 'const void *' to 'block_q5_K *' drops const qualifier [-Wcast-qual] | |
const block_q5_K * bx0 = (block_q5_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3707:9: note: in instantiation of function template specialization 'load_tiles_q5_K<64, 8, true>' requested here | |
load_tiles_q5_K<mmq_y, nwarps, need_check>, VDR_Q5_K_Q8_1_MMQ, vec_dot_q5_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4823:9: note: in instantiation of function template specialization 'mul_mat_q5_K<true>' requested here | |
mul_mat_q5_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3141:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3214:45: warning: cast from 'const void *' to 'block_q6_K *' drops const qualifier [-Wcast-qual] | |
const block_q6_K * bx0 = (block_q6_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3745:9: note: in instantiation of function template specialization 'load_tiles_q6_K<64, 8, false>' requested here | |
load_tiles_q6_K<mmq_y, nwarps, need_check>, VDR_Q6_K_Q8_1_MMQ, vec_dot_q6_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4856:9: note: in instantiation of function template specialization 'mul_mat_q6_K<false>' requested here | |
mul_mat_q6_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3214:45: warning: cast from 'const void *' to 'block_q6_K *' drops const qualifier [-Wcast-qual] | |
const block_q6_K * bx0 = (block_q6_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3745:9: note: in instantiation of function template specialization 'load_tiles_q6_K<64, 8, true>' requested here | |
load_tiles_q6_K<mmq_y, nwarps, need_check>, VDR_Q6_K_Q8_1_MMQ, vec_dot_q6_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4860:9: note: in instantiation of function template specialization 'mul_mat_q6_K<true>' requested here | |
mul_mat_q6_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
85 warnings generated when compiling for gfx908. | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:166:41: warning: cast from 'const signed char *' to 'unsigned short *' drops const qualifier [-Wcast-qual] | |
const uint16_t * x16 = (uint16_t *) (x8 + sizeof(int) * i32); // assume at least 2 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:176:41: warning: cast from 'const unsigned char *' to 'unsigned short *' drops const qualifier [-Wcast-qual] | |
const uint16_t * x16 = (uint16_t *) (x8 + sizeof(int) * i32); // assume at least 2 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:186:22: warning: cast from 'const signed char *' to 'int *' drops const qualifier [-Wcast-qual] | |
return *((int *) (x8 + sizeof(int) * i32)); // assume at least 4 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:190:22: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
return *((int *) (x8 + sizeof(int) * i32)); // assume at least 4 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2047:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2047:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2068:45: warning: cast from 'const void *' to 'block_q4_0 *' drops const qualifier [-Wcast-qual] | |
const block_q4_0 * bx0 = (block_q4_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2057:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2058:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2108:37: warning: cast from 'const __half2 *' to 'float *' drops const qualifier [-Wcast-qual] | |
const float * x_dmf = (float *) x_dm; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2104:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2104:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2141:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2141:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2162:45: warning: cast from 'const void *' to 'block_q4_1 *' drops const qualifier [-Wcast-qual] | |
const block_q4_1 * bx0 = (block_q4_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2151:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2152:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2195:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2195:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2233:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2233:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2254:45: warning: cast from 'const void *' to 'block_q5_0 *' drops const qualifier [-Wcast-qual] | |
const block_q5_0 * bx0 = (block_q5_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2243:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2244:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2307:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2307:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2347:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2347:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2368:45: warning: cast from 'const void *' to 'block_q5_1 *' drops const qualifier [-Wcast-qual] | |
const block_q5_1 * bx0 = (block_q5_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2357:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2358:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2418:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2418:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2453:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q8_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2453:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q8_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2475:45: warning: cast from 'const void *' to 'block_q8_0 *' drops const qualifier [-Wcast-qual] | |
const block_q8_0 * bx0 = (block_q8_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2463:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2464:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2508:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2508:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2542:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q2_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2565:45: warning: cast from 'const void *' to 'block_q2_K *' drops const qualifier [-Wcast-qual] | |
const block_q2_K * bx0 = (block_q2_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2554:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2611:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2686:45: warning: cast from 'const void *' to 'block_q3_K *' drops const qualifier [-Wcast-qual] | |
const block_q3_K * bx0 = (block_q3_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2767:41: warning: cast from 'const int *' to 'signed char *' drops const qualifier [-Wcast-qual] | |
const int8_t * scales = ((int8_t *) (x_sc + i * (WARP_SIZE/4) + i/4 + kbx*4)) + ky/4; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2881:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2904:45: warning: cast from 'const void *' to 'block_q4_K *' drops const qualifier [-Wcast-qual] | |
const block_q4_K * bx0 = (block_q4_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2949:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2893:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2962:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3062:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3085:45: warning: cast from 'const void *' to 'block_q5_K *' drops const qualifier [-Wcast-qual] | |
const block_q5_K * bx0 = (block_q5_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3141:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3074:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3154:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3191:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q6_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3214:45: warning: cast from 'const void *' to 'block_q6_K *' drops const qualifier [-Wcast-qual] | |
const block_q6_K * bx0 = (block_q6_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3203:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3274:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:5693:72: warning: unused parameter 'i02' [-Wunused-parameter] | |
float * src0_ddf_i, float * src1_ddf_i, float * dst_ddf_i, int64_t i02, int64_t i01_low, int64_t i01_high, int i1, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2068:45: warning: cast from 'const void *' to 'block_q4_0 *' drops const qualifier [-Wcast-qual] | |
const block_q4_0 * bx0 = (block_q4_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3423:9: note: in instantiation of function template specialization 'load_tiles_q4_0<64, 8, false>' requested here | |
load_tiles_q4_0<mmq_y, nwarps, need_check>, VDR_Q4_0_Q8_1_MMQ, vec_dot_q4_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4520:9: note: in instantiation of function template specialization 'mul_mat_q4_0<false>' requested here | |
mul_mat_q4_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2068:45: warning: cast from 'const void *' to 'block_q4_0 *' drops const qualifier [-Wcast-qual] | |
const block_q4_0 * bx0 = (block_q4_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3423:9: note: in instantiation of function template specialization 'load_tiles_q4_0<64, 8, true>' requested here | |
load_tiles_q4_0<mmq_y, nwarps, need_check>, VDR_Q4_0_Q8_1_MMQ, vec_dot_q4_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4524:9: note: in instantiation of function template specialization 'mul_mat_q4_0<true>' requested here | |
mul_mat_q4_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2162:45: warning: cast from 'const void *' to 'block_q4_1 *' drops const qualifier [-Wcast-qual] | |
const block_q4_1 * bx0 = (block_q4_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3461:9: note: in instantiation of function template specialization 'load_tiles_q4_1<64, 8, false>' requested here | |
load_tiles_q4_1<mmq_y, nwarps, need_check>, VDR_Q4_1_Q8_1_MMQ, vec_dot_q4_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4557:9: note: in instantiation of function template specialization 'mul_mat_q4_1<false>' requested here | |
mul_mat_q4_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2162:45: warning: cast from 'const void *' to 'block_q4_1 *' drops const qualifier [-Wcast-qual] | |
const block_q4_1 * bx0 = (block_q4_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3461:9: note: in instantiation of function template specialization 'load_tiles_q4_1<64, 8, true>' requested here | |
load_tiles_q4_1<mmq_y, nwarps, need_check>, VDR_Q4_1_Q8_1_MMQ, vec_dot_q4_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4561:9: note: in instantiation of function template specialization 'mul_mat_q4_1<true>' requested here | |
mul_mat_q4_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2254:45: warning: cast from 'const void *' to 'block_q5_0 *' drops const qualifier [-Wcast-qual] | |
const block_q5_0 * bx0 = (block_q5_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3495:9: note: in instantiation of function template specialization 'load_tiles_q5_0<64, 8, false>' requested here | |
load_tiles_q5_0<mmq_y, nwarps, need_check>, VDR_Q5_0_Q8_1_MMQ, vec_dot_q5_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4594:9: note: in instantiation of function template specialization 'mul_mat_q5_0<false>' requested here | |
mul_mat_q5_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2254:45: warning: cast from 'const void *' to 'block_q5_0 *' drops const qualifier [-Wcast-qual] | |
const block_q5_0 * bx0 = (block_q5_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3495:9: note: in instantiation of function template specialization 'load_tiles_q5_0<64, 8, true>' requested here | |
load_tiles_q5_0<mmq_y, nwarps, need_check>, VDR_Q5_0_Q8_1_MMQ, vec_dot_q5_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4598:9: note: in instantiation of function template specialization 'mul_mat_q5_0<true>' requested here | |
mul_mat_q5_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2368:45: warning: cast from 'const void *' to 'block_q5_1 *' drops const qualifier [-Wcast-qual] | |
const block_q5_1 * bx0 = (block_q5_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3529:9: note: in instantiation of function template specialization 'load_tiles_q5_1<64, 8, false>' requested here | |
load_tiles_q5_1<mmq_y, nwarps, need_check>, VDR_Q5_1_Q8_1_MMQ, vec_dot_q5_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4631:9: note: in instantiation of function template specialization 'mul_mat_q5_1<false>' requested here | |
mul_mat_q5_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2368:45: warning: cast from 'const void *' to 'block_q5_1 *' drops const qualifier [-Wcast-qual] | |
const block_q5_1 * bx0 = (block_q5_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3529:9: note: in instantiation of function template specialization 'load_tiles_q5_1<64, 8, true>' requested here | |
load_tiles_q5_1<mmq_y, nwarps, need_check>, VDR_Q5_1_Q8_1_MMQ, vec_dot_q5_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4635:9: note: in instantiation of function template specialization 'mul_mat_q5_1<true>' requested here | |
mul_mat_q5_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2475:45: warning: cast from 'const void *' to 'block_q8_0 *' drops const qualifier [-Wcast-qual] | |
const block_q8_0 * bx0 = (block_q8_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3563:9: note: in instantiation of function template specialization 'load_tiles_q8_0<64, 8, false>' requested here | |
load_tiles_q8_0<mmq_y, nwarps, need_check>, VDR_Q8_0_Q8_1_MMQ, vec_dot_q8_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4668:9: note: in instantiation of function template specialization 'mul_mat_q8_0<false>' requested here | |
mul_mat_q8_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2475:45: warning: cast from 'const void *' to 'block_q8_0 *' drops const qualifier [-Wcast-qual] | |
const block_q8_0 * bx0 = (block_q8_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3563:9: note: in instantiation of function template specialization 'load_tiles_q8_0<64, 8, true>' requested here | |
load_tiles_q8_0<mmq_y, nwarps, need_check>, VDR_Q8_0_Q8_1_MMQ, vec_dot_q8_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4672:9: note: in instantiation of function template specialization 'mul_mat_q8_0<true>' requested here | |
mul_mat_q8_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2565:45: warning: cast from 'const void *' to 'block_q2_K *' drops const qualifier [-Wcast-qual] | |
const block_q2_K * bx0 = (block_q2_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3597:9: note: in instantiation of function template specialization 'load_tiles_q2_K<64, 8, false>' requested here | |
load_tiles_q2_K<mmq_y, nwarps, need_check>, VDR_Q2_K_Q8_1_MMQ, vec_dot_q2_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4705:9: note: in instantiation of function template specialization 'mul_mat_q2_K<false>' requested here | |
mul_mat_q2_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2565:45: warning: cast from 'const void *' to 'block_q2_K *' drops const qualifier [-Wcast-qual] | |
const block_q2_K * bx0 = (block_q2_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3597:9: note: in instantiation of function template specialization 'load_tiles_q2_K<64, 8, true>' requested here | |
load_tiles_q2_K<mmq_y, nwarps, need_check>, VDR_Q2_K_Q8_1_MMQ, vec_dot_q2_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4709:9: note: in instantiation of function template specialization 'mul_mat_q2_K<true>' requested here | |
mul_mat_q2_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2686:45: warning: cast from 'const void *' to 'block_q3_K *' drops const qualifier [-Wcast-qual] | |
const block_q3_K * bx0 = (block_q3_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3635:9: note: in instantiation of function template specialization 'load_tiles_q3_K<64, 8, false>' requested here | |
load_tiles_q3_K<mmq_y, nwarps, need_check>, VDR_Q3_K_Q8_1_MMQ, vec_dot_q3_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4744:9: note: in instantiation of function template specialization 'mul_mat_q3_K<false>' requested here | |
mul_mat_q3_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2686:45: warning: cast from 'const void *' to 'block_q3_K *' drops const qualifier [-Wcast-qual] | |
const block_q3_K * bx0 = (block_q3_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3635:9: note: in instantiation of function template specialization 'load_tiles_q3_K<64, 8, true>' requested here | |
load_tiles_q3_K<mmq_y, nwarps, need_check>, VDR_Q3_K_Q8_1_MMQ, vec_dot_q3_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4748:9: note: in instantiation of function template specialization 'mul_mat_q3_K<true>' requested here | |
mul_mat_q3_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2904:45: warning: cast from 'const void *' to 'block_q4_K *' drops const qualifier [-Wcast-qual] | |
const block_q4_K * bx0 = (block_q4_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3673:9: note: in instantiation of function template specialization 'load_tiles_q4_K<64, 8, false>' requested here | |
load_tiles_q4_K<mmq_y, nwarps, need_check>, VDR_Q4_K_Q8_1_MMQ, vec_dot_q4_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4782:9: note: in instantiation of function template specialization 'mul_mat_q4_K<false>' requested here | |
mul_mat_q4_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2949:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2904:45: warning: cast from 'const void *' to 'block_q4_K *' drops const qualifier [-Wcast-qual] | |
const block_q4_K * bx0 = (block_q4_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3673:9: note: in instantiation of function template specialization 'load_tiles_q4_K<64, 8, true>' requested here | |
load_tiles_q4_K<mmq_y, nwarps, need_check>, VDR_Q4_K_Q8_1_MMQ, vec_dot_q4_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4786:9: note: in instantiation of function template specialization 'mul_mat_q4_K<true>' requested here | |
mul_mat_q4_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2949:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3085:45: warning: cast from 'const void *' to 'block_q5_K *' drops const qualifier [-Wcast-qual] | |
const block_q5_K * bx0 = (block_q5_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3707:9: note: in instantiation of function template specialization 'load_tiles_q5_K<64, 8, false>' requested here | |
load_tiles_q5_K<mmq_y, nwarps, need_check>, VDR_Q5_K_Q8_1_MMQ, vec_dot_q5_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4819:9: note: in instantiation of function template specialization 'mul_mat_q5_K<false>' requested here | |
mul_mat_q5_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3141:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3085:45: warning: cast from 'const void *' to 'block_q5_K *' drops const qualifier [-Wcast-qual] | |
const block_q5_K * bx0 = (block_q5_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3707:9: note: in instantiation of function template specialization 'load_tiles_q5_K<64, 8, true>' requested here | |
load_tiles_q5_K<mmq_y, nwarps, need_check>, VDR_Q5_K_Q8_1_MMQ, vec_dot_q5_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4823:9: note: in instantiation of function template specialization 'mul_mat_q5_K<true>' requested here | |
mul_mat_q5_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3141:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3214:45: warning: cast from 'const void *' to 'block_q6_K *' drops const qualifier [-Wcast-qual] | |
const block_q6_K * bx0 = (block_q6_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3745:9: note: in instantiation of function template specialization 'load_tiles_q6_K<64, 8, false>' requested here | |
load_tiles_q6_K<mmq_y, nwarps, need_check>, VDR_Q6_K_Q8_1_MMQ, vec_dot_q6_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4856:9: note: in instantiation of function template specialization 'mul_mat_q6_K<false>' requested here | |
mul_mat_q6_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3214:45: warning: cast from 'const void *' to 'block_q6_K *' drops const qualifier [-Wcast-qual] | |
const block_q6_K * bx0 = (block_q6_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3745:9: note: in instantiation of function template specialization 'load_tiles_q6_K<64, 8, true>' requested here | |
load_tiles_q6_K<mmq_y, nwarps, need_check>, VDR_Q6_K_Q8_1_MMQ, vec_dot_q6_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4860:9: note: in instantiation of function template specialization 'mul_mat_q6_K<true>' requested here | |
mul_mat_q6_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
85 warnings generated when compiling for gfx90a. | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:166:41: warning: cast from 'const signed char *' to 'unsigned short *' drops const qualifier [-Wcast-qual] | |
const uint16_t * x16 = (uint16_t *) (x8 + sizeof(int) * i32); // assume at least 2 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:176:41: warning: cast from 'const unsigned char *' to 'unsigned short *' drops const qualifier [-Wcast-qual] | |
const uint16_t * x16 = (uint16_t *) (x8 + sizeof(int) * i32); // assume at least 2 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:186:22: warning: cast from 'const signed char *' to 'int *' drops const qualifier [-Wcast-qual] | |
return *((int *) (x8 + sizeof(int) * i32)); // assume at least 4 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:190:22: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
return *((int *) (x8 + sizeof(int) * i32)); // assume at least 4 byte alignment | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2047:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2047:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2068:45: warning: cast from 'const void *' to 'block_q4_0 *' drops const qualifier [-Wcast-qual] | |
const block_q4_0 * bx0 = (block_q4_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2057:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2058:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2108:37: warning: cast from 'const __half2 *' to 'float *' drops const qualifier [-Wcast-qual] | |
const float * x_dmf = (float *) x_dm; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2104:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2104:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2141:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2141:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2162:45: warning: cast from 'const void *' to 'block_q4_1 *' drops const qualifier [-Wcast-qual] | |
const block_q4_1 * bx0 = (block_q4_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2151:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2152:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2195:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2195:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2233:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2233:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2254:45: warning: cast from 'const void *' to 'block_q5_0 *' drops const qualifier [-Wcast-qual] | |
const block_q5_0 * bx0 = (block_q5_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2243:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2244:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2307:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2307:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2347:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2347:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_1(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2368:45: warning: cast from 'const void *' to 'block_q5_1 *' drops const qualifier [-Wcast-qual] | |
const block_q5_1 * bx0 = (block_q5_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2357:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2358:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2418:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2418:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2453:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q8_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2453:129: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q8_0(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2475:45: warning: cast from 'const void *' to 'block_q8_0 *' drops const qualifier [-Wcast-qual] | |
const block_q8_0 * bx0 = (block_q8_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2463:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2464:24: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
int * __restrict__ x_sc, const int & i_offset, const int & i_max, const int & k, const int & blocks_per_row) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2508:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2508:125: warning: unused parameter 'x_sc' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2542:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q2_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2565:45: warning: cast from 'const void *' to 'block_q2_K *' drops const qualifier [-Wcast-qual] | |
const block_q2_K * bx0 = (block_q2_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2554:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2611:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2686:45: warning: cast from 'const void *' to 'block_q3_K *' drops const qualifier [-Wcast-qual] | |
const block_q3_K * bx0 = (block_q3_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2767:41: warning: cast from 'const int *' to 'signed char *' drops const qualifier [-Wcast-qual] | |
const int8_t * scales = ((int8_t *) (x_sc + i * (WARP_SIZE/4) + i/4 + kbx*4)) + ky/4; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2881:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q4_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2904:45: warning: cast from 'const void *' to 'block_q4_K *' drops const qualifier [-Wcast-qual] | |
const block_q4_K * bx0 = (block_q4_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2949:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2893:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2962:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3062:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q5_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3085:45: warning: cast from 'const void *' to 'block_q5_K *' drops const qualifier [-Wcast-qual] | |
const block_q5_K * bx0 = (block_q5_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3141:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3074:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3154:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3191:116: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
template <int mmq_y> static __device__ __forceinline__ void allocate_tiles_q6_K(int ** x_ql, half2 ** x_dm, int ** x_qh, int ** x_sc) { | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3214:45: warning: cast from 'const void *' to 'block_q6_K *' drops const qualifier [-Wcast-qual] | |
const block_q6_K * bx0 = (block_q6_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3203:106: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const void * __restrict__ vx, int * __restrict__ x_ql, half2 * __restrict__ x_dm, int * __restrict__ x_qh, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3274:94: warning: unused parameter 'x_qh' [-Wunused-parameter] | |
const int * __restrict__ x_ql, const half2 * __restrict__ x_dm, const int * __restrict__ x_qh, const int * __restrict__ x_sc, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:5693:72: warning: unused parameter 'i02' [-Wunused-parameter] | |
float * src0_ddf_i, float * src1_ddf_i, float * dst_ddf_i, int64_t i02, int64_t i01_low, int64_t i01_high, int i1, | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2068:45: warning: cast from 'const void *' to 'block_q4_0 *' drops const qualifier [-Wcast-qual] | |
const block_q4_0 * bx0 = (block_q4_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3423:9: note: in instantiation of function template specialization 'load_tiles_q4_0<64, 8, false>' requested here | |
load_tiles_q4_0<mmq_y, nwarps, need_check>, VDR_Q4_0_Q8_1_MMQ, vec_dot_q4_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4520:9: note: in instantiation of function template specialization 'mul_mat_q4_0<false>' requested here | |
mul_mat_q4_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2068:45: warning: cast from 'const void *' to 'block_q4_0 *' drops const qualifier [-Wcast-qual] | |
const block_q4_0 * bx0 = (block_q4_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3423:9: note: in instantiation of function template specialization 'load_tiles_q4_0<64, 8, true>' requested here | |
load_tiles_q4_0<mmq_y, nwarps, need_check>, VDR_Q4_0_Q8_1_MMQ, vec_dot_q4_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4524:9: note: in instantiation of function template specialization 'mul_mat_q4_0<true>' requested here | |
mul_mat_q4_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2162:45: warning: cast from 'const void *' to 'block_q4_1 *' drops const qualifier [-Wcast-qual] | |
const block_q4_1 * bx0 = (block_q4_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3461:9: note: in instantiation of function template specialization 'load_tiles_q4_1<64, 8, false>' requested here | |
load_tiles_q4_1<mmq_y, nwarps, need_check>, VDR_Q4_1_Q8_1_MMQ, vec_dot_q4_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4557:9: note: in instantiation of function template specialization 'mul_mat_q4_1<false>' requested here | |
mul_mat_q4_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2162:45: warning: cast from 'const void *' to 'block_q4_1 *' drops const qualifier [-Wcast-qual] | |
const block_q4_1 * bx0 = (block_q4_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3461:9: note: in instantiation of function template specialization 'load_tiles_q4_1<64, 8, true>' requested here | |
load_tiles_q4_1<mmq_y, nwarps, need_check>, VDR_Q4_1_Q8_1_MMQ, vec_dot_q4_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4561:9: note: in instantiation of function template specialization 'mul_mat_q4_1<true>' requested here | |
mul_mat_q4_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2254:45: warning: cast from 'const void *' to 'block_q5_0 *' drops const qualifier [-Wcast-qual] | |
const block_q5_0 * bx0 = (block_q5_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3495:9: note: in instantiation of function template specialization 'load_tiles_q5_0<64, 8, false>' requested here | |
load_tiles_q5_0<mmq_y, nwarps, need_check>, VDR_Q5_0_Q8_1_MMQ, vec_dot_q5_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4594:9: note: in instantiation of function template specialization 'mul_mat_q5_0<false>' requested here | |
mul_mat_q5_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2254:45: warning: cast from 'const void *' to 'block_q5_0 *' drops const qualifier [-Wcast-qual] | |
const block_q5_0 * bx0 = (block_q5_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3495:9: note: in instantiation of function template specialization 'load_tiles_q5_0<64, 8, true>' requested here | |
load_tiles_q5_0<mmq_y, nwarps, need_check>, VDR_Q5_0_Q8_1_MMQ, vec_dot_q5_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4598:9: note: in instantiation of function template specialization 'mul_mat_q5_0<true>' requested here | |
mul_mat_q5_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2368:45: warning: cast from 'const void *' to 'block_q5_1 *' drops const qualifier [-Wcast-qual] | |
const block_q5_1 * bx0 = (block_q5_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3529:9: note: in instantiation of function template specialization 'load_tiles_q5_1<64, 8, false>' requested here | |
load_tiles_q5_1<mmq_y, nwarps, need_check>, VDR_Q5_1_Q8_1_MMQ, vec_dot_q5_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4631:9: note: in instantiation of function template specialization 'mul_mat_q5_1<false>' requested here | |
mul_mat_q5_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2368:45: warning: cast from 'const void *' to 'block_q5_1 *' drops const qualifier [-Wcast-qual] | |
const block_q5_1 * bx0 = (block_q5_1 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3529:9: note: in instantiation of function template specialization 'load_tiles_q5_1<64, 8, true>' requested here | |
load_tiles_q5_1<mmq_y, nwarps, need_check>, VDR_Q5_1_Q8_1_MMQ, vec_dot_q5_1_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4635:9: note: in instantiation of function template specialization 'mul_mat_q5_1<true>' requested here | |
mul_mat_q5_1<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2475:45: warning: cast from 'const void *' to 'block_q8_0 *' drops const qualifier [-Wcast-qual] | |
const block_q8_0 * bx0 = (block_q8_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3563:9: note: in instantiation of function template specialization 'load_tiles_q8_0<64, 8, false>' requested here | |
load_tiles_q8_0<mmq_y, nwarps, need_check>, VDR_Q8_0_Q8_1_MMQ, vec_dot_q8_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4668:9: note: in instantiation of function template specialization 'mul_mat_q8_0<false>' requested here | |
mul_mat_q8_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2475:45: warning: cast from 'const void *' to 'block_q8_0 *' drops const qualifier [-Wcast-qual] | |
const block_q8_0 * bx0 = (block_q8_0 *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3563:9: note: in instantiation of function template specialization 'load_tiles_q8_0<64, 8, true>' requested here | |
load_tiles_q8_0<mmq_y, nwarps, need_check>, VDR_Q8_0_Q8_1_MMQ, vec_dot_q8_0_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4672:9: note: in instantiation of function template specialization 'mul_mat_q8_0<true>' requested here | |
mul_mat_q8_0<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2565:45: warning: cast from 'const void *' to 'block_q2_K *' drops const qualifier [-Wcast-qual] | |
const block_q2_K * bx0 = (block_q2_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3597:9: note: in instantiation of function template specialization 'load_tiles_q2_K<64, 8, false>' requested here | |
load_tiles_q2_K<mmq_y, nwarps, need_check>, VDR_Q2_K_Q8_1_MMQ, vec_dot_q2_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4705:9: note: in instantiation of function template specialization 'mul_mat_q2_K<false>' requested here | |
mul_mat_q2_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2565:45: warning: cast from 'const void *' to 'block_q2_K *' drops const qualifier [-Wcast-qual] | |
const block_q2_K * bx0 = (block_q2_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3597:9: note: in instantiation of function template specialization 'load_tiles_q2_K<64, 8, true>' requested here | |
load_tiles_q2_K<mmq_y, nwarps, need_check>, VDR_Q2_K_Q8_1_MMQ, vec_dot_q2_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4709:9: note: in instantiation of function template specialization 'mul_mat_q2_K<true>' requested here | |
mul_mat_q2_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2686:45: warning: cast from 'const void *' to 'block_q3_K *' drops const qualifier [-Wcast-qual] | |
const block_q3_K * bx0 = (block_q3_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3635:9: note: in instantiation of function template specialization 'load_tiles_q3_K<64, 8, false>' requested here | |
load_tiles_q3_K<mmq_y, nwarps, need_check>, VDR_Q3_K_Q8_1_MMQ, vec_dot_q3_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4744:9: note: in instantiation of function template specialization 'mul_mat_q3_K<false>' requested here | |
mul_mat_q3_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2686:45: warning: cast from 'const void *' to 'block_q3_K *' drops const qualifier [-Wcast-qual] | |
const block_q3_K * bx0 = (block_q3_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3635:9: note: in instantiation of function template specialization 'load_tiles_q3_K<64, 8, true>' requested here | |
load_tiles_q3_K<mmq_y, nwarps, need_check>, VDR_Q3_K_Q8_1_MMQ, vec_dot_q3_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4748:9: note: in instantiation of function template specialization 'mul_mat_q3_K<true>' requested here | |
mul_mat_q3_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2904:45: warning: cast from 'const void *' to 'block_q4_K *' drops const qualifier [-Wcast-qual] | |
const block_q4_K * bx0 = (block_q4_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3673:9: note: in instantiation of function template specialization 'load_tiles_q4_K<64, 8, false>' requested here | |
load_tiles_q4_K<mmq_y, nwarps, need_check>, VDR_Q4_K_Q8_1_MMQ, vec_dot_q4_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4782:9: note: in instantiation of function template specialization 'mul_mat_q4_K<false>' requested here | |
mul_mat_q4_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2949:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2904:45: warning: cast from 'const void *' to 'block_q4_K *' drops const qualifier [-Wcast-qual] | |
const block_q4_K * bx0 = (block_q4_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3673:9: note: in instantiation of function template specialization 'load_tiles_q4_K<64, 8, true>' requested here | |
load_tiles_q4_K<mmq_y, nwarps, need_check>, VDR_Q4_K_Q8_1_MMQ, vec_dot_q4_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4786:9: note: in instantiation of function template specialization 'mul_mat_q4_K<true>' requested here | |
mul_mat_q4_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:2949:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3085:45: warning: cast from 'const void *' to 'block_q5_K *' drops const qualifier [-Wcast-qual] | |
const block_q5_K * bx0 = (block_q5_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3707:9: note: in instantiation of function template specialization 'load_tiles_q5_K<64, 8, false>' requested here | |
load_tiles_q5_K<mmq_y, nwarps, need_check>, VDR_Q5_K_Q8_1_MMQ, vec_dot_q5_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4819:9: note: in instantiation of function template specialization 'mul_mat_q5_K<false>' requested here | |
mul_mat_q5_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3141:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3085:45: warning: cast from 'const void *' to 'block_q5_K *' drops const qualifier [-Wcast-qual] | |
const block_q5_K * bx0 = (block_q5_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3707:9: note: in instantiation of function template specialization 'load_tiles_q5_K<64, 8, true>' requested here | |
load_tiles_q5_K<mmq_y, nwarps, need_check>, VDR_Q5_K_Q8_1_MMQ, vec_dot_q5_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4823:9: note: in instantiation of function template specialization 'mul_mat_q5_K<true>' requested here | |
mul_mat_q5_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3141:38: warning: cast from 'const unsigned char *' to 'int *' drops const qualifier [-Wcast-qual] | |
const int * scales = (int *) bxi->scales; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3214:45: warning: cast from 'const void *' to 'block_q6_K *' drops const qualifier [-Wcast-qual] | |
const block_q6_K * bx0 = (block_q6_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3745:9: note: in instantiation of function template specialization 'load_tiles_q6_K<64, 8, false>' requested here | |
load_tiles_q6_K<mmq_y, nwarps, need_check>, VDR_Q6_K_Q8_1_MMQ, vec_dot_q6_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4856:9: note: in instantiation of function template specialization 'mul_mat_q6_K<false>' requested here | |
mul_mat_q6_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3214:45: warning: cast from 'const void *' to 'block_q6_K *' drops const qualifier [-Wcast-qual] | |
const block_q6_K * bx0 = (block_q6_K *) vx; | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:3745:9: note: in instantiation of function template specialization 'load_tiles_q6_K<64, 8, true>' requested here | |
load_tiles_q6_K<mmq_y, nwarps, need_check>, VDR_Q6_K_Q8_1_MMQ, vec_dot_q6_K_q8_1_mul_mat> | |
^ | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/models/ggml/ggml-cuda.cu:4860:9: note: in instantiation of function template specialization 'mul_mat_q6_K<true>' requested here | |
mul_mat_q6_K<need_check><<<block_nums, block_dims, 0, stream>>> | |
^ | |
85 warnings generated when compiling for host. | |
ninja: build stopped: subcommand failed. | |
Traceback (most recent call last): | |
File "/tmp/pip-build-env-v3f_z5es/overlay/local/lib/python3.10/dist-packages/skbuild/setuptools_wrap.py", line 674, in setup | |
cmkr.make(make_args, install_target=cmake_install_target, env=env) | |
File "/tmp/pip-build-env-v3f_z5es/overlay/local/lib/python3.10/dist-packages/skbuild/cmaker.py", line 697, in make | |
self.make_impl(clargs=clargs, config=config, source_dir=source_dir, install_target=install_target, env=env) | |
File "/tmp/pip-build-env-v3f_z5es/overlay/local/lib/python3.10/dist-packages/skbuild/cmaker.py", line 742, in make_impl | |
raise SKBuildError(msg) | |
An error occurred while building with CMake. | |
Command: | |
/tmp/pip-build-env-v3f_z5es/overlay/local/lib/python3.10/dist-packages/cmake/data/bin/cmake --build . --target install --config Release -- | |
Install target: | |
install | |
Source directory: | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5 | |
Working directory: | |
/tmp/pip-install-7cb4q_dn/ctransformers_4e4f626657934362ba44b5b0332d47c5/_skbuild/linux-x86_64-3.10/cmake-build | |
Please check the install target is valid and see CMake's output for more information. | |
[end of output] | |
note: This error originates from a subprocess, and is likely not a problem with pip. | |
ERROR: Failed building wheel for ctransformers | |
Failed to build ctransformers | |
ERROR: Could not build wheels for ctransformers, which is required to install pyproject.toml-based projects |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment