Skip to content

Instantly share code, notes, and snippets.

@martindevans
Created October 30, 2023 21:43
Show Gist options
  • Save martindevans/ef7875e87382687764e9dd99ab9a8dbd to your computer and use it in GitHub Desktop.
Save martindevans/ef7875e87382687764e9dd99ab9a8dbd to your computer and use it in GitHub Desktop.
var modelParams = new ModelParams(@"C:\Users\Martin\Documents\Python\oobabooga_windows\text-generation-webui\models\llama-2-7b-chat.Q5_K_M.gguf");
using var weights = LLamaWeights.LoadFromFile(modelParams);
using var ctx = weights.CreateContext(modelParams);
var tokensArr = Enumerable.Range(1, 100).ToArray();
unsafe
{
fixed (int* tokens = tokensArr)
{
var timer = new Stopwatch();
timer.Start();
{
if (NativeApi.llama_eval(ctx.NativeHandle, tokens, tokensArr.Length, 0) != 0)
throw new NotImplementedException("FAIL1");
}
timer.Stop();
Console.WriteLine($"Batch ({tokensArr.Length}): {timer.ElapsedMilliseconds}ms");
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment