Abacus.ai:

We recently released Smaug-72B-v0.1 which has taken first place on the Open LLM Leaderboard by HuggingFace. It is the first open-source model to have an average score more than 80.

  • Miss Brainfarts@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    1
    ·
    9 months ago

    I’m currently playing around with the Jan client, which uses the nitro engine. I think I need to read up on it more, because when I set the ngl value to 15 in order to offload 50% to GPU like the Jan guide says, nothing happens. Though that could be an issue specific to Jan.

    • Fisch@lemmy.ml
      link
      fedilink
      English
      arrow-up
      2
      ·
      9 months ago

      Maybe 50% GPU is already using too much VRAM and it crashes. You could try to set it to 0% GPU and see if that works.

      • Miss Brainfarts@lemmy.blahaj.zone
        link
        fedilink
        English
        arrow-up
        1
        ·
        9 months ago

        I may need to lower it a bit more, yeah. Though when I try to to use offloading, I can see that vram usage doesn’t increase at all.

        When I leave the setting at its default 100 value on the other hand, I see vram usage climb until it stops because there isn’t enough of it.

        So I guess not all models support offloading?

        • General_Effort@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          9 months ago

          Most formats don’t support it. It has to be gguf format, afaik. You can usually find a conversion on huggingface. Prefer offerings by TheBloke for the detailed documentation, if nothing else.

        • Fisch@lemmy.ml
          link
          fedilink
          English
          arrow-up
          2
          ·
          9 months ago

          The models you have should be .gguf files right? I think those are the only ones where that’s supported