• ikidd@lemmy.world
    link
    fedilink
    English
    arrow-up
    17
    arrow-down
    1
    ·
    11 months ago

    Using Voice Assist pipeline via the HASS cloud subscription works a heck of a lot better than locally. Locally it takes about 15 seconds to respond, via the Nabu Casa server it’s about 1 second. I’ve considered dedicating a box to the containers it’s instantiating to do this to get faster response.

    • Blackmist@feddit.uk
      link
      fedilink
      English
      arrow-up
      2
      ·
      11 months ago

      What hardware is it running on that takes 15 seconds? I’ve not actually tried it myself as I’ve got a poor little RPi 3, and I don’t want to scare it.

      • ikidd@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        11 months ago

        The M5stack atom echo. The hardware is the same, but if you change the pipeline in the back end between the two, that’s where the delay happens. You can run the Whisper stack locally or on another box locally but I think you’d want a good GPU on it to offload the NL processing to. Which is probably what happens when you’re using the Nabu Casa pipeline.

    • bitwolf@lemmy.one
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      11 months ago

      Do you think throwing a coral TPU on there would help?

      I saw it helps a ton with Frigate facial recognition.
      I was planning to do that on my Yellow once I can get the display thing that’s pictured in the article.

      • ikidd@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        11 months ago

        Idk if any LLMs are set up to operate on anything except GPUs, its an interesting question.