Virtual assistant is a software agent that can perform tasks or services for an individual based on commands or questions.
Of the open source solutions kalliope is the one I've liked most. I've also looked at mycroft but it seems less oriented to self hosted solutions, although it's possible. Mycroft has a bigger community behind though.
To interact with it I may start with the android app, but then I'll probably install a Raspberry pi zero with Pirate Audio and an akaso external mic in the kitchen to speed up the grocy inventory management.
The only self hosted Speech-To-Text (STT) solution available now is CMUSphinx, which is based on pocketsphinx that has 2.8k stars but last update was on 28th of March of 2020.
The CMUSphinx documentation suggest you to use Vosk based on vosk-api with 1.2k stars and last updated 2 days ago. There is an open issue to support it in kalliope, with already a french proposal.
That led me to the issue to support DeepSpeech, Mozilla's STT solution, that has 16.5k stars and updated 3 days ago, so it would be the way to go in my opinion if the existent one fails. Right now there is no support, but this would be the place to start. For spanish, based on the mozilla discourse thread I arrived to DeepSpeech-Polyglot that has taken many datasets such as Common Voice one and generated the models.