azayrahmad Posted January 10 Report Share Posted January 10 I think by now the speech synthesis technology has been improved significantly. I'm not sure about the open source alternatives, but sites like https://elevenlabs.io/ provides free speech synthesis for limited number of words. This could be very useful in voicing narrations, especially tutorials. I have attached an example of an excerpt I generated using the website sourced from 0 A.D tutorial text using one of the pre-generated voice from the site voice library: Quote Select the Civic Center again and advance to Town Phase by clicking on the pause icon. You have to wait for the barracks to be built first. This will allow Town Phase buildings to be constructed. Anyone familiar with the original Rome Total War perhaps familiar with the voice, and I also suspect that perhaps this particular voice was trained using audio files from said game. So perhaps there's legal issue with this voice. But it means that we can also train our own unique voices based on audio files already in the game. So, what do you think? ElevenLabs_2024-01-10T08 15 35_Victoria - Classy British Mature Woman Voice _gen_s66_sb75_se0_m2.mp3 Quote Link to comment Share on other sites More sharing options...
Stan` Posted January 10 Report Share Posted January 10 The problem is the existing voices wouldn't be enough to train a model. So until legislation is in vigor, I'd be really careful in using AI in anything else than mods. 1 Quote Link to comment Share on other sites More sharing options...
alre Posted January 10 Report Share Posted January 10 there are open-source text-to-speech models: https://huggingface.co/models?pipeline_tag=text-to-speech&sort=trending 1 Quote Link to comment Share on other sites More sharing options...
hyperion Posted January 10 Report Share Posted January 10 I suppose tts would have to be done similar to how png to dds with nvtt is done currently and not by using some web service. And I strongly agree with @Stan` to be careful with models as pirated data for training is pretty much standard. Quote Link to comment Share on other sites More sharing options...
Obelix Posted February 3 Report Share Posted February 3 I'm thinking about how much bigger the installation file would be if we were to add all these audio files-or a tts model. Does anyone have a clue? Quote Link to comment Share on other sites More sharing options...
hyperion Posted February 3 Report Share Posted February 3 2 hours ago, Obelix said: I'm thinking about how much bigger the installation file would be if we were to add all these audio files-or a tts model. Does anyone have a clue? A tts model is typically less than 100MB but likely is limited to a single language, so one might not be enough. The output could be generated and bundled while creating the mod if performance requires it. Size generated content depends on total text spoken obviously and the quality/bitrate of the encoding, might come at another 100Mb per hour. Splitting into addon-mods is possible. However, this would be a larger project (not just tts but the whole integration of speech) and I don't think it would be coming anytime soon, tho for immersion this would be a nice to have. Waiting for video support 1 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.