Text-to-Mic is a free text-to-speech and speech-to-text-to-speech (TTS and STTTS) to-microphone tool that turns typed text into speech audio with AI and then plays that audio to your speakers, headset, or microphone feed.

Here is a video example of how it looks when running on Windows: 

This is perfect to enable you to speak in online video meetings using text-to-speech AI. It can also manipulate text with AI in real-time which has lots of practical uses, such as tidying up speech or live translation. (See download links below).

Text-to-Mic uses the OpenAI text-to-speech engine, which surpasses the standard text-to-speech tools available on Windows and Mac. This app is available to use for free.

  • Seamless Text-to-Speech-to-Microphone (or speakers) Conversion:
    Utilizes OpenAI's API to convert text into natural-sounding speech in real-time.
  • Multiple Voices:
    Choose from a variety of OpenAI voices to find the tone that best suits your presentation or meeting style. Supported voices: Alloy, Echo, Fable, Onyx, Nova, Shimmer (Listen to samples).
  • Dual Output Capability:
    Outputs audio simultaneously to both headphones and a virtual microphone, ensuring you can monitor and share your presentation effectively.
  • STTTS - Speech-to-text-to-speech capabilities.
    Record your voice, even if you are struggling to speak, which saves as text, which you can then immediately playback over the selected audio feeds.
  • Hotkeys for Quick Access
    Trigger speech recording, conversion and playback using hotkeys (like ctrl+shift+0) to make using Text-to-mic feel more natural, quick and seamless.
  • Automatic ChatGPT AI text Manipulation
    This allows you to automatically translate what you've typed or recorded into another language, or automatically manipulate the input text in some desired way, speeding up the communications process

Watch the video above to see the power of the AI-enabled Text-to-Mic in action!

Download

For Windows

For Mac

You will need to download, extract, and then run the .app file

Getting Started

  1. Install VB-Cable
    Install VB-Cable from https://vb-audio.com/Cable/ if you haven't already. This tool creates a virtual microphone on your Windows computer or Mac. Once installed, you can trigger audio to play through this virtual cable.
  2. Add an OpenAI API Key
    Open the Text-to-Mic app by Scorchsoft and input your OpenAPI key (Tutorial video on setting up an API Key).
    If you don't yet have an API key, visit platform.openai.com, sign up for a free account, set up billing and add some credit, generate an API Key, and copy that key into text-to-mic.
    (It's not that expensive but OpenAI will bill you for text-to-speech generation - see pricing, see the text-to-speech and speech-to-text pricing, as well as GPT models if you enable AI manipulation)
  3. Set voice
    Select your preferred voice for speech synthesis in the app UI.
  4. Choose playback devices
    Choose a playback device. I recommend selecting your headphones as one device and the virtual microphone (usually labelled "Cable Input (VB-Audio)") as the other.
  5. Set Microphone to Cable Input VB-Audio in an online meeting
    When you join a meeting on platforms like Teams, Zoom, or Google Meets, select the Cable Input audio channel in the meeting tool's settings. This will play back any audio submitted via the tool when you hit play. However, please be aware that your own microphone will not function simultaneously. You will need to switch back if you need to speak.

    Example of virtual microphone selection in Google Meet:
    example virtual mic selection in google meet

  6. Type
    Enter the text you want to convert to speech in the provided text area.
  7. Play
    Click 'Play Audio' to listen to the spoken version of your text. This replays the previously generated audio clip to prevent unnecessary use of your OpenAI API Key.
  8. Repeat what you said last
    Use the 'Play Last Audio' button to replay the last generated speech output.
  9. Housekeeping
    You can change the API key at any time under the 'Settings' menu.
  10. Experiment with AI manipulation
    Play with the settings in "Settings > ChatGPT Manipulation" to automatically use AI to translate, change, or enhance recorded or spoken words. Useful for expanding on paraphrased content to increase the speed you can communicate, or reduce vocal strain.

Practical Applications

  • Education: Teachers can use Text-to-mic to provide clear, consistent instruction in virtual classrooms.
  • Business Meetings: Professionals who require voice rest can use this tool to communicate effectively in meetings without straining their voices.
  • Accessibility: Helps those with speech impairments communicate clearly and effectively in online meetings.
  • Translation: Translate your voice to another language and then immediately play as AI generated voice to a virtual mic feed
  • Expand paraphrasing: Talk or type in shorthand and have AI automatically convert it to longer form, and then speak that longer form version.

 v1.0.5 screenshot of text to mic app

 v1.0.5 screenshot of chatgpt manipulation settings

We created Text-to-Mic originally because a member of our team lost their voice, and we needed a simple solution to allow them to use text-to-speech (TTS) to speak with colleagues naturally, as this is much more engaging than typing in a parallel chat channel, which can often be overlooked.

If you enjoy using Text to Mic, you might also appreciate partnering with Scorchsoft on other technology projects. We specialise in developing technically complex web and mobile.

Frequently Asked Questions

How can I find or set up my OpenAI API Key?

You must sign up for an account and create a key in their developer's area. It sounds complex, but it's fairly straightforward; Here is a tutorial video.

What is the difference between GPT 3.5 and GPT 4 in AI manipulation settings?

This setting determines which AI 'model' is used to manipulate input or recorded text based on the provided prompt. Think of it as picking which AI brain to use.

  • GPT 3.5 is cheaper per word to manipulate text and is faster but less intelligent than GPT4.
  • GPT 4 is a much more powerful AI and is more likely to be able to deal with complex instructions, but it costs more per word to run and is a littler slower.

We recommend trying GPT3 first due to its speed benefits and switching to GPT4 should you find you want it to perform certain AI manipulations better.

What is the "Prompt" in the AI manipulation settings?

The prompt is the set of instructions you want the AI to use when manipulating your input or output text. The AI reads the instructions you've set in the prompt, and applies them to any converted text. Here are some example promps:

  • "Convert from English to Spanish"
  • "Expand paraphrased utterances to fully formed sentences."
  • "If I ask a question, reply to that question followed with a potential answer."
  • "Edit my input. You are a clown at an amusement park; convert to speak as this persona."
  • "Edit my input. You are a character in a computer game with a dark sense of humour. Convert text to speak as this persona. Remain concise"
  • "Copy edit my input. My mood today: upbeat, focused. Match this tone".

We recommend trying different prompts and making up your own too. You can also write much longer prompts than the above examples should you want it to do something very specific. Remember to switch from GPT 3 to GPT 4 if your prompt is particularly complex or requires more accuracy. If the response doesn't manipulate what you've said, and replies to it, then add something like "Copy edit my input" or "Transform my input" to the prompt and this should fix that.

Remember AI can "hallucinate" false information and give wrong answers, so make sure to evaluate responses before considering them to be true.

I have ideas for new features or custom extensions that would benefit my business. Can you help me with that?

If you notice a bug or small quality-of-life enhancement, please let us know, and we will consider implementing it in the tool for free.

We can also accommodate more substantial enhancements, such as custom extensions for business; Though please be aware these are likely to carry a development charge. Please contact us to let us know what you have in mind.

Changelog

  • v1.0.7 - Added support for hotkeys (ctrl+shift+0; ctrl+shift+9; ctrl+shift+8)
  • v1.0.6 - Fix audio channel sample rate mismatch issues
  • v1.0.5 - Adds ChatGPT manipulations functionality to auto-manipulate input text
  • v1.0.4 - Adds input device selection option
  • v1.0.3 - Fixes the record button and styles better
  • v1.0.2 - Added mac support, plus record voice button (But the app crashes if audio over around 3-seconds)
  • v1.0.1 - First working version of the app

Terms of Use, Disclaimer, and Licence Information

Text to Mic is provided "as is" and on an "as available" basis, without any warranties of any kind, either express or implied. Scorchsoft Ltd expressly disclaims all warranties, whether express, implied, statutory, or otherwise, including but not limited to the implied warranties of merchantability, fitness for a particular purpose, and non-infringement. We do not warrant that the software will function uninterrupted, that it is error-free, or that any errors or defects will be corrected.

Limitation of Liability

In no event will Scorchsoft Ltd be liable for any indirect, incidental, special, consequential, or punitive damages resulting from or related to your use or inability to use Text to Mic, including but not limited to damages for loss of profits, goodwill, use, data, or other intangible losses, even if Scorchsoft Ltd has been advised of the possibility of such damages.

Use at Your Own Risk

By using Text to Mic, you acknowledge and agree that you assume full responsibility for your use of the software, and that any information you send or receive during your use of the software may not be secure and may be intercepted or later acquired by unauthorized parties. Use of Text to Mic is at your sole risk.

License Agreement

Users are granted a non-exclusive, revocable license to use Text to Mic solely for personal or commercial purposes. While the software remains the intellectual property of Scorchsoft Ltd., users are permitted to share the software with others under the condition that they attribute it to Scorchsoft Ltd. explicitly. This license does not grant users any ownership rights in the software and prohibits the creation of derivative works or the sale of the software. Users must ensure that Scorchsoft Ltd. is credited appropriately when sharing or demonstrating the software in any public or private setting.