Podcast Translator
As a non-native English speaker, keeping up with long-form technical podcasts creates a high cognitive load. I built this tool to focus on the logic and information, rather than mentally translating every sentence.
Existing AI dubbing tools were black boxes, expensive and hard to customize. So I built this CLI tool to automate the pipeline: extracting transcripts (Youtube/HTML/URL), translating them and generating audio using various TTS providers (Gemini/OpenAI/Self-hosted).
Workflow
- Extract: Retrieve accurate dialogue text from the information source.
- Translate: Translate the dialogue text into the target language.
- Generate Audio: Convert the translated text into audio.
Audio Samples
- [Japanese Sample](https://github.com/…
Podcast Translator
As a non-native English speaker, keeping up with long-form technical podcasts creates a high cognitive load. I built this tool to focus on the logic and information, rather than mentally translating every sentence.
Existing AI dubbing tools were black boxes, expensive and hard to customize. So I built this CLI tool to automate the pipeline: extracting transcripts (Youtube/HTML/URL), translating them and generating audio using various TTS providers (Gemini/OpenAI/Self-hosted).
Workflow
- Extract: Retrieve accurate dialogue text from the information source.
- Translate: Translate the dialogue text into the target language.
- Generate Audio: Convert the translated text into audio.
Audio Samples
Audio Generation Options
- Gemini: Often failed to generate audio.
- OpenAI: Voices are not perfect, but the service is stable and fast.
- ElevenLabs: Tried for its voice cloning capabilities, but it proved too expensive (~$30 USD for a 4-hour podcast).
- Self-Hosted: Works well for voice cloning at ~$4-5 USD for a 4-hour podcast (6x cheaper than ElevenLabs).
Conclusion: For daily usage, OpenAI TTS is sufficient and reliable.
Usage
pip install -r requirements.txt
# Edit .env file
OPENAI_API_KEY=your_openai_api_key_here
# 1. Extract transcript from source
# Option A: YouTube
python3 scraper/youtube.py 'https://www.youtube.com/watch?v=IDSAMqip6ms'
python3 helper/txt_to_json.py IDSAMqip6ms.txt prompt/youtube.md
# Option B: Colossus
python3 scraper/colossus.py sample/invest-like-the-best-483.html --output transcript_colossus.json
# Option C: Lex Fridman
python3 scraper/lex.py https://lexfridman.com/pavel-durov-transcript
# 2. Translate the transcript to French (or other target language)
python3 translate.py transcript_colossus.json french
# 3. Generate Audio
# Note: Audio generation takes time. It is recommended to test with a small sample first.
python3 tts_services/tts_generator_openai.py sample/claude_code_youtube_talk_small-french.json