Convert EPUB e-books into high-quality audiobooks using multiple Text-to-Speech providers.
- ๐ EPUB Support: Compatible with EPUB 2 and EPUB 3 formats
- ๐๏ธ Multiple TTS Providers: Supports Azure and Doubao TTS services
- ๐ Auto-Detection: Automatically detects configured provider
- ๐ Multi-Language Support: Supports various languages and voices
- ๐ฑ M4B Output: Generates standard M4B audiobook format with chapter navigation
- ๐ง CLI Interface: Easy-to-use command-line tool with progress tracking
epub2speech input.epub output.m4b --voice zh-CN-XiaoxiaoNeural- Python 3.11 or higher
- FFmpeg (for audio processing)
- TTS provider credentials (Azure or Doubao)
# Install Python dependencies
pip install poetry
poetry install
# Install FFmpeg
# macOS: brew install ffmpeg
# Ubuntu/Debian: sudo apt install ffmpeg
# Windows: Download from https://ffmpeg.org/download.htmlSet environment variables and run:
export AZURE_SPEECH_KEY="your-subscription-key"
export AZURE_SPEECH_REGION="your-region"
epub2speech input.epub output.m4b --voice zh-CN-XiaoxiaoNeuralWhere to get credentials:
- Create an Azure account at https://azure.microsoft.com
- Create a Speech Service resource in Azure Portal
- Get your subscription key and region from the dashboard
Available voices:
- Voice list: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/language-support?tabs=tts#voice-styles-and-roles
- Voice gallery (preview): https://speech.microsoft.com/portal/voicegallery
Set environment variables and run:
export DOUBAO_ACCESS_TOKEN="your-access-token"
export DOUBAO_BASE_URL="your-api-base-url"
epub2speech input.epub output.m4b --voice zh_male_lengkugege_emo_v2_mars_bigttsWhere to get credentials:
- Get your Doubao access token and API base URL from Volcengine console
Available voices: https://www.volcengine.com/docs/6561/1257544 (Find voice IDs in the Doubao TTS documentation)
If you have configured only one provider, it will be automatically detected and used. If multiple providers are configured, specify which one to use:
# Explicitly use Azure
epub2speech input.epub output.m4b --provider azure --voice zh-CN-XiaoxiaoNeural
# Explicitly use Doubao
epub2speech input.epub output.m4b --provider doubao --voice zh_male_lengkugege_emo_v2_mars_bigtts# Limit to first 5 chapters
epub2speech input.epub output.m4b --voice zh-CN-XiaoxiaoNeural --max-chapters 5
# Use custom workspace directory
epub2speech input.epub output.m4b --voice zh-CN-YunxiNeural --workspace /tmp/my-workspace
# Quiet mode (no progress output)
epub2speech input.epub output.m4b --voice ja-JP-NanamiNeural --quiet
# Set maximum characters per TTS segment (default: 500)
epub2speech input.epub output.m4b --voice zh-CN-XiaoxiaoNeural --max-tts-segment-chars 800Pass credentials via command-line arguments:
epub2speech input.epub output.m4b \
--voice zh-CN-XiaoxiaoNeural \
--azure-key YOUR_KEY \
--azure-region YOUR_REGIONPass credentials via command-line arguments:
epub2speech input.epub output.m4b \
--voice zh_male_lengkugege_emo_v2_mars_bigtts \
--doubao-token YOUR_TOKEN \
--doubao-url YOUR_BASE_URL- EPUB Parsing: Extracts text content and metadata from EPUB files
- Chapter Detection: Identifies chapters using EPUB navigation data
- Text Processing: Cleans and segments text for optimal speech synthesis
- Audio Generation: Converts text to speech using your chosen TTS provider
- M4B Creation: Combines audio files with chapter metadata into M4B format
You can integrate epub2speech into your own Python application:
from pathlib import Path
from epub2speech import convert_epub_to_m4b, ConversionProgress
from epub2speech.tts.azure_provider import AzureTextToSpeech
# Or use: from epub2speech.tts.doubao_provider import DoubaoTextToSpeech
# Initialize TTS provider
tts = AzureTextToSpeech(
subscription_key="your-key",
region="your-region"
)
# Optional: Define progress callback
def on_progress(progress: ConversionProgress):
print(f"{progress.progress:.1f}% - Chapter {progress.current_chapter}/{progress.total_chapters}")
# Convert EPUB to M4B
result = convert_epub_to_m4b(
epub_path=Path("input.epub"),
workspace=Path("./workspace"),
output_path=Path("output.m4b"),
tts_protocol=tts,
voice="zh-CN-XiaoxiaoNeural",
max_chapters=None, # Optional: limit chapters
max_tts_segment_chars=500, # Optional: max characters per TTS segment (default: 500)
progress_callback=on_progress # Optional
)
if result:
print(f"Success: {result}")python test.pyRun specific test modules:
python test.py --test test_epub_picker
python test.py --test test_ttsContributions are welcome! Please feel free to submit issues or pull requests.
This project is licensed under the MIT License - see the LICENSE file for details.
For issues and questions:
- Check existing GitHub issues
- Create a new issue with detailed information
- Include EPUB file samples if relevant (ensure no copyright restrictions)โ๏ผโfile_pathโ: