thatupiso commited on
Commit
0961d1b
·
verified ·
1 Parent(s): b7805e9

Upload folder using huggingface_hub

Browse files
.github/workflows/sync.yml ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Sync to Hugging Face hub
2
+ on:
3
+ push:
4
+ branches: [main]
5
+
6
+ jobs:
7
+ sync-to-hub:
8
+ runs-on: ubuntu-latest
9
+ steps:
10
+ - uses: actions/checkout@v3
11
+ with:
12
+ fetch-depth: 0
13
+ lfs: true
14
+ - name: Push to hub
15
+ env:
16
+ HF_TOKEN: ${{ secrets.HF_TOKEN }}
17
+ run: git push https://thatupiso:[email protected]/spaces/thatupiso/Podcastfy.ai_demo main
.gitignore CHANGED
@@ -2,7 +2,6 @@
2
 
3
  specs/
4
  docs/
5
- data/
6
 
7
  *.ipynb
8
 
 
2
 
3
  specs/
4
  docs/
 
5
 
6
  *.ipynb
7
 
.gradio/certificate.pem ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ -----BEGIN CERTIFICATE-----
2
+ MIIFazCCA1OgAwIBAgIRAIIQz7DSQONZRGPgu2OCiwAwDQYJKoZIhvcNAQELBQAw
3
+ TzELMAkGA1UEBhMCVVMxKTAnBgNVBAoTIEludGVybmV0IFNlY3VyaXR5IFJlc2Vh
4
+ cmNoIEdyb3VwMRUwEwYDVQQDEwxJU1JHIFJvb3QgWDEwHhcNMTUwNjA0MTEwNDM4
5
+ WhcNMzUwNjA0MTEwNDM4WjBPMQswCQYDVQQGEwJVUzEpMCcGA1UEChMgSW50ZXJu
6
+ ZXQgU2VjdXJpdHkgUmVzZWFyY2ggR3JvdXAxFTATBgNVBAMTDElTUkcgUm9vdCBY
7
+ MTCCAiIwDQYJKoZIhvcNAQEBBQADggIPADCCAgoCggIBAK3oJHP0FDfzm54rVygc
8
+ h77ct984kIxuPOZXoHj3dcKi/vVqbvYATyjb3miGbESTtrFj/RQSa78f0uoxmyF+
9
+ 0TM8ukj13Xnfs7j/EvEhmkvBioZxaUpmZmyPfjxwv60pIgbz5MDmgK7iS4+3mX6U
10
+ A5/TR5d8mUgjU+g4rk8Kb4Mu0UlXjIB0ttov0DiNewNwIRt18jA8+o+u3dpjq+sW
11
+ T8KOEUt+zwvo/7V3LvSye0rgTBIlDHCNAymg4VMk7BPZ7hm/ELNKjD+Jo2FR3qyH
12
+ B5T0Y3HsLuJvW5iB4YlcNHlsdu87kGJ55tukmi8mxdAQ4Q7e2RCOFvu396j3x+UC
13
+ B5iPNgiV5+I3lg02dZ77DnKxHZu8A/lJBdiB3QW0KtZB6awBdpUKD9jf1b0SHzUv
14
+ KBds0pjBqAlkd25HN7rOrFleaJ1/ctaJxQZBKT5ZPt0m9STJEadao0xAH0ahmbWn
15
+ OlFuhjuefXKnEgV4We0+UXgVCwOPjdAvBbI+e0ocS3MFEvzG6uBQE3xDk3SzynTn
16
+ jh8BCNAw1FtxNrQHusEwMFxIt4I7mKZ9YIqioymCzLq9gwQbooMDQaHWBfEbwrbw
17
+ qHyGO0aoSCqI3Haadr8faqU9GY/rOPNk3sgrDQoo//fb4hVC1CLQJ13hef4Y53CI
18
+ rU7m2Ys6xt0nUW7/vGT1M0NPAgMBAAGjQjBAMA4GA1UdDwEB/wQEAwIBBjAPBgNV
19
+ HRMBAf8EBTADAQH/MB0GA1UdDgQWBBR5tFnme7bl5AFzgAiIyBpY9umbbjANBgkq
20
+ hkiG9w0BAQsFAAOCAgEAVR9YqbyyqFDQDLHYGmkgJykIrGF1XIpu+ILlaS/V9lZL
21
+ ubhzEFnTIZd+50xx+7LSYK05qAvqFyFWhfFQDlnrzuBZ6brJFe+GnY+EgPbk6ZGQ
22
+ 3BebYhtF8GaV0nxvwuo77x/Py9auJ/GpsMiu/X1+mvoiBOv/2X/qkSsisRcOj/KK
23
+ NFtY2PwByVS5uCbMiogziUwthDyC3+6WVwW6LLv3xLfHTjuCvjHIInNzktHCgKQ5
24
+ ORAzI4JMPJ+GslWYHb4phowim57iaztXOoJwTdwJx4nLCgdNbOhdjsnvzqvHu7Ur
25
+ TkXWStAmzOVyyghqpZXjFaH3pO3JLF+l+/+sKAIuvtd7u+Nxe5AW0wdeRlN8NwdC
26
+ jNPElpzVmbUq4JUagEiuTDkHzsxHpFKVK7q4+63SM1N95R1NbdWhscdCb+ZAJzVc
27
+ oyi3B43njTOQ5yOf+1CceWxG1bQVs5ZufpsMljq4Ui0/1lvh+wjChP4kqKOJ2qxq
28
+ 4RgqsahDYVvTH9w7jXbyLeiNdd8XM2w9U/t7y0Ff/9yi0GE44Za4rF2LN9d11TPA
29
+ mRGunUHBcnWEvgJBQl9nJEiU0Zsnvgc/ubhPgXRR4Xq37Z0j4r7g1SgEEzwxA57d
30
+ emyPxgcYxn/eR44/KJ4EBs+lVDR3veyJm+kXQ99b21/+jh5Xos1AnX5iItreGCc=
31
+ -----END CERTIFICATE-----
README.md CHANGED
@@ -2,109 +2,34 @@
2
  title: Podcastfy.ai_demo
3
  app_file: podcastfy-app/app.py
4
  sdk: gradio
5
- sdk_version: 4.44.1
6
- python_version: 3.11
 
7
  ---
8
- # Podcastfy.ai
9
- [![CodeFactor](https://www.codefactor.io/repository/github/souzatharsis/podcastfy/badge)](https://www.codefactor.io/repository/github/souzatharsis/podcastfy)
10
- [![PyPi Status](https://img.shields.io/pypi/v/podcastfy)](https://pypi.org/project/podcastfy/)
11
- [![Downloads](https://pepy.tech/badge/podcastfy)](https://pepy.tech/project/podcastfy)
12
- [![Issues](https://img.shields.io/github/issues-raw/souzatharsis/podcastfy)](https://github.com/souzatharsis/podcastfy/issues)
13
- [![License: CC BY-NC-SA 4.0](https://img.shields.io/badge/License-CC%20BY--NC--SA%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by-nc-sa/4.0/)
14
- [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/python/black)
15
 
16
  Transforming Multi-Sourced Text into Captivating Multi-Lingual Audio Conversations with GenAI
17
 
18
  https://github.com/user-attachments/assets/f1559e70-9cf9-4576-b48b-87e7dad1dd0b
19
 
20
- Podcastfy is an open-source Python package that transforms web content, PDFs, and text into engaging, multi-lingual audio conversations using GenAI.
 
 
21
 
22
- Unlike UI-based tools focused primarily on note-taking or research synthesis (e.g. NotebookLM ❤️), Podcastfy focuses on the programmatic and bespoke generation of engaging, conversational transcripts and audio from a multitude of text sources therefore enabling customization and scale.
23
 
24
  ## Audio Examples
25
 
26
  This sample collection is also [available at audio.com](https://audio.com/thatupiso/collections/podcastfy):
27
- - [English] Book Networks, Crowds, and Markets: [audio](https://audio.com/thatupiso/audio/networks)
28
- - [English] Research paper: ([audio](https://audio.com/thatupiso/audio/agro-paper) | [pdf](./data/pdf/s41598-024-58826-w.pdf))
 
29
  - [English] Personal website: ([audio](https://audio.com/thatupiso/audio/tharsis) | [website](https://www.souzatharsis.com))
30
  - [English] Personal website + youtube video: ([audio](https://audio.com/thatupiso/audio/tharsis-ai) | [website](https://www.souzatharsis.com) | [youtube](https://www.youtube.com/watch?v=sJE1dE2dulg))
31
  - [French] Website: ([audio](https://audio.com/thatupiso/audio/podcast-fr-agro) | [website](https://agroclim.inrae.fr/))
32
  - [Portuguese-BR] News article: ([audio](https://audio.com/thatupiso/audio/podcast-thatupiso-br) | [website](https://noticias.uol.com.br/eleicoes/2024/10/03/nova-pesquisa-datafolha-quem-subiu-e-quem-caiu-na-disputa-de-sp-03-10.htm))
33
-
34
- ## Quickstart
35
-
36
- ### Setup
37
- Before installing, ensure you have Python 3.12 or higher installed on your system.
38
-
39
- 1. Install from PyPI
40
-
41
- `$ pip install podcastfy`
42
-
43
- 2. Set up your [API keys](usage/config.md)
44
-
45
- 3. Ensure you have ffmpeg installed on your system, required for audio processing
46
- ```
47
- sudo apt update
48
- sudo apt install ffmpeg
49
- ```
50
-
51
- ### Python
52
- ```python
53
- from podcastfy.client import generate_podcast
54
-
55
- audio_file = generate_podcast(urls=["<url1>", "<url2>"])
56
- ```
57
- ### CLI
58
- ```
59
- python -m podcastfy.client --url <url1> --url <url2>
60
- ```
61
-
62
- ## Usage
63
-
64
- - [Python Package](podcastfy.ipynb)
65
-
66
- - [CLI](usage/cli.md)
67
-
68
-
69
- ## Contributing
70
-
71
- Contributions are welcome! Please feel free to submit a Pull Request - see [Open Issues](https://github.com/souzatharsis/podcastfy/issues) for ideas. But even more excitingly feel free to fork the repo and create your own app! Please let me know if I could be of help.
72
-
73
- ## Features
74
-
75
- - Generate engaging, AI-powered conversational content from multiple sources (URLs and PDFs)
76
- - Create high-quality transcripts from diverse textual information sources
77
- - Convert pre-existing transcript files into dynamic podcast episodes
78
- - Support for multiple advanced text-to-speech models (OpenAI and ElevenLabs) for natural-sounding audio
79
- - Support for multiple languages, enabling global content creation
80
- - Seamlessly integrate CLI for streamlined workflows
81
-
82
- ## Example Use Cases
83
-
84
- 1. **Content Summarization**: Busy professionals can stay informed on industry trends by listening to concise audio summaries of multiple articles, saving time and gaining knowledge efficiently.
85
-
86
- 2. **Language Localization**: Non-native English speakers can access English content in their preferred language, breaking down language barriers and expanding access to global information.
87
-
88
- 3. **Website Content Marketing**: Companies can increase engagement by repurposing written website content into audio format, providing visitors with the option to read or listen.
89
-
90
- 4. **Personal Branding**: Job seekers can create unique audio-based personal presentations from their CV or LinkedIn profile, making a memorable impression on potential employers.
91
-
92
- 5. **Research Paper Summaries**: Graduate students and researchers can quickly review multiple academic papers by listening to concise audio summaries, speeding up the research process.
93
-
94
- 6. **Long-form Podcast Summarization**: Podcast enthusiasts with limited time can stay updated on their favorite shows by listening to condensed versions of lengthy episodes.
95
-
96
- 7. **News Briefings**: Commuters can stay informed about daily news during travel time with personalized audio news briefings compiled from their preferred sources.
97
-
98
- 8. **Educational Content Creation**: Educators can enhance learning accessibility by providing audio versions of course materials, catering to students with different learning styles.
99
-
100
- 9. **Book Summaries**: Avid readers can preview books efficiently through audio summaries, helping them make informed decisions about which books to read in full.
101
-
102
- 10. **Conference and Event Recaps**: Professionals can stay updated on important industry events they couldn't attend by listening to audio recaps of conference highlights and key takeaways.
103
-
104
-
105
- ## License
106
-
107
- This project is licensed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-nc-sa/4.0/).
108
 
109
  ## Disclaimer
110
 
 
2
  title: Podcastfy.ai_demo
3
  app_file: podcastfy-app/app.py
4
  sdk: gradio
5
+ sdk_version: 5.4.0
6
+ python_version: "3.11"
7
+ header: mini
8
  ---
9
+ # Podcastfy.ai demo
10
+
11
+ Created with ❤️ by Open Source [Podcastfy](https://www.podcastfy.ai)
 
 
 
 
12
 
13
  Transforming Multi-Sourced Text into Captivating Multi-Lingual Audio Conversations with GenAI
14
 
15
  https://github.com/user-attachments/assets/f1559e70-9cf9-4576-b48b-87e7dad1dd0b
16
 
17
+ Try [HuggingFace 🤗 space app](https://huggingface.co/spaces/thatupiso/Podcastfy.ai_demo) for a simple use case (URLs -> Audio).
18
+
19
+ See [Open Source Python package](https://www.podcastfy.ai) and CLI at the original github repo for full customization options.
20
 
21
+ WARNING: This UI App was not as thoroughly tested as the underlying Python package.
22
 
23
  ## Audio Examples
24
 
25
  This sample collection is also [available at audio.com](https://audio.com/thatupiso/collections/podcastfy):
26
+ - [English] Youtube Video from YCombinator on LLMs: ([audio](https://audio.com/thatupiso/audio/ycombinator-llms) | [youtube](https://www.youtube.com/watch?v=eBVi_sLaYsc))
27
+ - [English] Book pdf Networks, Crowds, and Markets: [audio](https://audio.com/thatupiso/audio/networks)
28
+ - [English] Research paper on Climate Change in France: ([audio](https://audio.com/thatupiso/audio/agro-paper) | [pdf](./data/pdf/s41598-024-58826-w.pdf))
29
  - [English] Personal website: ([audio](https://audio.com/thatupiso/audio/tharsis) | [website](https://www.souzatharsis.com))
30
  - [English] Personal website + youtube video: ([audio](https://audio.com/thatupiso/audio/tharsis-ai) | [website](https://www.souzatharsis.com) | [youtube](https://www.youtube.com/watch?v=sJE1dE2dulg))
31
  - [French] Website: ([audio](https://audio.com/thatupiso/audio/podcast-fr-agro) | [website](https://agroclim.inrae.fr/))
32
  - [Portuguese-BR] News article: ([audio](https://audio.com/thatupiso/audio/podcast-thatupiso-br) | [website](https://noticias.uol.com.br/eleicoes/2024/10/03/nova-pesquisa-datafolha-quem-subiu-e-quem-caiu-na-disputa-de-sp-03-10.htm))
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
 
34
  ## Disclaimer
35
 
podcastfy-app/app.py CHANGED
@@ -1,127 +1,430 @@
1
  import gradio as gr
2
- from podcastfy.client import generate_podcast
3
  import os
 
 
 
4
  from dotenv import load_dotenv
5
 
6
- # Load environment variables from .env file
 
 
 
 
7
  load_dotenv()
8
 
9
  def get_api_key(key_name, ui_value):
10
  return ui_value if ui_value else os.getenv(key_name)
11
 
12
- def create_podcast(urls, openai_key, jina_key, gemini_key):
13
- try:
14
- # Set API keys, prioritizing UI input over .env file
15
- os.environ["OPENAI_API_KEY"] = get_api_key("OPENAI_API_KEY", openai_key)
16
- os.environ["JINA_API_KEY"] = get_api_key("JINA_API_KEY", jina_key)
17
- os.environ["GEMINI_API_KEY"] = get_api_key("GEMINI_API_KEY", gemini_key)
18
-
19
- url_list = [url.strip() for url in urls.split(',') if url.strip()]
20
-
21
- if not url_list:
22
- return "Please provide at least one URL."
23
-
24
- audio_file = generate_podcast(urls=url_list)
25
- return audio_file
26
- except Exception as e:
27
- return str(e)
28
-
29
- # Create the Gradio interface
30
- with gr.Blocks(title="Podcastfy.ai", theme=gr.themes.Default()) as iface:
31
- gr.Markdown("# Podcastfy.ai demo")
32
- gr.Markdown("Generate a podcast from multiple URLs using Podcastfy.")
33
- gr.Markdown("For full customization, please check [Podcastfy package](https://github.com/souzatharsis/podcastfy).")
34
-
35
- with gr.Accordion("API Keys", open=False):
36
- with gr.Row(variant="panel"):
37
- with gr.Column(scale=1):
38
- openai_key = gr.Textbox(label="OpenAI API Key", type="password", value=os.getenv("OPENAI_API_KEY", ""))
39
- gr.Markdown('<a href="https://platform.openai.com/api-keys" target="_blank">Get OpenAI API Key</a>')
40
- with gr.Column(scale=1):
41
- jina_key = gr.Textbox(label="Jina API Key", type="password", value=os.getenv("JINA_API_KEY", ""))
42
- gr.Markdown('<a href="https://jina.ai/reader/#apiform" target="_blank">Get Jina API Key</a>')
43
- with gr.Column(scale=1):
44
- gemini_key = gr.Textbox(label="Gemini API Key", type="password", value=os.getenv("GEMINI_API_KEY", ""))
45
- gr.Markdown('<a href="https://makersuite.google.com/app/apikey" target="_blank">Get Gemini API Key</a>')
46
-
47
- urls = gr.Textbox(lines=2, placeholder="Enter URLs separated by commas...", label="URLs")
48
-
49
- generate_button = gr.Button("Generate Podcast", variant="primary")
50
-
51
- with gr.Column():
52
- gr.Markdown('<p style="color: #666; font-style: italic; margin-bottom: 5px;">Note: Podcast generation may take a couple of minutes.</p>', elem_id="generation-note")
53
- audio_output = gr.Audio(type="filepath", label="Generated Podcast")
54
-
55
- generate_button.click(
56
- create_podcast,
57
- inputs=[urls, openai_key, jina_key, gemini_key],
58
- outputs=audio_output
59
- )
60
-
61
- gr.Markdown('<p style="text-align: center;">Created with ❤️ by <a href="https://github.com/souzatharsis/podcastfy" target="_blank">Podcastfy</a></p>')
62
-
63
- # Add JavaScript for splash screen and positioning the disclaimer
64
- iface.load(js="""
65
- function addSplashScreen() {
66
- const audioElement = document.querySelector('.audio-wrap');
67
- if (audioElement) {
68
- const splashScreen = document.createElement('div');
69
- splashScreen.id = 'podcast-splash-screen';
70
- splashScreen.innerHTML = '<p>Generating podcast... This may take a couple of minutes.</p>';
71
- splashScreen.style.cssText = `
72
- position: absolute;
73
- top: 0;
74
- left: 0;
75
- right: 0;
76
- bottom: 0;
77
- background-color: rgba(0, 0, 0, 0.7);
78
- color: white;
79
- display: flex;
80
- justify-content: center;
81
- align-items: center;
82
- z-index: 1000;
83
- `;
84
- audioElement.style.position = 'relative';
85
- audioElement.appendChild(splashScreen);
86
- }
87
- }
88
-
89
- function removeSplashScreen() {
90
- const splashScreen = document.getElementById('podcast-splash-screen');
91
- if (splashScreen) {
92
- splashScreen.remove();
93
- }
94
- }
95
-
96
- function positionGenerationNote() {
97
- const noteElement = document.getElementById('generation-note');
98
- const audioElement = document.querySelector('.audio-wrap');
99
- if (noteElement && audioElement) {
100
- noteElement.style.position = 'absolute';
101
- noteElement.style.top = '-25px';
102
- noteElement.style.left = '0';
103
- noteElement.style.zIndex = '10';
104
- audioElement.style.position = 'relative';
105
- }
106
- }
107
-
108
- document.querySelector('#generate_podcast').addEventListener('click', addSplashScreen);
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
109
 
110
- // Use a MutationObserver to watch for changes in the audio element
111
- const observer = new MutationObserver((mutations) => {
112
- mutations.forEach((mutation) => {
113
- if (mutation.type === 'childList' && mutation.addedNodes.length > 0) {
114
- removeSplashScreen();
115
- positionGenerationNote();
116
- }
117
- });
118
- });
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
119
 
120
- observer.observe(document.querySelector('.audio-wrap'), { childList: true, subtree: true });
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
121
 
122
- // Position the note on initial load
123
- window.addEventListener('load', positionGenerationNote);
124
- """)
 
 
 
 
 
 
 
 
 
125
 
126
  if __name__ == "__main__":
127
- iface.launch(share=True)
 
1
  import gradio as gr
 
2
  import os
3
+ import tempfile
4
+ import logging
5
+ from podcastfy.client import generate_podcast
6
  from dotenv import load_dotenv
7
 
8
+ # Configure logging
9
+ logging.basicConfig(level=logging.DEBUG)
10
+ logger = logging.getLogger(__name__)
11
+
12
+ # Load environment variables
13
  load_dotenv()
14
 
15
  def get_api_key(key_name, ui_value):
16
  return ui_value if ui_value else os.getenv(key_name)
17
 
18
+ def process_inputs(
19
+ text_input,
20
+ urls_input,
21
+ pdf_files,
22
+ image_files,
23
+ gemini_key,
24
+ openai_key,
25
+ elevenlabs_key,
26
+ word_count,
27
+ conversation_style,
28
+ roles_person1,
29
+ roles_person2,
30
+ dialogue_structure,
31
+ podcast_name,
32
+ podcast_tagline,
33
+ tts_model,
34
+ creativity_level,
35
+ user_instructions
36
+ ):
37
+ try:
38
+ logger.info("Starting podcast generation process")
39
+
40
+ # API key handling
41
+ logger.debug("Setting API keys")
42
+ os.environ["GEMINI_API_KEY"] = get_api_key("GEMINI_API_KEY", gemini_key)
43
+
44
+ if tts_model == "openai":
45
+ logger.debug("Setting OpenAI API key")
46
+ if not openai_key and not os.getenv("OPENAI_API_KEY"):
47
+ raise ValueError("OpenAI API key is required when using OpenAI TTS model")
48
+ os.environ["OPENAI_API_KEY"] = get_api_key("OPENAI_API_KEY", openai_key)
49
+
50
+ if tts_model == "elevenlabs":
51
+ logger.debug("Setting ElevenLabs API key")
52
+ if not elevenlabs_key and not os.getenv("ELEVENLABS_API_KEY"):
53
+ raise ValueError("ElevenLabs API key is required when using ElevenLabs TTS model")
54
+ os.environ["ELEVENLABS_API_KEY"] = get_api_key("ELEVENLABS_API_KEY", elevenlabs_key)
55
+
56
+ # Process URLs
57
+ urls = [url.strip() for url in urls_input.split('\n') if url.strip()]
58
+ logger.debug(f"Processed URLs: {urls}")
59
+
60
+ temp_files = []
61
+ temp_dirs = []
62
+
63
+ # Handle PDF files
64
+ if pdf_files is not None and len(pdf_files) > 0:
65
+ logger.info(f"Processing {len(pdf_files)} PDF files")
66
+ pdf_temp_dir = tempfile.mkdtemp()
67
+ temp_dirs.append(pdf_temp_dir)
68
+
69
+ for i, pdf_file in enumerate(pdf_files):
70
+ pdf_path = os.path.join(pdf_temp_dir, f"input_pdf_{i}.pdf")
71
+ temp_files.append(pdf_path)
72
+
73
+ with open(pdf_path, 'wb') as f:
74
+ f.write(pdf_file)
75
+ urls.append(pdf_path)
76
+ logger.debug(f"Saved PDF {i} to {pdf_path}")
77
+
78
+ # Handle image files
79
+ image_paths = []
80
+ if image_files is not None and len(image_files) > 0:
81
+ logger.info(f"Processing {len(image_files)} image files")
82
+ img_temp_dir = tempfile.mkdtemp()
83
+ temp_dirs.append(img_temp_dir)
84
+
85
+ for i, img_file in enumerate(image_files):
86
+ # Get file extension from the original name in the file tuple
87
+ original_name = img_file.orig_name if hasattr(img_file, 'orig_name') else f"image_{i}.jpg"
88
+ extension = original_name.split('.')[-1]
89
+
90
+ logger.debug(f"Processing image file {i}: {original_name}")
91
+ img_path = os.path.join(img_temp_dir, f"input_image_{i}.{extension}")
92
+ temp_files.append(img_path)
93
+
94
+ try:
95
+ # Write the bytes directly to the file
96
+ with open(img_path, 'wb') as f:
97
+ if isinstance(img_file, (tuple, list)):
98
+ f.write(img_file[1]) # Write the bytes content
99
+ else:
100
+ f.write(img_file) # Write the bytes directly
101
+ image_paths.append(img_path)
102
+ logger.debug(f"Saved image {i} to {img_path}")
103
+ except Exception as e:
104
+ logger.error(f"Error saving image {i}: {str(e)}")
105
+ raise
106
+
107
+ # Prepare conversation config
108
+ logger.debug("Preparing conversation config")
109
+ conversation_config = {
110
+ "word_count": word_count,
111
+ "conversation_style": conversation_style.split(','),
112
+ "roles_person1": roles_person1,
113
+ "roles_person2": roles_person2,
114
+ "dialogue_structure": dialogue_structure.split(','),
115
+ "podcast_name": podcast_name,
116
+ "podcast_tagline": podcast_tagline,
117
+ "creativity": creativity_level,
118
+ "user_instructions": user_instructions
119
+ }
120
+
121
+ # Generate podcast
122
+ logger.info("Calling generate_podcast function")
123
+ logger.debug(f"URLs: {urls}")
124
+ logger.debug(f"Image paths: {image_paths}")
125
+ logger.debug(f"Text input present: {'Yes' if text_input else 'No'}")
126
+
127
+ audio_file = generate_podcast(
128
+ urls=urls if urls else None,
129
+ text=text_input if text_input else None,
130
+ image_paths=image_paths if image_paths else None,
131
+ tts_model=tts_model,
132
+ conversation_config=conversation_config
133
+ )
134
+
135
+ logger.info("Podcast generation completed")
136
+
137
+ # Cleanup
138
+ logger.debug("Cleaning up temporary files")
139
+ for file_path in temp_files:
140
+ if os.path.exists(file_path):
141
+ os.unlink(file_path)
142
+ logger.debug(f"Removed temp file: {file_path}")
143
+ for dir_path in temp_dirs:
144
+ if os.path.exists(dir_path):
145
+ os.rmdir(dir_path)
146
+ logger.debug(f"Removed temp directory: {dir_path}")
147
+
148
+ return audio_file
149
+
150
+ except Exception as e:
151
+ logger.error(f"Error in process_inputs: {str(e)}", exc_info=True)
152
+ # Cleanup on error
153
+ for file_path in temp_files:
154
+ if os.path.exists(file_path):
155
+ os.unlink(file_path)
156
+ for dir_path in temp_dirs:
157
+ if os.path.exists(dir_path):
158
+ os.rmdir(dir_path)
159
+ return str(e)
160
 
161
+ # Create Gradio interface with updated theme
162
+ with gr.Blocks(
163
+ title="Podcastfy.ai",
164
+ theme=gr.themes.Base(
165
+ primary_hue="blue",
166
+ secondary_hue="slate",
167
+ neutral_hue="slate"
168
+ ),
169
+ css="""
170
+ /* Move toggle arrow to left side */
171
+ .gr-accordion {
172
+ --accordion-arrow-size: 1.5em;
173
+ }
174
+ .gr-accordion > .label-wrap {
175
+ flex-direction: row !important;
176
+ justify-content: flex-start !important;
177
+ gap: 1em;
178
+ }
179
+ .gr-accordion > .label-wrap > .icon {
180
+ order: -1;
181
+ }
182
+ """
183
+ ) as demo:
184
+ # Add theme toggle at the top
185
+ with gr.Row():
186
+ gr.Markdown("# 🎙️ Podcastfy.ai")
187
+ theme_btn = gr.Button("🌓", scale=0, min_width=0)
188
 
189
+ gr.Markdown("An Open Source alternative to NotebookLM's podcast feature")
190
+ gr.Markdown("For full customization, please check Python package on github (www.podcastfy.ai).")
191
+
192
+ with gr.Tab("Content"):
193
+ # API Keys Section
194
+ gr.Markdown(
195
+ """
196
+ <h2 style='color: #2196F3; margin-bottom: 10px; padding: 10px 0;'>
197
+ 🔑 API Keys
198
+ </h2>
199
+ """,
200
+ elem_classes=["section-header"]
201
+ )
202
+ with gr.Accordion("Configure API Keys", open=False):
203
+ gemini_key = gr.Textbox(
204
+ label="Gemini API Key",
205
+ type="password",
206
+ value=os.getenv("GEMINI_API_KEY", ""),
207
+ info="Required"
208
+ )
209
+ openai_key = gr.Textbox(
210
+ label="OpenAI API Key",
211
+ type="password",
212
+ value=os.getenv("OPENAI_API_KEY", ""),
213
+ info="Required only if using OpenAI TTS model"
214
+ )
215
+ elevenlabs_key = gr.Textbox(
216
+ label="ElevenLabs API Key",
217
+ type="password",
218
+ value=os.getenv("ELEVENLABS_API_KEY", ""),
219
+ info="Required only if using ElevenLabs TTS model [recommended]"
220
+ )
221
+
222
+ # Content Input Section
223
+ gr.Markdown(
224
+ """
225
+ <h2 style='color: #2196F3; margin-bottom: 10px; padding: 10px 0;'>
226
+ 📝 Input Content
227
+ </h2>
228
+ """,
229
+ elem_classes=["section-header"]
230
+ )
231
+ with gr.Accordion("Configure Input Content", open=False):
232
+ with gr.Group():
233
+ text_input = gr.Textbox(
234
+ label="Text Input",
235
+ placeholder="Enter or paste text here...",
236
+ lines=3
237
+ )
238
+ urls_input = gr.Textbox(
239
+ label="URLs",
240
+ placeholder="Enter URLs (one per line) - supports websites and YouTube videos.",
241
+ lines=3
242
+ )
243
+
244
+ # Place PDF and Image uploads side by side
245
+ with gr.Row():
246
+ with gr.Column():
247
+ pdf_files = gr.Files( # Changed from gr.File to gr.Files
248
+ label="Upload PDFs", # Updated label
249
+ file_types=[".pdf"],
250
+ type="binary"
251
+ )
252
+ gr.Markdown("*Upload one or more PDF files to generate podcast from*", elem_classes=["file-info"])
253
+
254
+ with gr.Column():
255
+ image_files = gr.Files(
256
+ label="Upload Images",
257
+ file_types=["image"],
258
+ type="binary"
259
+ )
260
+ gr.Markdown("*Upload one or more images to generate podcast from*", elem_classes=["file-info"])
261
+
262
+ # Customization Section
263
+ gr.Markdown(
264
+ """
265
+ <h2 style='color: #2196F3; margin-bottom: 10px; padding: 10px 0;'>
266
+ ⚙️ Customization Options
267
+ </h2>
268
+ """,
269
+ elem_classes=["section-header"]
270
+ )
271
+ with gr.Accordion("Configure Podcast Settings", open=False):
272
+ # Basic Settings
273
+ gr.Markdown(
274
+ """
275
+ <h3 style='color: #1976D2; margin: 15px 0 10px 0;'>
276
+ 📊 Basic Settings
277
+ </h3>
278
+ """,
279
+ )
280
+ word_count = gr.Slider(
281
+ minimum=500,
282
+ maximum=5000,
283
+ value=2000,
284
+ step=100,
285
+ label="Word Count",
286
+ info="Target word count for the generated content"
287
+ )
288
+
289
+ conversation_style = gr.Textbox(
290
+ label="Conversation Style",
291
+ value="engaging,fast-paced,enthusiastic",
292
+ info="Comma-separated list of styles to apply to the conversation"
293
+ )
294
+
295
+ # Roles and Structure
296
+ gr.Markdown(
297
+ """
298
+ <h3 style='color: #1976D2; margin: 15px 0 10px 0;'>
299
+ 👥 Roles and Structure
300
+ </h3>
301
+ """,
302
+ )
303
+ roles_person1 = gr.Textbox(
304
+ label="Role of First Speaker",
305
+ value="main summarizer",
306
+ info="Role of the first speaker in the conversation"
307
+ )
308
+
309
+ roles_person2 = gr.Textbox(
310
+ label="Role of Second Speaker",
311
+ value="questioner/clarifier",
312
+ info="Role of the second speaker in the conversation"
313
+ )
314
+
315
+ dialogue_structure = gr.Textbox(
316
+ label="Dialogue Structure",
317
+ value="Introduction,Main Content Summary,Conclusion",
318
+ info="Comma-separated list of dialogue sections"
319
+ )
320
+
321
+ # Podcast Identity
322
+ gr.Markdown(
323
+ """
324
+ <h3 style='color: #1976D2; margin: 15px 0 10px 0;'>
325
+ 🎙️ Podcast Identity
326
+ </h3>
327
+ """,
328
+ )
329
+ podcast_name = gr.Textbox(
330
+ label="Podcast Name",
331
+ value="PODCASTFY",
332
+ info="Name of the podcast"
333
+ )
334
+
335
+ podcast_tagline = gr.Textbox(
336
+ label="Podcast Tagline",
337
+ value="YOUR PERSONAL GenAI PODCAST",
338
+ info="Tagline or subtitle for the podcast"
339
+ )
340
+
341
+ # Voice Settings
342
+ gr.Markdown(
343
+ """
344
+ <h3 style='color: #1976D2; margin: 15px 0 10px 0;'>
345
+ 🗣️ Voice Settings
346
+ </h3>
347
+ """,
348
+ )
349
+ tts_model = gr.Radio(
350
+ choices=["openai", "elevenlabs", "edge"],
351
+ value="openai",
352
+ label="Text-to-Speech Model",
353
+ info="Choose the voice generation model (edge is free but of low quality, others are superior but require API keys)"
354
+ )
355
+
356
+ # Advanced Settings
357
+ gr.Markdown(
358
+ """
359
+ <h3 style='color: #1976D2; margin: 15px 0 10px 0;'>
360
+ 🔧 Advanced Settings
361
+ </h3>
362
+ """,
363
+ )
364
+ creativity_level = gr.Slider(
365
+ minimum=0,
366
+ maximum=1,
367
+ value=0.7,
368
+ step=0.1,
369
+ label="Creativity Level",
370
+ info="Controls the creativity of the generated conversation (0 for focused/factual, 1 for more creative)"
371
+ )
372
+
373
+ user_instructions = gr.Textbox(
374
+ label="Custom Instructions",
375
+ value="",
376
+ lines=2,
377
+ placeholder="Add any specific instructions to guide the conversation...",
378
+ info="Optional instructions to guide the conversation focus and topics"
379
+ )
380
+
381
+ # Output Section
382
+ gr.Markdown(
383
+ """
384
+ <h2 style='color: #2196F3; margin-bottom: 10px; padding: 10px 0;'>
385
+ 🎵 Generated Output
386
+ </h2>
387
+ """,
388
+ elem_classes=["section-header"]
389
+ )
390
+ with gr.Group():
391
+ generate_btn = gr.Button("🎙️ Generate Podcast", variant="primary")
392
+ audio_output = gr.Audio(
393
+ type="filepath",
394
+ label="Generated Podcast"
395
+ )
396
+
397
+ # Footer
398
+ gr.Markdown("---")
399
+ gr.Markdown("Created with ❤️ using [Podcastfy](https://github.com/souzatharsis/podcastfy)")
400
+
401
+ # Handle generation
402
+ generate_btn.click(
403
+ process_inputs,
404
+ inputs=[
405
+ text_input, urls_input, pdf_files, image_files,
406
+ gemini_key, openai_key, elevenlabs_key,
407
+ word_count, conversation_style,
408
+ roles_person1, roles_person2,
409
+ dialogue_structure, podcast_name,
410
+ podcast_tagline, tts_model,
411
+ creativity_level, user_instructions
412
+ ],
413
+ outputs=audio_output
414
+ )
415
 
416
+ # Add theme toggle functionality
417
+ theme_btn.click(
418
+ None,
419
+ None,
420
+ None,
421
+ js="""
422
+ function() {
423
+ document.querySelector('body').classList.toggle('dark');
424
+ return [];
425
+ }
426
+ """
427
+ )
428
 
429
  if __name__ == "__main__":
430
+ demo.queue().launch(share=True)
pyproject.toml CHANGED
@@ -1,16 +1,16 @@
1
  [tool.poetry]
2
- name = "podcastfy-app"
3
- version = "0.1.0"
4
- description = "Simple application for podcastfy.ai"
5
  authors = ["Tharsis T. P. Souza"]
6
  readme = "README.md"
7
 
8
  [tool.poetry.dependencies]
9
- python = "^3.12"
10
- gradio-client = "^1.3.0"
11
- gradio = "^4.44.1"
 
12
  python-dotenv = "^1.0.1"
13
- podcastfy = "^0.1.12"
14
 
15
 
16
  [build-system]
 
1
  [tool.poetry]
2
+ name = "podcastfy-demo"
3
+ version = "0.2.0"
4
+ description = "Demo for podcastfy"
5
  authors = ["Tharsis T. P. Souza"]
6
  readme = "README.md"
7
 
8
  [tool.poetry.dependencies]
9
+ python = "^3.11"
10
+ gradio = "^5.4.0"
11
+ podcastfy = "^0.2.15"
12
+ gradio-client = "^1.4.2"
13
  python-dotenv = "^1.0.1"
 
14
 
15
 
16
  [build-system]
requirements.txt CHANGED
@@ -1,4 +1,4 @@
1
- gradio-client==1.3.0
2
- gradio==4.44.1
3
- podcastfy==0.1.13
4
- python-dotenv==1.0.1
 
1
+ gradio-client==1.4.2
2
+ gradio==5.4.0
3
+ podcastfy==0.2.15
4
+ python-dotenv==1.0.1
tomlbackup.txt ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [tool.poetry]
2
+ name = "podcastfy-app"
3
+ version = "0.1.0"
4
+ description = "Simple application for podcastfy.ai"
5
+ authors = ["Tharsis T. P. Souza"]
6
+ readme = "README.md"
7
+
8
+ [tool.poetry.dependencies]
9
+ python = "^3.11"
10
+ gradio-client = "^1.3.0"
11
+ gradio = "^4.44.1"
12
+ python-dotenv = "^1.0.1"
13
+ podcastfy = "^0.1.13"
14
+
15
+
16
+ [build-system]
17
+ requires = ["poetry-core"]
18
+ build-backend = "poetry.core.masonry.api"