We are developing a web application that generates speech audio from text using GPT and ElevenLabs, and automatically blends this speech with background music. The application will allow users to upload text or audio material, select specific schemas, and customize voice emotionality. The final output will be a blended audio file where the speech is clearly heard over the background music.
Responsibilities:
Front-End Development:
Implement a user-friendly interface based on a provided template (React.js or Vue.js preferred).
Integrate file upload functionality for text and audio materials.
Develop customizable options for users, such as selecting schemas, adjusting voice emotionality, and specifying the number of output variants.
Back-End Integration:
Set up API calls to OpenAI’s GPT for text generation and ElevenLabs for speech synthesis.
Implement a solution for automatic music-speech blending, possibly using services like Auphonic or FFmpeg.
Manage the processing pipeline to ensure smooth and efficient generation of output audio files.
Deployment:
Deploy the web app on a cloud platform (e.g., Heroku, AWS, Vercel).
Ensure that the app is scalable and can handle multiple concurrent users.
Requirements:
Proven experience with front-end frameworks (React.js or Vue.js).
Familiarity with API integration, particularly with OpenAI, ElevenLabs, and audio processing tools.
Experience in building responsive and user-friendly web applications.
Ability to work with cloud platforms for deployment and scaling.
Strong communication skills and the ability to work collaboratively.
Deliverables:
A fully functional web application as described above.
Documentation on how the app works, including setup and deployment instructions.
Ongoing support for a specified period post-deployment for any bug fixes or minor adjustments.
Budget: $350
Posted On: August 16, 2024 20:21 UTC
Category: Full Stack Development
Skills:ChatGPT API Integration, React, ElevenLabs, Audio & Music Software
Country: France
click to apply
Powered by WPeMatico