Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate GPT 4o without TTS/STT #210

Open
clemlesne opened this issue May 25, 2024 · 5 comments
Open

Integrate GPT 4o without TTS/STT #210

clemlesne opened this issue May 25, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@clemlesne
Copy link
Collaborator

OpenAI GPT 4o model supports both in and out of text, image and audio. Understanding is finer than usual STT > model > TTS approach because the model has direct access to user behavior, emotions, etc.

Is there a way to use Communication Services and receive the raw audio flow, bypassing the STT step?

@clemlesne clemlesne added the enhancement New feature or request label May 25, 2024
@Qwatro55
Copy link

I'm also interested in this question.

@agentverket
Copy link

What about response time?
What about costs?
Can you stream data?

@clemlesne
Copy link
Collaborator Author

I know I know :) OpenAI APIs are not yet available:

Plus, Communication Services APIs are not yet available to use with raw audio stream.

If you have ideas, don't hesitate!

@JunJD
Copy link

JunJD commented Jul 22, 2024

m

@clemlesne
Copy link
Collaborator Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants