Speech-to-Text
Convert speech to text using AI
Configuration
OpenAI Whisper
Provider*
OpenAI Whisper
Audio/Video File*
Upload an audio or video file
Audio/Video File Reference*
Reference audio/video from previous blocks
Audio/Video URL
Or enter publicly accessible audio/video URL
Language*
Select...
Timestamps*
Select...
API Key*
••••••••
Model*
Select...
Translate to English
Disabled
Output
| Parameter | Type | Description |
|---|---|---|
transcript | string | Full transcribed text |
segments | array | Timestamped segments with speaker labels |
language | string | Detected or specified language |
duration | number | Audio duration in seconds |
confidence | number | Overall confidence score |
sentiment | array | Sentiment analysis results |
entities | array | Detected entities |
summary | string | Auto-generated summary |
Deepgram
Provider*
Deepgram
Audio/Video File*
Upload an audio or video file
Audio/Video File Reference*
Reference audio/video from previous blocks
Audio/Video URL
Or enter publicly accessible audio/video URL
Language*
Select...
Timestamps*
Select...
API Key*
••••••••
Model*
Select...
Speaker Diarization
Disabled
Output
| Parameter | Type | Description |
|---|---|---|
transcript | string | Full transcribed text |
segments | array | Timestamped segments with speaker labels |
language | string | Detected or specified language |
duration | number | Audio duration in seconds |
confidence | number | Overall confidence score |
sentiment | array | Sentiment analysis results |
entities | array | Detected entities |
summary | string | Auto-generated summary |
ElevenLabs
Provider*
ElevenLabs
Audio/Video File*
Upload an audio or video file
Audio/Video File Reference*
Reference audio/video from previous blocks
Audio/Video URL
Or enter publicly accessible audio/video URL
Language*
Select...
Timestamps*
Select...
API Key*
••••••••
Model*
Select...
Output
| Parameter | Type | Description |
|---|---|---|
transcript | string | Full transcribed text |
segments | array | Timestamped segments with speaker labels |
language | string | Detected or specified language |
duration | number | Audio duration in seconds |
confidence | number | Overall confidence score |
sentiment | array | Sentiment analysis results |
entities | array | Detected entities |
summary | string | Auto-generated summary |
AssemblyAI
Provider*
AssemblyAI
Audio/Video File*
Upload an audio or video file
Audio/Video File Reference*
Reference audio/video from previous blocks
Audio/Video URL
Or enter publicly accessible audio/video URL
Language*
Select...
Timestamps*
Select...
API Key*
••••••••
Model*
Select...
Speaker Diarization
Disabled
Sentiment Analysis
Disabled
Entity Detection
Disabled
PII Redaction
Disabled
Auto Summarization
Disabled
Output
| Parameter | Type | Description |
|---|---|---|
transcript | string | Full transcribed text |
segments | array | Timestamped segments with speaker labels |
language | string | Detected or specified language |
duration | number | Audio duration in seconds |
confidence | number | Overall confidence score |
sentiment | array | Sentiment analysis results |
entities | array | Detected entities |
summary | string | Auto-generated summary |
Google Gemini
Provider*
Google Gemini
Audio/Video File*
Upload an audio or video file
Audio/Video File Reference*
Reference audio/video from previous blocks
Audio/Video URL
Or enter publicly accessible audio/video URL
Language*
Select...
Timestamps*
Select...
API Key*
••••••••
Model*
Select...
Output
| Parameter | Type | Description |
|---|---|---|
transcript | string | Full transcribed text |
segments | array | Timestamped segments with speaker labels |
language | string | Detected or specified language |
duration | number | Audio duration in seconds |
confidence | number | Overall confidence score |
sentiment | array | Sentiment analysis results |
entities | array | Detected entities |
summary | string | Auto-generated summary |
Usage Instructions
Transcribe audio and video files to text using leading AI providers. Supports multiple languages, timestamps, and speaker diarization.
Notes
- Category:
tools - Type:
stt