...
If Fuse cannot confidently identify the spoken language, Fuse auto-transcribes the video in English by default.
Fuse requires at least 60 seconds of speech to auto-transcribe a video.
If the video contains a mixture of different languages, this may affect the transcription.
If the audio is not clear or of poor quality, Fuse may not be unable able to auto-transcribe the video.