Transcription audioFull guide

Turn your audio into text

Kits AI's Transcription service converts your audio recordings into structured text using a high-precision AI engine. Meetings, interviews, voice notes, podcasts — everything becomes a downloadable text document in minutes.

Get started for free →

How it works

01

Select your audio file

Drag and drop your audio file into the designated area, or click to browse from your device. MP3, M4A, WAV and WMA formats are accepted, up to 100 MB.

Recordings with a clear voice and no background noise give the best results.

02

Start transcription

Click "Send file". The audio is first uploaded to our servers, then passed to the AI transcription engine. Processing is entirely asynchronous.

You can close the tab once the upload is complete — transcription continues in the background.

03

Track progress

Find your task in the Transcription list. The card shows the real-time status: pending, processing, or completed. The status updates automatically.

A 10-minute audio recording typically takes 1 to 3 minutes to transcribe.

04

Download the result

Once processing is complete, the card switches to "Completed" status. Download the transcription as a text file from the card menu. The file contains the full timestamped text.

The text file is ready to copy, edit or integrate into a document.

Tips for better results

MP3

MP3

Universal format, compatible with all devices. Ideal for dictaphone recordings or podcasts.

M4A

M4A

Apple format, produced by iPhones and Macs during voice recordings.

WAV

WAV

Uncompressed, high-quality format. Larger files but no audio signal loss.

WMA

WMA

Windows Media Audio format, produced by Windows devices and some dictaphones.

🎙️

Recording quality

The clearer the audio, the more accurate the transcription. Use a dedicated microphone and record in a quiet room.

🗣️

Multiple speakers

The transcription faithfully captures all speech. For multi-speaker meetings, make sure voices do not overlap too much.

📁

File size

The limit is 100 MB. For long meetings, split the audio into 30 to 60-minute segments for faster processing.

📝

Proofreading recommended

AI transcription is very accurate but may stumble on proper nouns and highly technical terms. A proofreading pass is recommended before publishing.

Frequently asked questions

What audio formats are accepted?

MP3, M4A, WAV and WMA, up to 100 MB. If your file exceeds this limit, split it into segments using a tool like Audacity or GarageBand.

What language is transcribed?

The engine automatically detects the spoken language and transcribes accordingly. French is the primary optimized language, but English and other common languages also work well.

How many credits does a transcription cost?

The cost depends on the audio duration. It is shown before launching — you will never be charged without confirmation.

How long does processing take?

As a general rule, expect about 10 to 20% of the audio duration. A 10-minute recording is processed in 1 to 3 minutes.

Can I download the transcription in multiple formats?

Currently, the transcription is available as a text file (.txt). Other formats (Word, SRT for subtitles) are planned for future versions.

Are my audio files kept?

Uploaded audio files are stored securely on Google Cloud Storage. You can delete a transcription at any time from the list.

Related pages

Ready to transcribe your first audio?

Welcome credits included — no credit card required.

Get started for free →