Step-by-step
Convert a YouTube video to text
Here’s how to convert a YouTube video to text in seconds: paste the link below, get the words with timestamps, then copy or export. It draws on the video’s captions, so it’s instant and free. No sign-in.
Works on any video with captions · or add the Chrome extension for one-click transcripts on every video.
On this page
The fastest way: paste the link
The quickest route needs no install. Copy the video’s URL from the address bar (or the Share button), paste it into the box above, and the words appear in seconds — every line time-coded and ready to read. There’s no account to create and no cap on how many videos you run.
This is the whole flow:
- Paste the YouTube link into the tool.
- Convert to text — the words appear with clickable timestamps.
- Copy or export the result, or translate it first.
That’s it. For the bigger picture on what this is and where the words come from, see the YouTube to text overview.
How the conversion works
It helps to know what’s happening behind the button. The tool reads the video’s caption track — the same lines you can switch on under the player — and lays them out as one readable block. Nothing is re-recorded, and your machine isn’t crunching audio; it’s a re-format of words the video already carries, which is why it’s instant.
Those captions come from one of two places: a creator track the uploader wrote, which is punctuated and clean, or YouTube’s auto-generated captions, which are good for clear speech but arrive without punctuation. When a video has both, the creator track gives the tidier text.
Convert inside YouTube with the extension
If you do this often, copying links gets old fast. The Chrome extension opens the words right next to the player on the watch page — one click on any video, no leaving YouTube. It’s the same free text, in context while you watch, and the quickest route when you’re working through a lot of videos in a row.
Copy and export the text
Once the words are on screen, take them with you. Copy the whole thing to the clipboard, or save it as a file:
- TXT — plain text for notes or pasting anywhere.
- Markdown — for docs and note apps like Notion or Obsidian.
- SRT and VTT — subtitle files, if you want the timed captions instead.
Each format can keep the timecodes or drop them — a clean read, or working captions. From there, the text makes a great starting point for study notes or a quick AI summary.
Convert into another language
Need the words in a language other than the video’s? Pick one from the translate menu and the whole thing switches in a click. Read a foreign-language video in your own language, or keep a copy in one you handle better in writing. It runs on the captions, so translating stays free, and the result reads as cleanly as the original.
What if the video has no captions?
The conversion draws on captions, so a video needs a caption track to produce text. Most spoken-word videos have at least auto-generated captions, but a few come up empty:
Converting long videos
Length is no obstacle. A three-minute clip and a three-hour podcast both convert at once — there’s no length limit and no queue to wait in. Long videos are where this saves the most: instead of scrubbing through two hours to find one point, you search the words, click the line, and you’re there. The whole thing is on the page, so you skim the start, read the part you need, and leave the rest.
That turns a full lecture or a long interview from half an hour of hunting into a few seconds of searching. For a creator, a long upload becomes a first draft the moment it’s published — the words are already written, ready to reshape.
Clean prose, no timestamps
By default each line carries the moment it was spoken, which is handy for jumping around the video. But sometimes you just want the words. Toggle the timecodes off, or export to TXT or Markdown without them, and you get plain prose — easy to paste into a document or hand to an AI tool. Switching is instant, so you can read with the times on the page, then grab a tidy, time-free copy for your notes in the same sitting.
What to do with the text
Once a video is text, it fits wherever you work. Read it instead of watching. Search for the one line you need and land on it. Quote it, with its timestamp, in an article or report. Paste it into an AI assistant for a summary or the key points — it’s clean text, so the model has the exact words. Drop it into your notes app as Markdown, headings and all.
Creators reshape the text from their own uploads into blog posts, show notes or social captions — close to a week of writing from a single video. For the full background on the format, the YouTube transcript overview goes deeper.
Frequently asked questions
How do I convert a YouTube video to text for free?
Paste the link into the tool above. The video’s captions are read and laid out as text in seconds. It is free, with no sign-in and no limit on how many videos you run.
Where does the text come from?
From the captions the video already has — creator captions or auto-generated ones. It is the caption track reformatted for reading, not a fresh recording of the audio.
What if the video has no captions?
Then there is nothing to convert. Turning the audio itself into text would need speech recognition, which this tool does not do.
Can I do it on my phone?
Yes. The tool runs in any browser, so you can paste a link and read the text on a phone or tablet.