To download an MP4 video and transcribe it to text using Python, you can follow these steps:
youtube-dl
or pytube
.moviepy
.speech_recognition
.python-docx
.First, install the required libraries:
pip install youtube-dl moviepy SpeechRecognition python-docx
Then, create a Python script with the following code:
import os
import moviepy.editor as mp
import speech_recognition as sr
from docx import Document
import youtube_dl
def download_video(url):
ydl_opts = {
`format`: `bestvideo[ext=mp4]+bestaudio[ext=m4a]/mp4`,
`outtmpl`: `video.mp4`,
}
with youtube_dl.YoutubeDL(ydl_opts) as ydl:
ydl.download([url])
def extract_audio(video_file, audio_file):
video = mp.VideoFileClip(video_file)
audio = video.audio
audio.write_audiofile(audio_file)
def transcribe_audio(audio_file):
recognizer = sr.Recognizer()
with sr.AudioFile(audio_file) as source:
audio_data = recognizer.record(source)
try:
text = recognizer.recognize_google(audio_data)
except sr.UnknownValueError:
print("Google Speech Recognition could not understand audio")
text = ""
except sr.RequestError as e:
print(f"Could not request results from Google Speech Recognition service: {str(e)}")
text = ""
return text
def save_to_word(text, output_file):
doc = Document()
doc.add_paragraph(text)
doc.save(output_file)
# Replace the URL with the video URL you want to download
video_url = "https://www.youtube.com/watch?v=your_video_id"
video_file = "video.mp4"
audio_file = "audio.wav"
output_file = "transcription.docx"
download_video(video_url)
extract_audio(video_file, audio_file)
transcription = transcribe_audio(audio_file)
save_to_word(transcription, output_file)
print("Transcription saved to", output_file)
Replace the video_url
variable with the URL of the video you want to download. This script downloads the video, extracts the audio, transcribes the audio, and saves the transcription as a Word document.
Please note that the quality of the transcription heavily depends on the clarity of the audio and the speech recognition service used. In this example, we used Google's Speech Recognition service, which requires an internet connection.
The code provided above uses Google's Speech Recognition service via the speech_recognition
library, which supports both Ukrainian and Russian languages. To transcribe audio in these languages, you need to modify the transcribe_audio
function by adding a language
parameter to the recognize_google
method.
Here's the modified transcribe_audio
function:
def transcribe_audio(audio_file, language):
recognizer = sr.Recognizer()
with sr.AudioFile(audio_file) as source:
audio_data = recognizer.record(source)
try:
text = recognizer.recognize_google(audio_data, language=language)
except sr.UnknownValueError:
print("Google Speech Recognition could not understand audio")
text = ""
except sr.RequestError as e:
print(f"Could not request results from Google Speech Recognition service: {str(e)}")
text = ""
return text
To transcribe audio in Ukrainian, pass the language code "uk" when calling the function:
transcription = transcribe_audio(audio_file, language="uk")
To transcribe audio in Russian, pass the language code "ru":
transcription = transcribe_audio(audio_file, language="ru")
Please note that the accuracy of the transcription depends on the quality of the audio and the speech recognition service used. Google's Speech Recognition service generally performs well for both Ukrainian and Russian languages, but it might not be perfect.
Автор: Рудюк С.А. 2023. K2 Cloud ERP.