Роспізнавання розмов з відео та трансформація їх в текст

To download an MP4 video and transcribe it to text using Python, you can follow these steps:

Download the video using a library like youtube-dl or pytube.
Extract the audio from the video using moviepy.
Transcribe the audio to text using a speech recognition library like speech_recognition.
Save the transcribed text as a Word document using python-docx.

First, install the required libraries:

pip install youtube-dl moviepy SpeechRecognition python-docx

Then, create a Python script with the following code:

import os
import moviepy.editor as mp
import speech_recognition as sr
from docx import Document
import youtube_dl

def download_video(url):
    ydl_opts = {
        `format`: `bestvideo[ext=mp4]+bestaudio[ext=m4a]/mp4`,
        `outtmpl`: `video.mp4`,
    }
    with youtube_dl.YoutubeDL(ydl_opts) as ydl:
        ydl.download([url])

def extract_audio(video_file, audio_file):
    video = mp.VideoFileClip(video_file)
    audio = video.audio
    audio.write_audiofile(audio_file)

def transcribe_audio(audio_file):
    recognizer = sr.Recognizer()
    with sr.AudioFile(audio_file) as source:
        audio_data = recognizer.record(source)
        try:
            text = recognizer.recognize_google(audio_data)
        except sr.UnknownValueError:
            print("Google Speech Recognition could not understand audio")
            text = ""
        except sr.RequestError as e:
            print(f"Could not request results from Google Speech Recognition service: {str(e)}")
            text = ""
    return text

def save_to_word(text, output_file):
    doc = Document()
    doc.add_paragraph(text)
    doc.save(output_file)

# Replace the URL with the video URL you want to download
video_url = "https://www.youtube.com/watch?v=your_video_id"
video_file = "video.mp4"
audio_file = "audio.wav"
output_file = "transcription.docx"

download_video(video_url)
extract_audio(video_file, audio_file)
transcription = transcribe_audio(audio_file)
save_to_word(transcription, output_file)
print("Transcription saved to", output_file)

Replace the video_url variable with the URL of the video you want to download. This script downloads the video, extracts the audio, transcribes the audio, and saves the transcription as a Word document.

Please note that the quality of the transcription heavily depends on the clarity of the audio and the speech recognition service used. In this example, we used Google's Speech Recognition service, which requires an internet connection.

Підтримка рішенням української та російської мови

The code provided above uses Google's Speech Recognition service via the speech_recognition library, which supports both Ukrainian and Russian languages. To transcribe audio in these languages, you need to modify the transcribe_audio function by adding a language parameter to the recognize_google method.

Here's the modified transcribe_audio function:

def transcribe_audio(audio_file, language):
    recognizer = sr.Recognizer()
    with sr.AudioFile(audio_file) as source:
        audio_data = recognizer.record(source)
        try:
            text = recognizer.recognize_google(audio_data, language=language)
        except sr.UnknownValueError:
            print("Google Speech Recognition could not understand audio")
            text = ""
        except sr.RequestError as e:
            print(f"Could not request results from Google Speech Recognition service: {str(e)}")
            text = ""
    return text

To transcribe audio in Ukrainian, pass the language code "uk" when calling the function:

transcription = transcribe_audio(audio_file, language="uk")

To transcribe audio in Russian, pass the language code "ru":

transcription = transcribe_audio(audio_file, language="ru")

Please note that the accuracy of the transcription depends on the quality of the audio and the speech recognition service used. Google's Speech Recognition service generally performs well for both Ukrainian and Russian languages, but it might not be perfect.

Автор: Рудюк С.А. 2023. K2 Cloud ERP.