Extract Video Meta Data in Python

Did you ever download a ton of audio or videos from various places and think about what their resolution, length, or codec is? Perhaps you’re creating a media library app or simply want to sort out videos according to their characteristics. Mastering to extract audio video meta data in python can save you countless hours of doing it manually and automate your process productively.

Python makes it so easy to do this job with the appropriate tools. You are able to extract vital information such as video resolution, frame rate, duration, codec type, bitrate, and many more within seconds. This information assists you in organizing, classifying, and processing your media files smartly.

Installing FFmpeg-Python Library

Before we proceed to extract metadata, you’ll need to prepare the required tools. We’re going to use the ffmpeg-python library, which offers a mighty wrapper around FFmpeg an exhaustive multimedia framework.

First, install FFmpeg on your system. For Windows users, download it from its official website and include it in your system PATH. Linux users can install it via their package manager:

sudo apt install ffmpeg

For macOS users with Homebrew:

brew install ffmpeg

Once FFmpeg is installed, add the Python wrapper library using pip:

pip install ffmpeg-python

The ffmpeg library allows us to interact with media such as video files or audio. It will manage all command line operation behind the scene.

Extracting Basic Video Metadata

Let’s start with a simple example to get video meta data in python. You will create a script that fetchs some usefule information from any video file.

import ffmpeg

video_path = 'sample_video.mp4'

probe = ffmpeg.probe(video_path)
video_info = next(s for s in probe['streams'] if s['codec_type'] == 'video')

print(f"Duration: {float(probe['format']['duration'])} seconds")
print(f"Resolution: {video_info['width']}x{video_info['height']}")
print(f"Codec: {video_info['codec_name']}")
print(f"Frame Rate: {video_info['r_frame_rate']}")

The ffmpeg.probe() function does the heavy lifting here. It scans your video file and returns a dictionary containing all metadata. We filter the streams to find video-specific information, separating it from audio or subtitle tracks.

Output:

Duration: 125.5 seconds
Resolution: 1920x1080
Codec: h264
Frame Rate: 30/1

Retrieving Detailed Media Information In Python

Sometimes you need more comprehensive details. Let’s expand our example to extract media details in python including audio properties and file information.

import ffmpeg
import json

def get_detailed_metadata(video_path):
    probe = ffmpeg.probe(video_path)
    
    video_stream = next((s for s in probe['streams'] if s['codec_type'] == 'video'), None)
    audio_stream = next((s for s in probe['streams'] if s['codec_type'] == 'audio'), None)
    
    metadata = {
        'filename': probe['format']['filename'],
        'format': probe['format']['format_long_name'],
        'duration': float(probe['format']['duration']),
        'size': int(probe['format']['size']),
        'bitrate': int(probe['format']['bit_rate'])
    }
    
    if video_stream:
        metadata['video'] = {
            'codec': video_stream['codec_long_name'],
            'width': video_stream['width'],
            'height': video_stream['height'],
            'fps': eval(video_stream['r_frame_rate']),
            'bitrate': int(video_stream.get('bit_rate', 0))
        }
    
    if audio_stream:
        metadata['audio'] = {
            'codec': audio_stream['codec_long_name'],
            'sample_rate': audio_stream['sample_rate'],
            'channels': audio_stream['channels'],
            'bitrate': int(audio_stream.get('bit_rate', 0))
        }
    
    return metadata

video_metadata = get_detailed_metadata('sample_video.mp4')
print(json.dumps(video_metadata, indent=4))

The function will creates a structured dictionary with all the important details. You can easily convert this to JSON format for storage or return as API responses.

{
    "filename": "sample_video.mp4",
    "format": "QuickTime / MOV",
    "duration": 125.5,
    "size": 52428800,
    "bitrate": 3341107,
    "video": {
        "codec": "H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10",
        "width": 1920,
        "height": 1080,
        "fps": 30.0,
        "bitrate": 3145728
    },
    "audio": {
        "codec": "AAC (Advanced Audio Coding)",
        "sample_rate": "48000",
        "channels": 2,
        "bitrate": 192000
    }
}

It can be useful in systems where user uploads media files and as a hosting platform you need to validate format, quality and other details before allowing video or audio to application.

Extracting Creation Date and Tags

Some videos contain additional metadata like creation dates, author information, or custom tags. Let’s take an example to fetch creation date and tags in python:

import ffmpeg
from datetime import datetime

def get_extended_info(video_path):
    probe = ffmpeg.probe(video_path)
    format_info = probe['format']
    
    tags = format_info.get('tags', {})
    
    info = {
        'creation_time': tags.get('creation_time', 'Not available'),
        'title': tags.get('title', 'No title'),
        'artist': tags.get('artist', 'Unknown'),
        'comment': tags.get('comment', 'No comment')
    }
    
    return info

extended_data = get_extended_info('sample_video.mp4')

for key, value in extended_data.items():
    print(f"{key.title()}: {value}")

Here, it will get time of video creation, title, and artist information from video file.

Conclusion

Video metadata extraction doesn’t have to be so hard. Python’s ffmpeg-python library has all you need to obtain video meta data in python quickly and precisely. We’ve addressed simple extraction, extended analysis, batch processing, and quality checks within this guide.

Once you know how to extract video metadata, you can build powerful automation workflows. If you’re working with online videos, check out our guide on Download YouTube Videos Using Python.