Skip to content

Parsing YouTube URLs in PHP: Extracting Video IDs from Any Format

Introduction

When building applications that integrate with YouTube, you'll inevitably encounter the challenge of handling various URL formats. Users might paste a standard watch URL, a shortened youtu.be link, an embed URL, or even a Shorts link. Your application needs to handle all of these gracefully.

This guide demonstrates how to build a UrlUtil class that parses any YouTube URL format and returns a standardized embed URL, ready for use in iframes or video players.

Prerequisites

  • PHP 8.0 or higher
  • Basic understanding of regular expressions
  • Familiarity with PHP's preg_match function

Understanding YouTube URL Formats

YouTube uses several URL formats, all pointing to the same video but structured differently:

FormatExample URLUse Case
Standard watchyoutube.com/watch?v=dQw4w9WgXcQMain website
Short URLyoutu.be/dQw4w9WgXcQShare buttons
Embed URLyoutube.com/embed/dQw4w9WgXcQIframe embedding
Shortsyoutube.com/shorts/dQw4w9WgXcQMobile-first shorts

All these formats share one common element: the 11-character video ID (dQw4w9WgXcQ in these examples).

Video ID Structure

YouTube video IDs are always exactly 11 characters long, consisting of letters (a-z, A-Z), numbers (0-9), underscores (_), and hyphens (-).

Building the URL Parser

Let's create a utility class that handles all these formats:

php
<?php

namespace App\Utils;

class UrlUtil
{
    /**
     * Parse YouTube URL and convert to embed format
     * Supports: youtube.com/watch?v=ID, youtu.be/ID,
     *           youtube.com/embed/ID, youtube.com/shorts/ID
     *
     * @return string|null Embed URL or null if invalid
     */
    public static function parseYouTubeUrl(string $url): ?string
    {
        if (empty($url)) {
            return null;
        }

        // Define patterns for each URL format
        $patterns = [
            // youtube.com/watch?v=VIDEO_ID or with additional params
            '/(?:youtube\.com\/watch\?.*v=)([a-zA-Z0-9_-]{11})/',
            // youtu.be/VIDEO_ID
            '/(?:youtu\.be\/)([a-zA-Z0-9_-]{11})/',
            // youtube.com/embed/VIDEO_ID
            '/(?:youtube\.com\/embed\/)([a-zA-Z0-9_-]{11})/',
            // youtube.com/shorts/VIDEO_ID
            '/(?:youtube\.com\/shorts\/)([a-zA-Z0-9_-]{11})/',
        ];

        foreach ($patterns as $pattern) {
            if (preg_match($pattern, $url, $matches)) {
                return 'https://www.youtube.com/embed/' . $matches[1];
            }
        }

        return null;
    }
}

How the Patterns Work

Each regex pattern targets a specific URL format:

Standard Watch URL Pattern:

regex
/(?:youtube\.com\/watch\?.*v=)([a-zA-Z0-9_-]{11})/
  • (?:youtube\.com\/watch\?.*v=) - Non-capturing group matching the domain and path
  • .*v= - Allows for additional query parameters before the v parameter
  • ([a-zA-Z0-9_-]{11}) - Captures exactly 11 valid characters

Short URL Pattern:

regex
/(?:youtu\.be\/)([a-zA-Z0-9_-]{11})/
  • Simple pattern for the shortened youtu.be domain
  • Video ID appears directly after the slash

Query Parameter Order

The watch URL pattern uses .*v= to handle cases where v isn't the first parameter, like youtube.com/watch?list=PLxxx&v=VIDEO_ID.

Adding Helper Methods

You might want additional utility methods for working with YouTube URLs:

php
/**
 * Check if a URL is a valid YouTube URL
 */
public static function isYouTubeUrl(string $url): bool
{
    return self::parseYouTubeUrl($url) !== null;
}

/**
 * Extract just the video ID without converting to embed URL
 */
public static function getYouTubeVideoId(string $url): ?string
{
    $embedUrl = self::parseYouTubeUrl($url);

    if ($embedUrl === null) {
        return null;
    }

    // Extract ID from embed URL
    return substr($embedUrl, -11);
}

Testing the Implementation

Here's how to verify your parser handles all formats correctly:

php
// Test cases
$urls = [
    'https://www.youtube.com/watch?v=dQw4w9WgXcQ',
    'https://youtube.com/watch?v=dQw4w9WgXcQ&t=30',
    'https://youtu.be/dQw4w9WgXcQ',
    'https://www.youtube.com/embed/dQw4w9WgXcQ',
    'https://youtube.com/shorts/dQw4w9WgXcQ',
    'https://youtube.com/watch?list=PLxxx&v=dQw4w9WgXcQ',
    'invalid-url',
    '',
];

foreach ($urls as $url) {
    $result = UrlUtil::parseYouTubeUrl($url);
    echo $url . ' => ' . ($result ?? 'null') . "\n";
}

Expected output:

https://www.youtube.com/watch?v=dQw4w9WgXcQ => https://www.youtube.com/embed/dQw4w9WgXcQ
https://youtube.com/watch?v=dQw4w9WgXcQ&t=30 => https://www.youtube.com/embed/dQw4w9WgXcQ
https://youtu.be/dQw4w9WgXcQ => https://www.youtube.com/embed/dQw4w9WgXcQ
https://www.youtube.com/embed/dQw4w9WgXcQ => https://www.youtube.com/embed/dQw4w9WgXcQ
https://youtube.com/shorts/dQw4w9WgXcQ => https://www.youtube.com/embed/dQw4w9WgXcQ
https://youtube.com/watch?list=PLxxx&v=dQw4w9WgXcQ => https://www.youtube.com/embed/dQw4w9WgXcQ
invalid-url => null
 => null

Practical Usage: Embedding Videos

Once you have the embed URL, you can use it in your HTML:

php
$userUrl = $_POST['youtube_url'] ?? '';
$embedUrl = UrlUtil::parseYouTubeUrl($userUrl);

if ($embedUrl) {
    echo '<iframe
        width="560"
        height="315"
        src="' . htmlspecialchars($embedUrl) . '"
        frameborder="0"
        allowfullscreen>
    </iframe>';
} else {
    echo '<p>Invalid YouTube URL provided.</p>';
}

Security Note

Always use htmlspecialchars() when outputting URLs to HTML to prevent XSS attacks. Even though the URL is validated, defense in depth is essential.

Edge Cases to Consider

When working with YouTube URLs in production, keep these scenarios in mind:

  1. Protocol variations: URLs might start with http://, https://, or no protocol at all
  2. Mobile URLs: Some mobile apps generate m.youtube.com URLs
  3. Timestamp parameters: Users might include &t=123 for start times
  4. Private/unlisted videos: The URL is valid but the video might not be accessible

For a more comprehensive solution, you could extend the patterns:

php
// Extended pattern supporting m.youtube.com and protocol-less URLs
'/(?:(?:https?:)?\/\/)?(?:www\.|m\.)?youtube\.com\/watch\?.*v=([a-zA-Z0-9_-]{11})/'

Conclusion

Parsing YouTube URLs is a common requirement when building applications that handle user-submitted video content. By using a pattern-based approach, you can reliably extract video IDs from any URL format and convert them to a standardized embed format.

Key takeaways:

  • YouTube video IDs are always 11 characters long
  • Multiple URL formats exist, each requiring its own regex pattern
  • Return null for invalid URLs to allow graceful error handling
  • Always sanitize output when embedding in HTML

This utility can be extended to support additional video platforms like Vimeo or Dailymotion by adding new patterns and handling their specific URL structures.