Skip to content

Splitting Long Messages for Social Media: A PHP Utility for Character Limits

Introduction

When building chatbots, social media tools, or AI-powered applications, you'll inevitably face a common challenge: the text you want to send is longer than the platform allows.

Instagram DMs cap at 1,000 characters. Twitter/X limits posts to 280 characters. SMS messages split at 160 characters. And when you're working with AI-generated responses, they rarely respect these boundaries.

The naive approach—cutting text at exactly the character limit—results in broken words and unreadable messages. What you need is intelligent splitting that respects word boundaries and produces clean, readable chunks.

The Problem

Consider this scenario: your AI assistant generates a 3,000-character response to a customer inquiry. You need to send it via Instagram DM (1,000 char limit). Simply cutting at position 1000 might produce:

...and the best approach would be to reconfi

Instead of:

...and the best approach would be to

Your users deserve better.

The Solution

Here's a PHP utility function that splits long messages at natural boundaries—spaces and line breaks—while respecting your character limit:

php
public static function splitMessageByLength(string $message, int $length = 4096): array
{
    // Normalize line breaks to \n
    $message = str_replace("\r\n", "\n", $message);
    $message = str_replace("\r", "\n", $message);

    $messages = [];
    $current_message = '';

    // Traverse message by each character
    for ($i = 0, $count = mb_strlen($message, 'UTF-8'); $i < $count; $i++) {
        $char = mb_substr($message, $i, 1, 'UTF-8');

        // Append the character
        $current_message .= $char;

        // Check if current message is longer than the limit
        if (mb_strlen($current_message, 'UTF-8') >= $length) {
            // Find the last occurrence of a space or line break
            $lastSpace = mb_strrpos($current_message, ' ', 0, 'UTF-8');
            $lastBreak = mb_strrpos($current_message, "\n", 0, 'UTF-8');

            // Determine the split position (prefer line break over space)
            $splitPos = $lastBreak !== false ? $lastBreak : $lastSpace;

            // If there's no space or line break, force split at max length
            if ($splitPos === false || $splitPos === 0) {
                $splitPos = $length;
            }

            // Add the current chunk to the messages array
            $messages[] = mb_substr($current_message, 0, $splitPos, 'UTF-8');

            // Start the next chunk with remaining content
            $current_message = mb_substr($current_message, $splitPos, null, 'UTF-8');
        }
    }

    // Add remainder if there is any
    if (!empty($current_message)) {
        $messages[] = $current_message;
    }

    // Clean up: trim whitespace and filter empty strings
    $messages = array_map('trim', $messages);
    $filtered_messages = array_filter($messages, function ($message) {
        return (bool) $message;
    });

    return array_values($filtered_messages);
}

How It Works

The algorithm follows these steps:

  1. Normalize line breaks - Converts Windows (\r\n) and old Mac (\r) line breaks to Unix style (\n) for consistent handling.

  2. Character-by-character traversal - Uses multibyte-safe functions (mb_strlen, mb_substr) to properly handle Unicode characters like emojis and accented letters.

  3. Smart boundary detection - When the limit is reached, it looks backward for the last space or line break. Line breaks are preferred since they represent more natural split points.

  4. Fallback for edge cases - If no suitable split point exists (e.g., a single very long word), it forces a split at the exact limit.

  5. Cleanup - Trims whitespace from each chunk and removes any empty strings that might result from the splitting.

Usage Examples

Basic Usage

php
$longMessage = "This is a very long message that needs to be split...";
$chunks = TextUtil::splitMessageByLength($longMessage, 100);

foreach ($chunks as $index => $chunk) {
    echo "Part " . ($index + 1) . ": " . $chunk . "\n";
}

Platform-Specific Limits

php
class MessageSplitter
{
    const INSTAGRAM_DM = 1000;
    const TWITTER = 280;
    const SMS = 160;
    const WHATSAPP = 4096;
    const TELEGRAM = 4096;

    public static function forInstagram(string $message): array
    {
        return TextUtil::splitMessageByLength($message, self::INSTAGRAM_DM);
    }

    public static function forTwitter(string $message): array
    {
        return TextUtil::splitMessageByLength($message, self::TWITTER);
    }

    public static function forSMS(string $message): array
    {
        return TextUtil::splitMessageByLength($message, self::SMS);
    }
}

With AI Responses

php
// Get response from AI (could be very long)
$aiResponse = $anthropicService->chat($prompt);

// Split for Instagram DM delivery
$parts = TextUtil::splitMessageByLength($aiResponse, 1000);

foreach ($parts as $part) {
    $instagramApi->sendDirectMessage($userId, $part);
    // Add delay to maintain message order
    usleep(500000); // 500ms
}

Platform Character Limits Reference

PlatformLimitNotes
Instagram DM1,000Per message
Instagram Caption2,200Truncated at 125 in feed
Twitter/X280Premium users get more
SMS160Longer = multiple segments
WhatsApp4,096Per message
Telegram4,096Per message
Facebook Post63,206But 40-80 chars optimal
LinkedIn3,000Truncated at 140 in feed

Optimal Engagement

Studies show shorter messages get better engagement. Twitter posts under 100 characters get 17% higher engagement. Instagram captions between 138-150 characters perform best.

Enhancements to Consider

Adding Part Numbers

php
public static function splitWithNumbers(string $message, int $length): array
{
    $chunks = self::splitMessageByLength($message, $length - 10); // Reserve space
    $total = count($chunks);

    if ($total === 1) {
        return $chunks;
    }

    return array_map(function ($chunk, $index) use ($total) {
        return $chunk . "\n\n(" . ($index + 1) . "/" . $total . ")";
    }, $chunks, array_keys($chunks));
}

Preserving Paragraph Structure

php
public static function splitByParagraphs(string $message, int $length): array
{
    $paragraphs = explode("\n\n", $message);
    $chunks = [];
    $current = '';

    foreach ($paragraphs as $para) {
        $test = $current ? $current . "\n\n" . $para : $para;

        if (mb_strlen($test, 'UTF-8') <= $length) {
            $current = $test;
        } else {
            if ($current) {
                $chunks[] = $current;
            }
            // If single paragraph exceeds limit, use character splitting
            if (mb_strlen($para, 'UTF-8') > $length) {
                $chunks = array_merge($chunks,
                    self::splitMessageByLength($para, $length));
                $current = '';
            } else {
                $current = $para;
            }
        }
    }

    if ($current) {
        $chunks[] = $current;
    }

    return $chunks;
}

Emoji Considerations

Emojis can be 1-4 bytes but count as 1-2 characters depending on the platform. Test thoroughly with emoji-heavy content to ensure accurate splitting.

Conclusion

Splitting long messages might seem trivial, but doing it right—respecting word boundaries, handling Unicode properly, and cleaning up the results—makes the difference between a professional application and a frustrating user experience.

The key takeaways:

  • Always use multibyte string functions for Unicode safety
  • Prefer natural split points (line breaks > spaces > hard cuts)
  • Clean up results by trimming and filtering empty chunks
  • Consider platform-specific optimizations for your use case

This utility becomes especially valuable when integrating AI services into messaging platforms, where response length is unpredictable and character limits are strict.