Don't Blog Me. I'll Blog You.: Streaming JPEG images over RTP using live555 streaming media

Live555 streaming media is a software library that provides RTP/RTSP streaming features mainly. It supports many popular video and audio codecs. At this moment, JPEG is still the most popular lossy compression image format. It says that Live555 supports JPEG as well but not directly, see here, here and here. Live555 implements JPEG's RTP payload however it doesn't come with any JPEG parser. According to common Live555 streaming media programming flow, a user that gonna to stream JPEG images through it should(here):

(i) "JPEGVideoRTPSink", which will be fed by

(ii) a *subclass* of "JPEGVideoSource".

If you takes a peek to "JPEGVideoSource". You can find "JPEGVideoSource" is an abstract base class. You must implement your own subclass base on "JPEGVideoSource" to provide your own implementations of "type()", "qFactor()", "width()", "height()" and even "qFactor()". Brief speaking, a parser for JPEG should be implemented at first to make it work. Taking a look at RFC2435:

3.1.  JPEG header
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Type-specific |              Fragment Offset                  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |      Type     |       Q       |     Width     |     Height    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

3.1.7.  Restart Marker header
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |       Restart Interval        |F|L|       Restart Count       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

3.1.8.  Quantization Table header
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |      MBZ      |   Precision   |             Length            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Quantization Table Data                    |
   |                              ...                              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Those field at JPEG/RTP payload header are mapped by "JPEGVideoSource". After study RFC2435 and other implementation from open source community, I make a JPEG image parser that able to work with "JPEGVideoSource" and its subclasses. This parser should be used to parse JPEG image files.

JpegFrameParser.hh

/*
    Copyright (C) 2011, W.L. Chuang <ponponli2000 at gmail.com>

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <http://www.gnu.org/licenses/>.
*/

#ifndef _JPEG_FRAME_PARSER_HH_INCLUDED
#define _JPEG_FRAME_PARSER_HH_INCLUDED


class JpegFrameParser
{
public:
    JpegFrameParser();
    virtual ~JpegFrameParser();

    unsigned char width()     { return _width; }
    unsigned char height()    { return _height; }
    unsigned char type()      { return _type; }
    unsigned char precision() { return _precision; }
    unsigned char qFactor()   { return _qFactor; }

    unsigned short restartInterval() { return _restartInterval; }

    unsigned char const* quantizationTables(unsigned short& length)
    {
        length = _qTablesLength;
        return _qTables;
    }

    int parse(unsigned char* data, unsigned int size);

    unsigned char const* scandata(unsigned int& length)
    {
        length = _scandataLength;

        return _scandata;
    }

private:
    unsigned int scanJpegMarker(const unsigned char* data,
                                unsigned int size,
                                unsigned int* offset);
    int readSOF(const unsigned char* data,
                unsigned int size, unsigned int* offset);
    unsigned int readDQT(const unsigned char* data,
                         unsigned int size, unsigned int offset);
    int readDRI(const unsigned char* data,
                unsigned int size, unsigned int* offset);

private:
    unsigned char _width;
    unsigned char _height;
    unsigned char _type;
    unsigned char _precision;
    unsigned char _qFactor;

    unsigned char* _qTables;
    unsigned short _qTablesLength;

    unsigned short _restartInterval;

    unsigned char* _scandata;
    unsigned int   _scandataLength;
};


#endif /* _JPEG_FRAME_PARSER_HH_INCLUDED */

JpegFrameParser.cpp

/*
    Copyright (C) 2011, W.L. Chuang <ponponli2000 at gmail.com>

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <http://www.gnu.org/licenses/>.
*/

#include <string.h>

#include "JpegFrameParser.hh"

#ifndef NDEBUG
    #include <stdio.h>
    #define LOGGY(format, ...) fprintf (stderr, format, ##__VA_ARGS__)
#endif /* NDEBUG */


enum
{
    START_MARKER = 0xFF,
    SOI_MARKER   = 0xD8,
    JFIF_MARKER  = 0xE0,
    CMT_MARKER   = 0xFE,
    DQT_MARKER   = 0xDB,
    SOF_MARKER   = 0xC0,
    DHT_MARKER   = 0xC4,
    SOS_MARKER   = 0xDA,
    EOI_MARKER   = 0xD9,
    DRI_MARKER   = 0xDD
};

typedef struct
{
    unsigned char id;
    unsigned char samp;
    unsigned char qt;
} CompInfo;


JpegFrameParser::JpegFrameParser() :
    _width(0), _height(0), _type(0),
    _precision(0), _qFactor(255),
    _qTables(NULL), _qTablesLength(0),
    _restartInterval(0),
    _scandata(NULL), _scandataLength(0)
{
    _qTables = new unsigned char[128 * 2];
    memset(_qTables, 8, 128 * 2);
}

JpegFrameParser::~JpegFrameParser()
{
    if (_qTables != NULL) delete[] _qTables;
}

unsigned int JpegFrameParser::scanJpegMarker(const unsigned char* data,
                                             unsigned int size,
                                             unsigned int* offset)
{
    while ((data[(*offset)++] != START_MARKER) && ((*offset) < size));

    if ((*offset) >= size) {
        return EOI_MARKER;
    } else {
        unsigned int marker;

        marker = data[*offset];
        (*offset)++;

        return marker;
    }
}

static unsigned int _jpegHeaderSize(const unsigned char* data, unsigned int offset)
{
    return data[offset] << 8 | data[offset + 1];
}

int JpegFrameParser::readSOF(const unsigned char* data, unsigned int size,
                             unsigned int* offset)
{
    int i, j;
    CompInfo elem;
    CompInfo info[3] = { {0,}, };
    unsigned int sof_size, off;
    unsigned int width, height, infolen;

    off = *offset;

    /* we need at least 17 bytes for the SOF */
    if (off + 17 > size) goto wrong_size;

    sof_size = _jpegHeaderSize(data, off);
    if (sof_size < 17) goto wrong_length;

    *offset += sof_size;

    /* skip size */
    off += 2;

    /* precision should be 8 */
    if (data[off++] != 8) goto bad_precision;

    /* read dimensions */
    height = data[off] << 8 | data[off + 1];
    width = data[off + 2] << 8 | data[off + 3];
    off += 4;

    if (height == 0 || height > 2040) goto invalid_dimension;
    if (width == 0 || width > 2040) goto invalid_dimension;

    _width = width / 8;
    _height = height / 8;

    /* we only support 3 components */
    if (data[off++] != 3) goto bad_components;

    infolen = 0;
    for (i = 0; i < 3; i++) {
        elem.id = data[off++];
        elem.samp = data[off++];
        elem.qt = data[off++];

        /* insertion sort from the last element to the first */
        for (j = infolen; j > 1; j--) {
            if (info[j - 1].id < elem.id) break;
            info[j] = info[j - 1];
        }
        info[j] = elem;
        infolen++;
    }

    /* see that the components are supported */
    if (info[0].samp == 0x21) {
        _type = 0;
    } else if (info[0].samp == 0x22) {
        _type = 1;
    } else {
        goto invalid_comp;
    }

    if (!(info[1].samp == 0x11)) goto invalid_comp;
    if (!(info[2].samp == 0x11)) goto invalid_comp;
    if (info[1].qt != info[2].qt) goto invalid_comp;

    return 0;

    /* ERRORS */
wrong_size:
    LOGGY("Wrong SOF size\n");
    return -1;

wrong_length:
    LOGGY("Wrong SOF length\n");
    return -1;

bad_precision:
    LOGGY("Bad precision\n");
    return -1;

invalid_dimension:
    LOGGY("Invalid dimension\n");
    return -1;

bad_components:
    LOGGY("Bad component\n");
    return -1;

invalid_comp:
    LOGGY("Invalid component\n");
    return -1;
}

unsigned int JpegFrameParser::readDQT(const unsigned char* data,
                                      unsigned int size,
                                      unsigned int offset)
{
    unsigned int quant_size, tab_size;
    unsigned char prec;
    unsigned char id;

    if (offset + 2 > size) goto too_small;

    quant_size = _jpegHeaderSize(data, offset);
    if (quant_size < 2) goto small_quant_size;

    /* clamp to available data */
    if (offset + quant_size > size) {
        quant_size = size - offset;
    }

    offset += 2;
    quant_size -= 2;

    while (quant_size > 0) {
        /* not enough to read the id */
        if (offset + 1 > size) break;

        id = data[offset] & 0x0f;
        if (id == 15) goto invalid_id;

        prec = (data[offset] & 0xf0) >> 4;
        if (prec) {
            tab_size = 128;
            _qTablesLength = 128 * 2;
        } else {
            tab_size = 64;
            _qTablesLength = 64 * 2;
        }

        /* there is not enough for the table */
        if (quant_size < tab_size + 1) goto no_table;

        //LOGGY("Copy quantization table: %u\n", id);
        memcpy(&_qTables[id * tab_size], &data[offset + 1], tab_size);

        tab_size += 1;
        quant_size -= tab_size;
        offset += tab_size;
    }

done:
    return offset + quant_size;

    /* ERRORS */
too_small:
    LOGGY("DQT is too small\n");
    return size;

small_quant_size:
    LOGGY("Quantization table is too small\n");
    return size;

invalid_id:
    LOGGY("Invalid table ID\n");
    goto done;

no_table:
    LOGGY("table doesn't exist\n");
    goto done;
}

int JpegFrameParser::readDRI(const unsigned char* data,
                             unsigned int size, unsigned int* offset)
{
    unsigned int dri_size, off;

    off = *offset;

    /* we need at least 4 bytes for the DRI */
    if (off + 4 > size) goto wrong_size;

    dri_size = _jpegHeaderSize(data, off);
    if (dri_size < 4) goto wrong_length;

    *offset += dri_size;
    off += 2;

    _restartInterval = (data[off] << 8) | data[off + 1];

    return 0;

wrong_size:
    return -1;

wrong_length:
    *offset += dri_size;
    return -1;
}

int JpegFrameParser::parse(unsigned char* data, unsigned int size)
{
    _width  = 0;
    _height = 0;
    _type = 0;
    _precision = 0;
    //_qFactor = 0;
    _restartInterval = 0,

    _scandata = NULL;
    _scandataLength = 0;

    unsigned int offset = 0;
    unsigned int dqtFound = 0;
    unsigned int sosFound = 0;
    unsigned int sofFound = 0;
    unsigned int driFound = 0;
    unsigned int jpeg_header_size = 0;

    while ((sosFound == 0) && (offset < size)) {
        switch (scanJpegMarker(data, size, &offset)) {
        case JFIF_MARKER:
        case CMT_MARKER:
        case DHT_MARKER:
            offset += _jpegHeaderSize(data, offset);
            break;
        case SOF_MARKER:
            if (readSOF(data, size, &offset) != 0) {
                goto invalid_format;
            }
            sofFound = 1;
            break;
        case DQT_MARKER:
            offset = readDQT(data, size, offset);
            dqtFound = 1;
            break;
        case SOS_MARKER:
            sosFound = 1;
            jpeg_header_size = offset + _jpegHeaderSize(data, offset);
            break;
        case EOI_MARKER:
            /* EOI reached before SOS!? */
            LOGGY("EOI reached before SOS!?\n");
            break;
        case SOI_MARKER:
            //LOGGY("SOI found\n");
            break;
        case DRI_MARKER:
            LOGGY("DRI found\n");
            if (readDRI(data, size, &offset) == 0) {
                driFound = 1;
            }
            break;
        default:
            break;
        }
    }
    if ((dqtFound == 0) || (sofFound == 0)) {
        goto unsupported_jpeg;
    }

    if (_width == 0 || _height == 0) {
        goto no_dimension;
    }

    _scandata = data + jpeg_header_size;
    _scandataLength = size - jpeg_header_size;

    if (driFound == 1) {
        _type += 64;
    }

    return 0;

    /* ERRORS */
unsupported_jpeg:
    return -1;

no_dimension:
    return -1;

invalid_format:
    return -1;
}

Demo:

21 comments:

AnonymousJanuary 27, 2012 at 2:37 PM
what about code of
"*subclass* of "JPEGVideoSource""
William ChuangJanuary 28, 2012 at 4:55 AM
The original liveMedia doesn't provide a "real" JPEG source. Instead of that, it/live555 declares an abstract C++ class named "JPEGVideoSource". Anyone who likes to stream JPEG frames using liveMedia should implement a JPEG frame source from "JPEGVideoSource". A JPEG frame source would have a JPEG frame parser, just like above one. :)
AnonymousFebruary 1, 2012 at 5:02 PM
what about _qFactor, it's allways 255, not working for me.
AnonymousAugust 31, 2012 at 5:49 PM
Hi

Will you please tell me how did you use that parser with live 555 (like parser return some value and this value is assign by which class and which type of input would be given by the above mentioned parser class)?

Thanks
William ChuangAugust 31, 2012 at 6:24 PM
This comment has been removed by the author.
SahuganiSeptember 10, 2012 at 3:50 PM
Hello
I know that you said that compositing the parser into the subclassed JPEG video source is no magic, but it is for me :). I'm not sure how to write this class and how to use your parser on a single file. Could you give me an example or some explanation?
Thanks

6/18/2011

Streaming JPEG images over RTP using live555 streaming media

21 comments: