Super User

Super User

 

Website URL: http://www.obe.tv/

Open Broadcast Systems OBE C-100 encoders and decoders running as software apps on commodity IT hardware delivered a number of important broadcasts over IP during the EU Referendum for Sky News.

A low-depth, low-power IT server located in street-furniture outside 10 Downing Street encoded a number of referendum events including the Prime Minister’s final speech before the referendum, coverage from Sky’s Political Editor throughout results night and the Prime Minister’s historic resignation. Sky provided the Prime Minister’s final speech and subsequent resignation as the UK pool feed, meaning the country saw these historic events delivered end-to-end using IT infrastructure. Feeds were decoded at the Sky News centre using IT hardware running the OBE C-100 decoder app.

Ten Gigabit PCIe card

(Stand H16, BVE, London) Open Broadcast Systems is pleased to announce it is making an uncompressed-over-IP software update available for all its existing decoder (IRD) products, which adds a missing piece in the puzzle for customers to move to IP-based production. This will be the first IRD in the market to have a native uncompressed output, without the use of a converter SFP. Visitors to Stand H16 at BVE will be able to see it in action - an "IRD" that consists of entirely of standard server components. 

Initially this will be using the SMPTE 2022-6 standard but as a software product is easily upgradable to support whatever flavour of Uncompressed-over-IP a customer decides to use. Precision Time Protocol (PTP) is also supported as standard, using the capabilities of the Linux® kernel. Coupled with newly added ASI support in the OBE range of products, customers can now have equipment capable of handling existing ASI all the way to future uncompressed-over-IP.

“This is usually the part where a hardware manufacturer has a picture of their custom-built circuit board and chassis”, said Kieran Kunhya, Managing Director of Open Broadcast Systems. “We’re not like that - we use off the shelf components that you can buy on Amazon to transport uncompressed video over IP. Our customers will tell you that using off-the-shelf components provides amazing levels of flexibility, letting them roll out deployments in record time and have pieces of kit capable of doing more than one thing. It doesn't matter whether you're a small operation or delivering the highest profile content in the world, IP will revolutionise the way you do things.”

 

FFmpeg

(Stand H16, BVE, London)  Open Broadcast Systems is pleased to announce its development and open-sourcing of a VC-2 HQ encoder and decoder. VC-2, formerly known as Dirac Pro, is ideally suited for mezzanine compression of video allowing HD to be transported using Gigabit Ethernet or 4K down Ten-Gigabit Ethernet at an ultra-low latency. As a core component in all of our products, FFmpeg 3.0 is the destination for this code, allowing any manufacturer to use and improve VC-2 encoding and decoding.

At the Broadcast Video Expo in London, we will be showing a low-latency encoder and decoder running on a wide range of hardware, from a software encoder you can fit in the palm of your hand, to high density blades, ideal for remote production.

Apart from being patent-free VC-2 has another trick up it’s sleeve compared to codecs not designed for video like JPEG2000. It doesn’t just have low generation loss, it has zero generation loss (thanks to its symmetric quantisation). Provided the downstream encoders are configured correctly (same bitrate, same slice-size etc.), there will be only a single encode generation, a unique feature in the industry.

VC-2 is also much more lightweight allowing reduced power consumption and encoding of higher resolutions without tiling and only requiring a modest bitrate penalty. However, in our implementation we’re going much further than many hardware manufacturers who are implementing only the simplest variant of VC-2.

Credit must go to BBC R&D for implementing high-quality, freely available, reference implementations and writing a clear specification.

Our implementation can be found in FFmpeg 3.0 which is available at http://www.ffmpeg.org

One of the great things about having rack-space in our new office is that we can now support open source projects using our equipment such as FFmpeg and Libav. They are critical parts of our software as well as underpin much of multimedia processing in the world today.

Fuzzing, is one of the ways in which we can improve the quality of the decoders when exposed to corrupted input. It involves randomly or systematically corrupting the input of a program in order to make it crash. The heartbleed vulnerability was one of the most famous bugs found via fuzzing [1].

Google, notably fuzzed FFmpeg and Libav at a relatively large scale, leading to a thousand fixes. But after seeing crashes in the H264 decoder earlier in the year, with real-world events such as packet loss and video splices, it was clear that something was wrong. One possibility is that Google only fuzzed progressive H264 content using frame threads and didn’t include interlaced content nor tried decoding in the lower-latency sliced-threads mode. Or that the codebase changed significantly enough to introduce new bugs.

Using basic tools like zzuf and later on the more advanced american fuzzy lop and a single quad-core server (in contrast to Google’s 2000 cores), the following unique bugs were found, a few of which caused easily-triggerable, real-world crashes.

H264 Frame Threads

https://trac.ffmpeg.org/ticket/4428

H264 Sliced Threads

https://trac.ffmpeg.org/ticket/4440

https://trac.ffmpeg.org/ticket/4438

https://trac.ffmpeg.org/ticket/4431

https://trac.ffmpeg.org/ticket/4408

https://trac.ffmpeg.org/ticket/4977

FFv1

https://trac.ffmpeg.org/ticket/4931

https://trac.ffmpeg.org/ticket/4932

https://trac.ffmpeg.org/ticket/4939

VP9

https://trac.ffmpeg.org/ticket/4935

Opus

https://bugzilla.libav.org/show_bug.cgi?id=876

https://bugzilla.libav.org/show_bug.cgi?id=909

Thanks to @rilian for providing fuzzing scripts and thanks to those who investigated and fixed the bugs, Michael Niedermayer in particular.

[1] http://www.codenomicon.com/files/pdf/Heartbleed-Story.pdf

 

Sky News

(IBC, Amsterdam - 4.A61g) Sky News has selected the OBE-C100 encoder platform for a number of encoder upgrades. This includes encoders in street-furniture and bureaus as well as expansion of their internal IPTV network for the election.

Sky News were able to achieve a low latency, crucial for interviews with guests and quickly reporting election results. The OBE C-100 platform allows customers to choose from pre-defined hardware or use their own hardware allowing Sky News to customise the chassis of the C-100, choosing suitable off-the-shelf components (e.g low power, high density) for their needs, reducing space and energy consumption. The encoders for the expansion of their internal IPTV network had an exceptional density of 40 channels in 5U.

“Using an Open Broadcast Encoder on commodity hardware has enabled Sky News to design and implement cost effective, flexible solutions, ranging from low latency contribution video to internal IPTV channels.”  said Chris Smith, Technology Development Executive at Sky News.

“The C-100 has let Sky News push the boundaries with encoding - they have adapted it for many new use-cases.” said Kieran Kunhya, Managing Director of Open Broadcast Systems. “

##

Pelmorex

(IBC, Amsterdam - 4.A61g) Pelmorex Media Inc., the parent company of The Weather Network and MeteoMedia, TV networks in Canada, has selected Open Broadcast Encoder (OBE) for its HD localisation upgrade. This upgrade allows Pelmorex to deliver an HD feed with local weather graphics to viewers across Canada. 

Running entirely on off-the-shelf Intel hardware, OBE allows for significant cost savings and flexibility over traditional hardware encoders. At its end, the deployment will be one of the largest uses in North America of software-encoders for broadcast contribution.

Six OBE C-100 encoders were used at the Glastonbury Festival to transport high quality HD broadcast feeds over IP from six stages to London, marking the first on-site IP delivery at Glastonbury. These feeds were then used on the web and Connected Red Button services. Using BBC Newsgathering’s expertise in using IP and commodity equipment in the field, the C-100 encoders ran as software on existing Intel-based hardware with the feeds transported over an on-site IP connection.

Dishes

London, UK: Open Broadcast Systems is pleased to announce that it is opening a new headquarters in Vauxhall, London. With easy access from Central London and the rest of Europe, the new HQ will allow Open Broadcast Systems to be closer to its customers. Based in a purpose-built teleport, the new HQ provides space for corporate expansion as well as providing a showcase for software video transport over IP in an actual broadcast environment.

Note: This is a more technical post than usual, and about 5 months late.

The decoding in the OBE C-100 decoder was optimised to make use of instructions in modern CPUs and this blog post explains how we did it:

HD-SDI video uses 10-bit pixels but computers operate in bytes (8-bits). However, 10-bit professional video doesn’t fit nicely into bytes. Instead, 10-bit video on a computer is stored in memory like this:



The X represents an unused bit - note how in total 12 out of 32 of the bits are unused (that’s 37.5%). It’s very wasteful if the data needs to be transferred to a piece of hardware like a Blackmagic SDI card. Virtually all professional SDI cards use the ‘v210’ format that was first introduced by Apple in the 90s [1] and v210 improves the efficiency of 10-bit storage by packing the 10-bit video samples as follows:

(adapted from [1])

Now only 2 out of the 32-bits are unused, a major improvement. Using the old v210 encoder in FFmpeg, each pixel is loaded from memory, shifted to the correct position and “inserted” using the OR operation. When doing this on 1920x1080 material, this involves about 250 million of these operations every second. More CPU time is spent packing the pixels for display than actually decompressing them from the encoded video!

Clearly, we’ve got to do something about this - Thanks to the magic of SIMD instructions (in this case SSSE3 and AVX) we can instead process 12 pixels in one go [2]: 

  1. Load luma pixels from memory
  2. Make sure they are within the v210 range
  3. Shift each pixel (if necessary) to appropriate position
  4. Shuffle pixels to rearrange them to v210 order
  5. Repeat 1-4 for chroma
  6. OR the luma and chroma registers together
  7. Store in memory

This can be (unscientifically) benchmarked with the command:

ffmpeg -pix_fmt yuv422p10 -s 1920x1080 -f rawvideo -i /dev/zero -f rawvideo -vcodec v210 -y /dev/null

Before: 168fps

After: 480fps

A 3x speed boost.

But, a lot of content that the decoder receives is 8-bit which has this packing format:



In existing software decoders, this needs to be converted to the 10-bit samples in the first picture and then packed into v210, a two step process. But, we can now just do this in a single step.

ffmpeg -pix_fmt yuv422p -s 1920x1080 -f rawvideo -i /dev/zero -f rawvideo -vcodec v210 -y /dev/null

Before: 95fps

After: 620fps

Now 6.5x faster!

What more could be done: 

  • Allow the decoder to decode straight to v210 using FFmpeg's draw_horiz_band capability. 
  • Try using AVX2 on newer Haswell CPUs - should provide a small speed increase but with an increased complexity.
  • Use multiple CPU cores on the conversion - this isn’t really useful for OBE but people creating v210 files may find it useful (especially UHD content).

Thanks must go to those who helped review this code.

[1] https://developer.apple.com/library/mac/technotes/tn2162/_index.html#//apple_ref/doc/uid/DTS40013070-CH1-TNTAG8-V210__4_2_2_COMPRESSION_TYPE

(This is from Apple’s venerable Letters from the Ice Floe)

[2] http://git.videolan.org/?p=ffmpeg.git;a=blob;f=libavcodec/x86/v210enc.asm

This post follows on from an old blog post about OSS DPP Creation, which many people have used to deliver DPP MXF files. It’s fair to say that this entirely vendor neutral method of creating AVC-Intra based MXF files raised of important questions about interoperability. Many manufacturers were only capable of decoding files from a single vendor. To this day there is ongoing debate about whether certain manufacturers are capable of delivering advertised features when their equipment fails to decode legal, but difficult to decode test files (notably CABAC AVC-I).

A lot of these issues have subsequently been followed up in the groundbreaking interoperability programme from the DPP, something which should be applauded. At the same time it is rather sad that after over a decade of file-based workflows in broadcast, manufacturers need to be schooled by their customers on how to interpret specifications which should be unambiguous in the first-place, or in some cases how to follow the prescribed document instead of a secret, proprietary document.

Recently, the Institut fur Rundfundtechnik (Broadcast Technology institute for German speaking broadcasters) have published their set of incredibly precise delivery requirements. Using OSS software, an IRT compliant file can now be be delivered to German broadcasters in the ARD_ZDF_HDF format. Files created with this method have also been tested at the IRT plugfest (see http://sourceforge.net/p/bmxlib/discussion/general/thread/68352f5a/?page=1 for more information)

1: x264

x264 is a best-in-class MPEG-4/AVC encoder that's used for a variety of uses such as web video, Blu-ray disc and broadcast television encoding. It supports 10-bit 4:2:2 as required by IRT - a 10-bit build of x264 is required to make AVC-Intra files. x264 will warn you if you encode AVC-Intra using an 8-bit build. x264 can be downloaded from: http://download.videolan.org/pub/x264/binaries/  (choose the latest and remember to get a 10-bit build) or better still, compiled from scratch.

1080i

x264.exe input.file --colorprim "bt709" --transfer "bt709" --colormatrix "bt709" --tune psnr --fps 25/1 --interlaced --force-cfr --avcintra-class 100 --output-csp i422 -o out.h264

720p

x264.exe input.file --colorprim "bt709" --transfer "bt709" --colormatrix "bt709" --tune psnr --fps 50/1 --interlaced --force-cfr --avcintra-class 100 --output-csp i422 -o out.h264

(If you get errors about avcintra-class it means your x264 is too old)

2: BMXlib

BMXlib is a library from BBC R&D that is designed to manipulate MXF files. Recent versions of bmxlib have been updated to support the IRT delivery requirements. http://sourceforge.net/projects/bmxlib/

Note that your wav files must be 24-bit encoded and silence tracks used where required. The AFD value should be altered as required.

1080i

raw2bmx.exe -y 09:58:00:00 -t op1a --afd 8 --ard-zdf-hdf -o out.mxf --avci100_1080i out.h264 --wave in.wav --wave in.wav --wave in.wav --wave in.wav

720p

raw2bmx.exe -y 09:58:00:00 -t op1a --afd 8 --ard-zdf-hdf -o out.mxf --avci100_720p  out.h264 --wave in.wav --wave in.wav --wave in.wav --wave in.wav

(note that the IRT does not specify a timecode start so this needs to be changed as advised)

Please let is know if you have any issues. Thanks to the people and organisations who tested this.

 

Page 3 of 5