Super User

Super User

 

Website URL: http://www.obe.tv/

Dishes

London, UK: Open Broadcast Systems is pleased to announce that it is opening a new headquarters in Vauxhall, London. With easy access from Central London and the rest of Europe, the new HQ will allow Open Broadcast Systems to be closer to its customers. Based in a purpose-built teleport, the new HQ provides space for corporate expansion as well as providing a showcase for software video transport over IP in an actual broadcast environment.

Note: This is a more technical post than usual, and about 5 months late.

The decoding in the OBE C-100 decoder was optimised to make use of instructions in modern CPUs and this blog post explains how we did it:

HD-SDI video uses 10-bit pixels but computers operate in bytes (8-bits). However, 10-bit professional video doesn’t fit nicely into bytes. Instead, 10-bit video on a computer is stored in memory like this:



The X represents an unused bit - note how in total 12 out of 32 of the bits are unused (that’s 37.5%). It’s very wasteful if the data needs to be transferred to a piece of hardware like a Blackmagic SDI card. Virtually all professional SDI cards use the ‘v210’ format that was first introduced by Apple in the 90s [1] and v210 improves the efficiency of 10-bit storage by packing the 10-bit video samples as follows:

(adapted from [1])

Now only 2 out of the 32-bits are unused, a major improvement. Using the old v210 encoder in FFmpeg, each pixel is loaded from memory, shifted to the correct position and “inserted” using the OR operation. When doing this on 1920x1080 material, this involves about 250 million of these operations every second. More CPU time is spent packing the pixels for display than actually decompressing them from the encoded video!

Clearly, we’ve got to do something about this - Thanks to the magic of SIMD instructions (in this case SSSE3 and AVX) we can instead process 12 pixels in one go [2]: 

  1. Load luma pixels from memory
  2. Make sure they are within the v210 range
  3. Shift each pixel (if necessary) to appropriate position
  4. Shuffle pixels to rearrange them to v210 order
  5. Repeat 1-4 for chroma
  6. OR the luma and chroma registers together
  7. Store in memory

This can be (unscientifically) benchmarked with the command:

ffmpeg -pix_fmt yuv422p10 -s 1920x1080 -f rawvideo -i /dev/zero -f rawvideo -vcodec v210 -y /dev/null

Before: 168fps

After: 480fps

A 3x speed boost.

But, a lot of content that the decoder receives is 8-bit which has this packing format:



In existing software decoders, this needs to be converted to the 10-bit samples in the first picture and then packed into v210, a two step process. But, we can now just do this in a single step.

ffmpeg -pix_fmt yuv422p -s 1920x1080 -f rawvideo -i /dev/zero -f rawvideo -vcodec v210 -y /dev/null

Before: 95fps

After: 620fps

Now 6.5x faster!

What more could be done: 

  • Allow the decoder to decode straight to v210 using FFmpeg's draw_horiz_band capability. 
  • Try using AVX2 on newer Haswell CPUs - should provide a small speed increase but with an increased complexity.
  • Use multiple CPU cores on the conversion - this isn’t really useful for OBE but people creating v210 files may find it useful (especially UHD content).

Thanks must go to those who helped review this code.

[1] https://developer.apple.com/library/mac/technotes/tn2162/_index.html#//apple_ref/doc/uid/DTS40013070-CH1-TNTAG8-V210__4_2_2_COMPRESSION_TYPE

(This is from Apple’s venerable Letters from the Ice Floe)

[2] http://git.videolan.org/?p=ffmpeg.git;a=blob;f=libavcodec/x86/v210enc.asm

This post follows on from an old blog post about OSS DPP Creation, which many people have used to deliver DPP MXF files. It’s fair to say that this entirely vendor neutral method of creating AVC-Intra based MXF files raised of important questions about interoperability. Many manufacturers were only capable of decoding files from a single vendor. To this day there is ongoing debate about whether certain manufacturers are capable of delivering advertised features when their equipment fails to decode legal, but difficult to decode test files (notably CABAC AVC-I).

A lot of these issues have subsequently been followed up in the groundbreaking interoperability programme from the DPP, something which should be applauded. At the same time it is rather sad that after over a decade of file-based workflows in broadcast, manufacturers need to be schooled by their customers on how to interpret specifications which should be unambiguous in the first-place, or in some cases how to follow the prescribed document instead of a secret, proprietary document.

Recently, the Institut fur Rundfundtechnik (Broadcast Technology institute for German speaking broadcasters) have published their set of incredibly precise delivery requirements. Using OSS software, an IRT compliant file can now be be delivered to German broadcasters in the ARD_ZDF_HDF format. Files created with this method have also been tested at the IRT plugfest (see http://sourceforge.net/p/bmxlib/discussion/general/thread/68352f5a/?page=1 for more information)

1: x264

x264 is a best-in-class MPEG-4/AVC encoder that's used for a variety of uses such as web video, Blu-ray disc and broadcast television encoding. It supports 10-bit 4:2:2 as required by IRT - a 10-bit build of x264 is required to make AVC-Intra files. x264 will warn you if you encode AVC-Intra using an 8-bit build. x264 can be downloaded from: http://download.videolan.org/pub/x264/binaries/  (choose the latest and remember to get a 10-bit build) or better still, compiled from scratch.

1080i

x264.exe input.file --colorprim "bt709" --transfer "bt709" --colormatrix "bt709" --tune psnr --fps 25/1 --interlaced --force-cfr --avcintra-class 100 --output-csp i422 -o out.h264

720p

x264.exe input.file --colorprim "bt709" --transfer "bt709" --colormatrix "bt709" --tune psnr --fps 50/1 --interlaced --force-cfr --avcintra-class 100 --output-csp i422 -o out.h264

(If you get errors about avcintra-class it means your x264 is too old)

2: BMXlib

BMXlib is a library from BBC R&D that is designed to manipulate MXF files. Recent versions of bmxlib have been updated to support the IRT delivery requirements. http://sourceforge.net/projects/bmxlib/

Note that your wav files must be 24-bit encoded and silence tracks used where required. The AFD value should be altered as required.

1080i

raw2bmx.exe -y 09:58:00:00 -t op1a --afd 8 --ard-zdf-hdf -o out.mxf --avci100_1080i out.h264 --wave in.wav --wave in.wav --wave in.wav --wave in.wav

720p

raw2bmx.exe -y 09:58:00:00 -t op1a --afd 8 --ard-zdf-hdf -o out.mxf --avci100_720p  out.h264 --wave in.wav --wave in.wav --wave in.wav --wave in.wav

(note that the IRT does not specify a timecode start so this needs to be changed as advised)

Please let is know if you have any issues. Thanks to the people and organisations who tested this.

 

FOSDEM is the largest Open Source conference in Europe (and in the world?) where over 5000+ people attend to hear more about Open Source. This year in association with the EBU and FOMS, we are organising an "Open media devroom" allowing people to present and discuss various Open Source projects relating to multimedia, and for our part of the session, relating especially to broadcast.

There's a lot going on and we urge you to talk about anything related using the information below: 

Dear open source in broadcasting community,

We have submitted an application to get a developer room for presentations dedicated to media at FOSDEM 2015 in Brussels (https://fosdem.org/2015/ ).

It has been accepted for the first day on Saturday 31th January as "Open media devroom"  and we co-organise it with the FOMS community.

So now the call for participation is opened.

If you are interested to present, please register your submission directly on this link: https://penta.fosdem.org/submission/FOSDEM15
You need to create an account and then go to "create event" to give the details of your presentation.
IT IS VERY IMPORTANT TO SELECT THE "OPEN MEDIA DEVROOM" TRACK IN YOUR SUBMISSION. Otherwise, we won't see it and it will appear in other tracks that we don't control.

The deadline for submission is: 1st December 2014.
Presentation are recorded and will be made available with CC-BY licence by FOSDEM.

The timeslot for presentation is 20 minutes and it is foreseen to have panel discussions also (40 minutes).

We will then have to select submission together with FOMS community and final schedule is expected to be published by 30th December.

Telegeography Cable Map

Recent broadcasts during the Scottish Referendum from the Shetland Islands are excellent example of how IP can be used to deliver broadcast-quality live or recorded material.


IBC, Amsterdam (4.A61h): “Low bitrate, low latency, high quality – pick two” sums up the current state of audio codecs for broadcast contribution. A demo of Opus at IBC aims to show that you can now have all three. Opus is a low-delay, royalty-free audio codec designed for a wide range of audio applications and is used in Skype, WebRTC and on the Playstation 4. A specification for using Opus in a standard MPEG Transport Stream was recently written meaning it could be used in broadcast contribution for the first time.

The IBC demo will show a live 1080i feed from a camera going into a 1/2U OBE C-100 encoder, encoded as high bitrate H.264 video and low bitrate Opus audio in an MPEG Transport Stream and into a 1U OBE C-100 decoder with an end to end latency of around 300ms.

The Opus in MPEG-TS standard is published freely online and is available for all manufacturers to implement without royalties.

More detailed technical information can be found on our blog: http://obe.tv/about-us/obe-blog/item/14-using-opus-audio-in-broadcasting

 

IBC 2014 (4.A61h) – Saeta TV Channel 10 Uruguay has selected Open Broadcast Encoder (OBE) for its national ISDB-T platform. OBE is used to encode HD MPEG-4/AVC services compliant with the ISDB-T standard as used in Latin America. A further 15 channels throughout Uruguay will also be using OBE as part of systems integration work by the local team who delivered the encoding platform for Saeta TV.

IBC 2014 (4.A61h) – B1 TV, a privately held news and current affairs channel airing across Romania has selected Open Broadcast Encoder (OBE) for its upgrade from MPEG-2 to MPEG-4/AVC. Having run OBE as a backup encoder from January 2014, B1 decided in March 2014 to reverse the roles and use OBE for its main encoding solution.

“There were two important reasons for our decision,” said Dan Lita, technical consultant for B1. “The first was the picture quality which was better than our previous encoder and secondly OBE resolved compatibility issues with set-top-boxes of a DTH provider. Prior to our main deployment, we had been using OBE for various contribution feeds since 2012.”

The OBE C-100 platform is the first broadcast encoder/decoder to support Opus audio. But why are we doing this? This post explains some of the background behind implementing Opus for Broadcast Contribution.

Disclaimer: This analysis is merely an objective analysis of the coding features the encoder uses and not an analysis of the subjective or objective picture quality of the encoder. It’s also worth saying that this information is from a small clip but in the main short clips can provide a good indication of the coding decisions an encoder is making.

Early stage encoders like the one used in the BBC World Cup UHD trial are interesting in that they provide an insight into the development process of a encoder and what coding tools encoder manufacturers have decided to use first (often with limited processing power). This information usually remains under NDA but public use of the encoder means anyone can perform analysis on it.

A very good introduction to HEVC coding tools can be found here: http://forum.doom9.org/showthread.php?t=167081

Thanks to the help provided by Parabola Research in producing this post. You can download a bitstream analysis report from Parabola Explorer Pro 3.0 below. This report helped produce the analysis below.

In no particular order:

  • The GOP structure is pretty standard IBBPBBPBBP. It appears not to adapt. It’s quite similar to MPEG-2 in that it only keeps a maximum of one frame in L0 and L1.
  • CTU size is 64x64; 9 slices in total.
  • Intra prediction modes are very limited. Almost always using DC, horizontal or vertical and very rarely using the other 32 modes.
  • No use of Asymmetric Prediction Units; quite similar to MPEG-4/AVC.
  • Constant Quantiser – no use of (variance based) adaptive quantisation yet to improve visual quality
  • There’s no use of Sample Adaptive Offset and no use of Weighted Prediction
  • There’s no use of BI prediction at all. It’s either L0 or L1.

All in all, not really a surprise. At this early stage it's no real surprise that people like Netflix are saying “We're not seeing efficiency gains being claimed by HEVC encoding vendors"1 and such a limited use of the toolkit is the main reason why 

The report from Parabola Explorer Pro can be found here: http://downloads.obe.tv/Parabola-Explorer-Pro-analysis-of-Rio-Stream.pdf 

 1. http://www.streamingmedia.com/Articles/Editorial/Featured-Articles/Streaming-Media-East-Netflix-Making-the-Move-to-HEVC-but-Efficiency-Gains-Lag-96981.aspx 

Page 3 of 4