Thursday, March 15, 2012

1080p, iTunes and sandblasting perfection

WARNING: Windows users may need to adjust gamma up by 50% for these images ...

This week has to be one of the best moments to warm a pixelnazi's heart ... not because of the new big Retina displays, or the launch of 1080p iTunes and AppleTV, but because of the rare surge of pixel gazing happening around the internet.

You may have heard of the ArsTechnica "smackdown" article already:

iTunes left, BluRay right — the perennial red-to-black sensitivity curve strikes again:
in the left image, see the early quantization in the top-right edges of the red circle
But soon after, MacObserver posted a followup, not to dispute that BluRay was better, but taking issue with ArsTechnica's assertion that the difference was marginal and not very noticeable:


In this case, the saturation is noticeably different, and this immediately makes the comparison less accurate in terms of human perception as well as for technical reasons — a pure codec+parameter comparison becomes invalid, as any codec will take different priorities at different parts of the colourspace. Indeed, a saturation shift would also ring alarm bells in terms of the entire colourspace changing, so be wary of irregularities in the hue, brightness and the entire transfer function as well.

One thing we should mention now is that articles like these are going to be very tricky to judge on Windows computers. The second image comparison on the MacObserver article looks totally black:

This will be even more crazy-dark if you're reading this on Windows

ArsTechnica has another good article on what is arguably the more important MPEG upgrade in the new AppleTV, and the other iOS devices in their latest generation:


In the same way that MPEG-2 digital TV decoders are classified as "MP@ML" Main Profile at Main Level for standard definition and "MP@HL" Main Profile at High Level for high definition, there is a step up in capability as the MPEG-4 decoders move from 720p to 1080p standards in the iOS devices.

Compare "MP" Main Profile with "HiP" High Profile (courtesy Wikipedia)
So the new devices support 8×8 transforms, quantization scaling matrices, and separate Cb and Cr QP control. While the iPhone 4S and iPad 3rd gen support "HiP@4.1L", the 1080p AppleTV only supports "HiP@4.0L" — this means a 25Mbps limit on the bitstream you can feed an AppleTV, but a theoretical 62.5Mbps limit to the iPhone 4S and new^3 iPad.

With the whole universe now 1080p compatible, we'll move on to bigger things next time.

UPDATE: Some more microscopic pixel gazing here:

Tuesday, February 7, 2012

The nexus between upscaling and decompression

Now that everybody has a Full HD display, you’ve probably formed your own opinions on the quality of SDTV content.  In the early days, it was a bit of novelty to have your own broadcast-monitor experience, like being able to judge whether a news studio had upgraded to a digital tape format yet.

But SDTV will be with us for a long time;  quite apart from the diversion that YouTube, cellphones and the scene have taken us on — something of a long-run historical blip — it’s inevitable that the standard “standard” still has another 20 years left in it.  A large body of work is still being created in this format, so it’s not about to lose its replay value the way black & white did in the ’80s.

So we’ll be dealing with non-square pixels for some time to come.  But I digress.

More than you might realise, our experience (and opinion) of SDTV is influenced hugely by the scaling process.  The quality of the upscaling has often left a lot to be desired, even by broadcasters:  All but one of the five national networks in Australia used only field-based rendering until 2007 … this meant that each field (1080i) or frame (720p) of HDTV broadcast had only the detail of 288 lines, or “288p”, half the SDTV potential 576 lines in a static image.

The tell-tale signs of field-based scaling
For several years, it was more likely that a $500 decoder box with HD outputs could do a better job than the networks.  And by 2005, the last generation of 1376x768 “non-full” HDTVs could finally scale 480i or 576i properly, deriving the full detail from a static SD image.  Full HD panels soon became a baseline feature, and the momentum was such that in 2006, it became de rigeur for HDTV displays to take a 1080i signal and derive a real 1080p picture with a static image.  When you think about it, this was quite an achievement, with 3 gigabits per second being processed by the internal chipsets.

24 bit colour × 1920 × 1080 × 60fps == 2,985,984,000 bps
double again for "100Hz" / "120Hz" interpolation

But by 2010, after a lot of network upgrades, the situation finally turned.  The networks were finally creating a nice image on their HD channels from their studio-quality SD sources (as the layman had always presumed);  now the only thing unnecessarily limiting quality on all those Full HD displays was the SD channels — or, to be accurate, the way that SD was handled.

A few things have helped mislead viewers into believing that everything on the HD channels was “in HD”, or HD-native, because the end-product of the SD channels was very far from SDTV best practice.  One thing was “small” screen sizes — anything under 60 inches makes it hard for an untrained eye to tell the providence of an HD end-product.  Another thing not helping matters was that the SD channels were more aggressively starved of bitrate;  yes that’s a factor, but a more important one has been the state of the art of consumer-side scaling.

So, what's “best practice” with SDTV, even if you can find channels with DVD-like bitrates?

True, field-based processing has long been history, but when you’re done deinterlacing at very high quality and are faced with a 720x480p60 or 720x576p50 sequence that requires display on a panel that has anything but those pixel dimensions, there are many ways to skin a cat.  As people soon found, an “upscaling DVD player” with HDMI output often did a much better job at exactly the same thing that the TV itself was supposed to do.  The SDTV channels as viewed on TiVo, Sky+HD, IQ2 looked better than the TV’s own tuner, or indeed other decoder boxes.

Both are examples of 576i upscaled to 1080i.  Both are deriving frames via 576p.
On the left, TiVo.  On the right, ‘another’ chipset, not utilising Faroudja technology :-)
The example on the left is definitely what you want to live with day-to-day, unless you're lucky enough to live in Japan.  You will have a very pleasant viewing experience on your Full HD panel if you apply this to studio-grade SD, and to SD sources derived from HD material — this is getting very common as cameras get upgraded more quickly than delivery mechanisms.

But we’re still left with a real-life problem:  What about the macroblocks?  In broadcast material, you’re often lucky to get as high as 720x480 or 720x576;  in streaming technologies, the problem is even more common.  You’re most likely to see this unavoidable drop in resolution on broadcast during a “quiet” scene if all hell is breaking loose on the other channels of the same transponder, and the broadcaster has an aggressive statmux’ing regime that rips the megabits out of the channel you’re watching:  The i-frames will become very visible and blocky, at the start of every MPEG GOP as the video starts “pulsating” with every group, generally once per second.

Here is one example of just such an occurrence, with the anamorphic 720x576 image @ 100% on the left.  On the right, I've converted the image to 90x72, and blown it up to view @ 800%.

Spot the difference  :-(
So, yes, for one frame we are watching the equivalent of a 72x72 postage-stamp sized video.
Hello, 1991 called — they want their QuickTime v1.0 video back.

Of course, with twenty years’ strides in technology, we should be able to do a better job than this.
The decoder should be signalling an alert to that fancy 1080i scaler connected to the HDMI port.
Or, at least, if we’re not going to interfere with the 1920x1080 frame buffer, let’s at least do what we can with the 720x576 frame buffer — presumably we still have to manufacture not-very-integrated chipsets that have to process video sequentially in intermediate frame buffers like this?

Left:  Bilinear upscaling.  Right:  Bicubic upscaling.
The challenge here — the only challenge — is to identify how low to go before applying a sane upscale, not some ridiculous nearest-neighbour copyblt lazy engineering.  Of course, this information is already available during the MPEG decoding process, and, of course, it’s only really needed during the lightweight MPEG-2 and SDTV decoding that’s long been conquered — but shall be with us for decades yet.  (In heavier scenarios, with MPEG-4 AVC deblocking and/or HDTV frames, this technique isn’t necessary).

Left:  Different frame, far less severe bitrate starvation.  Right:  Yet another frame, with ‘normal’ bitrate.
(note the quality recovery comes largely as a result of p-frames coping a lot better under harsh conditions)
Compare it to some more “normal” frames, above, to see how successful it is to integrate scaling and decompression stages.

By the way, the video sample here has been through a fairly typical process for older American TV shows in PAL/DVB countries:

film 24fps  480i 23.976pSF  576i 25pSF  540x576  padded to 720x576 for 16:9

And following on that, you can get all the 1080i scaling mentioned further above.

Sunday, January 29, 2012

Squeezing the most out of H.264 QuickTime


Presumably you already know about keeping the native resolution, aspect ratio,
frame rate, interlacing, etc., taking into account the final delivery format.

At this point it's worth repeating that Vimeo allows "Plus" users to download
each other's raw upload file ("raw" as in what got uploaded to the site). This can
be any .MOV or .MP4 you wish, getting around the 30fps limits of "online video".

How to get the most quality out of your QuickTime encoding:
   
  • Most importantly, install an x264 implementation encoder.
    It still decodes with Apple / VLC / MPlayer / etc.

    http://www.macupdate.com/app/mac/20273/x264-quicktime-codec

    This is Henry Mason's x264 implementation.  After extensive trial and error,
    I've found it more effective than the 'lavc' implementation, with far fewer buttons.

    Here's what I mean:
The 'lavc' x264 options panel.  Thankfully, we can avoid this.


  • Advanced Settings:

    This is simply a question of DB or not DB.  Deblocking is the pre-filtering
    craze that makes people with a semi-trained eye believe they have conquered
    the MPEG quality issue.  "No More Blocky Pixelization!"

    Unfortunately, it's also a killer of HD.  With time, you'll come to appreciate
    that deblocking kills the "oversampled" pixels which make the best HD film
    transfers, or classy 4K to HD content.  Indeed it can take away the full
    potential of SDTV (the best SD is always produced in HD).

    If you come across a high bitrate MPEG-2 source, you'll notice the lovely detail
    in the noise which really isn't noise at all.

    If your video feature contains grains, rocks, forests, or textures of any kind,
    then turn off DB.
      This is the key to ensuring that the feel of the video does not
    change … it is the only setting that has a real impact on changing the nature of
    the picture.  This sets it apart from the amateurish look of archive-unfriendly
    codecs like Cinepak, DiVX and Xvid, historically abused by "the scene".
    x264 Advanced Settings:  No DB
    Turn on CABAC (the real grunt behind H.264) and everything else, except B-frame
    Pyramids:  All the others put a burden on the
     encoder (you) and not the decoder;
    the random forward-lookup nature of the B-frame pyramids 
    means you would be
    placing an undue burden on the viewer, whose Pentium III 
    clunky laptop probably
    can't cope with the decoding.


    If your video feature contains mostly glossy surfaces
    blue sky, cloud or water,
    then use DB.  Set it to maximum.
     These are examples of where DCT technologies
    come up short;  the MPEG family will try to be "faithful" to the noise but in our minds
    we want to see the smoothness of these surfaces.
    x264 Advanced Settings:  DB Max
      
  • Aiming for a bitrate?  Use 2-pass and turn off the frame resampling.
    Of course, if you're aiming for a bitrate, you'd use "Restrict to" and enter kbps!

    There's a big fat bug:  If you don't select "Current", the 2-pass encoding won't hit
    your target;  instead it will just give you a "medium" quality encoding, regardless of
    the target bit rate, spatial quality percentage and temporal quality percentage.
     
  • Generally, though, you're not aiming for a bitrate.  You would be:

    1.  Uploading a short YouTube feature that doesn't go near the 20GB "pro" limit
    2.  Uploading a very short Vimeo feature that doesn't go near their 5GB limit
    3.  Uploading a low-resolution or sub-15-min YouTube clip;  you're going to be
         nowhere near the 2GB "amateur" limit, just wanting a quicker upload than
         you'd get if you uploaded the original source — without losing too much quality

    In this case you should set the Data Rate to Automatic, and select one of the following
    for Spatial Quality (temporal quality is ignored):
     
    • 50% Medium for "looks good to most people"
      (equivalent to bit-starved broadcast quality)
    • Halfway between 50% Medium and 75% High "looks good to pros"
      (equivalent to a typical DVD or BluRay)
    • 75% High only if you really need to maintain existing artefacts
      (e.g. for another generation of editing)
    • Halfway between 25% Low and 50% Medium for just scraping by
      (e.g. if you had a really bad source anyway,
      and just need some further bitrate reduction)