The following words are by GoPro Technical Fellow David Newman and Spherical Software Manager Daryl Stimm.

With the release of GoPro MAX, we've introduced some unique, new 360 modes and features as well as a .360 file format that set MAX apart from existing 360 cameras, including GoPro Fusion.

And now that MAX is out in the wild, we want to share the backstory of some of MAX's inner workings that tech specs at face value, especially in 360 capture, just can't convey properly. Take resolution for example, specficially the difference between capture and stitched resolutions. With MAX, spherical videos are captured at 6K resolution and stitched to 5.6K, and spherical photos are captured at 18MP resolution and stitched to 16.6MP. Two different resolutions, one incredible output—read on to find out how.

To get started, it’s best to look at the key goals we identified when we started developing MAX:

  • Significantly decrease the time users spend editing 360 in post-production
  • Simplify media management
  • Increase final stitched resolution output
  • Maintain playback compatibility with mobile devices
  • Fine grain control of stabilization options
  • Improve the spherical stitching and blending quality established with Fusion

In order to achieve these goals, we focused on the following key technical areas:

  • In-camera stitching
  • Encoding to a single MP4 file
  • Creative use of a new 360° projection
  • Improving processing of sensor data store as metadata
  • Perfecting a multi-lens blend

Thanks in large part to the power of the GP1 Chip that is the heart of MAX, we are proud to report that all of these goals are achieved with MAX. However, when faced with a camera that captures source resolutions of 6K video and 18MP photos, even with the power of GP1 on our side, some tricky challenges arose.

For example, once you have a 5.6K stitched equirectangular video, that is equivalent to a resolution of 5376x2688, which is near impossible to encode with a modern encoder and, to make things more challenging, most high-end devices (PCs and smartphones alike) struggle to decode such a large file. Most decoders are only optimized around UHD (4K), which is 3840x2160—that is a difference of 6,156,288 pixels! So, we knew that encoding to the 5.6K resolution wouldn’t be feasible, but if we changed the projection, we could fit the same amount of resolution into a much smaller container size.

The best projection for this is Google’s Equal Area Cubemap, or EAC. EAC has a 25 percent drop in pixels for the same resolution as equirectangular. This can easily be seen in the image below where there are redundant pixels at the poles in equirectangular, but each face of the cube map is equal area in EAC (hence the name). This, in turn, allows us to pack 5.6k into a container of 4032x2688—3,612,672 less pixels!  Sadly, this file is also too large for most decoders to play back smoothly, so we had to employ one more trick. And it happens to be one of our favorites. 

By breaking the top and bottom faces of the cube map into two separate video tracks, this creates two videos at the resolution of 4032x1344. 4032x1344 is very easy for all UHD decoders play back. And if said decoder can decode one 4032x1344 file, it should have no problem decoding two. That did the trick! So, we built a special player that can decode both streams in real time and re-project them into a sphere.  You can see this in action with the GoPro App and the newly released GoPro Player for Desktop.   

Now, on to the next challenge. We still needed to provide users a way to convert these .360 files into standard videos or convert them into the larger equirectangular projection for upload to YouTube or 360 editors, such as Adobe Premiere Pro and Final Cut Pro. This is now much faster thanks to it being packaged in a stitched EAC format.

Because of this success, and knowing that Google is also using EAC to power YouTube 360 playback, we really hope to help in propelling EAC as the future standard. Once more tools start to support EAC, we will be able to skip this conversion step altogether and save even more time since no reprojection is needed.

Another huge advantage of EAC over the previous standard (dual fisheye) is that there is no need to encode unused pixels. Previously, we had to encode a round image into a square container, leaving the encoded edges completely unused. With EAC encoding 25 percent less pixels than Fusion, we can achieve a net bitrate increase even though the overall bitrate is less, meaning much smaller file sizes. File sizes get even smaller with one final MAX advantage. With MAX, we are also encoding in HEVC, and 60Mbps in HEVC is equivalent to 90Mbps with H.264.  

Here is an example of Equirectangular:

Here is an example of EAC. Notice the savings in pixels since we don’t have any redundancy, and everything is equal area. We can compact the same number of pixels in a much smaller container.

Now let’s talk stitching and blending. Where the stitch aligns the front and back images, automatically correcting for near and far object parallaxes, the blend balances the exposure differences between the front and back lens. When we developed Fusion, we set the industry bar very high for stitching and blending, and we did so by combining the overlap on the 194° dual fisheyes.

With MAX, we are proud to say we are again setting the bar by retaining the small overlapped areas inside the EAC projection. While the stitching is completed in MAX, the blending is enhanced by retaining a small overlap in post to improve quality even beyond that of Fusion. We did this by adding overlap areas on the stitch line and increasing the width of the EAC projection by an additional 64 pixels.

The image below shows overlap areas outlined in red.

The following image shows these areas blended correctly. Notice that, when blended correctly, you see those highlighted areas disappear and the colors and lighting on each side of the lens are corrected.  

Finally, another big bitrate advantage is that we are now doing Electronic Rolling Shutter (ERS) compensation on the camera before it hits the encoder; unlike Fusion, which did this all in a post-processing step. This ERS compensation is a key factor in stabilization. It is the anti-vibration stabilization applied in camera. This means even better compression for the encoder since like pixels move less between frames, and this has proven to significantly reduce compression artifacts alone.

HEVC, EAC projection and ERS stabilization done on camera all give Max a huge bitrate advantage over our competition, and we are confident you will be very pleased with the quality of the resulting video thanks to this complete new package.

With the combination of optimizing around projection and splitting up the video into dual tracks, the resulting 5.6K video from MAX will give 360 content a huge boost in playability and quality over anything we have previously seen. We are excited to see this come to life as MAX gets into the hands of 360 enthusiasts.

For more background on defining spherical resolutions, check out this article also penned by David Newman.