AV1 Image File Format (AVIF)

Commit Snapshot,

This version:
https://AOMediaCodec.github.io/av1-avif
Issue Tracking:
GitHub
Editor:
(Netflix)
Warning

This specification is still at draft stage and should not be referenced other than as a working draft.

Copyright 2018, The Alliance for Open Media

Licensing information is available at http://aomedia.org/license/

The MATERIALS ARE PROVIDED “AS IS.” The Alliance for Open Media, its members, and its contributors expressly disclaim any warranties (express, implied, or otherwise), including implied warranties of merchantability, non-infringement, fitness for a particular purpose, or title, related to the materials. The entire risk as to implementing or otherwise using the materials is assumed by the implementer and user. IN NO EVENT WILL THE ALLIANCE FOR OPEN MEDIA, ITS MEMBERS, OR CONTRIBUTORS BE LIABLE TO ANY OTHER PARTY FOR LOST PROFITS OR ANY FORM OF INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES OF ANY CHARACTER FROM ANY CAUSES OF ACTION OF ANY KIND WITH RESPECT TO THIS DELIVERABLE OR ITS GOVERNING AGREEMENT, WHETHER BASED ON BREACH OF CONTRACT, TORT (INCLUDING NEGLIGENCE), OR OTHERWISE, AND WHETHER OR NOT THE OTHER MEMBER HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


Abstract

This document contains a set of requirements that can be used to incorporate one or more [AV1] still images in a file using structures and procedures defined by the [HEIF] specification. This specification also defines a set of baseline requirements to allow the interchange of files between AVIF writers and readers.

1. General

AVIF is a file format wrapping compressed still images based on the Alliance for Open Media AV1 intra-frame encoding toolkit. AVIF supports High Dynamic Range (HDR) and wide color gamut (WCG) images as well as standard dynamic range (SDR). An AVIF file shall be a conformant to [HEIF] for both Image Collections and Image Sequences. Specifically, an AVIF file shall be compliant to the requirements of Clause 4 of the [ISOBMFF] specification and, where applicable, the recommendations in Annex I: Guidelines On Defining New Image Formats and Brands in [HEIF].

2. Terms and Definitions

For the purposes of this document, the terms, definitions, and abbreviated terms specified in [ISOBMFF] and [HEIF] apply.

Some important definitions used by this document are paraphrased informationally here.

2.1. Alpha Image

A specific type of Auxiliary Image that may be used to convey information representing the opacity of associated Master Images.

2.2. Auxiliary Image

An image that is not intended to be displayed but provides supplemental information for associated Master Images.

2.3. Cover Image

A Master Image that may be used to represent the file contents. An example of this is a single image used to represent an animation before the animation sequence is activated.

2.4. Image Collection

One or more Master Images stored as items in a single file with no defined order or timing information. Within a collection image samples may share properties and metadata.

2.5. Image Properties

This is a class of non-media data. The property items may be a descriptive image attribute or decoder configuration data. The properties are primarily for consumption by the decoding agent. This information may include:

2.6. Image Sequence

A sequence of Master Images stored as a track for which information is provided that defines a sequential ordering and temporal information indicating suggested playback timing. An agent decoding and presenting an AVIF file may chose to render an Image Sequence as an animation.

2.7. Master Image

An image that is not a thumbnail or auxiliary image. For the purpose of this specification, such an image is encoded using AV1 intra-frame tools. This type of image is the primary displayable payload of an AVIF file. A Master Image may be included in both an Image Collection and an Image Sequence within the same file, along with being referenced as the Cover Image.

2.8. Metadata

Metadata conveys image attributes that are not used to decode or reconstruct an image. This data is considered to be non-essential and non-normative. Examples of this include EXIF, XMP, and MPEG-7.

2.9. Thumbnail Image

This is a non-master image that may be used to represent one or more Master Images found in an AVIF file. It is typically of a smaller scale than the Master Images. Its compression format may be different than the one used by the Master Images.

3. Object Model and Structure

An AVIF file shall be a conformant version of an [HEIF] file. This is to allow for the deployment of general libraries that may be used to create and parse HEIF-based image files wrapping different coding methods for the actual image content. This should be similar to ISO-BMFF usage in the video domain.

The AVIF file format will be built on the box-structured media interchange format introduced by the ISO Base Media File Format ([ISOBMFF]). The format specified by AVIF defines the use of a subset of box structures introduced in ISOBMFF. Where the necessary structures do not exist in ISOBMFF, structures defined as part of the High Efficiency Image File Format ([HEIF]: ISO/IEC 23008-12) that are codec neutral and can be applied in a generic manner are used. An AVIF version 1.0 file shall be compliant to the requirements of Clause 4 of the [ISOBMFF] specification, and where applicable, the recommendations in Annex I: Guidelines On Defining New Image Formats and Brands in the MPEG HEIF specification shall be followed for AVIF 1.0.

4. Image Data

Image data of type "av1i" shall be limited to AV1 intra frames. This applies to both Image Collections and Image Sequences. Each image shall conform to the requirements of an Intra Frame as defined by AV1 Bitstream & Decoding Process Specification [AV1-ISOBMFF]. No inter-frame encoding shall be permitted between images.

5. AVIF Image Collection

The image data of type "av1i" shall be used for an image collection item coded with AV1.

The image item data shall be structured as defined in the AV1 Sample Format section of the AV1 Codec ISO Media File Format Binding [AV1] specification.

An AVIF file containing an Image Collection shall list the "mif1" structural brand as one of the entries in the compatible_brands array and conform to clauses 6 and 10.2 of [HEIF].

AVIF does not support coding dependencies outside of shared image properties: no inter-frame decoding dependencies within the image samples sequence.

5.1. Image Item Properties

In addtion to the Image Properties defined in [HEIF], AVIF image collections may also use the Content Light Level and Mastering Display Colour Volume image properties introduced in [MIAF].

5.1.1. AV1 Configuration Item Property

Each image item of type '"av1i" shall have an associated image property of type "av1C" that is identical to the AV1CodecConfigurationBox as defined in [AV1-ISOBMFF]. Such a property shall be marked as essential.

Each AV1i image item shall be associated with an AV1 Configuration Item Property.

6. AVIF Image Sequence

6.1. AVIF Image Tracks

The sample entry of type "av1i" shall be used for an image sequence track coded with AV1.

Image samples of type "av1i" (AV1 image) are based on AV1 intra frames. Specifically the AV1 intra-frame encoding tools as defined by AV1 Bitstream & Decoding Process Specification [AV1-ISOBMFF]: no inter-frame encoding shall be permitted between images. The image track is structured as defined by the AV1 Codec ISO Media File Format Binding [AV1] specification constrainded to the specific use case where all frames are key frames.

An AVIF file containing an Image Sequence shall list the "msf1" brand as one of the entries in the compatible_brands array and conform to sections 7 and 10.3 of [HEIF].

6.2. AV1 Sample Entry

All SampleDescriptionBoxes associated with AV1 image tracks shall use the format and procedures as defined in [AV1-ISOBMFF]. The AV1SampleEntry extends the VisualSample entry and includes a manditory AV1CodecConfigurationBox. In this case, the box is identified by the "av1i" type.

Sample Entry Type: av1i
Container:         Sample Description Box ('stsd')
Mandatory:         Yes
Quantity:          One or more.
class AV1SampleEntry extends VisualSampleEntry('av1i') {
  AV1CodecConfigurationBox config;
}

7. Alpha Images

An Alpha Image is a specific type of auxiliary image that is used to carry per pixel opacity information for one or more Master Images.

A URN will be defined to identify AVIF alpha auxiliary images in both collections and sequences. For the purposes of this draft the placeholder urn:aom:avif:alpha will be used whenerver the "auxC" image item property is required.

8. Brands

If the major_brand field is set to "av1i" then the minor_version shall be set to 0.

If the major_brand field is not set to "av1i", then the brand "av1i" shall appear in the compatible_brands array.

The compatible_brands array shall contain "mif1" if the file contains an Image Collection.

The compatible_brands array shall contain "msf1" if the file contains an Image Sequence.

9. AVIF Baseline Profile

An AVIF Baseline file should be a conformant, simplified version of an [HEIF] file. This is to allow for the deployment of general libraries that may be used to create and parse HEIF-based image files wrapping different coding methods for the actual image content. This should be similar to ISO-BMFF usage in the video domain. The following define limitations to the general format of an HEIF file that defines the minimum requirements for an AVIF file reader. An implementation of an AVIF file reader may optionally choose to recognise any structure as defined in HEIF.

9.1. Image Storage

All of the constituent elements, including image samples, shall be contained in a single file. All media data locations, regardless of construction method, shall resolve to an offset within an AVIF file.

9.2. Thumbnails

Only an image type of "av1i" or "jpeg" is permited for thumbnails.

9.3. Cover Image

A PrimaryItemBox is optional. If a Cover Image is not explicitly indicated by a PrimaryItemBox the Cover Image shall be assumed to be the first master image entry in the ItemLocationBox or the first entry in the master track.

9.4. Auxiliary Images

AVIF allows only one type of Auxiliary Image: an Alpha Image.

9.5. Alpha Image

After applying any transformational Image Properties, the alpha image shall have the same dimensional attributes as the largest composite plane in the Master Image: width, height, and pixel aspect ratio. Furthermore, the pixels of the Alpha Image shall overlay the pixels of the largest component plane of any linked Master Image exactly. For example, for YUV 4:2:x, this would be the Y component plane. The decoded value of an alpha pixel shall be a normalized unsigned integer of at least 8 bits representing a range between 0.0 and 1.0. An alpha value of zero will map to 0.0 and the maximum value representable by the alpha channel shall map to 1.0.

9.6. Sample Groups

Sample groups may be ignored by readers.

9.7. Hidden Images

Hidden images may be ignored by readers.

9.8. Pre-Derived Coded Images

Pre-derived coded images may be ignored by readers.

9.9. Multi-Layer Images

Multi-layer images may be ignored by readers. This does not limit the underlying image format from encoding multiple layers but this will be internal to the encoded image and managed directly by the decoder and not visible to the AVIF reader.

9.10. Derived Images

The derivation of tiled images may be supported by Baseline readers which also support rendering images larger than permited by AV1 profiles. No other derived image format need be supported.

9.11. Metadata

Metadata conveys image attributes that are not used to decode or reconstruct an image. This data is considered to be non-essential and non-normative. Examples of this include EXIF, XMP, and MPEG-7. An AVIF reader will not be required to extract metadata from Informational Metadata boxes. This includes the case of image editing instructions conveyed in an XMP record. Essential information shall be carried in the image media directly or be conveyed as Image Properties.

9.12. Collection Elements

An AVIF file reader must be able to recognize the following boxes. Any box field or feature not explicitly limited by this specification should be handled as defined in ISO 14496‑12 and ISO 23008-12.

box hierarchy version box description
ftyp - file type
meta 0 metadata container box
hdlr 0 handler type definition
pitm 0,1 primary item reference
iloc 0,1,2 item location table
iinf 0,1 item information table
infe 2,3 item information table entry
iprp - item properties container box
ipco 0 item property definitions
ipma 0,1 item property associations
idat - item data box

9.13. Sequence Elements

An AVIF file reader must be able recognize the following boxes. Any box field or feature not explicitly limited by this specification should be handled as defined in ISO 14496‑12 and ISO 23008-12.

box hierarchy version box description
ftyp - file type
moov - movie container box
trak - track container box
tkhd 0,1 track header
tref - track references
mdia - media information container
mdhd 0,1 media information header
hdlr 0 media handler type
minf - media information box
vmhd video media header
dinf - data information container
dref 0 data references for track sources
stbl - sample table mapping container
stts 0 sample to decode time table
stsd sample description(visual sample entry box subclass)
stsz 0 sample size table
stsc 0 sample to chunk table
stco 0 chunk offset table
mdat - media data

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Index

Terms defined by this specification

References

Normative References

[AV1]
AV1 Bitstream & Decoding Process Specification. LS. URL: http://av1-spec.argondesign.com/av1-spec/av1-spec.html
[AV1-ISOBMFF]
AV1 Codec ISO Media File Format Binding. LS. URL: https://aomediacodec.github.io/av1-isobmff/
[HEIF]
Information technology — High efficiency coding and media delivery in heterogeneous environments — Part 12: Image File Format. International Standard. URL: https://www.iso.org/standard/66067.html
[ISOBMFF]
Information technology — Coding of audio-visual objects — Part 12: ISO Base Media File Format. December 2015. International Standard. URL: http://standards.iso.org/ittf/PubliclyAvailableStandards/c068960_ISO_IEC_14496-12_2015.zip
[MIAF]
Information technology -- Multimedia application format (MPEG-A) -- Part 22: Multi-Image Application Format (MiAF). Enquiry. URL: https://www.iso.org/standard/74417.html
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119