1. General
AVIF is a file format wrapping compressed still images based on the Alliance for Open Media AV1 intra-frame encoding toolkit. AVIF supports High Dynamic Range (HDR) and wide color gamut (WCG) images as well as standard dynamic range (SDR). An AVIF file shall be a conformant to [HEIF] for both Image Collections and Image Sequences. Specifically, an AVIF file shall be compliant to the requirements of Clause 4 of the [ISOBMFF] specification and, where applicable, the recommendations in Annex I: Guidelines On Defining New Image Formats and Brands in [HEIF].
2. Terms and Definitions
For the purposes of this document, the terms, definitions, and abbreviated terms specified in [ISOBMFF] and [HEIF] apply.
Some important definitions used by this document are paraphrased informationally here.
2.1. Alpha Image
A specific type of Auxiliary Image that may be used to convey information representing the opacity of associated Master Images.
2.2. Auxiliary Image
An image that is not intended to be displayed but provides supplemental information for associated Master Images.
2.3. Cover Image
A Master Image that may be used to represent the file contents. An example of this is a single image used to represent an animation before the animation sequence is activated.
2.4. Image Collection
One or more Master Images stored as items in a single file with no defined order or timing information. Within a collection image samples may share properties and metadata.
2.5. Image Properties
This is a class of non-media data. The property items may be a descriptive image attribute or decoder configuration data. The properties are primarily for consumption by the decoding agent. This information may include:
- decoder specific configuration and initialization values
- image width and height
- pixel attributes
- color space
- content light level
- mastering display color volume
2.6. Image Sequence
A sequence of Master Images stored as a track for which information is provided that defines a sequential ordering and temporal information indicating suggested playback timing. An agent decoding and presenting an AVIF file may chose to render an Image Sequence as an animation.
2.7. Master Image
An image that is not a thumbnail or auxiliary image. For the purpose of this specification, such an image is encoded using AV1 intra-frame tools. This type of image is the primary displayable payload of an AVIF file. A Master Image may be included in both an Image Collection and an Image Sequence within the same file, along with being referenced as the Cover Image.
2.8. Metadata
Metadata conveys image attributes that are not used to decode or reconstruct an image. This data is considered to be non-essential and non-normative. Examples of this include EXIF, XMP, and MPEG-7.
2.9. Thumbnail Image
This is a non-master image that may be used to represent one or more Master Images found in an AVIF file. It is typically of a smaller scale than the Master Images. Its compression format may be different than the one used by the Master Images.
3. Object Model and Structure
An AVIF file shall be a conformant version of an [HEIF] file. This is to allow for the deployment of general libraries that may be used to create and parse HEIF-based image files wrapping different coding methods for the actual image content. This should be similar to ISO-BMFF usage in the video domain.
The AVIF file format will be built on the box-structured media interchange format introduced by the ISO Base Media File Format ([ISOBMFF]). The format specified by AVIF defines the use of a subset of box structures introduced in ISOBMFF. Where the necessary structures do not exist in ISOBMFF, structures defined as part of the High Efficiency Image File Format ([HEIF]: ISO/IEC 23008-12) that are codec neutral and can be applied in a generic manner are used. An AVIF version 1.0 file shall be compliant to the requirements of Clause 4 of the [ISOBMFF] specification, and where applicable, the recommendations in Annex I: Guidelines On Defining New Image Formats and Brands in the MPEG HEIF specification shall be followed for AVIF 1.0.
4. Image Data
Image data of type "av1i" shall be limited to AV1 intra frames. This applies to both Image Collections and Image Sequences. Each image shall conform to the requirements of an Intra Frame as defined by AV1 Bitstream & Decoding Process Specification [AV1-ISOBMFF]. No inter-frame encoding shall be permitted between images.
5. AVIF Image Collection
The image data of type "av1i" shall be used for an image collection item coded with AV1.
The image item data shall be structured as defined in the AV1 Sample Format section of the AV1 Codec ISO Media File Format Binding [AV1] specification.
An AVIF file containing an Image Collection shall list the "mif1" structural brand as one of the entries in the compatible_brands array and conform to clauses 6 and 10.2 of [HEIF].
AVIF does not support coding dependencies outside of shared image properties: no inter-frame decoding dependencies within the image samples sequence.
5.1. Image Item Properties
In addtion to the Image Properties defined in [HEIF], AVIF image collections may also use the Content Light Level and Mastering Display Colour Volume image properties introduced in [MIAF].
5.1.1. AV1 Configuration Item Property
Each image item of type '"av1i" shall have an associated image property of type "av1C" that is identical to the AV1CodecConfigurationBox as defined in [AV1-ISOBMFF]. Such a property shall be marked as essential.
Each AV1i image item shall be associated with an AV1 Configuration Item Property.
6. AVIF Image Sequence
6.1. AVIF Image Tracks
The sample entry of type "av1i" shall be used for an image sequence track coded with AV1.
Image samples of type "av1i" (AV1 image) are based on AV1 intra frames. Specifically the AV1 intra-frame encoding tools as defined by AV1 Bitstream & Decoding Process Specification [AV1-ISOBMFF]: no inter-frame encoding shall be permitted between images. The image track is structured as defined by the AV1 Codec ISO Media File Format Binding [AV1] specification constrainded to the specific use case where all frames are key frames.
An AVIF file containing an Image Sequence shall list the "msf1" brand as one of the entries in the compatible_brands array and conform to sections 7 and 10.3 of [HEIF].
6.2. AV1 Sample Entry
All SampleDescriptionBoxes associated with AV1 image tracks shall use the format and procedures as defined in [AV1-ISOBMFF]. The AV1SampleEntry extends the VisualSample entry and includes a manditory AV1CodecConfigurationBox. In this case, the box is identified by the "av1i" type.
Sample Entry Type: av1i Container: Sample Description Box ('stsd') Mandatory: Yes Quantity: One or more.
class AV1SampleEntry extends VisualSampleEntry('av1i') { AV1CodecConfigurationBox config; }
7. Alpha Images
An Alpha Image is a specific type of auxiliary image that is used to carry per pixel opacity information for one or more Master Images.
A URN will be defined to identify AVIF alpha auxiliary images in both collections and sequences. For the purposes of this draft the placeholder urn:aom:avif:alpha will be used whenerver the "auxC" image item property is required.
8. Brands
If the major_brand field is set to "av1i" then the minor_version shall be set to 0.
If the major_brand field is not set to "av1i", then the brand "av1i" shall appear in the compatible_brands array.
The compatible_brands array shall contain "mif1" if the file contains an Image Collection.
The compatible_brands array shall contain "msf1" if the file contains an Image Sequence.
9. AVIF Baseline Profile
An AVIF Baseline file should be a conformant, simplified version of an [HEIF] file. This is to allow for the deployment of general libraries that may be used to create and parse HEIF-based image files wrapping different coding methods for the actual image content. This should be similar to ISO-BMFF usage in the video domain. The following define limitations to the general format of an HEIF file that defines the minimum requirements for an AVIF file reader. An implementation of an AVIF file reader may optionally choose to recognise any structure as defined in HEIF.
9.1. Image Storage
All of the constituent elements, including image samples, shall be contained in a single file. All media data locations, regardless of construction method, shall resolve to an offset within an AVIF file.
9.2. Thumbnails
Only an image type of "av1i" or "jpeg" is permited for thumbnails.
9.3. Cover Image
A PrimaryItemBox is optional. If a Cover Image is not explicitly indicated by a PrimaryItemBox the Cover Image shall be assumed to be the first master image entry in the ItemLocationBox or the first entry in the master track.
9.4. Auxiliary Images
AVIF allows only one type of Auxiliary Image: an Alpha Image.
9.5. Alpha Image
After applying any transformational Image Properties, the alpha image shall have the same dimensional attributes as the largest composite plane in the Master Image: width, height, and pixel aspect ratio. Furthermore, the pixels of the Alpha Image shall overlay the pixels of the largest component plane of any linked Master Image exactly. For example, for YUV 4:2:x, this would be the Y component plane. The decoded value of an alpha pixel shall be a normalized unsigned integer of at least 8 bits representing a range between 0.0 and 1.0. An alpha value of zero will map to 0.0 and the maximum value representable by the alpha channel shall map to 1.0.
9.6. Sample Groups
Sample groups may be ignored by readers.
9.7. Hidden Images
Hidden images may be ignored by readers.
9.8. Pre-Derived Coded Images
Pre-derived coded images may be ignored by readers.
9.9. Multi-Layer Images
Multi-layer images may be ignored by readers. This does not limit the underlying image format from encoding multiple layers but this will be internal to the encoded image and managed directly by the decoder and not visible to the AVIF reader.
9.10. Derived Images
The derivation of tiled images may be supported by Baseline readers which also support rendering images larger than permited by AV1 profiles. No other derived image format need be supported.
9.11. Metadata
Metadata conveys image attributes that are not used to decode or reconstruct an image. This data is considered to be non-essential and non-normative. Examples of this include EXIF, XMP, and MPEG-7. An AVIF reader will not be required to extract metadata from Informational Metadata boxes. This includes the case of image editing instructions conveyed in an XMP record. Essential information shall be carried in the image media directly or be conveyed as Image Properties.
9.12. Collection Elements
An AVIF file reader must be able to recognize the following boxes. Any box field or feature not explicitly limited by this specification should be handled as defined in ISO 14496‑12 and ISO 23008-12.
box hierarchy | version | box description | ||
ftyp | - | file type | ||
meta | 0 | metadata container box | ||
hdlr | 0 | handler type definition | ||
pitm | 0,1 | primary item reference | ||
iloc | 0,1,2 | item location table | ||
iinf | 0,1 | item information table | ||
infe | 2,3 | item information table entry | ||
iprp | - | item properties container box | ||
ipco | 0 | item property definitions | ||
ipma | 0,1 | item property associations | ||
idat | - | item data box |
9.13. Sequence Elements
An AVIF file reader must be able recognize the following boxes. Any box field or feature not explicitly limited by this specification should be handled as defined in ISO 14496‑12 and ISO 23008-12.
box hierarchy | version | box description | |||
ftyp | - | file type | |||
moov | - | movie container box | |||
trak | - | track container box | |||
tkhd | 0,1 | track header | |||
tref | - | track references | |||
mdia | - | media information container | |||
mdhd | 0,1 | media information header | |||
hdlr | 0 | media handler type | |||
minf | - | media information box | |||
vmhd | video media header | ||||
dinf | - | data information container | |||
dref | 0 | data references for track sources | |||
stbl | - | sample table mapping container | |||
stts | 0 | sample to decode time table | |||
stsd | sample description(visual sample entry box subclass) | ||||
stsz | 0 | sample size table | |||
stsc | 0 | sample to chunk table | |||
stco | 0 | chunk offset table | |||
mdat | - | media data |