This is an unofficial draft spec, not formally endorsed by the WHATWG. It is suitable only for reviewing the details of the proposed element.
- Introduction
- Implementation Examples
- Sample Markup Pattern
- Functional Polyfill
- Example Use Cases
- Flexible Layouts
- High-Resolution Displays
- Requirements
- Prior Discussion
Our goal is a markup-based means of delivering alternate image sources based on device capabilities, to prevent wasted bandwidth and optimize display for both screen and print.
The idea is to use the video tag’s markup pattern as the inspiration, as it’s specced to allow the use of media queries within attributes on its source elements and reliably displays the markup inside the tag in any browser that doesn’t recognize it. Through use of media attributes we would not only be able to reduce wasteful image requests for the sake of users with smaller displays, but we would be able tailor our images’ resolutions for users with high-res displays or for print.
Much of the surrounding discussion has taken place publicly, in the W3C’s Responsive Images Community Group.
Any combination of existing media queries can be used to determine the appropriate picture source, through a media
attribute on source
elements. This is identical to the specced behavior of media
attributes on the video
’s source
elements, as outlined here. Any implementation of the picture
tag should allow for the inclusion of fallback markup that is completely ignored by any UA that supports picture
, and is only displayed in browsers that do not recognize the new tag. Note that older browsers can be polyfilled (see section 3.2) with behavior similar to a native implementation.
<picture alt="The alt attribute’s content should accurately describe the image represented by all sources, though cropping and zooming of sources may differ.">
<!-- Matches by default: -->
<source src="mobile.jpg" />
<!-- Overrides the previous source for windows greater than 600px -->
<source src="medium.jpg" media="min-width: 600px" />
<!-- Overrides the previous source for windows greater than 900px -->
<source src="fullsize.jpg" media="min-width: 900px" />
<!-- Fallback content, only displayed in the event the <picture> tag is unsupported by the browser: -->
<img src="mobile.jpg" />
</picture>
Scott Jehl has put together a JavaScript polyfill that could be used to bring similar behavior to older browsers should <picture>
see widespread adoption. As the polyfill is fully dependent on JavaScript, it differs from the behavior of a native implementation in that fallback content is also displayed in the event that JavaScript is unavailable. This ensures a predictable fallback regardless of the presence of JavaScript in older browsers, though a native implementation would have no such dependency on scripting.
Serving full-bleed images within a flexible layout or a layout dictated by media queries requires a source image with the largest necessary inherent size and scaling it down through CSS. On smaller displays such as phones and tablets—where bandwidth can be at a premium—this means an exceptionally wasteful request.
While there are currently “responsive images” solutions that deliver smaller images by default and conditionally load a larger image above a certain window size, all of these involve a redundant request and rely fully on JavaScript. On large displays, an image’s src
will be prefetched prior to any logic that swaps the image, in many modern browsers. This is further detailed in this post.
<picture alt="Image of a polar bear blinking during a snowstorm.">
<!-- Matches by default: -->
<source src="mobile.jpg" />
<source src="medium.jpg" media="min-width: 600px" />
<source src="fullsize.jpg" media="min-width: 900px" />
<img src="mobile.jpg" />
</picture>
Assuming a 960px wide window at the time the page is requested and the sample markup pattern in section 3.1, the UA should make a single request for “fullsize.jpg.” Any window/screen smaller than 600px is served “mobile.jpg”, which—as a completely alternate source—could be cropped as well as resized in order to preserve the focus of the image at smaller sizes.
High resolution screens such as Apple’s Retina display will require high-resolution images, leaving us with a situation similar to the above: either serving larger, high-resolution images to displays that can’t take advantage of them, or forcing high-density displays to first download a low-resolution image then replace it with a high-resolution image. The latter—the approach currently used on Apple’s website—is far from ideal, and the former is so fraught with concerns that it has recently been adressed in such mainstream publications as the New York Times.
<picture alt="Hero image for new high-resolution device, containing a cringe-worthy portmanteau.">
<!-- Matches by default: -->
<source src="standard-res.jpg" />
<source src="high-res.jpg" media="[-webkit-]min-device-pixel-ratio: 2" />
<img src="standard-res.jpg" />
</picture>
In this instance, the standard resolution image is served by default and as a fallback in cases where <picture>
is unsupported. The high resolution image is served instead only in cases where the pixel-ratio media query matches.
A conforming user agent must meet the following requirements:
- The appropriate asset MUST be fetched by way of a single request. A change in window size causing the media attribute to match an alternate source SHOULD trigger a request for said source (to be retrieved from the browser cache, if possible).
- As with the
<video>
and<audio>
tags, this solution MUST NOT require any client-side scripting, server-side technologies, or headers to reliably deliver content tailored for the end user’s context. - Similar to the
<video>
tag, fallback markup MUST be rendered in any browser that does not recognize the<picture>
element. The example in 3.1 uses the “mobile”-sized image as the fallback content, which is the recommended approach: barring the use of a polyfill, the smaller/low-res image should be provided as a fallback to prevent incurring a costly download in contexts that may see no benefit. - The specification MUST provide at least the same level of accessibility as
<img>
, with analt
attribute readily accessible to assistive technology.
How we arrived at <picture>
most recently:
https://etherpad.mozilla.org/responsive-assets
Common questions and concerns: http://www.w3.org/community/respimg/common-questions-and-concerns/
Prior discussion on W3 mailing lists: http://www.w3.org/Search/Mail/Public/search?type-index=public-html&index-type=t&keywords=picture+element http://lists.w3.org/Archives/Public/public-html/2007Jun/1057.html http://lists.w3.org/Archives/Public/public-html/2011May/0386.html http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2012-February/ (as “Responsive Images”)
While the above are clever solutions to the resolution problem for sure, I wonder if they aren’t a bit too specific. This would introduce a redundant means of determining information about a user’s display, and by side-stepping media queries we’ve now introduced a “fork” of sorts. While the markup for
<picture>
may be a little verbose—albiet in a very familiar way—one of the major considerations was “future friendliness”: as media queries are expanded over time, we’d find ourselves with an increasingly useful element. We could not only use it now to determine image size and density—say the furtive whispers of one day seeing bandwidth as a MQ should come true, we could selectively serve high-resolution images to high-density displays only when on a 4G-or-better connection. We could mix-and-match media queries, whereas this entirely new system would leave us bound to DPI(/DPX) unless it were expanded in parallel. As we already have a flexible, reliable system in place to solve the problem of resolution—and more—on the client side, it stands to reason that we should make use of it.It also bears repeating that we arrived at this markup pattern not in a brainstorming vacuum, but after months of discussion involving standards bodies and browser vendors. We have been told several times that modifying
<img>
and bypassing prefetching may not be an option for several vendors. By not bypassing prefetching, of course, we would have gone to all this effort to introduce an element as broken as responsive images scripts are today. Most of this discussion is available throughout the Community Group and the original Etherpad, and has been mentioned enough to warrant a “frequently asked questions” page. I urge you to read through the history and ask questions, and believe me when I say we’re not spitballing, here. This is the result of nearly a year of research and conversation. I absolutely welcome new feedback and ideas—and yours is interesting for certain—but I wouldn’t want to derail discussions for the sake of a subject we’ve already established as a non-starter.With regards to bandwidth not being an issue in ten years—well, think of the difference in typical internet connections ten years ago versus today. It’s night and day, for certain: from a screeching 28.8kbps to blazing fast internet in the air around us at all times. And yet: here we are concerned about pictures—because our bandwidth need has always expanded to use the bandwidth we have available.
Though, you could make a case that we have plenty of bandwidth to go around, even today—even factoring in the advent of “Retina” images. But that’s here, on 4G connections, using expensive and feature-laden devices. This is to say absolutely nothing of the millions of people worldwide accessing the internet on mobile connections alone, paying for each kilobyte consumed, with devices only slightly better than feature phones. Who are we to say “we would like sharper looking images right now, so we’re having them. Your needs might sort themselves out eventually?” I am not willing to saddle users in underdeveloped countries with additional costs—not bandwidth, but actual economic cost—so that more privileged users can have a slightly better experience. I am not content to put “best viewed eventually” in the footer of the websites I build, and I am loathe to so much as think “best viewed in the first world.”
What we have is an issue of tremendous cost to users, and that is why we do this for a living: to solve problems for our users.
Is
<picture>
ideal? Absolutely not, no. And it is verbose, yes, but it’s also the result of countless discussions and consensus. There will always be some aspect of any proposal that can be improved upon, and I urge everyone to look at our proposal as a “first step.” Otherwise, it’s easy for us to riff on each others’ ideas in circles forever—we’re developers, and searching for better solutions is what we do. But at a certain point we’re doing so at the cost of failing users today.I’m happy to answer any questions or point anyone on this thread in the direction of more information, but I’ll spare you all the tirades from here on. I may archive these comments on the CG and remove them here, unless anyone protests.