My Blog has moved to Github Pages

Saturday 11 December 2010

Large, readable Newspaper & Magazine images with small file-sizes

A few months back we built a "prototype" iPhone app using only web technologies. The prototype has since gone on to be installed for several customers, hence the slightly sarcastic quotes. I think this is cool really - its not a huge app, and even though it was built on a short time-scale with some new technologies I think we did a reasonable job under the covers.

There are several interesting aspects of this app that I should write about separately, including things like making a web-app feel like a native app with a nice app icon on the home page, however this post is about some ideas we had for minimizing the download requirements for news-paper page images, whilst still allowing them to be zoomed in enough to be very readable.

The page images start life as pdf's. For various reasons that are not important to this post we had to convert them to images to use in our app, with the following constraints:
  • the photos, cartoons, etc., (full colour) must be retained in sufficient quality to be pleasing to the eye.
  • the text (black on white) on the pages must be retained in sufficient quality to be clearly readable when zoomed on an iphone.
  • loading and rendering the image should be as speedy as possible.

Here's a screenshot of one page of a pdf I made for the purposes of this post, using the excellent fivefilters.org service (free). The pdf for this page is 813kb.


To get some ballpark numbers for what might be achievable, we started out simply converting the pdf to a jpeg with imagemagick. Imagemagick is a great tool for all kinds of image manipulation, definitely worth checking out. Converting a pdf to jpeg is extremely easy - just a matter of invoking "convert":

convert -density 300 -quality 7 test.pdf test.jpg

That produces a relatively small filesize (215kb for the example page), with photos that look ... mm, ok-ish, to the not-overly critical eye, but of course jpeg is less than ideal for images of text. You end up with an awful lot of artifacts (jpeg'ing) in and around the text - an effect that is anything but easy on the eye.


Since its easy to experiment with, we tried a variety of different jpeg quality settings, largely to satisfy ourselves that a single jpeg couldn't really meet all of our requirements - either you get a large file-size with readable text, or a small filesize with horrible artifacts all over the textual parts of the image.

We also wondered if we could just do some tricks like using progressive jpegs to make the image appear to load quickly, and resolve to higher resolution versions as the download progressed. Sadly the iPhone at the time didn't support progressive jpegs (not sure if it does by now).

Next we looked at other image formats. Portable Network Graphic (png) images work nicely for photos and text - you don't get any nasty jpeg'ing, but inevitably the file-size is significantly larger than jpeg. On my sample page it's a whopping 9.1Mb (but it does look very nice).

Clearly what we really want is an image-format that combines the best of both using some kind of bi-level compression: jpeg-like compression for the colour-rich areas like photos, and png-like quality for the textual areas.

We did a bit of noodling around the web, looking for other image formats that might help out. The most likely candidate we found was JBIG2, which sounds like a really excellent format (check out that link to wikipedia), but sadly there's very little support for it - none of the web-browsers supported it at the time (and probably still don't, sorry i'm too lazy to check again now).

OK, satisfied that we'd done our due-diligence and that there wasn't a more easily available solution we started pondering options for solving the problem ourselves (hey, we're programmers, we gotta have some fun, right?

We had lots of nice ideas, for example: dissecting the image into regions containing only photos and only text, then stitching them back together at the client - like a google maps tile-based solution. This sounds like a lot of fun - an excellent excuse to play with image manipulation, edge detection and what-have-you, and if I get some time I intend to have a poke around and see how much better (or worse) it is than the solution we eventually came up with.

Somehow - I wish I'd written this back then, because I can't recall how the idea came about - we struck upon another, simpler, approach. A bit more experimenting with imagemagick and we soon had a working solution using only images, minimal coding, and resulted in readable text and ok-quality photos at approximately the same file-size as the original pdf. We were pretty pleased with that I can tell you, given that the smallest barely-readable jpage-2pg-only image was about 3 or 4 times larger than the original pdf.

Here's what we did: we realized early on that there were two completely separable components to our images: the colour photos, and the black & white text; and had been trying to separate them on the x-y planes of the image (if you know what i mean). Instead we looked at separating the components on the z-axis - ie. into two layers, one containing the coloured components, and the other containing the black & white.

This turns out to be a doddle with a few imagemagick incantations (I didn't make that up to sound funny, its what they are called!). First we extracted an image of the black & white parts, at the same resolution as the original. This will be our text component, so we want to save it as a png to keep the image-quality high and artifact free. Another reason for using png here is that we can set the white (background) part of the image to be transparent:

convert -density 300 test.pdf -threshold 5% -depth 8 -colors 16 -transparent white test.png

The threshold 1% cuts out everything but the very darkest of colours (very near black), and we're reducing the colour depth to make the file-size as small as possible. This gives me 117kb png for the example page. Next we did the opposite - set the black to transparent to create an image that only contains the coloured components of the image. Again we're using png in order to take advantage of its transparency capabilities:

convert -density 300 test.pdf -transparent black back.png

Finally we flatten the image from png down to a relatively low quality jpg. Note that we can make the quality quite a bit lower than we did earlier, because we don't have to keep the text readable (it isn't even present in this image, we cut it out by filtering black out in the previous step).

convert back.png -background white -flatten -quality 9 back.jpg

Great, now we have two images: a "background" image which contains all of the coloured parts of the page, and a "foreground" image which contains only parts of the original that were black (surprisingly this almost always seems to be only text and borders of images, there's very little black in any of the photos. My example jpg is 81.9kb.

Two further optimisations can be made: run image optimiser's (e.g. pngcrush) on the resulting images. Pngcrush squeezed 3% out of my foreground png image, bringing it down to 113.7kb. I'm too tired to squash the jpeg now (it's past midnight, sorry).

When we layer those two images on top of each other we get back something that has decent quality photos, high quality text, and comes in at a fraction under 200kb. Layering the two images is very straight-forward with some simple html and css.

That was the best solution we came up with in the (very short) time we had to think about this. As it happens we didn't need to use it in the end, but it was a fun challenge to play with, and I had in mind to write something about it and to play with some of the other ideas at some point.

In the interests of full disclosure I have to say that I think we had much better results (in terms of quality) with the real newspaper pdf's than i'm getting with my example. I'm not really sure why that is, although it could be related to how my example pdf was created. Another major difference is that our real pdf's contained arabic text, which has a lot more fiddly bits in and around the characters, which increases the damaging effect of the jpeg'ing of the text.

No comments:

Post a Comment