I wish to set aside books with much in the way of text. If you're writing any sort of coherent "story" whether it be a literal laying out a sequence of events, or something else, the words will carry things along.
As usual, I want to construe "story" as generally as possible, to cover Cinderella through an impressionistic take on Monet's garden, and probably more besides.
And so, under consideration is a book of, essentially, just a bunch of pictures, most of them photographs, a book aimed at conveying something.
In my previous remarks I argued, with what I imagine to be a fair degree of success, that you're not going to get much traction if the goal is to convey a sequence of factual (or fact-like) events, such as the story of Cinderella or the Roman Empire. What you're going to be able to accomplish is something a lot more like Keith Smith's composite picture, a whole or a gestalt that in is psychologically similar to a photograph in that it contains a collection of visual facts and ideas and relationships, but is ultimately a singular object to which you, the reader, may react in some way.
A visual book does not, in general, relate a sequence of events, or a sequence of logical statements forming an argument. It does not convey names, dates, locations, and similar details. It is nothing more than a complex arrangement of visual details that may add up to... something.
The basic unit of the western codex is the two-page spread. You may elect to put one photo on there, or two, or more. The traditional approach places one photo per spread, and so the basic unit of that book is a single photo.
With a little work, some cueing, you might be able to persuade the reader that the unit is, say, 3 spreads in a row. Perhaps you alternate three color spreads with three black and white ones, or change the page color every three spreads. In this case the reader might be persuaded to flip randomly to a spot, and then find the beginning of the unit from there, more or less consistently.
The unit, therefore, is what I am considering to be the basic lump of material. I divide the book notionally into units, each unit being consumed together, as a whole, perhaps even in-order. Units, however, tend to be consumed more or less randomly. Earlier units will tend to be examined sooner than later ones, because we do tend to leaf through books roughly front-to-back, perhaps with some backing and filling.
You might envision the course through a visual book as, roughly, a series of units each consumed in-order, the units themselves consumed in a zig-zag path that tends front-to-back, but contains gaps and backtracking to one degree or another.
Probably a strict two-level hierarchy of "units" and "book of unit" is simplistic, but let us see if it offers any guidance.
All this suggests that, far from the complex structures we associate with the film and the novel, there is in fact very little wiggle room in the visual book.
Your choices seem to be one of these two: either have no particular progressive goal, but merely make your point through a pile of units; or make your point within the context of this somewhat labored path.
This chart suggests how I see these things.
We start our with a sequential reading (matching the dashed blue line) and then a short jump forward, and then back a little. One unit gets skipped, another gets looked at twice. A little later on a larger forward jump happens, more backtracking. At some point there is one dip backwards into previously skipped material, and then a large jump back forward. This is, of course, just an example, but illustrates the general shape of the thing.
So you have a few units at the beginning to set the stage, and then people start jumping around, in a more or less forward-moving fashion, with potentially larger and larger jumps.
After that, you can say things later and earlier, and people will notice that, although they may not encounter these statements in-order, they will tend to encounter them mostly in the right order, that is, later things after earlier things. Even if they backtrack and come across something near the end of their reading, they may well note that this is happening near the beginning of the book.
This suggests that your book should progress in fairly large strokes, with a lot of repetition. If you want someone to reliably notice something in the latter half of your book, you better give them several chances at it, because they're just jumping around at that point. The farther along in the book your material lies, the more repetition you'll need (or, the more you'll have to accept people simply missing it).
Opposing this notion, you don't want to simply make the last half of your book just a bunch of repeats of essentially the same point, so as to get through to people who are just casually flipping by that point. You'll put off the people who are reading more closely.
Some sort elaboration seems right. You will want some way to both communicate the bigger ideas in broad, repetitious strokes, while offering rewarding detail to the closer reader.
It will come as no surprise to long time readers to discover that I think this supports a music-like view of the visual books.
Sonata-like, you can state a couple of themes up front, in that first sequential read.
Following that, you can repeat and elaborate on those themes over longer stretches of material. The elaborations are enough to reward the closer readers, but the large themes are repeated over lengthier stretches of the book so that even people skimming will likely stumble across each of the important themes, in roughly the right order.
If your book is engaging, your readers will occasionally return to it, taking each time a similar but different path through it. They will, one hopes, discover the same large set of themes, the same overall structure, but with new details. Elaborations previously unnoticed may reveal themselves. Relationships between this picture and that, this unit and that, will pop up over time.
The composite image, formed at least hazily on the first read-through, ideally becomes clearer and at the same time evolves, upon each new reading.
It's not perfect, but maybe it's a model you can find something to use in.
Most people just go for a pile of pictures, anyways, and that's OK too.