【2012暑假读书报告】——2010级硕士生党支部吴小丽

Reading Images: The Grammar of Visual Design
吴小丽
During this summer holiday, I have read some books for the preparation of my thesis paper. Right now, I want to share one book with you, which is called, Reading Images: The Grammar of Visual Design.
Kress and van Leeuwen’s visual grammar (1996/2006) is developed from Halliday’s social semiotic view of language. Halliday (1985) proposed three meta-functions of language in the systemic-functional grammar: ideational function, interpersonal function and textual function. According to Halliday’s arguments, Kress and van Leeuwen believe that these meta-functions as the necessary qualities of any human communication system. According to Kress and van Leeuwen (2006: 41-44), visual images, like all semiotic modes, fulfill the requirements for the representation of the experiential world, the communication between producers and receivers as well as the interaction between the participants represented in a visual design and its viewers, and the compositional arrangements of visual resources. Therefore, by adopting the theoretical notion of ‘meta-functions’ from the work of Halliday, they develop a descriptive and analytical framework for the analysis of visual images, that is visual grammar. Like the grammar of language defined by Halliday, the visual grammar has three aspects of meaning: representational meaning, interactive meaning and compositional meaning.

Representational meaning
Representational meaning in visual grammar is equal to ideational meaning in Halliday’s Systemic Functional Grammar. However, according to the features of visual images, Kress and van Leeuwen only distinguish two general kinds of representational processes: narrative processes and conceptual processes. Narrative processes involve ongoing actions or events, where actors are represented by means of vectors to do something to or for each other. On the other hand, conceptual processes involve a classification or analysis of participants in terms of their stable and timeless essence.

Narrative processes
When the represented participants are connected by a vector, they are represented as doing something to or for each other (Kress & van Leeuwen 2006: 59). According to Kress and van Leeuwen (2006: 59), in pictures, these vectors are formed by depicted elements that form an oblique line, often a quite strong, diagonal line. On the basis of the kinds of vector and the kinds of participants involved in visual images, three subcategories of narrative processes are distinguished: action processes, reactional processes, speech and mental processes and conversion processes.
As in action processes, there are usually two kinds of participants involved: the “Actor” and the “Goal”. The Actor is the participant from which the vector emanates, or which itself, in whole or in part, forms the vector (Kress & van Leeuwen 2006: 59). They are often the most salient participants, through size, place in the composition, contrast against background, color saturation or conspicuousness, sharpness of focus, and also psychological salience. The goal is the participant at whom the vector is directed, hence it is also the participant to whom the action is done, or at whom the action is aimed (Kress & van Leeuwen 2006: 63). For action processes, the Actor is indispensable while the Goal is an alternative element. When both the Actor and the Goal are included in an action process, it is called transactive action process. When only the Actor is included in the process, it is a non-transactive action process.
In reactional processes, the vector is formed by the eyeline, which means the direction of the glance of one or more of the represented participants forms the vector. Participants in this kind of processes are defined as the “Reactor” and the “Phenomenon”. The Reactor is the emanator of the glance. it must be human, human-like animals or personified objects who/which are capable of giving a glance to others; while the participants at which or whom the Reactor is looking is the Phenomenon. Similar to the action processes, reactional processes also can be distinguished as transactive and non-transactive ones (Kress & van Leeuwen 2006: 67). In the transactive reactional processes, there is phenomenon while in the non-transactive processes, there is no phenomenon.
In speech and mental processes, the vectors are formed by dialogue balloons or thought balloons as those in comic strips. The oblique protrusions of the thought balloons and dialogue balloons connect drawings of speakers or thinkers to their speech or thought (Kress & van Leeuwen 2006: 68).
The last kind of processes is conversation processes. In the conversation processes, there is a special kind of participant, “relay”, which is the goal in terms of one participant and the actor in terms of another. Relays transform what they receive; therefore, the state of the represented participants is changing. We can see that the conversation processes are rare in the English textbooks.

Conceptual processes
Narrative processes presents unfolding actions and events, processes of change, transitory spatial arrangements while conceptual processes represents participants in terms of their more generalized and more or less stable and timeless essence, in terms of class, or structure or meaning. According to the features of participants engaged in the processes, three kinds of conceptual processes are identified in terms of class, structure and meaning. They are classificational, analytical and symbolic processes.
In classificational processes, participants are related to each other in terms of taxonomy relations. At least one set of participants will play the role of “Subordinates”. Classificaitonal processes can be covert or overt, which depends on the appearance of the large, general class – “Superordinate”. As a covert visual structure, the Superordinate must be inferred from the visual image, which means that the participants are realized in an equal and symmetrical composition; otherwise, it is an overt visual structure (Kress & van Leeuwen 2006: 79).
In analytical processes, participants are related in terms of a part-whole structure. Two kinds of participants are involved in the processes: the “Carrier” (the whole) and the “Possessive” (the parts) (Kress & van Leeuwen 2006: 87). For example, in a visual image which depicts a bird in it, the bird is the Carrier and its wings are the Possessive Attributes.
Symbolic processes are about what a participant means or is. There are two sub-processes in symbolic processes: Symbolic Attribute and Symbolic Suggestive. In Symbolic Attribute, there are two participants – the participant whose meaning or identity is establishes in the relation, the Carrier, and the participant which represents the meaning or identity itself. But in Symbolic Suggestive, there is only one participant, the Carrier, and in that case the symbolic meaning is established in another way.

Interactive meaning
As for the interactive meaning, Kress and van Leeuwen (2006: 42) said, “Any semiotic mode has to be able to project the relations between the producer of a (complex) sign, and the receiver/reproducer of that sign. That is, any mode has to be able to represent a particular social relation between the producer, the viewer and the object represented.” There are four key factors for that interaction: visual contact, social distance, perspective.

Visual Contact
Kress and van Leeuwen (1996/2006) analyze images and come up with two kinds of image acts: offer image and demand image. In a demand image, the participant looks directly at the viewers. The participant’s gaze demands something from the viewer, demands the viewers enter into some kind of imaginary relation with him or her (Kress & van Leeuwen 2006: 118), while an offer image offers the represented participants to the viewers as items of information, objects of contemplation, impersonally, as though they were specimens in a display case (ibid. 2006: 119). That is to say, in an offer image, the represented participants do not look directly at the viewers but gaze away.

Social Distance
Kress and van Leeuwen (1996/2006) proposed that when the image producer depicts human participants or objects, they choose to depict them as close or far away from the viewer. As the represented participants are depicted farther and farther from the viewers, the relations between the visual and the viewers evolve from intimacy to frozen styles. Kress and van Leeuwen (1996/2006: 124-5) identified six distances according to the size of frame: (1) intimate distance: only the face or head are shown; (2) close personal distance: the head and also the shoulders are shown; (3) far personal distance: the part up the waist is shown; (4) close social distance: the whole figure is shown; (5) far social distance: the whole figure with space around it is shown; (6) public distance: at least four or five people are depicted.

Perspective
When producing images, image producers select a certain perspective to display the represented participants. Through the perspective selection, the image producer not only imposes his attitude on the represented participant on the poster, but also on the viewers. Kress and van Leeuwen (1996/2006) postulated the existence of horizontal and vertical angles. As for horizontal angle, if the image has a frontal angle, then what is depicted in the visual image is a world involving both the represented participant and the viewers. If the image has an oblique angle, then the image encodes detachment. As for the vertical angle, it transmits the power relations. If a represented participant is seen from a high angle, then the interactive participant (the producer of the image, and hence also the viewer) has power over the represented participant. If from a low angle, the represented participant has power over the interactive participant. If at eye level, then the point of view is one of equality and there is no power difference involved. (Kress & van Leeuwen 2006: 140).
However, scientific and technical pictures, such as maps, diagrams and charts, usually encode an objective attitude. In the scientific images, angles do suggest viewer positions, but special and privileged ones, which neutralize the distortions that usually come with perspective, because they neutralize perspective itself (Kress & van Leeuwen 2006: 143-144).

Modality
The term “modality” comes from linguistics and refers to the truth value or credibility of statements about the world. One of the crucial issues in communication is the question of the reliability of messages. As members of a society, we have to be able to make decisions on the basis of the information we receive, produce and exchange. The grammar of linguistic modality focuses on modality markers such as auxiliary verbs which accord specific degrees of modality to statements, verbs like may, will and must and the related adjectives (e.g. possible, probable, certain) and adverbs.
The concept modality is equally essential in accounts of visual communication. Visuals can represent people, places and things as though they are real, as they actually exist in his way, or as though they do not – as though they are imaginings, fantasies, caricatures, etc.. Modality judgements are social, dependent on what is considered real (or true, or sacred) in the social group for which the representation is primarily intended.
In this section, the author will introduce the modality markers, such as the various parameters and coding orientations, which will provide criteria for modality judgement in different contexts.

Modality markers
According to Kress and van Leeuwen (1996/2006), the following means of visual expression are involved in judgements of visual modality (van Leeuwen 2005: 167):
1. Degrees of the articulation of detail form a scale which runs from the simplest line drawing to the sharpest and most finely grained photograph.
2. Degrees of the articulation of the background range from zero articulation, as when something is shown against a white or black, via lightly sketched in or out-of-focus backgrounds, to maximally sharp and detailed backgrounds.
3. Degrees of colour saturation range from the absence of saturation – black and white – to the use of maximally saturated colours, with, in between, colours that are mixed with grey to various degrees.
4. Degrees of colour modulation range from the use of flat, unmodulated colour to the representation of all the fine nuances and colour modulations of a given colour – for example, skin colour or the colour of grass.
5. Degrees of colour differentiation range from monochrome to the use of a full palette of diverse colour.
6. Degrees of depth articulation range from the absence of any representation of depth to maximally deep perspective, with various other possibilities in between – for example, simple overlapping without perspectival foreshortening.
7. Degrees of the articulation of light and shadow range from zero to the articulation of the maximum number of degrees of ‘depth’ of shade, with opinions such as simple hatching in between.
8. Degrees of the articulation of tone range from just two shades of tonal gradation, black and white – or a light and dark version of another colour – to maximal tonal gradation.
All these means of visual expression are gradable. They allow the relevant dimension of articulation to be increased or reduced. What is more, the different parameters may be amplified or reduced to different degrees, resulting in many possible modality configurations. These configurations cue viewers’ judgements of modality, of ‘as how real’ images (or part of images) are to be taken. (van Leeuwen 2005: 167). Just like modality in SFG can be divided into three levels: high, medium and low. Modality in visual can also have three levels: high, medium and low.

Coding orientation
Coding orientations are sets of abstract principles which inform the way in which texts are coded by specific social groups or within specific institutional contexts. Kress and van Leeuwen (2006: 165-166) have distinguished the coding orientations as follows:
(1) Technological coding orientations, which have, as their dominant principle, the “effectiveness” or the visual representation as a “blueprint”. Whenever colour, for example, is useless for the scientific or technological purpose of the image, it has, in this context, low modality (Kress and van Leeuwen 2006: 165).
(2) Sensory coding orientations, which are used in contexts in which the pleasure principle is allowed to be the dominant: certain kinds of art, advertising, fashion, food photography, interior decoration, and so on. Here color is a source of pleasure and affective meanings, and consequently it conveys high modality: vibrant reds, soothing blues, and so on – a whole psychology of colour ahs evolved to support this (Kress and van Leeuwen 2006: 165).
(3) Abstract coding orientations, which are used by sociocultural elites – in academic and scientific contexts, and so on. In such contexts modality is higher the more an image reduces the individual to the general, and the concrete to its essential qualities. The ability to produce and/or read texts grounded in this coding orientation is a mark of social distinction, of being an “educated person” or a “serious artist”.
(4) The common sense naturalistic coding orientation, which remains, for the time being, the dominant one in our society. It is the one coding orientation all members of the culture share when they are being addressed as “members of our culture”, regardless of how much education or scientific-technological training they have received (Kress and van Leeuwen 2006: 165-166).

Compositional meaning
Compositional meaning in visual grammar is equal to the textual meaning in Halliday’s systemic-functional grammar. It refers to “the way in which the representational and interactive elements are made to each other and the way they are integrated into a meaningful whole” (Kress and van Leeuwen 1996/2006: 176). According to the features of visual images, it includes three factors: information value, salience and framing.

Information value
The placement of elements in visual images endows these elements with specific information values, which are realized through the placement of elements in different zones of the image: left and right, top and bottom, center and margin (Kress & van Leeuwen 2006: 177).
Visual grammar regards elements placed on the left as Given, and those on the right as New. “The meaning of the New is problematic, contestable and the information at issue, and regarded as objects for contemplation, while the meaning of the Given is self-evident and commonsensical” (Kress and van Leeuwen 1996/2006: 181).
The elements placed in the upper parts of the image are represented as Ideal, while the elements placed in the lower parts of the image are represented as Real. “For something to be Ideal means that it is presented as the idealized or generalized essence of the information, and therefore also as having ideologically one kind of salience. The Real is then opposed to this in that it presents more specific information or more down-to-earth information or more practical information” (Kress and van Leeuwen 1996/2006: 186-187).
When the elements placed in the middle of the image, they are referred to as the Center, and other elements around as Margins. “For something to be presented as Center means that it is the most salient information to which the image producer wants the viewers to pay special attention, whereas for something to be presented as Margins means that other elements are to some extent subservient information to which less importance is attached with respect to the Center” (Kress and van Leeuwen 1996/2006: 197).
The dimensions of visual space are shown in the following figure:

Figure 3.1 Dimensions of visual space in the information value

Salience
Placement not only endows different elements with specific information values, but also assigns different degrees of salience to these elements. Salience refers to the degree which an elements draws viewers’ attention to itself. “Salience can create a hierarchy of importance among the elements, selecting some as more important, more worthy of attention than other” (Kress and van Leeuwen 1996/2006: 201).
Readers are able to judge the “weight” of the various elements of the placement, and the greater the weight of an element, the greater its salience. The visual weight of certain elements can be judged according to factors: size, sharpness of focus, tone contrast (areas of high tone contrast, for instance, the border between black and white has high salience), color contrasts (for instance, the contrast between highly saturated and unsaturated colors, or the contrast between red and blue has high salience), placement in the visual field (elements not only become more salient as they are moved towards the left, they have high salience due to an asymmetry in the visual field), perspective (foreground participants are more salient than background participants and elements which overlap other elements are more salient) (Kress & van Leeuwen 2006: 202).

Framing
In visual grammar, framing refers to the connection and disconnection of the elements of a visual composition with different semiotic potentials. The elements of an image may be disconnected marked off from each other, or connected, joined together (Kress & van Leeuwen 2006: 203). However, connection and disconnection are also a matter of degree like the aspect of salience. It means that elements in a visual image may be strongly or weakly disconnected, or strongly or weakly connected. The weaker the connection, the more the elements in different frames are presented as separate units of information, while the stronger the connection, the more they are presented as one unit of information, as belonging to each other.
Disconnection can be realized in many different ways, for example, by the frameline, by the discontinuity of color or shape, or simply by empty space between the elements. Connection can be realized by the logically connected elements or items represented in the image, by the repetition of formal features of the connected elements, shapes, colors or vectors formed by features of depicted objects (Kress & van Leeuwen 2006: 204).
Up till now, I have given a comprehensive introduction to the theory of Kress and van Leeuwen’s visual grammar. The visual grammar builds up the theoretical foundation of my thesis paper.