• Non ci sono risultati.

Reading without saccades using Rapid Serial Visual Presentation

N/A
N/A
Protected

Academic year: 2021

Condividi "Reading without saccades using Rapid Serial Visual Presentation"

Copied!
127
0
0

Testo completo

(1)

Thesis dissertation in fulfilment of requirements for the XXXII PhD Program in Psychology and Cognitive Science

Reading without eye movements

using Rapid Serial Visual

Presentation

PhD Student

Orlando Ricciardi

Sapienza University of Rome, Italy

Advisor

Co-Advisor

(2)

Abstract

The complex behaviour that allows processing of information present in a text in order to extract its meaning is what we usually call reading (Rayner, Schotter, Masson, Potter & Treiman, 2016). The objective of the reading may differ depending on the type of text: understanding the instructions in a manual will require different learning than understanding the content in an article or in a story magazine from a novel. In any case, it is implausible that reading occurs without the need to understand the meaning of a text, thus implying understanding the relationship between words and sentences and inferring their meaning. Like any behaviour, reading can also be analyzed at a molecular level. Information is acquired by the sensory organ responsible for the sight, the eye, and conveyed to the occipital region of the brain through fixations and saccades. Further ballistic adjustment movements allow proper placement of the necessary information in the fovea, the region of the retina with the greatest visual acuity able to cover only 2° of the field of view. Through a continuous repositioning, necessary to focus on the target letter, the meaning of a text is processed and stored. This process is time-consuming and is conditioned by various processes. The anatomy of the visual system is one of them. Deficits in the visual system require more time to read, imposing a greater number of repositions on the word. On the other hand, morphological and syntactic properties of the text can make easier or harder deducing the meaning of the text. Some examples are characterized by the type and size of the font, the language of the text, the length, and frequency of use of words in the text and the distance at which it is read. These characteristics have been studied exhaustively by cognitive psychology, and it is not the subject of this work to reconsider the produced results. Instead, the aim is to understand how technology has influenced the study of reading. This influence refers to a methodological aspect and an application aspect. The structure of the manuscript consists of four parts. In the first, the main characteristics of the visual system are briefly described, with particular reference to the anatomy of the eye, the eye movements and the muscles that allow its movement. Besides, there is a brief presentation of the relationship between eye movements and reading. In this part, we will present the main results obtained in the field of cognitive psychology and the interpretations provided to explain its role in understanding a text. The second part concerns how technological developments have changed the way to study reading. This part begins with a description of the technological developments which have created new possibilities for studying the eye movements involved in reading, with particular reference to eye-tracking techniques. The second part of this section analyses new reading possibilities. The displays dimension, for example, can vary from a regular monitor to a smartwatch. It implies the opportunities to read in much more spaces. Similarly, the hyper connection and spread wearability of technological devices created the opportunity to read in

(3)

much more times. Very often, the reading of short messages, notifications, emails or short newspaper articles takes place in a short time, which can represent the waiting of a bus or a trip on the metro.

The third part of the paper focuses on alternative ways of presenting text than the classical one, defining its aims and the main aspects that characterize it. Particular emphasis will be devoted to the Rapid Serial Visual Presentation (RSVP), which will be the subject of the successive chapter. The fourth part will describe the studies conducted on RSVP to compare it with the traditional mode of presentation. This section will deal with some research questions accompanied by the results of the relative experimental analyses. It is evident that to answer to all questions would be too much ambitious and goes beyond the ambitions of this chapter. Thus, the discussion is mainly oriented to the presentation of problems and the proposition of possible experimental analyses. In many cases, a single experiment will be described, which therefore cannot be exhaustive, not even accompanied by an analysis of the literature, in order to give a decisive answer to the question. It is important to emphasise from the beginning that the studies described had a particular objective. Subjects were not required to read a text containing instructions, not even a text written in scientific language or even a manual or a book. Results refer to comprehension in reading magazine articles, with the aim of short reading time. This choice is consistent with the opportunity to read anywhere and even for short time segments through small (smartphones, smartwatches), portable and always-connected devices. Finally, the conclusions of the results of all studies will be described, with suggestions for the use of RSVP regarding specific reading objectives and texts.

(4)

Publications

Part of materials, ideas, and figures from this dissertation have previously appeared in the following peer-reviewed publications. After each paper, the chapter from which the material is used is noted.

Ricciardi, O. & Di Nocera, F. (submitted). Do eye movements matter in Rapid Serial Visual Presentation efficiency? A test on small displays

Ricciardi, O., Juola, J. F., Di Nocera, F. (submitted). Speed reading using Spritz has a cost: limit when reading a short text.

Di Nocera, F., Ricciardi, O., & Juola, J. F. (2018). Rapid serial visual presentation: degradation of inferential reading comprehension as a function of speed. International

Journal of Human Factors and Ergonomics, 5(4), 293.

Ricciardi, O. & Di Nocera, F. (2017). Not so fast: A reply to Benedetto et al. (2015). Computers in Human Behavior, 69, 381-385.

(5)

Table of Content

Table of Content 5

List of tables 7

List of figures 8

Section one 10

1.1 Visual system overview 10

Anatomy of the eye 10

Eye movements 11

Muscles implicated in the visual system 12

1.2 Eye movements in reading 13

Regressions 16

Skipping 17

Parafoveal preview 18

Section two 20

2.1 Methodological innovation in the study of eye movements 20

2.2 Eye movement recording techniques 20

2.3 Eye tracking 21

2.4 Eye tracking data analysis 22

Section three 27

3.1 Reading study paradigms 27

3.2 Rapid Serial Visual Presentation 29

3.3 Experimental results about the RSVP 31

Section four 34

4.1 Study one 34

4.2 Study two 43

4.3 Study three 47

(6)

4.5 Study five 61

Concluding remarks 79

References 81

(7)

List of tables

Table 4.4.1. Differences between RSVP nominal speed and Spritz actual rate. Text length was 1337 words. Note: The average time in the control-reading condition was 7 minutes and 50 seconds (about 167 wpm).

Table 4.4.2. Pairwise comparisons among NASA-TLX sub-scales.

Table 4.5.1. Number of sentences, words and result to Flesch-Kincaid readability test for each text included in the experiment.

Table 4.5.2. Actual frequency rate (wpm) for each text used in the experiment for RSVP conditions and how it results when reading at subjects’ own pace.

Table 4.5.3. Number of sentences, words and result to Flesch-Kincaid readability test for each text included in the experiment.

Table 4.5.4. Actual frequency rate for each text used in the experiment for RSVP conditions and how it results when reading at subjects’ own pace.

(8)

List of figures

Figure 1.1.1. Representation of the foveal and parafoveal regions. Figure 1.1.2. Muscles and nerves implicated in the visual system.

Figure 1.2.1. Example of eye movements made during reading with the objective to understand a text.

Figure 2.1.1. Example of a scanpath, consisting in a series of saccades (green lines) and fixations (green circles).

Figure 3.1.1. Representation of the moving-window. From top to bottom, the reading takes place through the transformation of the x in the corresponding letters. The visible part of the text is always and only one.

Figure 3.1.2. Representation of the trailing-mask paradigm. From top to bottom, the lines show the subsequent views of a sentence chosen as an example. Asterisks represent the positions of fixations.

Figure 3.2.1. Rapid Serial Visual Presentation (on the left) and Spritz implementing the Optimal Recognition Point (on the right).

Figure 4.1.1. Percent of correct comprehension scores in the normal condition and at different RSVP speed conditions. Vertical bars denote 0.95 confidence intervals.

Figure 4.1.2. Perceived mental workload reported by participants for each condition. Vertical bars denote .95 confidence intervals.

Figure 4.1.3. Percent of correct comprehension scores in the normal condition and at different RSVP speed conditions. Vertical bars denote .95 confidence intervals. Figure 4.2.1. Percent of correct comprehension scores in the two pauses conditions (available vs. not available) for each RSVP speed conditions (250 wpm vs. 450 wpm). Vertical bars denote 0.95 confidence intervals.

Figure 4.2.2. Percent of correct comprehension scores in the two RSVP speed conditions (250 wpm vs. 450 wpm). Vertical bars denote 0.95 confidence intervals.

(9)

Figure 4.3.1. Percent correct comprehension scores in the control condition (traditional reading) and at different RSVP speed conditions. Vertical bars denote 0.95 confidence intervals. The straight horizontal line indicates the percent correct scores for participants who did not read the passage.

Figure 4.4.1. Mean percentage of comprehension scores obtained across conditions on the inferential comprehension questionnaire (chance performance = 50%). Vertical bars denote 95% confidence intervals.

Figure 4.4.2. Reported mental workload using Nasa TLX. Vertical bars denote .95 confidence intervals.

Figure 4.5.1. The mean proportions of scores obtained in different conditions on comprehension questionnaires. Vertical bars denote 0.95 confidence intervals. Figure 4.5.2. The mean proportions of correct inferential comprehension scores obtained in

different conditions. Vertical bars denote 0.95 confidence intervals.

Figure 4.5.3. Reported mental workload using Nasa TLX. Vertical bars denote .95 confidence intervals.

Figure 4.5.4. The mean proportions of scores obtained in different conditions on comprehension questionnaires. Vertical bars denote 0.95 confidence intervals. Figure 4.5.5. The mean proportions of inferential comprehension scores obtained in different

conditions. Vertical bars denote 0.95 confidence intervals.

Figure 4.5.6. Reported mental workload using Nasa TLX. Vertical bars denote .95 confidence intervals.

(10)

Section one

This first part of the paper includes two sections. Initially, the characteristics of the visual system will be described, describing first the anatomy of the eye, then its possibilities of movements and then the muscles that make this movement possible. The second part of this section explores, instead, the role of eye movements in reading.

1.1 Visual system overview

This paragraph does not aim to deal exhaustively with the mechanisms that regulate the muscles involved in eye movements. The presentation of the topic is devoted to allowing an expert reader to develop it with appropriate references and, at the same time, it provides essential information for the naive reader interested in understanding the basic anatomical features of the visual system.

Anatomy of the eye

The eye is the organ that allows images to be acquired and transmitted to the brain for further processing. The outer coating of the eye consists of two curved but not exactly spherical parts. The cornea is placed in front, appears transparent and with a radius of just under a centimetre. The other is the sclera, opaque and with a radius of about 12 millimetres. On the external level are the choroid, the iris and the ciliary body, while more internally is placed the retina. The crystalline lens, formed by transparent fibres, is placed behind the cornea and, like a lens, allows to adjust the focal length so that the image is as sharp as possible. The pupil is a hole in the centre of the iris. Its size is variable and regulates the amount of light entering the eye. The visual system elaborates the external stimuli starting from the light received through the pupil; these stimuli are turned upside down in the lens and then projected in the posterior part of the eyeball, the retina, constituted by photosensitive cells, the cones and the rods. These cells have specific functions and allow the transduction of light into electrical signals to be sent through the optic nerve to the visual cortex for information processing. The cones allow a detailed view of the visual scene and distinguish the various colours, while the rods are more sensitive to variations in light, movement and depth rather than the details of the stimuli and allow us to see in low light conditions. One aspect of notable importance is the distribution of the cones in the retina, mainly concentrated in a small central area of the retina called fovea, which can only cover 2° of the field of view.

For this reason, visual acuity, defined as "the ability of the eye to resolve and perceive fine details of an object" (Cline, Hofstetter & Griffin, 1997), is at its highest in the centre of the retina and rapidly decreases towards the periphery. This anatomical property has essential functional meaning because it implies the need for the eye to move continuously to relocate a

(11)

stimulus in the centre of the visual field. Beyond the fovea, visual acuity diminishes in the immediate surrounding region, called parafovea. Stimuli present in the parafovea can still influence the processing of a scene. In reading, for example, characters up to 6° of visual angle benefits from parafoveal preview. Out of parafovea, the perifovea is the final region included in the macula, which represents the central area of the retina (figure 1.1.1).

Figure 1.1.1. Representation of the foveal and parafoveal regions.

Eye movements

The possibility of the human eye to move depends on anatomical factors but also on the person's objectives. In general, these characteristics allow classifying eye movements between rapid movements called saccades, fluid tracking movements, vergence movements and vestibule-ocular movements. Abrupt changing of the visual point corresponds to a saccade. The change in visual point can be small, for example, while reading, or wide, in a visual scene exploration. Saccades are mostly reflexive when fixating a stimulus, but can be voluntary. Their ballistic properties establish the opportunity to make a second saccade if the first one has lost the target stimulus. To keep a stimulus on the fovea while it is moving the visual system, use a much slower movement, called smooth pursuit. These movements are

(12)

voluntary, depending on the need to follow a moving stimulus. However, is the moving stimulus to guide the smooth pursuit and, in its absence, only highly trained observers can make this kind of movement. Indeed, the attempt to move the eyes smoothly usually results in a saccade. Vergence movements are disjunctive, and their function is to align the fovea with stimuli located at different distances from the point of observation. Objects nearer or farther away are focus using either a convergence or divergence of the lines of sight of each eye. Vestibule-ocular movements have the function to stabilise the eyes' position with the external world. These reflexive movements allow compensating for head movements, preventing the slipping of the image on the surface of the retina when the head position varies. It is possible to appreciate the role of the vestibule-ocular movements by moving the head on the horizontal axis while fixating an object. The resulting image would remain at more or less the same place on the retina. It is possible thanks to the detection of brief, transient changes in head position and the production of rapid corrective eye movements.

Muscles implicated in the visual system

A typical visual perception requires that the ocular systems appropriately control either the position and movement of the eyes to move the target image in the corresponding areas of the retinas. In this task, the accommodation function has to adjust the size of the pupil and the refraction of the lens and to converge the eyes for directing the images of nearby objects onto the laminas. When the two eyes follow a moving image, instead, their movement must be coordinated to maintain the image at the same point in the binocular field of view. In case of a failure, it is possible to appreciate the diplopia or double vision. The muscles controlling eye movements are named extraocular (also extrinsic muscles of the eyeball, extraocular muscles). Seven extraocular muscles are present in each orbit in connection with the eye (figure 1.1.2). Six muscles are responsible for the eye movements, including four rectus muscles (superior rectus, inferior rectus, medial rectus, lateral rectus, and two oblique muscles), the superior oblique and the inferior oblique. The seventh extraocular muscle is the levator palpebrae superioris, responsible for the elevation of the upper eyelid. Three cranial nerves provide the innervation for the extraocular muscles. They are the oculomotor (CN III), the trochlear (CN IV), and the abducens (CN VI). Specifically, the oculomotor nerve supplies three out of the four rectus muscles (superior, inferior, medial), inferior oblique muscle, and levator palpebrae superioris muscle; the trochlear nerve innervates the superior oblique and the abducens nerve supplies the lateral rectus muscle.

(13)

Figure 1.1.2. Muscles and nerves implicated in the visual system.

1.2 Eye movements in reading

The recognition and elaboration of individual words organized sequentially to create a text are only one of the components of the reading. Reading with the aim of understanding the meaning of a text mainly requires to integrate the meaning of words with previous knowledge. In this case, a more complex and highly coordinated process is needed to guide the shifting of the eyes to the reading of the target word, the re-reading of the previous word (regression) or the anticipation of the reading of the next word (preview benefit).

The eye movements that make it possible to read a text presented in the traditional way (by reading the words from left to right and scrolling the text from top to bottom) are called saccades. The saccades are of a ballistic nature and are separated by fixations, that are very short periods of time in which the eyes are relatively motionless. The role of the saccades consists in bringing the area of maximum visual acuity (the fovea) on the word to be read, while the fixations are used to acquire information. During the movement, in fact, it is not possible to acquire information because of saccadic suppression (Matin, 1974). The duration of saccades and fixations is slightly variable but many studies report that saccades typically last 20-50 ms whereas fixations about 200-250 ms (Rayner, 1978, 1998, 2009).

The challenge for research is to understand how eye movements reflect the cognitive processes underlying reading, and how external features can influence them, such as the

(14)

physical and graphical aspects of a text. Studies on this topic exist since 1879, but only from the 1970s scholars tried to use this information to make inferences about the underlying cognitive processes (Starr & Rayner, 2001). This development is mainly due to the development of new eye-tracker technologies and theories on language processing that allowed higher accuracy in data recording and data analysis. Various studies (see Rayner, 1998; Reichle, Rayner & Pollatsek, 2003; Liverdege, Gilchrist & Everling, 2011 for reviews) focused on eye behaviour in reading, studying both the word (frequency, length, predictability) and the visual system features (visual acuity, parafoveal vision) involved in reading.

One of most important question for eye movements research in reading has been the understanding of what determines the moment when the reader decides to shift his gaze from one word to another? Technically speaking, the impulses coming from the constricting muscle and from the dilating muscle of the iris are those that allow the intrinsic or involuntary oculomotor of the part of the ocular globe responsible for the fixation of the images (and therefore of the graphic form of the words) on the retina. The movement of these muscles is influenced by light, colour and the distance or proximity of the objects that are fixed. The decision on the target to be set seems to be due to factors that involve low-level processing of text features, such as the length of the words, the space between the letters, the shape of the letters; the moment in which a saccade is accomplished, instead, seems to be influenced the ease or difficulty associated with processing a word. As we will see, the linguistic properties of a word influence the time of fixation on it: long words, less frequent and less predictable on the basis of to the context are set longer.

Since the first researches on eye movements in reading it appeared clear that eyes do not move continuously along the text. Indeed, saccades and fixations alternate to move the focus on the text and to elaborate on the meaning of the words, skipping some words and re-fixating others (Rayner, 1998). Saccades take between 20 and 50 milliseconds depending on the length of the movement and involve a forward movement in the text of about 6-9 characters. Due to the phenomenon of "saccadic suppression", no information is extracted during a saccade (Ishida & Ikeda 1989; Wolverton & Zola 1983). The information is extracted during fixations, lasting on average between 200 and 250 milliseconds. It is worth to note that there is considerable variability in saccades' length and the fixations' duration. Much of this variability is related to the ease or difficulty of processing the fixed text (Reichle, Rayner, & Pollatsek, 2003). Fixations' duration critically depends by factors such as word frequency, predictability, length, the similarity with other words or reader's age of acquisition (Rayner, 1998, 2009). Furthermore, the duration of fixation does not fully reflect fixed word processing. A part of the processing of each word fixed starts in parafoveal vision, while the eye fixes the previous word (preview benefit effect), which poses the problem of the separation between attention and gaze (Inhoff, Starr, & Shindler, 2000; Schroyens, Vitu, Brysbaert, & Ydewalle, 1999). The

(15)

duration of a fixation on a word may also reflect part of the elaboration of the previous word (spillover effect) (Reichle, Rayner, & Pollatsek, 2003). Although most saccades are progressive, i.e. proceeding from left to right, regressive saccades, i.e. ocular movements from right to left, are also found with a particular frequency. About 10-15% of the saccades are oriented through parts of the text already examined. Such backward movements are called "regressions" and may be due to problems of linguistic elaboration (e.g. difficulty in understanding) or to ocular-motor errors. Usually, the eyes tend to position themselves between the beginning and the centre of a word, in the so-called "Optimal Viewing Position". The left and right eye start a saccade at the same time but "land" at a distance of about 1.5 characters from each other. During the fixation, the two eyes come together again. The disparities in fixation, in fact, are rebalanced by the brain's synchronization activity. The fixation slightly to the left of the centre of the word is caused by the vertical split of the fovea. This apparent imbalance gives each brain hemisphere an equal chance to recognize the word, considering that each eye projects into the contralateral hemisphere and that there is an advantage of the left hemisphere for linguistic tasks (Shillcock, 2007). When the eyes are positioned near the end of a word, they often make a regression of some characters (O'Regan, 1990). This observation supports the idea that regressions are due to ocular-motor errors. The time needed to plan and execute a saccade has been defined as "saccadic latency", various experiments have indicated that it is about 180-250 milliseconds (Becker & Jurgens, 1979; Rayner, Carlson & Frazier, 1983). With that in mind, scholars suggested that the first 100 milliseconds of fixation can establish a saccade. However, most of the surveys (Ishida & Ikeda, 1989; Rayner, Inhoff, Morrison, Slowiaczek & Bertera, 1981) indicate that, even though the individual owns information useful for the elaboration of a word after the first 50-60 milliseconds of a fixation, its lexical recognition requires from 100 to 300 milliseconds. We can, therefore, affirm that the identification of the word occurs in too slow a time to be considered the guide of eye movements (Reichle, Rayner, & Pollatsek, 2003). The peculiarity of the visual system has already been mentioned so that a detailed vision of the stimuli is possible only through a foveal vision of the same. The relationship between the foveal area and the boundary of a word is fundamental for reading studies using eye-tracking. The visual acuity decreases as moving away from the fovea and the reader must necessarily fix most of the words in the text in order to identify them. The foveal region covers about 2° of visual angle, which corresponds to 3-4 characters to the right and left of the fixation, or 6-8 characters in all (Rayner & Sereno, 1994). However, the reader can also use information extracted by parafoveal vision. The parafoveal region can cover about 5° of visual angle, and according to some authors it does not allow to extract linguistically relevant information, but only indications on the length and shape of the text (Staub & Rayner, 2007). We will see later how crucial parafoveal information is to understand eye movements involved in reading tasks. The portion

(16)

of text that can be "captured" by fixation is known as "perceptual span". Its amplitude depends on the language and includes both foveal and parafoveal information. This region does not extend beyond the line that is read and is constant for readers of similar alphabets, often can coincide with the single word or even include two. In Latin alphabet languages, such as English or Italian, it extends 14-15 characters to the right and 3-4 characters to the left of the fixation point. However, word encoding probably does not extend beyond 7-8 characters to the right of the fixation point, with a specific language right-left asymmetry. During a reading task, it is possible to identify several phenomena that show that reading a text is not a sequential and ordered activity, the most common are the effects of skipping, parafoveal preview, length, frequency and predictability of a word.

Regressions

Regressions in reading are movements of the eyes towards words to the left of the target word, thus already fixed previously (Rayner, 1998, 2009). They can involve a few or more words (figure 1.2.1). In the former case, a possible explanation is the correction of oculomotor errors (see Bicknell & Levy, 2011; O'Regan, 1990), or the search for the "optimal position of vision" (O'Regan, 1990; Brysbaert & Nazir, 2005). Difficulties of linguistic processing (Reichle, Rayner, & Pollatsek, 2003) can produce longer regressions, with the function to re-read a word for integrating its meaning in the whole context.

The regressions represent about 10-15% of the sacks made during the reading (e.g. Reichle, Pollatsek, Fisher & Rayner, 1989). Differences may be due to different reading strategies used to understand a text (Frazier and Rayner, 1982; Kennedy and Murray, 1987a, 1987b; Murray and Kennedy, 1988). Despite numerous studies on the role of eye movements in reading (for an overview see Rayner, 1998, 2009), there are still two possible explanations. One of these is that re-reading serves to retrieve missing information, called "re-reading hypothesis". Consistent with this hypothesis, Reichle and colleagues (1989) reported that readers made more regressions when reading a more complex text than a simpler one. The number of regressions could also vary with dynamic characteristics of the text, such as an increase in the number of grammatical errors or ambiguities present (Blanchard & Iran-Nejad, 1987) or a change in the subject (Hyönä, 1995). The second possibility is that regressions serve to integrate information from working memory, reconsidering the spatial properties of the text (Ballard, Hayhoe, Pook, & Rao, 1997; Ferreira, Apel, & Henderson, 2008; O'Regan, 1992; Spivey, Richardson, & Fitneva, 2004).

(17)

Figure 1.2.1. Example of eye movements made during reading with the objective to understand a text.

Skipping

When saccades move the target stimulus (word) into the foveal region, fixations elaborate its meaning. Skipping words consist of elaborating a word present in the parafoveal region and implies that the word present in the fovea is not elaborated at the same time (Rayner, White, Kambe, Miller, & Liversedge, 2003). During a reading, fixations refer to only two-thirds of the words of the whole text (Rayner, 1998, 2009). Skipping is related to some characteristics of words as, for example, their length and the contextual constraints. The shorter words are skipped more likely than, the longer ones (Brysbaert & Nazir, 2005; Drieghe, Brysbaert, Desmet, De Baecke, 2004; Drieghe, Desmet, Brysbaert, 2007; Rayner, 1998). When two or three short words occur in succession, there is a good chance to skip two of them, as well as to ignore short words that precede the content words (Drieghe, Pollatsek, Staub, Rayner, 2008; Gautier, O'Regan, Le Gargasson, 2000). Moreover, words that are highly bound by the previous context are more likely to be skipped than unpredictable words (Balota, Pollatsek & Rayner 1985; Binder, Pollatsek & Rayner, 1999; Ehrlich & Rayner, 1981; Rayner & Well, 1996; Schustack, Ehrlich, & Rayner, 1987). The frequency also affects the skipped word, but the effect is less than that of the contextual constraint (Rayner, Sereno, & Raney, 1996). While predictability affects whether or not to jump the word, it does not affect fixation on the word (Rayner, Binder, Ashby, Pollatsek, 2001; Vainio, Hyönä, Pajunen, 2009). Another cause not to be neglected, especially when we see phenomena of skipping short words, are the so-called "oculomotor errors" that occur when a saccade lacks its target (Nuthmann, Engbert, & Kliegl, 2005). It is interesting to note how uncommon or irregular orthographic combinations of the word in para-foveal vision can reduce the skipping effect in cases where the irregular word is placed immediately after the fixed word, or increase this phenomenon in cases where there is another word between the irregular one and the one currently fixed (Hyona, 1995). Finally, Kliegl and Engbert (2005) noted that fixations preceding the skipping of frequent words generally have a shorter duration than fixations preceding the skipping of less frequent words; these data confirm the hypothesis that the processing of the word after the fixed one starts

(18)

from parafoveal information. At the same time, it means that the processing of a skipped word can still happen in the parafoveal vision (Fisher, Shebilske, 1985).

Parafoveal preview

The elaboration of words present in the foveal area also describes another phenomenon, called parafoveal preview. It consists of obtaining in advance information of the word after the fixed one, implying the ability to process information extracted by parafoveal vision, regarding the shape of letters, length, spelling and phonological characteristics of the word after the fixed one (Briihl & Inhoff, 1995; Inhoff, 1989; Rayner, Well, Pollatsek & Bertera, 1982). In extension to allowing the skipping, the parafoveal preview represents an advantage in the elaboration of the next word. This benefit can be estimated by the difference in fixation time when parafoveal information is present or not. Another important aspect is how parafoveal information affects the control of eye movements (O'Regan, 1990; Rayner, Sereno, & Raney, 1996). For example, saccades on the parafoveal word are shorter and directed towards its beginning, instead of its centre. Schotter, Angele and Rayner (2012) suggest to represent the interaction between foveal and parafoveal information on a continuum: when there is no foveal information to be processed, individuals can extract a considerable amount of information from the parafovea. At the opposite, much foveal to process means reduced information obtained from parafovea. A paradigm used to study the preview benefit is called boundary-paradigm (Rayner, 1975). In the boundary paradigm, an invisible boundary is just to the left of the target word, and before the reader crosses the boundary, there is a different preview from the target word. When the eyes cross the border, the target word replaces the preview. Readers are unaware of the preview's identity and the display's change. Research using this paradigm has revealed that when readers have a valid preview to the right of the fixation, they spend less time fixing on a word than when they do not have a valid preview (i.e., a non-word or a string of random letters). The size of the advantage of such a preview is typically 30-50 ms. Research using this technique has revealed that readers do not combine a literal representation of visual information through saccades, useful for integrating spelling and phonology information, but not semantic ones (Altarriba, Kambe, Pollatsek & Rayner, 2001; Rayner, McConkie & Zola,

1980). Thus, words generally producing the priming effect in a standard denomination or in a decision making activity (for example, the first word of the melody triggers the target word of the song) do not do the same in the parafoveal vision. Maybe it is because of the limited space available in parafoveal vision. However, it does not mean that words in the parafoveal vision cannot be identified. When words are short enough or constrained by context, readers can identify them just before skipping. The amount of preview advantage that readers can get

(19)

varies depending on the word set. If the word is difficult to process, readers get little or no benefit from the preview of the word to the right of the fixation (Henderson and Ferreira, 1990; Kennison and Clifton, 1995; White, Rayner, Liversedge, 2005); if the fixed word is simple to process, readers get a better advantage of the preview (Balota, Pollatsek & Rayner, 1985; Drieghe Rayner & Pollatsek, 2005). Another relevant aspect is the spatial extension of the advantage of the preview. In particular, readers get the benefit of the preview from the second word to the right of the current fixed word. While it is clear that readers generally get a benefit from the first word to the right of the fixed word, it is not possible to claim the same f rom the second word to the right of the fixed word (Angele, Slattery, Yang, Kliegl, Rayner, 2008; Kliegl, Risse, Laubrock, 2007; McDonald, 2005, 2006). It is possible that when the first word to the right of the fixed word is short and high frequency (2-3 letters) the advantage of the preview is obtained by the second word to the right of the fixed word.

(20)

Section two

This section addresses the issue of technological development in reading from two perspectives. The first is the methodological one and refers to the techniques that have allowed to improve the study of eye movements in reading and to deepen the study of their role in the underlying cognitive processes. The second perspective concerns the aspect related to the reader's habits, adapted to the possibility of reading on many more occasions and many more devices.

2.1 Methodological innovation in the study of eye movements

Eye-tracking refers to the use of proper techniques and tools to track individual eye movements. Eye-tracking allows to detect and analyse data related to what a subject looks, to derive its cognitive processes. Yarbus (1967) was the first to analyse the eye traces obtained through eye-tracking techniques. His famous experiment on the visual exploration of Repin's painting "The Unexpected Visitor" can be considered a milestone in eye movement studies. He noted that the subjects paid attention to different parts of the painting depending on the task assigned to them, reflecting these different strategies in various scanpaths generated. The development of eye-tracking technologies has facilitated the study of eye movements, allowing the description of cognitive processes involved in reading by providing numerous contributions both for the explanation of reading (Just & Carpenter 1980, 1984; Schilling, Rayner & Chumbley, 1998; Reichle, Rayner & Pollatsek, 2003), and for linguistic and arithmetic tasks (Hegarty, Mayer & Green 1992; Suppes, 1990).

2.2 Eye movement recording techniques

Eye movements recording techniques can be synthesised in four different groups, the electro-oculography (EOG), the photo-electro-oculography (POG) or video-electro-oculography (VOG), the "coil" or galvanometric technique scleral" (scleral/search coil contact lenses), infrared oculography (combined pupil-corneal reflection; Duchowski, 2007). The electro-oculography is measured by placing electrodes around the eye (above and below, to the left and right). This apparatus allows measuring the variations of tension caused by the movement of the eyeball. This method, used until since the 1960s, does not allow high-precision measurements because it can only measure the position of the eye referred to the head, however it has the advantage to be very economical, and it is also the only one applicable to study the ocular behaviour during sleep. Another strategy for tracking eye movements is the galvanometric technique or "scleral coil". It uses a contact lens that covers the cornea and the sclera. A "pedicle" is attached to this lens and allows you to send ocular activity data to a mechanical or optical

(21)

device, such as a reel which measures the variation of an electromagnetic field. Although it is an exact method, it has the apparent disadvantage of being too intrusive. The photo/video-oculography allows through sequences of shots or video footage to measure specific characteristics of the eyes during their movement (such as the shape of the pupil, the edge that separates the sclera and iris and the corneal reflexes caused by one or more light sources, usually infrared). Controlling the stimuli presented to a subject at a given time, and the direction of the gaze, the technique allows to make assumptions about visual behaviour. However, even this technique can analyze only the ocular movement in itself, based on the position of the head, which must be held fixed through a chin rest. Infrared ophthalmology exploits the cornea's properties to reflect the infrared light. This technique can record through a camera with CCD (Charge Coupled Device) sensor the corneal and pupil reflexes generated by an infrared light source. In general, data sampling takes place at a speed that varies between 30 and 2000 Hz, depending on the device used, and with an accuracy between 1/2° and 2° of the field of view. We can distinguish so-called table-mounted devices (very similar to ordinary LCD monitors) in which the monitor integrates the eye-tracker or wearable devices (helmets or glasses). The calibration procedure is the first step in recording eye movements. It consists of matching a specific number of point presented to the subject with its corneal reflexes (also known as Purkinje reflections). Thus, corneal reflexes allow to identify the exact position of the pupil and to derive the direction of the gaze. Infrared ophthalmology has the advantage of being not very intrusive and of providing a precise estimation of the gaze direction, also offering compensation for head movements.

2.3 Eye tracking

All eye trackers consist of two components: one or more cameras and a light source. The cameras track the reflection of the light source, usually an infrared light directed towards the eye. The data allows the movement of the gaze to be tracked, providing information such as observation points, saccades, pupil diameter, blink frequency and pupil diameter variations. Currently, there are three main types of eye-trackers. The most common is the one based on monitors or remote. They consist of 17" LCD monitors (or higher) that contain the integrated eye movement detection device. This type of eye-tracker records the direction of the individual's gaze concerning the screen, and for this reason, it is the most suitable for the study of user interfaces and direct control of the computer in the assistive field. Other eye-trackers do not have their monitor, but the only infrared led emitters and a video camera for the perception of the eye signal. A mixed eye-tracker has several advantages. For example, it can be used with any screen and allows to measure the direction of the gaze even outside a

(22)

laboratory context. Finally, there are wearable eye trackers, wearable or head-mounted. In the past, these instruments were rather large and invasive and generally required long and complicated calibration procedures. With technological evolution, wearable systems are becoming easier for using in both applied and laboratory research. They present like hats or glasses on which infrared emitters and the camera are attached, reducing the intrusiveness of the recordings. At the same time, calibration has become much faster.

2.3.1 Eye tracking data analysis

Once recorded eye movements, the researcher's purpose is to extract the data and analyse them. Both these steps deserve attention, as they determine the final result.

2.3.2 Data extraction

The information provided by eye tracking is usually transformed in pixel coordinates with x, y points representing the position of the gaze on a hypothetical Cartesian axis, that is the observer's field of view. These raw data are usually processed by algorithms that identify fixations. As described by Salvucci and Goldberg (2000), this process aggregates observation points and minimal movements such as tremors, drifts, and flicks, reducing the complexity of the information. This process is used to select tiny movements of the gaze that do not have a real functional value. The process of fixations identification is, therefore, necessary for data analysis. As described by Karsh and Breitenbach (1983), different identification algorithms can lead to very different interpretations of the same data set. The problem of information identification appears essential for the analysis of eye-tracking data, as widely explained in a paper published by Salvucci and Goldberg (2000). Resulting data consist of fixations and saccades to which are added other information provided directly by the tool, such as fixation durations, saccadic velocities, saccadic amplitudes, and various transition-based parameters between fixations and regions of interest.

2.3.4 Metrics

Fixations are the only time when the stimuli present in the individual's field of vision can reach the foveal part of the retina, allowing a detailed view. On average, fixations can last between 60 and 300 milliseconds. The primary metrics derived from fixations refer to their number, duration and frequency. From the pioneering work of Fitts (Fitts, Jones, Milton, 1950) to more recent applications (Van Orden, Limbert, Makeig & Jung, 2001), the duration of fixations is considered indicative of the difficulty of processing (see also Findlay & Kapoula, 1992; Moffitt, 1980; Rayner 1998), while the frequency of fixations would be indicative of the importance

(23)

attributed to the area inspected. Saccades are displacements between two points of fixation. They have the function of bringing the stimuli into the foveal area of the retina.

They last on average between 30 and 80 milliseconds and speed proportional to the amplitude of the movement of the eyes, typically between 300 and 500 degrees per second. Saccades and visual attention are closely related (Pashler & Sutherland, 1998; Hoffman, 1998). Several studies report that information processing is suppressed during the saccades (Hansen and Sanders, 1988; Irwin & Carlson-Radvansky, 1996; Irwin, Carlson-Radvansky, Andrews, 1995; Matin, Shao, Boff, 1993; Sanders and Houtmans, 1985; Sanders and Rath, 1991; Van Duren, 1993; Van Duren and Sanders, 1992, 1995), although this knowledge is not uniquely shared (e.g. Irwin 1998). The sequence of saccades and fixations is called a scanpath. The shape that the scanpath can provide information about the visual strategy used to expose a scene. The most common analyses take into account the area, extent or shape of the pattern of fixations and saccades (figure 2.1.1). Its interpretation is very often based on the transitions that occur between different areas of interest in a subject's field of view (Fitts, Jones & Milton,

1950; Tole, Stephens, Vivaudou, Ephrath & Young, 1983), but recently technical proposals have been made that disregard areas of interest and are based on a geostatistical analysis based on the subject's entire field of view (Di Nocera, Camilli & Terenzi, 2007).

Figure 2.1.1. Example of a scanpath, consisting in a series of saccades (green lines) and fixations (green circles).

(24)

Concerning reading, these metrics have been analyzed taking as reference large areas of the text (global averages) or the single word (word-based). Global averages refer to eye movements recorded when reading large parts of a text, such as two or more sentences, a paragraph or paragraphs. These measurements include the average duration of fixations, the length of progressive sacks, the probability of a reader making a regression; they generally reflect the difficulty of the reading process. Results showed how more complex texts mean longer average fixations, shorter saccades and a higher number of regressions (Rayner & Pollatsek, 1989). Although such global measurements have some value, the use of the eye movements as an on-line measure to understand cognitive processes requires to measure more precisely located units in the text (Blanchard, 1985). As a result, other measures have provided information about a smaller region of text, a word or a phrase, as analysis units. Although exist metrics for both the single word and the single sentence (Rayner & Sereno, 1994; Rayner, Sereno, Morris, Schmauder & Clifton, 1989), the most commonly used refer to the single word. These metrics provide information about how long a reader's gaze remains on a given point. Among the most commonly used there is the gaze duration (Just & Carpenter, 1980), defined as the sum of the total duration of the fixations on a word when it was fixed for the first time, thus excluding regressions on it. If we consider the word "n" the gaze duration is obtained by adding all the fixations that the word receives before the gaze moves to the word "n+1". This metric only takes into account cases where a word is fixed when the reading of the text follows a regular course. For this reason, the duration of fixations on skipped words during the first reading of the text and then regressed are not taken into account. Metrics based on fixations are the duration of the first fixation duration (Inhoff, 1984), and the average duration of fixations on a word, excluding fixation time resulting from regressions (single fixation duration; Rayner, Sereno, & Raney, 1996). It is possible to obtain metrics relating to the second or third fixation on a word, however, since words are usually fixed only once, these measures are not used as much as the duration of the first fixation. The total duration of fixations on a word allows the total fixation time to be obtained. These measurements indicate how and for how long a reader fixes a specific word. However, in order to have a complete picture of eye movements, it is also necessary to know the probability that a word has to be fixed during the first reading of the text (first pass), the probability of being skipped, the probability of being first skipped and then read following a regression. The measures that record the steps following the first one (number and duration of fixations, number of regressions, length of saccades) and the total reading time are called late measures or "deferred measures". While immediate measures account for the initial analysis of the sentence by the reader, deferred measures relate to the process of global re-analysis of the sentence. More recent research suggests that syntactic and semantic re-analysis processes only cause regressions in the region of the anomaly and do not influence the total reading

(25)

times (Clifton, Staub & Rayner, 2007). The process of re-analysis is necessary when an individual realizes that the first interpretation given to a sentence is incorrect. While immediate measures refer to the processed words or sentences, the deferred measures can also record the effects on the processing of words that have been fixed previously, recognized and stored in working memory, which at some point are reconsidered, integrated and reinterpreted in relation to the overall meaning of the text (Pickering & Frisson, 2001). However, research into eye movements and reading does not end with the identification of such metrics. The data obtained from the recordings have meaning if inserted within a model capable of explaining the reader's cognitive and physiological processes. The next paragraph will be devoted to the description of the leading models elaborated in an attempt to explain the ocular behaviour typical of reading.

2.3.5 Metrics and text features

In reading, typographical and textual characteristics of the word strongly influence saccades and fixations. For example, common nouns and verbs are fixed about 85% of the time, while function words (prepositions and conjunctions) about 35% (Carpenter and Just, 1983; Rayner and Duffy, 1988). Furthermore, words with a length of 2-3 letters are fixed only about 25% of the time, whereas longer words almost always. More generally, to the extent that the text becomes more conceptually complex, there is an increase in the duration of fixations, a reduction in the length of the saccades and, finally, an increase in the frequency of regressions (Rayner & Pollatsek, 1989). Therefore, print quality (font variation), line length and spacing between letters can influence eye movements (Morrison and Inhoff, 1981). A crucial aspect is that eye movement measurements can be used to deduce the cognitive processes of moment-by-moment reading (Just and Carpenter, 1980; McConkie, Hogaboam, Wolverton, Zola, Lucas, 1979; Rayner, 1978; Rayner, Sereno, Morris, Schmauder, Clifton, 1989). Eye-trackers of this type allow observing ocular movements without altering the ecological validity of the recordings, making it possible detecting the ocular behaviour in the most varied situations of daily life.

2.4 Advantages and limitations of eye-tracking

In summary, the accuracy of the recordings, the low intrusiveness and the ecological validity are the main advantages of eye-tracking. The eye-trackers currently available can provide data on eyes' position, pupil diameter, presence of the subject in the range of action of the device and its relative distance. It is not risky to think about the integration of this technology is commonly used devices such as personal computers, tablets or smartphones, to obtain qualitative and quantitative data without interfering with the users' habits. Despite the progress

(26)

achieved so far, there are still some limitations related to the accuracy of the surveys and the commercial availability of eye-tracking devices. A not insignificant limit is constituted by the signals of disturbance generated by movements of the head of the subject, winks or other factors that generate reflections not recognized by the device, in fact, although in a few cases, devices that use infrared light have problems with users who wear glasses, or with users with a particular anatomical shape of the eyes (eg: almond eyes). During the performance of a task the recordings of eye movements can also be subject to two types of errors, one due to poor accuracy of the instrument that involves a dispersion of the points (gaze points) around the actual fixation maintained by the subject (variability error), and one due to poor accuracy that involves moving the average position of the gaze points from the actual fixation maintained by the subject (systematic error). Another limit is represented by the price of the devices on the market (higher than 20,000 euros), which does not allow the purchase to universities or research institutions that do not manage to obtain broad public or private funding. The causes of the high price of eye-tracking devices derive from the cost of components (high-resolution cameras and high-quality lenses), and especially in the small market that these devices have had so far.

(27)

Section three

As a result of the spread of digital devices, new methods and tools for the analysis of cognitive processes that allow the reader to understand the meaning of a text have enriched the study of reading. Many studies, starting from the 80s, have investigated the legibility of a text on a digital display, identifying the pros and cons and suggesting new ways to improve understanding of the text and speed of reading (of those already seen: Juola, 1988; Kruk & Muter, 1984; Muter, 1996; Muter, Latremouille, Treurniet & Beam, 1982). Below, some paradigms used to study reading will be presented, with particular attention to moving-window reading, trailing-mask, cumulative reading and Rapid Serial Visual Presentation (RSVP).

3.1 Reading study paradigms

In the moving-window (figure 3.1.1), most of the text is interrupted, except for a moving area or window surrounding the reader's fixation point. With each movement of the eyes, different parts of the text are deleted, making sure that there is a regular reading are within the region of the window. In experiments using the moving-window paradigm, the logic is to vary the amount of information available to a reader and then determine the width of the text window, before readers usually read. Vice versa, it is also necessary to determine how small the window can be before there is an interruption in the reading. In experiments, the text appears typically inside the window area, whereas letters are replaced outside the window. Research using this paradigm has shown that readers specialised in English and other alphabetical writing systems process a quantity of information (perceptual span) with an extension of 3-4 letters to the left of the fixation (McConkie and Rayner, 1976; Rayner, Well, Pollatsek, 1980; Underwood and McConkie, 1985) and 15 letters to the right of the fixation (McConkie & Rayner, 1975; Rayner and Bertera, 1979; Rayner, Well, Pollatsek, Bertera, 1982; Underwood McConkie, 1985; Underwood and Zola, 1986).

Figure 3.1.1. Representation of the moving-window. From top to bottom, the reading takes place through the transformation of the x in the corresponding letters. The visible part of the text is always and only one.

The "parafoveal magnification" was introduced as a variant of the window-mobile paradigm. To compensate for the difference between foveal and parafoveal visual acuity, it increases the

(28)

size of the text present in parafovea. Finally, in the moving-mask experiments (Fine and Rubin, 1999a, 1999b, 1999c; Rayner and Bertera, 1979; Rayner, Inhoff, Morrison, Slowiaczek, Bertera, 1981) a visual mask moves simultaneously with each eye fixation covering the letters in the centre of the vision. The paradigm of the mobile mask thus creates a central artificial scotoma that eliminates the use of foveal vision, making reading difficult, if not impossible. The technique of trailing-mask (figure 3.1.2) corresponds to a modification of the more classical moving-window. The differences with the moving window are mainly two. The first is that the subjects, before testing, can read the whole sentence. The second concerns the displayed text segments because the displayed words are not hidden and are therefore always available, until the display of the last word.

Figure 3.1.2. Representation of the trailing-mask paradigm. From top to bottom, the lines show the subsequent views of a sentence chosen as an example. Asterisks represent the positions of fixations.

This technique has also been used to study the role of regressions in reading. In 2014, Schotter, Tran and Rayner had tried to eliminate regressions in reading, through the trailing -mask paradigm. The authors referred to the studies of Masson (1983) and Booth and Weger (2013), reporting how regression was a natural part of reading, fundamental to obtain more visual inputs in order to have an overview. Thus, Schotter and colleagues tried to investigate regression, trying to understand how it could be useful in reading tasks, using the trailing-mask paradigm, a modification of the more classical moving window technique (Just, Carpenter & Woolley, 1982; Rayner & Bertera, 1979). The sentences had been constructed ambiguously, and for each of them, at the end of the presentation, the subjects had to answer questions of true/false understanding. The trailing-mask paradigm had not proved useful: the impossibility of making regressions had negatively influenced the understanding. Regressions were an essential part of the reading as they allowed the reader to access further information about the text.

Through cumulative reading, subjects can usually read text from left to right. Through a button, subjects can make the words of each line appear, one after the other, on an initially empty display. After the subject reads the first word, he can press the button again to make the

(29)

second one appear, while the first one is still visible in its original position: this process lasts for the whole text. In a 1982 study, Just, Carpenter and Woolley compared three different textual presentation techniques: cumulative reading, moving-window and stationary-window; in particular, the condition of textual presentation that received the most attention was moving-window. In all three conditions, the subjects read a text of about 130 characters presented on screen. In the "cumulative reading" condition, the subjects regularly read a text from left to right through a button thanks to which they could make the words of each line appear, one after the other. When the subject reads the last word, the whole text disappears; after that, he can answer the comprehension questionnaire. In the other condition of this experiment, the subjects would read the text through the technique of moving-window. The participants, through a button, could decide when to start the presentation of each sentence. The third condition was called stationary-window condition; all words were presented in the middle of the screen with each new word replacing the previous one. This technique is also called Rapid Serial Visual Presentation.

In all conditions, subjects had to read the texts naturally without memorizing them, and then answer oral questions of understanding concerning each paragraph read. In general, the results were generally similar between the various conditions. Readers dwell longer on longer words, less frequent words and words that introduce a new topic or at the end of sentences. Participants who read the text through the technique of moving-window did not have the option to go on independently with the text as in the cumulative reading condition, thus bringing back the higher reading times compared to the other two conditions. This result was unexpected since, through the moving-window, there were no possibilities of regressions and re-readings. Another significant result of the study was that the moving-window condition produced reading times and eye movements similar to those produced in the traditional reading of a text. Finally, in the moving-window condition, the ability to remember increased when talking about content such as definitions/cause/consequences. At the contrary, the scores for memorization and understanding were lower than in the cumulative reading and stationary-window conditions about detailed information. So, in the stationary-window condition, the text is displayed by presenting single words in the centre of the screen, with each new word that replaced the previous one. This technique of text presentation is also known as Rapid Serial Visual Presentation (RSVP).

3.2 Rapid Serial Visual Presentation

The RSVP format consists of showing text as a sequential stream of words and was initially used as a controlled method for studying reading-comprehension processes (e.g., Forster,

(30)

1970; Forster & Ryder, 1971; Holmes & Forster, 1972; Potter, 1984).

RSVP has been studied for its promising quality of reducing the cognitive load of oculomotor control processes by decreasing the number of saccades and eliminating the regressions on previous words or sentences. This point is quite controversial indeed, because other scholars support the importance of regressions for effective reading comprehension (for example, Schotter, Tran and Rayner (2014). Another benefit may be the reduction from the negative influence of visual crowding, which is the deleterious effect of clutter on object recognition (Whitney & Levi, 2011) and known to affect reading speed (Pelli & Tillman, 2008). Other scholars also pointed out that RSVP-based techniques could be helpful for specific populations of readers, such as novices, visually impaired, older adults, and people with dyslexia (Castelhano & Muter, 2001; Chen, 1986; Lemarié, Eyrolle & Cellier, 2008; Potter, 1984; Williamson, Muter, Kruk, 1986). The recent development of RSVP, Spritz, has been proposed to increase the reading speed by further reducing the eye movements. Whereas conventional RSVP displays words either left-aligned or centered, Spritz use the Optimal Recognition Point (ORP) to center the word in an attempt to speed word recognition while reducing the need for eye movements (figure 3.2.1). Indeed, this would function as a cue for the reader to keep his eyes fixed on the red letter, whose position will be the same for all words. The function of this application is aimed at eliminating any voluntary eye movement, unlike in the traditional implementation of RSVP. One last feature of Spritz is that it slows down the reading speed for certain words and punctuations. The interval between words is not constant but varies according to the word's and the sentence's length (in the case of its last word).

Figure 3.2.1. Rapid Serial Visual Presentation (on the left) and Spritz implementing the Optimal Recognition Point (on the right).

(31)

3.3 Experimental results about the RSVP

Studies that have directly compared reading comprehension levels across display methods have shown that RSVP is superior to the leading format (i.e., moving text from right to left in discrete jumps of one or more characters at a time; Granaas, McKay, Laham, Hurt & Juola, 1984), while Juola, Ward, and McNamara (1982) found that the comprehension of short paragraphs was about the same using RSVP or a standard page format at comparable reading rates. However, when reading rates are increased by shortening the length of each display in the RSVP format, reading comprehension decreases steadily when the presentation rate exceeds 250 wpm (e.g., Juola,Ward, and McNamara, 1982; Just & Carpenter, 1980; Potter, 1984). A possible explanation for these results is that eye movements do not represent a resource-consuming activity (cf., Schneps, Thomson, Sonnert, et al., 2013; Schneps, Thomson, Chen, Sonnert, & Pomplun, 2013), but instead a necessary component of reading comprehension. Indeed, eye movements may even reduce the workload rather than increase it (Rayner, Schotter, Masson, Potter & Treiman, 2016), thanks to the benefit of rereading part of the text (regressions) or having anticipations on pieces of the words to read (parafoveal preview, see Schotter, 2018; Schotter, Angele & Rayner, 2012). In this case, RSVP would make reading more tiring, increasing the workload. This assumption seems consistent with found by Ricciardi and Di Nocera (2017). In their study, reading using RSVP resulted in a higher mental workload perceived by readers using RSVP than reading using a traditional layout. Similar results were found also using Spritz instead of classic RSVP. Boo and Conklin (2015) tested Spritz efficacy comparing detailed and inferential comprehension after reading texts on paper vs. a computer display using RSVP. The texts were no longer than 500 words and were presented in English at an imposed reading rate of 500 and 1000 wpm for native and nonnative speakers. Results showed that presenting text one word at a time did not impair inferential comprehension for both native and nonnative speakers when subjects read at 500 wpm, supporting Spritz efficacy. However, it is worth noting that the texts they used were no longer than 500 words, so the reader was committed for only one minute or less, and only five questions were used to assess comprehension. With a different approach, Benedetto,

Carbone, Pedrotti, Le Fevre, Bey, and Baccino (2015) used a chapter of the “1984” novel by

George Orwell and tested the effects of the two reading modalities on comprehension, visual fatigue, performance, task load, and ocular behavior. Results showed no differences between the two reading modalities in inferential comprehension, whereas literal comprehension was lower for Spritz than for traditional reading at a presentation rate of 250 wpm. The authors interpreted these results to suggest that inferential comprehension can compensate for the loss of detailed information at normal reading rates in the RSVP format. The questionnaire they used included fifteen literal and fifteen inferential questions, so it was more detailed than

(32)

that used by Boo and Concklin. Benedetto et al. therefore provided important evidence about Spritz efficacy, although their results are limited to a relatively slow reading rate that is no faster than the average reading speed of a reader who reads a text with the aim of learning from it (Baccino, 2004). Attempting to extend what found by Benedetto and colleagues (2015), Ricciardi & Di Nocera (2017) designed a study in which reading an online magazine article was tested at different RSVP rates, and the results were compared with those of traditional reading. The results showed no (inferential) comprehension differences between the traditional reading format and RSVP at a rate of 250 wpm, whereas reading with RSVP at a rate of 450 wpm resulted in poorer text comprehension. This result partially confirms the study of Benedetto et al., because no differences were found at a slow reading rate, and are different than Boo and Concklin’s, who found no differences in inferential comprehension (what they named “gist”) between Spritz and traditional reading. This difference could be due to differences in the length of the text used and to the number and difficulty of the comprehension questions asked. In their study, Ricciardi and Di Nocera (2017) also used a subjective measure and the dual task paradigm to assess readers mental workload. They found that: “presentation speed does not affect performance in a dual task condition, thus suggesting that the effect on comprehension is not due to the attention devoted to the reading task, but most probably to difficulties in encoding (Potter, Kroll & Harris, 1980)” (Ricciardi & Di Nocera, 2017, page 14). In a subsequent study, Di Nocera, Ricciardi and Juola (2018) tested additional reading speeds via RSVP, found a difference in reading comprehension when the presentation rate was increased above 350 wpm. In this study, authors used the same text (about five thousand words) and inferential questionnaire to assess reading comprehension. One aspect characterizing these results and diverging from the study of Boo and Conklin (2015) is really the length of the experimental text. Thus, a similar setting was replicated but using a short text (about one thousand three hundreds words) and using 250, 350 and 450 wpm RSVP conditions (Ricciardi, Juola & Di Nocera, submitted). Results replicated the same comprehension threshold at 350 wpm, suggesting a perceptual explanation of comprehension decrement. Indeed, authors suggested that using RSVP at a speed rate the reader has not the time to integrate the information and organize a general understanding of the text.

These results seem to discourage the use of RSVP to read faster. However, the ability of this technique to present text effectively at an average reading speed is not of little interest. The readability of a text is an essential element when the operator needs to display information in a small amount of space. Similar situations can imply the use of head-up displays, for example. In these cases, the visibility of the presented text is competitive with the availability of the field of view and finding an alternative way of presenting the text may be essential. A question to consider is also the possibility of presenting a text in a tiny region of the display. An example is a complex operating context where the amount of information to be displayed

(33)

on the screen is important and the availability of displaying text messages reduced. In such cases, RSVP may be useful to maintain an adequate letter size in a small portion of space. An alternative technique to RSVP is called leading (Granaas et al., 1984), in which text shifts "from right to left across the display so that new information comes in more-or-less continuously from the right side, while older information disappears off the left side" (Juola, Tiritoglu & Pleunis, 1995, p. 1). In a 1995 study, Juola, Tiritoglu and Pleunis found that subjects who read experimental sentences were more accurate when subjects used RSVP than the leading technique. This result also confirmed what was previously found by Granaas and colleagues (1984).

Figura

Figure 1.1.1. Representation of the foveal and parafoveal regions.
Figure 1.1.2. Muscles and nerves implicated in the visual system.
Figure 1.2.1. Example of eye movements made during reading with the objective to understand a text.
Figure  2.1.1.  Example  of  a  scanpath,  consisting  in  a  series  of  saccades  (green  lines)  and  fixations  (green circles)
+7

Riferimenti

Documenti correlati

In this work, we calculate the quantum conductance of a 2D electron gas roaming on a Sierpinski carpet (SC), i.e., a plane fractal with Hausdorff dimension intermediate between 1

Fat Mass and Obesity-Associated Gene ( FTO ) in Eating Disorders: Evidence for Association of the rs9939609 Obesity Risk Allele with Bulimia nervosa and Anorexia nervosa..

(i) assessment of crop productiv- ity under current climate and management, (ii) short to medium term forecast of yield and soil moisture for a sustainable irriga- tion

We describe its performance not only for the Perceptron Learning Problem but also for the random K-Satisfiabilty Problem another prototypical CSP with a radically different

Within urban, regional, and territorial studies, this concept is especially suitable for describing how to imagine, follow through, and plan the re-territorialization of

Future development of the project The design of this visual tool prototype combines different kinds of data visualization as part of the research process. The tools

Effects of treatments of Bacillus firmus, oxamyl and fosthiazate alone or in combination and without (first crop cycle in March–August) or with soil