Submitted for the proceedings of the First International Conference on Document Design Tilburg, December 17 and 18, 1998
Published in a slightly edited version in: Document Design, vol. 2:1, pp.76-88, 2001.
In an electronic publication environment, a scientific article is structured more effectively and efficiently if it is presented as a coherent collection of well-characterised and explicitly linked modules, rather than as a traditional linear essay. In a linear printed article, the abstract primarily fulfils a selection and substitution function. In a network of modular electronic articles, the abstract is primarily an orientation tool providing insight into the flow of the discourse. In order to fulfil this function, the abstract must provide a balanced representation that explicitly refers, in the informative mode, to the main stages in the problem-solving process. Furthermore, it should contain hypertext links that connect phrases of the abstract to the related modules, enabling the reader to switch smoothly between the abstract and its source text. Each link should carry a label that informs the reader about the specific relationship between the phrase at hand and the module referred to.
Electronic publishing has gradually gained a significant position in the scientific communication process. By now, the leading publishing houses have supplemented most of their printed journals with on-line versions on the World Wide Web. There is also a growing number of 'autonomous' electronic journals that are not based on a printed version. The emergence of electronic journals derives from several attractive opportunities of on-line publishing, such as publication speed, interactivity and the possibility to include 'unprintable' information. The current literature on electronic publishing is mainly concerned with issues relating to logistics and storage of electronic documents. As yet, little attention has been paid to the question of the extent to which an electronic publication environment calls for a reconsideration of the conventional design of the 'genre' of the scientific article. The standard linear structure of a scientific article has been shaped in the tradition of print publishing and seems less suitable for the entirely different features of an electronic publication environment. It is therefore worth investigating how these features can be utilised more effectively.
In this paper, I shall focus on the implications of an electronic publication environment for one component of the scientific article: the abstract. Because scientists are confronted with a rapidly growing amount of information, the abstract plays an increasingly important role in the scientific communication process. This trend is reflected in several recent initiatives intended to regulate the presentation of abstracts in printed journals, in order to prevent them from remaining, in the words of Rowley (1988), "an ill-planned afterthought" (p. 127).
The object of this paper is to point out in what respects the composition of 'traditional' printed journal abstracts is to be reconsidered when dealing with abstracts in an electronic network of non-linear scientific articles. First, I shall briefly describe the modular structure we propose for electronic articles in the field of experimental physics. Then, I shall discuss the main functions of the printed journal abstract and argue that there is a different order of priority of these functions for abstracts in a modular electronic environment. Finally, I shall consider some important composition requirements for these abstracts, taking both their primary function and the specific features of an electronic environment into account.
The printed article is characterised by a linear structure: an essay that leads the reader from an introductory section to a concluding section. Elsewhere, we have argued that for electronic articles, in contrast, a modular structure is more appropriate (Harmsze, Kircz & Van der Tol, 1996; Kircz, 1998; Harmsze & Kircz, 1998). A modular article is structured as a coherent set of interconnected 'modules' within a broader network of modules of related articles. Each module is a self-contained, uniquely characterised unit that focuses on a specific subject. A module can be retrieved and consulted separately, as well as consulted in conjunction with related modules in any desired order.
While a linear structure is most appropriate for sequential reading, a modular structure lends itself better to selective reading. This is useful because scientists mostly restrict themselves to searching and reading particular parts of an article, rather than reading the complete story (Bazerman, 1985; Dillon, Richardson & Mc Knight, 1989; Berkenkotter & Huckin, 1995). Therefore, a modular structure is probably more in accordance with their reading behaviour. For those who wish to read a modular article in its entirety, a sequential 'reading path' can be specified between all its modules.
The object of our research project 'Communication in Physics' is to specify a modular structure for electronic articles in the field of experimental physics. We aim to determine what types of modules are useful and what types of relations are a useful basis for hypertext links expressing the coherence between the modules. On the basis of this structure, we shall formulate concrete instructions to authors. We have taken the traditional structure of the empirical research article as a starting point. The standard article is divided into the sections 'Introduction', 'Methods', 'Results', 'Discussion' and 'Conclusions' ('IMRDC'). There are several reasons why 'modularising' the linear research article does not simply amount to breaking it down in these standard sections. First, the composition of these sections is subservient to the linear ordering of the information. Second, although the division in sections may seem straightforward at first glance, on closer inspection it is not always predictable what information belongs to which section. For a modular structure to be adequate, the modules should be both more predictable and more transparent.
To determine how the standard sections of a scientific article can be replaced by self-contained modules, a corpus of printed articles has been analysed. The analysis was aimed at reshaping the articles into a modular structure that can be a basis for modelling future electronic articles. Figure 1 gives an overview of the main division into modules we have arrived at.
[FIGURE 1]
Figure 1. The main division in modules for a modular electronic article. The dashed line indicates the complete sequential path and the dotted line a possible essay-type sequential path. As an example, the module 'Theoretical methods' is divided into optional lower-level modules by using a domain-oriented characterisation.
The main division is based on the problem-solving pattern of the research. Strongly related modules are combined into a higher level, 'complex' module; the modules 'Experimental Methods', 'Theoretical Methods' and 'Numerical Methods', for instance, constitute the complex module 'Methods'. The main division is further specified into lower-level modules by using a domain-oriented characterisation. As an example, the module 'Theoretical Methods' is depicted in figure 1 as a complex module consisting of two lower-level modules, each focussing on a different theoretical model. All modules are also characterised by the author names and other bibliographical data.
[FIGURE 2]
Figure 2. A labelled hypertext link between a source module S and a target module T. In this example, the two italicised characterisations refer to organisational relations; the third characterisation refers to a discourse relation. The target module T is a constituent module of the (complex) source module S, it is the next step in the essay-type path and it provides more details.
Although modules are self-contained units, they are obviously not independent of their context. The interdependence of the modules is expressed in hypertext links representing relations between these modules, both inside and outside the article. These relations are classified into two types: organisational relations and scientific discourse relations. As exemplified in figure 2, each hypertext link carries a label with a characterisation of the relations it expresses. This enables the reader to make well-considered decisions when determining his own route throughout the information network.
For readers who prefer sequential reading, two kinds of reading paths are made explicit. The complete reading path leads through all components of the article in a standard sequence. To maximise readability, authors should also specify an essay-type reading path, leading through the components in their smoothest sequence. This path bypasses the components that are usually not relevant for sequential reading, such as the raw data.
The overall coherence of the article is expressed in two elementary modules: the organisational coherence is visualised in a map of contents; the coherence of the scientific discourse is expressed in an abstract. Both elementary modules are part of the complex module 'Meta-Information' (see figure 1). This module serves as a 'linchpin' which encloses all information about the article. It further comprises all standard meta-data like the author's name and address, the publication date and an overview of all keywords.
An abstract is a condensed account of an article that enables a reader to digest the contents of the article more efficiently. Abstracts accompany their source text in primary publications; they are also separately included in secondary sources, such as abstract journals and bibliographical databases. Following the literature on abstracting, four main functions of an abstract can be distinguished:
1) selection function
Because a scientist cannot read all articles published in his field of interest, he can use the abstract for deciding which articles he can discard and which articles warrant further inspection.
2) substitution function
In this case, the abstract is aimed at presenting all information that is considered relevant for (some of) its intended readers, exempting them from reading the full document. This can be useful for readers who only want to take note of the main conclusions.
3) orientation function
In this case, the abstract is aimed at supporting those who read (parts of) the source text. It can support them in several stages. Before they start reading the source text, readers can consult the abstract to make themselves familiar with its global structure and contents. During the reading process, they can return to the abstract to regain insight in the broader coherence of the discourse. And afterwards, they can use the abstract as an orientation tool while reflecting upon what has been read.
4) retrieval function
First, an abstract can fulfil a retrieval function by providing a source for indexers. As Ashworth (1973) puts it: "In fact, the ideal abstract from an indexer's point of view is a string of keywords linked into an easily read sentence" (p. 50). An abstract that is stored in an electronic full-text database fulfils a retrieval function in a second, more direct way: in full-text databases, abstracts can be searched by using phrases from the full text. Therefore, the retrieval function requires that the phrases used in the abstract are as predictable as possible for searchers in the database for whom this abstract might be relevant.
To meet the variety of purposes of a heterogeneous readership, an abstract usually has to fulfil all four functions. However, one abstract cannot simultaneously fulfil each of these functions optimally. In particular, there is a tension between the searcher-oriented retrieval function and the three other, reader-oriented functions. Fidel (1986) has pointed out that the specific requirements stemming from the retrieval function are often incompatible with the requirements for a brief, clear and acceptable writing style. In addition, there is also a certain tension between the three reader-oriented functions. An orientation function directed to readers with a general interest in the topic, for instance, requires a different and more comprehensive account of the article than a substitution function, directed to readers who are only interested in the main conclusions. As an abstract cannot simultaneously fulfil all functions optimally, it should be clear which of these functions has priority.
How is generally the order of priority in the case of printed journal abstracts? To start with, the retrieval function has clearly less priority than in the case of abstracts stored in databases. As for the three remaining, reader-oriented functions of printed journal abstracts, their selection and substitution function have priority over their orientation function. The latter is already the primary function of the 'Introduction' section and the author can also include a summary at the end of the article "to complete the orientation of a reader who has studied the preceding text" (National Information Standards Organization, 1997). Explorative studies into the reading behaviour of scientists indicate that they indeed use the abstract primarily as a selection tool or as a substitute for the article (Bazerman, 1985; Berkenkotter & Huckin, 1995).
When dealing with abstracts of modular electronic articles, the order of priority is different: for these abstracts, the orientation function is pivotal. A potential drawback of a modular environment is that a reader might easily 'get lost' in case he has no broader overview of the coherence of the information distributed over the modules. He can reach a module from many directions, so he might not have immediate insight into the status of that particular module within the broader context of the article. Therefore, the module Meta-Information contains both orientation tools that are mentioned above. When the reader has questions concerning the organisational coherence, for instance concerning the types of modules the article contains or the route of the essay-type path, he can consult the (visual) map of contents; when the reader has questions concerning the overall coherence of the discourse, he can consult the (textual) abstract. Each module gives immediate access to both orientation tools.
Conversely, the selection function of abstracts of modular electronic articles is less significant. To evaluate the relevance of a particular modular article, the reader can also consult directly one of its constituent modules. If, for instance, the relevance of a particular article depends on the methods used in the experiment, the reader can consult the module concerned for a quick relevance judgement. Similarly, if its relevance depends on the main findings, he can consult the module 'Findings'. In other words, in a modular environment the selection function can be fulfilled better through components of the source text itself than in the case of a linear printed document. On the other hand, in a modular article there is, in the absence of an 'Introduction' section, no alternative to the abstract as an orientation tool to gain insight into the overall coherence of the discourse.
At first sight, the retrieval function of the abstract may be considered more important in the case of modular electronic articles than in the case of linear printed articles, because the former are stored in an electronic database, which allows for full-text searching. However, not only the abstract, but all modules of the article are searchable by both their full text and their specific characterisations. The retrieval function of the abstract is therefore not pivotal. For that reason, as soon as observing the retrieval function would affect the readability of the abstract, priority should be given to the reader-oriented functions.
Generally speaking, there are three ways in which abstracts can represent the contents of their source text: they can provide a brief characterisation, a slanted representation or a balanced representation. What type of representation is most adequate depends mainly on the primary function of the abstract and its intended readership. If an abstract is primarily meant to be a selection or substitution tool for a readership consisting of specialists in a confined research area, a brief characterisation may do. Such annotation-like abstracts, also known as 'mini-abstracts' or 'micro-abstracts', often amount to a clarification of the title or a statement of the main conclusions.
If an abstract is directed at readers with a specific background and a particular interest, a slanted representation is most appropriate. A slanted abstract, also known as 'oriented abstract' or 'special purpose abstract', only represents the information that is considered relevant for a particular interest group, which not necessarily reflects of the main issues and main lines of the source text. These abstracts mainly appear in dedicated secondary sources.
For an abstract that has primarily an orientation function, such as an abstract of a modular article, a balanced representation is most appropriate: the abstract can only fulfil an orientation function if it adequately reflects the complete main issues and main lines of its source text. In the case of modular articles, this implies that the abstract should represent each of the main modules 'Positioning', 'Methods', 'Results', 'Interpretation' and 'Outcome'. This does not necessarily mean, however, that they should all have an equal share: if most new information is presented in the module 'Interpretation' and the methodological information is generally known and mainly presented in an external module, then the interpretation can likewise be emphasised in the abstract.
The requirement that abstracts of modular electronic articles be balanced representations will in practice lead to longer abstracts than usually is the case in printed journals. This difference, however, is not as fundamental as it may seem: studies into the quality of printed journal abstracts point out that they are often incomplete (Buxton & Meadows, 1978; Hartley & Benjamin, 1998). Consequently, recent initiatives to regulate the format of printed abstracts also lead to longer texts. A possible obstacle to including longer abstracts in printed journals is space constraint, but in an electronic environment, this is no longer relevant. A valid reason to be cautious with extending the format of abstracts, though, is that an overlong abstract decreases, rather than increases reading efficiency.
There is a conventional distinction between informative and indicative abstracts. Although a diversity of definitions can be found in the literature on abstracting and summarisation, the leading criterion used for this distinction is in the perspective. An informative abstract conveys in a reduced form the same message as the source text, as if it were a direct report of the research; an indicative abstract is an external account of what the source text is about. Informative abstracts are usually longer and may (for particular readers) serve as a substitute for the source text; indicative abstracts have a primary selection function.
Both types of abstracts are in some respect also suitable for the fulfilment of an orientation function. Indicative abstracts, describing the main steps in the source text, can serve as an orientation tool to gain insight into the global organisational coherence of the article. Informative abstracts lend themselves more easily to a more profound, content-oriented orientation. In a modular article, the organisational coherence is already dealt with by the map of contents. Therefore, an informative abstract is most suitable.
Figure 3a. The original printed abstract of Los & Delvigne, 1973.
Figure 3b. A rewritten version of the abstract, tailored to an electronic modular environment. The figures in the text represent links to other modules. As an example, six link labels are listed below. The codes in these labels are the unique identification codes of the target modules.
Figure 3a shows an example of an original paper abstract taken from our corpus and figure 3b shows a rewritten version that is tailored to an electronic modular environment. The article reports on experimental research in the field of molecular physics. In the electronic version, each module is provided with a standard navigation bar with icons linking the module (in this case the abstract) to other components of the article.
As for the textual changes, the main difference is that the electronic version refers more explicitly to the successive stages in the problem-solving process. In particular, the purpose of the study and the conclusions are treated more explicitly. Because the main division in modules is also based on the standard stages in a problem-solving process, the modified abstract can, in this way, probably better fulfil its orientation function. The added phrases are, in a slightly modified form, extracted from the source text. The abstract has also been divided into paragraphs; this is a type of revision that has turned out to have an advantageous effect on the readability of abstracts (Hartley, 1994). The changes have caused an increase in the total number of words from 175 to 323.
Another noticeable change is that two small pictures have been included, which depict graphs that play an important role in the article. In printed journal abstracts, visual information is usually left out due to space constraints and verbally, these graphs could have only been mentioned indicatively. In an electronic abstract, visually-reduced information can be included more easily. Although these graphs are not as fine-grained as in the source text, their shape provides sufficient information for specialised readers to assess the types of results that have been obtained.
To connect components of the abstract to related information in the modular network, hypertext links have been included. The links are represented by icons, which can be hidden in case the reader prefers to read (and to print out) a smooth text. The icons are coloured according to the type of main module to which they refer ('Positioning', 'Methods' etc.). Further, the shape of the icon gives information about the 'proximity' of the module referred to: a square refers to a module of the same article, a lozenge refers to a module of a different article within the same research project and a bullet (not present in this abstract) refers to an external publication. An abstract in a modular environment should be linked to at least each main module of its source text.
Each link carries a label that pops up as soon as the reader passes the mouse over the icon. The label provides information about the relationship between the phrase at hand and the module referred to. All labels contain a brief clarifying sentence that is formulated in the indicative mode. Furthermore, the labels contain a characterisation of the link based on our typology of organisational relations and scientific discourse relations. The relation that presumably matters most for the reader is capitalised; in the abstract this is by default the organisational relation indicating that the link refers to a 'content module' presenting the scientific discourse ('Positioning' etc.), as opposed to the administrative module 'Meta-Information', of which the abstract is a component. By way of illustration, the labels of six links are listed in figure 3b.
When the abstract is in this manner linked to its source text, the reader of the abstract has an efficient tool to switch over to those modules that provide more information about the issues in which he is particularly interested, while keeping a grip on the broader coherence of the discourse. At the same time, the reader of a module can go the other way around and switch over to the abstract to gain insight into the broader coherence of the discourse. Due to this double interplay between abstract and source text, the electronic abstract can play a more significant role in the reading process than its printed counterpart.
In a modular electronic environment, the abstract has primarily an orientation function. It fulfils this function best when it provides a balanced representation that refers explicitly, in the informative mode, to the various stages in the problem-solving process. It also has to contain labelled links that connect phrases of the abstract to the related modules of the article. At least each main module of the source text should be linked to the abstract.
In our research project, an analytic procedure is being developed that specifies four main types of activities that have to be carried out in systematically composing an electronic abstract: 'goal-setting', 'analysis', 'reduction' and 'presentation'. This procedure will comprise a basis for developing practical guidelines to authors who compose abstracts of electronic modular articles, as well as to editors who evaluate these abstracts.
Standardised guidelines for composing abstracts for modular electronic articles probably require a more conscious writing attitude, which may have a daunting effect on the author. On the other hand, an inquiry by Hartley & Benjamin (1998) has revealed that more standardisation of printed journal abstracts in the medical and psychological fields has facilitated the writing process. Moreover, the same author will take advantage of a clearly-structured network of modular articles when he himself plays the role of searcher or reader and, not unimportant, his own work will in that case also be searched and read more effectively and efficiently.
This work is part of the 'Communication in Physics' project of the Foundation Physica; it is financially supported by the Foundation Physica, the Shell Research and Technology Centre Amsterdam, the Royal Dutch Academy of Sciences, the Royal Library, and Elsevier Science NL. I would like to thank Frans van Eemeren, Frédérique Harmsze and Joost Kircz for their valuable comments on earlier drafts of this paper. I am grateful to Keith Jones for correcting the English.
Figure 3a. The original printed abstract of J., & Delvigne, G.A.L. (1973). Rainbow, Stueckelberg Oscillations and Rotational Coupling on the Differential Cross Section of Na + I --> Na+ + I-. Physica, 67, 166-196.
Relative differential cross sections have been measured for the atom--atom collision process at kinetic energies from 13 to 85 eV. The measurements show two types of oscillating features. The Stueckelberg oscillations are due to the interference of scattering from different potentials inside the pseudo-crossing of the covalent and ionic potential curves. Interference due to scattering from the ionic interatomic potential causes a rainbow structure that has been resolved completely. The measurements allow estimations of the covalent potential parameters and the pseudocrossing parameter H12. Semiclassical differential cross sections have been calculated using the lowest-order stationary-phase approximation, JWKB phase shifts and the Landau--Zener transition probability. Substituting the known ionic potential and the determined covalent one, there is a very good agreement between the calculated and measured differential cross sections due to collisions with small and intermediate impact parameters. For large impact parameters a rather serious disagreement arises between the relative intensities as well as the oscillation wavelengths of the corresponding cross section. The intensity discrepancy has been removed taking into account the phenomenon of rotational coupling. |
Figure 3b. A rewritten version, tailored to an electronic modular environment, of the abstract of J., & Delvigne, G.A.L. (1973). Rainbow, Stueckelberg Oscillations and Rotational Coupling on the Differential Cross Section of Na + I --> Na+ + I-. Physica, 67, 166-196. The figures in the text represent links to other modules. As an example, six link labels are listed below. The codes in these labels are the unique identification codes of the target modules.
The purpose of this study 1 is to test a semiclassical calculation method and the suitability of the atom-atom model for ion pair formation 2 to explain the chemi-ionization that forms the first step in a harpoon reaction .
Relative differential cross sections have been measured for
the atom-atom collision process Na + I --> Na+ + I - at
kinetic energies from 13 to 85 eV:
3
We have used a molecular beam set-up, including a charge exchange sodium source
and a hybrid between a collision chamber and a secondary beam for the iodine
.
This set-up allows for the experimental resolution of the Stueckelberg oscillations
and the rainbow structure. The Stueckelberg oscillations are due to the interference
of scattering from different potentials inside the pseudo-crossing of the covalent
and ionic potential curves. Interference due to scattering from the ionic interatomic
potential causes a rainbow structure that has been resolved completely.
The measurements allow estimations of the covalent potential
parameters and the pseudocrossing parameter H12
.
Based on the potential, semiclassical differential cross sections have been
calculated via the deflection curve
:
In this calculation, we have used the lowest-order stationary-phase approximation,
JWKB phase shifts
4 and the Landau-Zener transition probability .
When the known ionic potential and the determined covalent one are substituted, there is a very good agreement between the calculated and measured differential cross sections due to collisions with small and intermediate impact parameters . For large impact parameters a rather serious disagreement 5 arises between the relative intensities as well as the oscillation wavelengths of the corresponding cross section 6. The intensity discrepancy has been removed taking into account the phenomenon of rotational coupling .
We conclude that the atom-atom model for ion-pair formation
via potential curve crosssing by means of Landau-Zener coupling and rotational
coupling, indeed explains the experimental results for atom-atom collisions
if the semiclassical calculation method is used .
Therefore, in future research atom-molecule collisions can be studied .