View  Edit  Attributes  History  Attach  Print 
Research / Proposal


Data gathering

  1. Credite experto
  2. Software analysis
  3. Requests analysis
  4. User survey


  1. Background
  2. Literature review
  3. Data collection & measurement
  4. Modelling & Testing
  5. Results
  6. Conclusions
  7. Appendices & Bibliography




edit SideBar

Proposal for PhD research in the Department of Applied Psychology

Peter Flynn, Computer Centre, University College Cork 1 June 2002

Interfaces to creating and developing structured documents

Why are structured text editors regarded as so difficult to use? They should be easier than other ways of dealing with textual information. What is it about interfaces and users' expectations that appears to raise the barrier to using structured documents instead of lowering it?


Software and techniques for creating and editing structured text documents has been much slower to develop and mature than that for unstructured documents, and it is widely regarded as difficult to use without extensive training. This constitutes a major hindrance (organisationally and financially) to the deployment of structured document techniques which are becoming essential in both business and non-business applications of information technology.

This research will investigate three principal areas (explained in more detail in the items Techniques and Features below:

the requirements
for using structured document techniques and the extent to which these are realistic in terms of the resources available to organisations and individuals;
the expectations of users
editing structured documents, especially in relation to the skills needed and to the demands and facilities of the technology, and the nature and degree of any gap between expectation and provision;
the interface paradigms
used in existing software, and the quantity and quality of improvement that might be achieved to close the gap and improve usability and productivity.

The principal objectives are a) to identify which aspects of structured document techniques are responsible for the perceived difficulty of use, and b) to derive a model for structured document editing interfaces which maximises usability.


The introduction of the Extensible Markup Language (XML) and its companion developments1 has led to a rapid growth in the use of structured document techniques as a model for the way in which textual information is stored and processed on computers and networks. In this paradigm, programmatic controls known as markup are used to monitor the formation and manipulation of the component parts of a document.2 The use of programmed logic in this way not only helps to ensure the document's integrity but enables it to be further processed by automated systems because its structure can be detected and proved by software.[^This approach (parsing and validation) addresses only the physical structure of the document, not its semantic content or the quality of the writing, but it provides a suitable framework(7) for addressing these other aspects.^]

Many of these controls are obvious, and have been implemented over many years using applications of XML's parent technology, the Standard Generalized Markup Language (SGML, ISO 8879:1985). SGML, however, is a very large and complex standard, with many powerful but rarely-used options and hidden pitfalls even for the experienced programmer or document engineer. These make software development expensive and slow. XML was designed to remove all the optional features of SGML and make it lighter and simpler to program for, thus enabling its use on the Web, where document control criteria must be exercised in real time. On the Web, using a structured document system which is robust but flexible (as opposed to HTML, which is neither) will mean improved automation — standards can be developed for machine recognition and interpretation of the significant features of a document without the need for human interpretation (which relies on purely visual clues to meaning) (ISO 11179).

Applications of XML

Two principal application areas of XML have emerged in recent years:

`data' applications,
comprising mainly the treatment of rectangular numeric or categorical information for use in e-commerce, using XML as a lingua franca for transmission purposes, with the actual data residing in traditional databases at either end and being processed by conventional business applications.
`document' applications,
comprising the treatment of expository or narrative text for publication (on paper or electronically: this includes both traditional print-and-paper publishing and publishing on the Web, CD-ROM, and other media);

In the first area (the field of e-commerce information), the use of structured data techniques for data entry and maintenance is well supported. Software can present the user with a familiar interface broadly resembling traditional database applications. These appear either as specially designed form-fill screens or as a type of spreadsheet grid. Data entry and maintenance is largely prescriptive because of the tightly-defined requirements of business data — that is, most data elements must occur in a predetermined sequence and are compulsory (or only optional in very restricted circumstances). The grid or form-fill interface seems to have been universally accepted as efficient and effective for this class of applications.

However, in the second area (the document publishing field), interface development has been non-existent, slow, or stagnant, and yet the volume of information required often exceeds that of `data' applications. The markup requirements of narrative or expository text are often descriptive rather than prescriptive, with the author or editor rightly allowed very great latitude in composition. Methods of handling this kind of text have inherited the legacy of over 30 years industry experience of text-editing, word-processing, and desktop publishing3, including 15 years of SGML, whereas XML `data' applications are very new.

Interface paradigms

The only interface paradigms in significant use in document-based XML are based either on a plain text display (typewriter-style) or on a synchronous typographic display4

Plaintext editor displays
were standard until the advent of graphical user interfaces (GUIs) and are still in widespread use. They represent the content of a document `as-is', that is, at a 1:1 ratio where every character in the document file is displayed as a character on the screen and every character on the screen represents exactly one character in the document file. Any features or functions of text are usually visible as special embedded characters or commands. This is essential for the traditional use of computers for programming and data management, because complete control over the exact content of the data or the file is an absolute prerequisite.
The plaintext editors in use for structured documents grew out of the existing programmers' editors, and while some of them are very sophisticated5, their monodic use of the fixed-width typewriter font is widely (if mistakenly) regarded by the casual or untrained user as a deterrent, rather than the virtue which the experienced user finds it to be.
Examples of plaintext editors include ed, vi and Emacs on Unix and other platforms; WinEdt, PFE, and NotePad on Microsoft Windows, BBedit and TeachText on Apple Macs, and many others. Early wordprocessors such as PC-Write and WordPerfect are also members of this class of program.
Synchronous typographic editor displays
are designed to hide any explicit markup or notation, including that defining the structure. They are intended instead to restrict communication with the user exclusively to the typography of the display.6
While this method is adequate for ephemeral use in wordprocessors, it makes the interface almost unusable for structured documents. It usually provides no way for the user to associate text with markup other than by applying visual styling (typefaces and spacing). These `styles' can be applied arbitrarily anywhere in the document, even across structural boundaries, making it impossible to use automated controls. It also relies — like printed text — solely on visual clues for the interpretation of meaning7, and because the markup is hidden, even an experienced operator may not be able to detect where its boundaries lie.8
Examples of synchronous typographic editors include all the common office wordprocessors (Microsoft Word, Corel WordPerfect, Sun Star Office, Lotus Notes, etc), and most small desktop publishing systems (Adobe PageMaker, the obsolete but still widely-used Ventura, Microsoft Publisher, Adobe InDesign, etc)9

Most structured document editors — in either of these two modes — therefore provide a combination of colourisation, sentinels, status panels, diagrammatic displays, style bars, document-tree windows, tell-tales, pop-ups, mouseovers, icon trays, toolbars, toggle buttons, and dozens of other devices to signal to the user what portion of the document currently lies under consideration.

In the case of structured editors, other information is also signalled: what is the nature of the current portion, where it fits into the document model, how it is to be formatted, and in some cases even where it came from or who is responsible for it. These signalling devices can be grouped into several categories:

  • Devices which have become popular enough for them to rank as de facto standards in their own right, such as the drop-down menu. Many of these are implemented as widgets (commonly-occurring reusable graphical components) at a low level in graphical user interfaces, but it is unclear to what extent any alternatives have been described, documented, tested, or evaluated.
  • Devices which are simple to operate but cumbersome to implement, because they need a comprehensive awareness of the entire document rather than just the portion visible to the user, such as cross-reference marking. This means they require significant processing power or memroy usage to execute. Because they alter the nature of the displayed text, they can cause a significant change to the formatting and the quantity of text on display even though they themselves occupy little space.
  • Devices which require a well-trained user equipped with a large body of foreknowledge about the specifics of the document structure being edited, and very powerful hardware and software (ArborText's EPIC).
  • Devices which do not take advantage of some of the more obvious developments in human interfaces to speed or ease the editing process (requiring the user to click twice rather than once, for example).
  • Devices which are known to have broken new ground (the hierarchical boxes in InContext and the cyclic use of the Enter key in STiLO), but these are the exceptions.

Condition of the interface

There appears to have been little or no discernable attempt to investigate a) the needs or expectations of the user with regard to this class of interaction (structured documents), nor b) the nature of their interaction with the edit interface itself (especially `user-friendliness'), nor c) the functions which can now be provided in such interfaces.

This has led to a general uncritical adoption of the prevailing edit interface paradigms of the wordprocessor, text editor, and desktop publishing fields, apparently without sufficient consideration of their suitability, efficiency, or reliability. The justification for this approach is often the anecdotal grounds that `people know it already', which ignores the possibility that greater benefits may come from better training or knowledge.


There appear from observation to exist three sets of expectations with regard to creating and editing structured documents, namely:

  1. authors or creators expect certain behaviour patterns from the software. These expectations have grown up over nearly three decades' experience of using synchronous typographical and text-mode displays;
  2. document type designers10 expect that structural editing software will correctly activate the the templates they create as part of their document analysis procedures11 and that the author or editor will be able to fill them with information;
  3. publishers (or other manipulators of documents) expect that the finished document will conform to the structural and informational pattern prescribed or allowed for documents of its type [As well as making sense, being a good read, and whatever other criteria are required.] so that it can be used or re-used effectively and efficiently with confidence.


At the intersection of these expectations lies the edit interface:

There is possibly a fourth expectation, that the task of data entry or modification of structured information can be carried out by untrained staff without error.

This is an unfortunate but realistic interpretation of the desires and assumptions of a very large section of industry, including the educational community, amply evidenced almost daily by questions raised in the common network forums for discussion. It is believed that this field is not commonly investigated, and has never been documented with customary scientific rigour.


That the authors' or editors' expectations of experience can be described in formal terms in such a way as to permit the modelling of a set of interface tools which maximise the usability of an editing interface, and contribute in some measurable way to improving reliability and robustness.

Secondary hypothesis. That such a set of tools would enable the creation and maintenance of structured information by users with a lower level of training.


Formal investigation of the field, analysis of the parameters, construction and application of a test for the hypotheses, and analysis of the results.

  1. Revision of the available research on editor interface usability with reference to information structure.
  2. Investigation of commonly-held beliefs and sets of assumptions in order to ascertain and quantify them or disprove and discard them.
  3. Analysis of structured document techniques to determine the human and machine requirements for implementation.
  4. Analysis of the features provided by existing systems and how these match the techniques above and contribute to the expectations below.
  5. Enquiry among representative samples of users and intending users of structured information to quantify and qualify their expectations, with a control group of non-users (ie users of unstructured systems).
  6. Design and testing of a model to match these expectations with the deterministic and heuristic factors of structured editing systems from the investigation of beliefs, assumptions, and expectations.
  7. Implementation of the model in a suitable pilot environment as a test of the main hypothesis.
  8. Depending on the results, piloting of the performance of the model under real-world editing conditions as a test of the secondary hypothesis.
  9. Evaluation of the results, conclusions, and suggestions for further work.

1 Principally the Extensible Stylesheet Language (XSL) and W3C Schemas, but also the large number of supporting or enabling technologies detailed on the W3C Web site.

2 For example, a structured document system can check that every section has a heading; that every cross-reference has a matching target; and that each list has at least two items.

3 As well as the preceding 500 years experience of printing and 5,000 years of written language.

4 Usually (quite erroneously) called `What You See Is What You Get' (WYSIWYG): the phrase properly refers to the quality or accuracy of the display representing the typeset output, not to its use as the interface medium.

5 Emacs with the psgml, xxml, xslide, and tdtd `modes' (plugins) provides a comprehensive and reliable document engineering development environment the equal of most commercial systems.

6 Some synchronous typographic displays used for structured text can also reveal the structure, but we are dealing here with the default paradigm of this mode of display, not its variant forms or behaviour.

7 These clues (typeface, size, weight, and style [bold, italics, small capitals, etc], underlining, indenting, vertical and horizontal white-space, and many others) derive from the legacy of writing and printing mentioned earlier, and are not lightly to be discounted. But each has multiple purposes, dependent on semantic context which cannot yet be determined sufficiently accurately by machine. Their use by computers at the moment is therefore often ambiguous.

8 Some systems do allow separate markup display, such as Microsoft Word `named styles', but these are one-dimensional and not useful for observing structure.

9 It is instructive to note that most professional packages (Miles, XPress, 3B2, LaTeX, FrameMaker, XPP, etc) have always provided ways to defeat the synchronous display and view the underlying structure, and that WordPerfect still provides this feature.

10 This is a specific occupation in document engineering, and refers to the designer of the structure of different types (or classes) of documents: books, letters, reports, articles, purchase orders, etc. It should not be confused with the typographic design of their output, which is the field of the typographer or graphic designer.

11 The traditional document type design process includes Inspection, Analysis, one or more cycles of Design---Testing---Evaluation, Implementation, and cyclical Monitoring---Maintenance [#maler|Maler and El Andaloussi, 1997]].

Outline contents

Note on nomenclature and semantics
Explanation of technical notation


Structured documents

Markup systems

Interfaces and attitudes

Some dichotomies of interface design

Research into edit interfaces

Research before SGML (1985)

Visual and non-visual interfaces

The effects of platform diversity

Research since SGML (1985)

Industrial pressures

Market dominance and published research

The re-emergence of Document Engineering as a discipline

Beliefs and assumptions: reliability and unreliability in industrial anecdote

Structured document techniques


What is demanded of the user

What is demanded of the software

Features available in software

Acceptance and resistance

The problems of `selling' structure

User attitudes

The revulsive reaction and self-deprecation

Training and education

Determining users' expectations

Hypotheses: satisfaction or dissatisfaction

The cohort survey

Positioning `before' and `after'

Sampling and selection

Measurement and techniques

Administration and data collection

Analysis of results

Exposure to information

The effects of experience

The effects of training

An experimental model for expectation and behaviour

The interaction of documents and humans

Measurement of software `power'

Adaptability and design

Extending the model

Software control and user expectation

Behavioural control or user anarchy?

Effects of exogenous control

A framework for testing

Pilot implementation

Software and design

Data selection and variability

Revisiting the cohort

Adapting to user demand




Forms and programs used

Pilot code

↑ Top  View  Edit  Attributes  History  Attach  Print 
This page was last modified on May 23, 2011, at 08:03 PM