Developing Compatible Record Systems for Botanic Gardens in the Netherlands

Bert J.W. van den Wollenberg

Utrecht Botanic Gardens, Utrecht University, P.O.Box 80.162, 3508 TD Utrecht, the Netherlands

Abstract

From the moment computers became available as a way of managing databases, botanic gardens have undertaken efforts to incorporate these new developments into their everyday practice, in order to reduce the time and money spent on updating their vast data holdings and to allow for easy availability of this data.

The Dutch Agreements on data storage stem from the late 1970s, when a number of botanical gardens from universities expressed their intention to start automating their plant records. A set of Agreements was established, and the layout of the databases were more or less identical. The Agreements concerned mostly coded information in fields of fixed size. The Agreements were rather elaborate, and each participating garden was left the freedom to choose which data items were maintained, thereby acknowledging that the agreements were a framework rather than a straitjacket.

Today, the number of different database systems used in the Netherlands is still increasing. The increased availability of hardware and software has created a wide variety of combinations. Also, each garden decides whether it needs a customized system. This is a serious obstacle to integration and data exchange.

Differences in database systems do not necessarily exclude the possibility of data exchange. When moving from one system to another, such as has happened at Utrecht Botanic Gardens, or when developing a new database system, the requirements imposed upon the current database system can be incorporated into the new database system. The only limitation to this is the extent to which these requirements conflict with a garden's need to record certain information, or conflict with the Dutch Agreements concerning coded information.

While the first Utrecht Botanic Gardens database system (called BUD1) completely reflected the Dutch Agreements, it did not, for a number of reasons, perform the central task required. The shift from BUD1 to its successor (called BUD2) actually meant expanding, stretching, and splitting up the contents of the Dutch Agreements.

After analysis of the requirements of an efficient Curators' Office, the Internation Transfer Format (ITF) requirements were analyzed, and interlinking with the ITF was established. This involved "translation" of certain data in ITF codes. The returned ITF files are automatically handled by a separate program, to include modifications into the database.

The Dutch Agreements have gradually evolved into a de facto Dutch Transfer Format, which developed from the initial 2 times 80-character fixed fields, to 4 times 80-character fixed fields. The Dutch Transfer Format has one big disadvantage however: it only includes coded information in fields of fixed size. It therefore cannot include information of variable size, such as information on source collecting data, and verifiers.

The Dutch Transfer Format will become more important in the near future, since the Dutch Botanic Gardens Foundation wishes to establish a central database on the specializations of each of the affiliated Dutch botanic gardens. Although several Dutch botanic gardens are members of BGCI, only Utrecht Botanic Gardens is currently (1992) exchanging data with BGCI using the ITF. Political, financial and technical reasons have been the cause of this low participation in data exchange so far. With the increasing importance of conservation tasks for botanic gardens, data exchange through the ITF may become an important issue in the Netherlands in the near future.

Introduction

This paper first discusses the development of data storage for botanic gardens in the Netherlands. This has led to the establishment of a framework for recording plant data, and the subsequent establishment of more-or-less identical databases, for which the generic name 'Dutch Agreement' is used here.

Next, the current situation is discussed, elaborating on the development of the Dutch Transfer Format (DTF) as well as customized systems amongst botanical gardens in the Netherlands. The database system BUD2, used at Utrecht Botanic Gardens, is presented as a case study. Finally, future prospects regarding compatibility are reviewed, in which the DTF, the International Transfer Format (ITF) and a customized system are compared.

Data storage in botanical gardens in the Netherlands

The Dutch Agreement on Data-storage stems from the late seventies. Representatives from the botanical gardens of Gent and Meise (both in Belgium), and Leiden, Nijmegen, Utrecht, and Wageningen (in the Netherlands) decided to investigate the possibility of developing unified general outlines for computer databanks. The aim was to establish unity in collection data, ensure data exchange, and facilitate the future creation of a central database. This Dutch Agreement resulted in a report, which was published in 1979 (Barendse et al. 1979).

At that time, only Leiden, Nijmegen, Utrecht and Wageningen considered using (mainframe) computers. Data integrity was maintained manually, using accepted codes. For those gardens which did not consider computer usage at that time, the development of a computer card format for index-card systems was suggested. These computer cards were filled out, and those gardens with access to a mainframe used them as a standardized model for the agency that entered the data. Others used the computer cards as a card index system, facilitating future computer recording.

The Dutch Agreement concerned coded information in fields of fixed size, and was, for that time, rather elaborate. Each record item consisted of an A- and a B-record; the C-record was designed for flexible-length information, but was not included in databases at that time, being defined to cover future developments. The A- and B-records contained the fixed-field information, and each consisted of sequences of up to 80 characters.

The general outlines of the Dutch Agreement became a standardized framework for recording of plant collections, and actually represented the physical layout of each database, rendering each database layout more or less identical!

The great advantage - and most worthwhile aspect - of the Dutch Agreement was that, within the rather elaborate framework, each participating garden had the freedom to choose which data items it actually stored; this also implied the freedom to use a more limited set of codes for each designated field. The meanings of the codes were still clear to all participants, however. Another major advantage was that each individual botanical garden could decide at what time to take the step of employing a computer, an aspect which from Ogilvie evoked the qualification that the project 'lacked momentum' (Ogilvie 1983), but in my view sooner acknowledged the idea that the Agreement was a framework, rather than a straitjacket. This Agreement in fact was the basis for all further developments in the Netherlands.

From the mid-1980s onwards, botanical gardens started to move away from working on mainframes. It became increasingly clear that the mainframes were very user-unfriendly. In addition, even minor changes to their programs were very expensive. In 1986, Utrecht University Botanic Gardens were the first botanical garden in the Netherlands to move its database from a mainframe to a personal computer. It ran on an IBM-AT, although at the time such a transfer was considered impossible by many. The resulting BUD1 database occupied 5.5 Mbytes, and still reflected in detail the Dutch Agreement. Later, a number of Dutch gardens also took up using a personal computer for this purpose, using other programs (e.g. Cardbox, Dbase), or using the Utrecht (BUD1) system.

Current situation concerning data storage

In the late 1980s, the number of different database systems used in the Netherlands was still increasing. The wide range of availability of hardware and software created a great variety of combinations.

Many gardens were considering using a more customized and less expensive system, which would better meet their needs. It became evident that each garden needed to identify why records were maintained, and which data it was necessary, even if not vital, to include in the data set already maintained, while at the same time trying to avoid having to adopt expensive software and hardware (Brown 1988).

Fig. 1: Example of a computer card used by Dutch botanical gardens (Dutch Agreement). The A-, B- and C-records are separated by double lines; the C-record was not included in databases at the time.

Fig. 2: Survey of the A- and B-records of the Dutch Agreement. The numbers indicate the exact space occupied by each field in the sequence (see also Fig. 3).

Added to this, the publication of the ITF in 1987 caused further discussions, fuelled by those gardens that wished to join the international data exchange. This resulted in two main developments: the establishment of the Dutch Transfer Format (DTF), and the incorporation of ITF requirements into customized systems; this is discussed using the case study of the Utrecht University Botanic Gardens systems, in which I was heavily involved.

The Dutch Transfer Format

Gradually the Computer Committee of the Dutch Botanic Gardens became aware that the increased use of different hardware and software, as well as the re-evaluation of the Dutch Agreement carried out while planning customized systems on personal computers, posed a serious threat to the integration and ease of data exchange. This resulted in a policy change. It was acknowledged that it was impossible to keep individual botanical gardens from using the system they wished to adopt. The only way to guarantee future data exchange was to turn the Agreement into a transfer format (subsequently called the Dutch Transfer Format). Each participating botanical garden was to maintain the ability to produce the required data in the DTF, using ASCII symbols.

Subsequently, the Dutch Transfer Format has developed from the initial 2 times 80-character fields to 4 times 80-character fields. It has one big disadvantage however: it only includes coded information in fields of fixed size. It therefore cannot include information of variable size, such as information on source collecting data, or verifiers.

The Dutch Government set up the Dutch Botanic Gardens Foundation in 1989 to coordinate the joint activities of botanic gardens in the Netherlands. The Dutch Botanic Gardens Foundation has emphasised the establishment of a central database of specializations present in the affiliated Dutch botanical gardens. These specializations are collectively known as the (Dutch) Decentralized National Plant Collection. Only the specializations of each participating botanical garden will be compiled in a central database. As a result, taxonomic and nomenclatural conflicts between data of different gardens will be avoided. The compilation of this central database of the Decentralized National Plant Collection has been started recently.

Fig. 3: Example of the provenance codes used in the Dutch Agreement and the DTF, and the corresponding ITF codes.

Development of customized systems: a case example

Fig. 4: Differences between corresponding BUD2 and DTF record fields. The number preceeding each field name indicates the available space (length), the number after the field name indicates whether the information that it contains belongs to one (1) or two (2) levels.

After Utrecht University Botanic Gardens switched from a mainframe to a personal computer in 1986, a special database was used, called BUD1 (Botanic Gardens Utrecht Database system v.1). While the BUD1 system fully conformed with the Dutch Agreement, it quickly became outdated.

Analysis of the Dutch Agreement by the Curators' Office of Utrecht Botanic Gardens led to the conclusion that one of its major disadvantages was the enormous redundancy in stored data. In particular the B-record of the Agreement incorporated data from taxon-, accession- and location-levels. This led to the result that the lowest level of data (location-related data) dictated the uniqueness of the record, in that each additional location of an accession actually resulted in an entirely new data item (an extra A- and B-record).

To solve the redundancy problem, a new system would have to be produced, more in line with the relational databases that were coming into use at that time. Detailed study of the data fields of the Agreement and the subsequent Dutch Transfer Format, also employed in BUD1, revealed that certain fields contained information that actually belonged to different levels, and so needed to be split up (Wollenberg 1989). For example, in fig. 4 the DTF field "Sex" contains information on the gender of plants (male/female), but also on whether the species is monoecious or dioecious. The former concerns individual plants, the latter the (abstract) taxon.

The BUD2 database system at Utrecht is an example of a tailor-made database system, with its many advantages. BUD2 has turned into a relational system. The system includes special features to enable the incorporation of taxonomic and nomenclatural knowledge. By linking synonym names to the current correct taxonomic names, entering either as a query term leads to the same relevant information. Of course, data integrity is maintained using the particular constraints that are applicable to each individual field e.g. upper case characters only, numeric or date fields, value-lists, etc. When building this database, a thorough analysis was made of the information required, utilities required (e.g. note-book facilities such as pop-up windows within the database), and the grouping of data into relational levels, as well as automated label production. For example, the taxon-number level in BUD2 is the key feature that allows the application of a synonym bank (reducing the work needed to change a synonym into the corresponding current name, and vice versa) to changing two toggles (yes/no switches).

The shift from BUD1 to BUD2, in 1989, was a major change, and not only in a technical sense. It also required reorganizing the Curators' Office, which had to adjust to new ways of implementing the administrative aspects of the scientific maintenance of the existing plant collections.

The shift also meant expanding, stretching, and splitting up the Dutch Agreement. The development of BUD2 has partly resulted in a violation of the Dutch Transfer Format. For some fields, BUD2 has more available space, sometimes in multiple fields, where the Dutch Transfer Format has but one field, with limited space.

Fig. 5: Representation of the levels in the BUD2 data of the Utrecht Botanic garden. Behind the fixed fields screen at each level, there is a flexible fields screen. Note that the highest level is not the plant-name level, but the taxon-number level.

While BUD2 was still in an early phase of development, the ITF was repeatedly examined to determine to what extent we could incorporate it into our new system. After comparison of the ITF and DTF requirements, it was concluded that for most of the DTF fields that we used in BUD1, we could use these fields in BUD2, translating the information to ITF codes when exchanging information. But as the ITF covered most of our desired information that was lacking in the DTF, the ITF codes were applied to these fields in BUD2. This resulted in easy data-exchange with BGCI, where the ITF transfer file is a "push-button" procedure.

The production of a file in the Dutch Transfer Format proved to be more difficult, and no longer produces all data required. Although this does not harm the data-exchange directly, at this moment some data cannot be provided, and may need further action in the future.

Fig. 6 Comparison between the Dutch Transfer Format, the ITF & BUD2 (which contains many ITF features). Single-line arrows indicate ready availability of data. Double-line arrows require 'translation'.

Future developments

Although several Dutch botanic gardens are members of BGCI, only Utrecht Botanic Gardens is currently using the ITF to exchange data with BGCI. Political, financial, and technical reasons have been the cause of this low participation in data-exchange so far. With the increasing importance of conservation tasks for botanic gardens, data-exchange through the ITF may become an important issue in the Netherlands within the near future. The linking of databases that still can only supply data available from the Dutch Transfer Format will, however, only result in a small proportion of the requested information being transferable.

Fig. 6 compares the data available from the DTF, the customized BUD2 system of Utrecht University Botanic Gardens which has incorporated many ITF features, with the ITF. E.g. BUD2 system, the MAKE_ITF program selects records using a specified algorithm, and subsequently 'translates' all codes of the known 'problem' fields into accepted ITF codes. Where the black arrows are used, I also included those cases in which only a selected sample of the codes are copied to the ITF. Fig. 6 shows for the ITF field 'Sex' that for both the DTF and BUD2 data are readily available. For the DTF, however, a filter has to be applied to filter out the codes used e.g. for 'monoecious' and 'dioecious'.

Divergence in database systems does not necessarily exclude allowing for data-exchange. When moving from one system to another, or when developing a new database system, the requirements imposed upon the existing database system can be incorporated into the new database system. The only limitations being the extent that these requirements are adverse to garden objectives regarding monitoring certain information, and the Dutch Agreement concerning coded information. The appreciation of the importance of garden-grown endangered plants for conservation purposes determines to what extent gardens are willing to monitor data which is strictly superfluous for collection management. Another difficulty exists when a garden has to spend a substantial funds to enable data-exchange, where conservation does not represent a major task. When overall funding is tight, these considerations logically determine the decision.

When developing compatibility of database systems, allowing participants sufficient time to reach a certain standard at their own speed, rather than trying to force all participants into a tight time schedule, has proved to be very productive. Once a state of compatibility and data-exchange with other participants has been established, adopting a new database system may bring you into conflicting situations. To prevent this, regular consultation with other participants is necessary. And last, but not least, if botanic gardens wish to become involved in the international network, adopting international standards such as the ITF may require early consideration, at the moment that decisions on (changes to) hardware and software are expected. Maintaining compatible records systems is more difficult than developing them.

References

Barendse, G.W.M., Bruinsma, B.F., Hoog, M.H., Ietswaart, J.H., Lammens, E., & Otten, K. (1979). Definitief Rapport Computer Commissie (Final report of the Computer Committee). University of Nijmegen, Nijmegen, Netherlands.

Brown, R.A. (1988). A guide to Computerization of Plant Records. American Association of Botanical Gardens and Arboreta, Computer Information Committee. Wayne, Pennsylvania, USA. ISBN 0-934843- 02-3.

Ogilvie, F.M.P. (1983). Reference Systems for Living Plant Collections. Landscape Publication no. LP/8201. Dept. of Architecture, Heriot-Watt University, Edinburgh, UK.

Wollenberg, L.J.W. van den, (1989). Aanzet tot een systeem-analytische benadering van het nieuw te implementeren ge´ntegreerd database-systeem BUD2 (Draft for the analysis of the integrated database system BUD2). Pp. 1-68. Unpublished. Utrect University Botanic Garden, Utrecht, Netherlands.

Preface  |  Contents List  |  Congress Report  |  Workshop Conclusions  |  List of Authors