Plant Ontology Developers Guide

Our curation and management model is based on a tested protocol established by the Gene Ontology Consortium. PO development work is based on text files in OBO flat file format.

In January 2011, the Plant Anatomy Ontology (PAO) and the Plant Growth and Development Stage Ontology (PGDSO) were merged into one file: plant_ontology.obo. This file includes terms, relations, and definition related to plant anatomical entities as well as plant growth and development stages. This allows users to download the entire ontology as a single file, and allows editors to create relations between terms in the two branches of the PO.

In compliance with OBO Foundry principles, the PO is "open and available to be used by all without any constraint other than (a) its origin must be acknowledged and (b) it is not to be altered and subsequently redistributed under the original name or with the same identifiers." The live version of this file is available through our SVN repository as well as on our web browser. For the immediate future, the POC will maintain separate versions of the po_anatomy.obo and po_temporal.obo files that match the live version of plant_ontology.obo (but obviously will lack links between the two branches) at our SVN site. However, users are encouraged to switch to the merged file as soon as possible.

The combination of text files and the CVS system allows us to track changes to each of the documents, for members to work on the same documents concurrently, and for the differences between different curators' versions of the documents to be merged reliably.

In order to update the ontologies, curators use OBO-Edit. This open source Java application implements the rules and constraints needed to maintain internal consistency among the ontologies.

Unique identifiers

As specified in the OBO Foundry principles, the PO must possesses a unique identifier space within the OBO Foundry, so that the source of a term (i.e. class) can be immediately identified by the prefix of its identifier. The prefix for Plant Ontology terms is PO, with the syntax PO:nnnnnnn, where nnnnnnn is a zero-padded unique integer of seven digits. To ensure database integrity, unique identifiers are never removed after they have been published in a live release of the ontology. Instead, terms that are retired from the ontology are moved into the obsolete category.

To ensure that the same identifier is not used twice by different editors, each contributing group within the POC has been given non-overlapping ranges of numbers (see the Accession IDS Guide). These ranges automatically act as internal identifiers for the group that submitted the term.

A term can have multiple IDs (one primary ID and one or more alternate IDs). Alternate IDs are created whenever two terms are merged internally or when terms have been added to the PO by merging external ontologies (e.g., TAIR and Gramene IDs).

Adding terms

New terms can often be added to the tips of the ontology (leaf terms) without disturbing the structure of the graph. For example, creating a new specific instance of a more general term will not disturb sibling or cousin terms. However, introducing a new root term or a term into the middle of the graph can disturb existing parent/children relationships, and needs to be performed with more care.

All new terms that are added that are added to the PO should be posted as a request on our Source Forge tracker. This provides an opportunity for all curators and collaborators to comment on the proposed term and its definition and relations. The tracker item also provides a record of the decisions that were made about a term, when those decisions were made, and why.

Merging and splitting terms

Merging and splitting of terms that have descendents can have broad ripple effects. Such changes will need to be approved by a consensus prior to committing them. Any proposed merges or splits should be posted on the Source Forge tracker.

Two or more terms should be merged when the curators determine that they are describing the same class of objects. This often occurs when two ontologies are merged, and there is some overlap of classes between the ontologies. It can also occur if a decision is made only to include a more general class rather than several specific classes. For example, achene, berry and capsule were merged into the term fruit. Finally, merging may be necessary if the curators determine that two classes are redundant, such as when meristemoid was merged with initial cell.

Merging involves a target term and a source term (or terms). When the two terms are merged, the ID of the target term becomes the primary ID while the ID of the source term becomes a secondary ID. The term name of the target term remains as the primary name while the term name of the source term becomes an exact synonym. Definitions and comments are merged, and the curators must edit them to ensure that they are appropriate for the new merged term.

Term definitions

As specified in the OBO Foundry principles, each term in the PO must have a textual definition, and terms should be defined so that their precise meaning within the context of a particular ontology is clear to a human reader.

Whenever possible, internationally accepted nomenclature and definitions obtained from standard reference works, journal articles, and other published sources are used, although they may be modified to fit the genus-differentia form. In cases when a published definition is unavailable, or when published definitions disagree with each other, definitions will be written by the curators. This is often the case for upper-level ontology terms, since many published definitions are written with specific taxa in mind, while PO definitions must be appropriate for all plant taxa to which a term can apply.

All definitions must have a reference that indicates the authority for the term. References are typically textbook or journal article citations. For uniformity, the PO uses citation database identifiers, such as PubMed IDs and ISBN numbers as [http://plantontology.org/docs/dbxref/PO_DBXref.txt DBXrefs]. If a definition is written by a curator (as is often the case with lower-level terms that have simple genus-differentia definitions refering to other PO terms), then the definition DBXref should have the form Database:curator initials. For example: POC:rw or TAIR:tb. Definitions that are written collectively by the POC curators are identified as POC:curators.

Images or diagrams can be helpful for supplementing text definitions, in cases where words cannot adequately describe anatomic or developmental relationships. Image files cannot be inserted into the OBO file, but links to reference images on the internet may be included. In order to insure that the images remain available between live releases, curators should only provide links to websites that are approved by the group and that have stable URLs for their images.

Relations in the PO

Under the standard Plant Ontology data structure, terms are allowed to have a limited number of parent/child relationships. Below are brief descriptions of the relations currently used the in the PO. For more detailed descriptions, including formal definitions, see the wiki page on Relations in the Plant Ontology.


is_a: The is_a relation is used to indicate the relationship between a specific class and a more general one. For example, megasprophyll is_a sporophyll and sporophyll is_a phyllome. This means that every instance of megasporophyll is a sporophyll. Since the is_a relation is transitive, every megasporophyll is also a phyllome.

part of: The part_of relation is used to indicate that one class is part of another class. For example, ectocarp is part_of pericarp, which in turn is part_of fruit. The part_of relation is transitive, so every instance of ectocarp is also part of some fruit.

has_part: The has_part relation is used to indicate that one class always has an instance of another class as a part. For example, inflorescence has_part flower. This means that every instance of inflorescence has a flower as a part, but it does not imply that every flower is part of an inflorescence. The has_part relation is transitive.

derives_from: The derives_from relation is used to indicate that one plant structure succeeds another across a temporal divide in such a way that at least a biologically significant portion of the matter of the earlier structure is inherited by the later. For example, leaf-derived cultured plant cell is derives_from leaf indicates that a significant portion of the matter of a leaf-derived cultured plant cell is inherited from some cell in a leaf.

develops_from: The develops_from relation is used to indicate that a plant structure develops from its parent term. For example, root hair cell develops_ from trichoblast.

adjacent_to: The adjacent_to relation is used when one plant structure is in permanent contact with another plant structure. For example, anther wall endothecium adjacent_to anther wall exothecium. In this example, every instance of anther wall endothecium should be in permanent contact (adjacent_to) some instance of anther wall exothecium. This does not imply that every anther wall exothecium is adjacent to some anther wall endothecium. If the latter were also true, that relation would have to be asserted separately. The adjacent_to relation is not transitive.

participates_in: The participates_in relation provides a link between an independent occurant in the PAO and a continuant in the PGDSO. It is used to indicate that an anatomical entity only occurs during a particular plant growth or development stage. In the PO, the participates_in relation provides a way of more clearly defining structures that occur only in a particular growth stage or phase. For example archegonium participates_in gametophytic phase or vascular tissue participates_in sporophytic phase. The participates_in relation can also be used post-compositionally to describe structures such as gametophyte, sporophyte, or seedling. For example, a user wishing to annotate to sporophyte should describe it as a whole plant that participats_in the sporophyte phase.

Other attributes for PO terms

The Plant Ontology data structure allows terms to have attributes other than their name, ID and definition. The two attributes that we use in the current PO are Synonym, which indicates an alternative name for the term, and Reference, which indicates the authority for the term. References are typically textbook or journal article citations. For uniformity, we use citation database identifiers, such as PubMed IDs and ISBN numbers as DBXrefs. Any term can have multiple attributes, allowing several synonyms or references to be attached to a term. We will also have external IDs for the terms that have been imported from participating databases (e.g. TAIR IDs and Gramene IDs).

Quality control and consistency checking

We will monitor the developing PO for inconsistencies by applying the true path rule, which insists that semantic coherence is maintained as terms are followed upwards to their ancestors. Editors use the Rule Based Reasoner built into OBOEdit on a regular basis to check for redundancies. As the use of cross-product definitions is expanded, the reasoner will also be used to check for implied relationships that should be asserted and to look for logical inconsistencies.