DAG-Edit and the flat-file format described below are no longer in use by the Plant Ontology.

Please follow this link to read about the newer OBO flat file format.


Understanding the flat file format.

The best way to visualize the ontologies is through a browser that represents the data in a hierarchical tree format. The flat file format was originally developed as a flexible, and extensible machine parseable representation. The DAG-Edit ontology editor is capable of both reading and writing the flat file format. Parsers have also been written to read flat files into a variety of databases and alternative formats. This page outlines the main features of and how to interpret flat file formatted ontologies.

I. Meta-data about the file.

Included at the top of each file are comments containing descriptive information about the file. Comment lines are preceded with an exclamation point (!). The comment lines generally have the following format:

!autogenerated-by:  DAG-Edit version 1.401  -Text produced by DAG-Edit that identifies the version. 
!saved-by:   Pankaj  -Name of the individual who created the version
!version:    $Revision: 1.5 $   -Each version corresponds to a revision with a number assigned from CVS
!Disclaimer: Copyright 2003 Plant Ontology Consortium  -Copyright statement
!Title:      Plant Growth and Development  -The title of the page.
!Date:       Wed Dec 10 14:05:08 EST 2003  -The date when the file was committed to the CVS

II. Flat file format conventions: Ontology file

The following section describes the formatting used in the ontology flat files and their meanings.

A. Root terms

1. Each ontology starts with the root term and children terms are listed below their parents. The indentations represent the depth from the parent term.The root term is designated with a dollar symbol ($).

B. Types of relationships between terms.

Relationship types are indicated using symbols preceding each term. This represents the relationship between a child term and its parent term (term to term relationship).

1. Relationship type: IS A

A percent (%) symbol indicates that term "a" IS AN instance (term to term relationship ISA) with that of its parent term placed higher in the tree.
For example, sepal IS A type of floral organ.

 %floral organ
  %sepal
 2. Relationship type: PART OF

A less-than (<) sign is used for terms that are a part of (term to term relationship PART OF) a parent term placed higher in the tree.
For example: sepal is a PART OF a flower.

%flower
 <sepal
3. Relationship type: DEVELOPS FROM

A tilde (~) sign is used for terms that develop from (term to term relationship DEVELOPS FROM) its parent term.
For example: trichome DEVELOPS FROM a trichoblast.

<sepal
 <epidermis
  <trichoblast
   ~trichome

C. Inheritance and other features of terms.

Each row represents a single term. All of the information about that term (with the exception of the definition) is contained in each row including parentage, relationship types, term identifiers and term synonyms.

1. Inheritance.

In the following example the hyphen sign (-) is used to show the indentation from left. In the flat files, the hyphens (INDENTS) are one or more blank spaces. Finer, more granular level terms lies at the bottom of tree and parent terms are at the top of the tree. Deeper, granular terms are indented to a greater depth than their parent (general) terms.

For example: sepal (unique identifier=PO:0000012) is a part of a flower which is a type of an organ which is a part of the plant structure ontology.

$Plant structure ontology ; PO:0000001
-<organ ; PO:0000006
--%flower ; PO:0000007
---<sepal ; PO:0000012
2. Multiple parentage

It is represented with the following syntax FOR EACH LOCATION IN WHICH THE CHILD NODE IS FOUND.

[INDENT][relationship type][term name][SPACE];[SPACE][PO:nnnnnnn][SPACE][relationship type to second parent][SPACE][second parent term name][SPACE];[SPACE][second parent term id PO:nnnnnnn]

For example a sepal can be represented as a type of floral organ:

%floral organ ; PO:0000043
 %sepal ; PO:0000012 <flower ; PO:0000007


and also as a part of the flower:

%flower ; PO:0000007
 <sepal ; PO:0000012 %floral organ ; PO:0000043
3. Representation of synonyms.

If a term has one or more synonyms, the alternative term is represented in the flat files according to the following format:

[indentation][relationship type][term name][SPACE];[SPACE][PO:nnnnnnn][SPACE];[SPACE]synonym:[alternative name]

For example: term hair cell is a synonym of trichoblast.

 <trichoblast ; PO:0000033  ; synonym:hair cell

II. Flat file format conventions: Definition file

The following section describes the formatting used in the definition flat files and their meanings.

It is not mandatory to have definitions for all the terms in the ontology flat files. However, it is suggested that every effort is made to have one. If the definition exists, then it must contain the following:

term: the name of the PO term to which the definition refers. [term name]
goid: the term's unique identifier/accession number. [PO:nnnnnnn]
definition: the definition of the term in free text format.  [definition]
definition_reference: one or more references for the definition. This includes  [Xref key]:[reference id]

A definition may also have a comment:

comment: free text  [not mandatory]

For example:

term: sepal
goid: TAIR:0000125
definition: A unit of the calyx.
definition_reference: ISBN:047124529