Annotation Association File Format

Collaborating databases and projects provide the POC project a tab delimited file, known informally as an "association file". This file carries links between database objects and PO terms. The database object may represent one of gene, transcript, protein, protein_structure, complex, germplasm (stock/cultivar), mutant, QTL, etc. Columns in the file are described below. A sample file containing associations from the Gramene database is provided for comparison.

File Name

File Format

The GO Annotation File (GAF) 2.0 format comprises 17 tab-delimited fields, several of which are not mandatory. This includes two new columns (16 and 17) that were not part of the GAF 1.0 format.
Make sure the column order is strictly followed, including spaces for columns that are left blank.
Also see the Gene Ontology Annotation Format web page for more information.
* denotes required fields

Column Content Example
1. * DBGR
2. * DB Object IDGR:0060905
3. * DB Object Symbollrd10
4. Qualifier  
5. * PO IDPO:0007014
6. * DB:Reference(|DB:Reference)GR_ref:5655|PMID:2676709
7. * EvidenceIMP
8. With (or) From  
9. * AspectG
10. DB Object Namelesion resembling disease-10
11. DB Object Synonym(|Synonym)spl4|bl5|spotted leaf-4
12.* DB Object Typegene
13.* taxon(|taxon)taxon:4527
14.* Date20050303
15.* Assigned byGR
16. Annotation Extensionpart_of(PO:0028002)
17. Gene Product Form IDUniProtKB:P12345-2

Description of the content:

Column 1. DB Column 2. DB Object ID Column 3. DB Object Symbol Column 4. Qualifier Column 5. PO ID Column 6. DB:Reference Column 7. Evidence Column 8. With (or) From Column 9. Aspect Column 10. DB Object Name Column 11. Synonym Column 12. DB Object Type Column 13. Taxon Column 14. Date Column 15. Assigned by Column 16. Annotation Extension Column 17. Gen Product Form ID
Note that several fields contain database cross-references (dbxrefs) in the format dbname:dbaccession. The fields are: POid (where dbname is always PO), DB:Reference, With, Taxon (where dbname is always taxon). For PO id, do not repeat the 'PO:' prefix (i.e. always use PO:0000000, not PO:PO:00000000)