• Home
  • News
  • 80,000 Data Points and Growing…

80,000 Data Points and Growing…

May 14, 2012
November, 2004

cds_mwarburtonIn November CIMMYT unveiled a significant addition to the field of DNA fingerprinting for wheat and maize. Two databases, fashioned by molecular geneticist Marilyn Warburton and her team, are the largest public information sites of their kind.

Offered online via CIMMYT’s www page (see links below) and on CD-ROM, the new databases can be accessed or requested. Currently, over 80,000 data points are recorded, but the databases’ dynamic nature enables the constant incorporation of new information, so scientists worldwide can integrate information into the original studies. “This feature will perhaps be their greatest legacy,” says Warburton, “as people can add and compare their data with CIMMYT’s to address an infinite number of queries.” In fact, the size of these databases is expected to double within one year. Recorded in the databases are characterization information for CIMMYT varieties (pure lines and populations), breeding materials, and landraces, as well as materials from collaborating universities and national agriculture research programs in developing countries.

Of Widespread Interest

Like the diversity within the databases themselves, those who stand to reap the benefits from such a project are varied. “The more people who know how to use it and do, the more useful it becomes,” Warburton predicts. Breeders will utilize it to ascertain the success of a potential cross. Gene bank curators can steer clear of myopia and work with more complete or correct information regarding a strain’s pedigree or origin. When one encounters, for example, a wheat strain labeled as originating in the former USSR, ambiguity is difficult to overcome in such a vast area. Also standing to benefit from this affair is the relatively new field of association analysis, which determines the function of specific genes. A little bit like detective work, these databases bridge the gap between the physical traits of a variety and its DNA sequences.

Providing Access

“If you want something done, you have to do it yourself,” Warburton remarks, in reference to her newfound computer savvy skills. Because there was nothing on the market that suited the project’s needs, Warburton learned Microsoft Access™ and modified it to properly manage the deluge of data. In addition, in Access, CIMMYT’s software developers Carlos Lopez, Juan Carlos Alarcón, and Jesper Norgaard built three specific tools to manipulate the data, with more in the works as the project grows. Other scientists, students, and assistants helped build the database by carrying out individual laboratory studies, which are recorded in the final product. Reformatting data to meet the input needs for different analysis programs can be tedious, toilsome work, and nearly discouraged one postdoctoral scientist from finishing his program. The fingerprinting database has data translation tools to input and output data in multiple formats. Many supporters of the fingerprinting work have been around from the beginning, and funding came from a variety of sources including the European Commission, Germany’s Federal Ministry for Economics and Development (BMZ), and more recently, the CGIAR Generation Challenge Program.

Efficient storage of multiple data types is essential for understanding and applying the vast universe of genes to improve wheat and maize varieties, which provide developing countries with better options to feed their hungry. Empowering faster and more efficient crop improvement which targets the needs of farmers, databases of the different data types will allow scientists to search for ideal traits and find the varieties with the genes that control these traits. Like a giant toolbox filled with unknown gadgets, the genes are there, but it hasn’t always clear what they do or how plants use them. Warburton and her team have started the process that, together with other data types, will allow each tool to be examined and labeled, furnishing scientists with clues to improve maize and wheat varieties.

genet_diverTableMaize database: http://www.cimmyt.org/english/docs/manual/dbases/contents_mz.htm

Wheat database: http://www.cimmyt.org/english/docs/manual/dbases/contents_wh.htm