Rice (Oryza) sativa is the main food source for half of the world's population, and contributes about 20% calories to human diet. Under the background of rapid global population growth and climate change, rice breeders are faced with the challenge of developing new sustainable rice varieties. These varieties not only need to increase yield, but also pay attention to the nutritional quality of grains and reduce environmental impact. Since the international rice genome sequencing project released the first "gold standard" reference genome of Japanese fine varieties in 2005, with the advancement of global research, more and more rice material genomes have been successfully sequenced, assembled and annotated, and rice pan-genome research has gradually increased. The accumulation of these genomic resources has greatly promoted the development of rice genetics and breeding research, and provided key data support for cultivating better quality and adaptability rice varieties. In order to better support rice research, there collates the database resources commonly used in rice research, hoping that these valuable tools can provide strong support for scientific research.
Research progress of rice genome
Comprehensive Rice Research Databases
RAP-DB Database
The rice annotation project database RAP-DB (https://rapdb.dna.affrc.go.jp/) was launched in 2004, aiming at providing accurate annotation of rice genome and promoting in-depth analysis of its structure and function. In addition, users can use GBrowse to view the gene distribution of Nissin IRGSP-1.0, search the gene function and sequence by gene name, use BLAST to search for homologous genes, and use ID Converter to convert gene ID.
RGAP Database
The rice genome annotation project database RGAP (https://rice.uga.edu/) of Michigan State University mainly provides rice genome sequence and annotation data. The database takes Nissin as a reference genome, including genome data and CDS data. Users can convert gene names through gene ID and RAP-db, and carry out BLAST search using known sequences to obtain information such as loci number, gene structure and CDS sequence of genes. On September 3, 2024, the database was updated, the genome browser was upgraded to Jbrowse2, and the gene expression and co-expression data and web pages were also updated. In addition, the website has added a syntelogs page, which contains two new syntelog data sets.
Gramene Database
Gramene (https://www.gramene.org/), a comparative genomics database, integrates various data resources through comparative functional genomics, and lists the genome information of different rice subspecies, including Oryza rufipogon, Oryza nivara and Oryza glaberrima. Plant provided by this website Reactome connects the reported gene signal networks in series to form a signal network about growth and development signals, secondary metabolic pathways and biotic or abiotic stresses. Users can view specific data on the website.
RPAN Database
The rice pan-genome database RPAN (https://cgm.sjtu.edu.cn/3kricedb/index.php) is constructed from 3K rice genome, and contains about 370Mbp of IRGSP genome and about 260Mbp of new sequences. The database provides basic information of 3,010 rice germplasm, including the sequence and gene annotation of rice pan-genome, gene presence-deletion variation (PAV) and expression profile. Its basic search function allows users to query the basic information and sequencing prospect of a single gene or rice. The advanced search function supports searching the shared genes of multiple rice germplasm or the existence of multiple genes. Visualization functions include tree browser for viewing the phylogeny of 3K rice germplasm and genome browser for gene annotation and PAV.
learn more: genome sequencing agriculture
MBKbase Database
MBKbase (https://www.mbkbase.org/rice), a comprehensive database of rice molecular breeding, provides the genotype and related phenotype of each gene locus in the population by integrating the high-quality reference genome of rice, reveals the relationship among germplasm, phenotype and genotype, and realizes online integration analysis and visualization of genotype and phenotype. The database consists of five functional modules: the germplasm module provides information on rice germplasm resources, the phenotype module displays trait records, the genotype module integrates sequenced samples and SNP/InDel information to confirm the verified alleles, and the population module provides information on WGS population and its phylogenetic tree, which provides important support for functional genomics research and molecular breeding of rice.
SNP-Seek Database
Rice SNP retrieval database SNP-Seek (https://snpseek.irri.org/index.zul) was established by the International Rice Research Institute in 2014, which mainly provides information on rice genotypes, phenotypes and varieties. The database is based on the Japanese reference genome IRGSP-1.0, integrates SNP genotyping data from 3,000 rice genome projects, and includes phenotypic data from the International Rice Gene Bank Collection Information System (IRGCIS). Its purpose is to centralize the information access of rice research data and provide calculation tools to discover new gene-trait associations and promote rice improvement.
learn more: crop genome sequencing
Oryzabase Database
Oryzabase (https://shigen.nig.ac.jp/rice/oryzabase/), a comprehensive rice science database, was established by the Japan Rice Research Council in 2000 to collect as much information about rice as possible, including strain inventory, mutant information, chromosome map, gene dictionary and basic knowledge. The database covers the research progress from classical rice genetics to the latest genomics, provides a wide range of genetics and genomics data, involves many popular research topics, and provides rich information for rice researchers.
Rice Gene-related Databases
RGI Database
Rice Gene Index Database RGI (https://riceome.hzau.edu.cn/) contains reference genomes of 16 major Asian rice materials, and processes multiple genomes and annotation information in a unified way. A total of 119,783 non-redundant genomes were identified to represent the whole Asian rice gene set, and a unified digital orthogonal gene index (OGI) was also established, indicating its representativeness in all materials. RGI provides a wealth of modules and tools to help researchers query and visualize rice genes and their homologous relationships. Each gene has a "comprehensive graphic information card", which contains homologous gene index, sequence and function information, and shows transcript structure and phylogenetic tree.
RiceVarMap v2.0 Database
Rice Genome Variation and Functional Annotations Comprehensive Database RiceVarMap v2.0 (http://ricevarmap.ncpgr.cn/) provides information of 17,397,026 genome variations, including 14,541,446 SNPs and 2,855,580 small INDEL, from 4,726 sequencing data of rice materials. These mutations were identified by GATK software based on OS-nipponbare-reference-irgsp-1.0, ensuring high quality and complete genotype data. In addition, the database also contains phenotypic data and GWAS results to help researchers search for important SNPs related to various traits. The new version introduces more powerful query and visualization tools, such as online forecasting tool Regulatory Variant.
RiceXPRo Database
RiceXPRo (https://ricexpro.dna.affrc.go.jp/) is a repository of rice gene expression profiles, covering the whole growth process of rice plants in natural field conditions, rice seedlings treated with different plant hormones, and specific cell types and tissues separated by laser microdissection (LMD). The database aims to characterize the expression profiles of all predicted genes in rice and provide reference information for functional genomics. All expression profiles were generated using a single microarray platform, and the probes were based on the manually arranged gene model in RAP-DB and the full-length cDNA sequence information of rice in KOME database.
RiceENCODE Database
RiceENCODE (http://glab.hzau.edu.cn/RiceENCODE/index.html) is a comprehensive encyclopedia database of rice DNA elements, which combines published three-dimensional interactive data (ChIA-PET, Hi-C) and epigenome data sets (ChIP-Seq, ATAC-Seq, MNase-Seq, FAIRE-Seq, WGBS, RNA-Seq). The database contains 694 data sets, which is the largest rice epigenome data at present.
RiceGE Database
Rice Functional Genome Expression Database RiceGE (http://signal.salk.edu/cgi-bin/RiceGE5) focuses on providing rice-related gene expression patterns and gene function annotation information. The database integrates a large number of transcriptome data generated based on high-throughput sequencing technology, and supports users to query the expression of specific genes at different developmental stages or in response to environmental stimuli. In addition, it also contains multi-dimensional information such as gene cloning status and gene function verification, which provides valuable resources for researchers to deeply understand the function and regulation mechanism of rice genes
RiceRelativesGD Database
Rice-related species genome database RiceRelativesGD (http://ibi.zju.edu.cn/ricerelativesgd/) provides gene and genome resources from 16 rice related species in Gramineae, including important rice weeds such as weedy rice and barnyard grass (Echinochloa crus-galli) and ancient crop Zizania. Latifolia), nine kinds of wild rice and African cultivated rice. The database specifically provides the genes unique to indica and japonica rice in various rice related species and their annotations, and includes online services, such as genome browser, search tool and phylogenetic tree construction tool, to help researchers make better use of the genome data in the database.