Technical Report
Of Genes and Regulatory elements: A New view of the Zebrafish Genome
By Priyanka Jamadagni
The technical report section features new methods and resources that are likely to be of broad use to the Zebrafish community. We encourage technical report articles from early career researchers and trainees. If you are interested in contributing an article, please contact Karuna Sampath (email: K.Sampath@warwick.ac.uk). In this issue, we feature a technical report piece by Priyanka Jamadagni. Priyanka is a doctoral student at the Institut National de Recherche Scientifique (INRS) in Canada where she is working in the laboratory of Dr. Kessen Patten. Priyanka’s research focuses on understanding the role of a chromatin remodeler 7 (CHD7) in brain development and in CHARGE syndrome/Autism Spectrum Disorder using zebrafish as a model.
In this issue of the IZFS News Splash, Priyanka's Technical report features a recent paper by Yang et al. https://doi.org/10.1038/s41586-020-2962-9.
Since its discovery in the 1980s as a promising genetic model to study vertebrate development and model human diseases, there have been multiple efforts to sequence and elucidate the structural and functional constitution of the zebrafish genome. Some of the key endeavours to this effect include: i) the zebrafish genome-sequencing project initiated at the Wellcome Trust Sanger Institute in 2001, which led to the most recent reference genome GRCz11, ii) profiling of Histone modification/methylation patterns typically using CHIP-seq (Vastenhouw et. al., 2010; Aday et. al.,2011), and iii) deduction of higher order enhancer chromatin structure (Kaaij et. al., 2016). Despite the progress in assembling a reference zebrafish genome, detailed annotations of the cis-regulatory elements are incomplete and particularly limited in the context of tissue-specific regulators. Poor sequence read quality arising from heavy heterochromatin and transposable elements in chromosome 4 has limited our understanding of large segments of what is thought to be a rudimentary zebrafish sex chromosome.
A recent article by Yang et. al now provides a new comprehensive annotation of the cis regulatory elements and their 3D structure within the zebrafish genome (Yang et al., 2020). The authors profiled eleven adult tissues and two embryonic tissues in an exhaustive analysis employing RNA-seq, ATAC–seq, whole-genome bisulfite sequencing (WGBS), Hi-C sequencing and CHIP-seq; generating a library of 10 billion reads in 161 genomic data sets, along with a new reference genome assembly which resolves chromosome 4.
Profiling of transcribed regions identified 14,764 transcripts that exhibited tissue-specific patterns. 13,285 were identified as previously unknown transcripts, which included long noncoding RNAs (lncRNAs), previously unknown mRNA isoforms and importantly, 3,739 previously uncharacterized protein-coding genes. Amongst the novel zebrafish transcripts with one-to-one human orthologues, 47% were found to show tissue-specific patterns in humans, suggesting that these genes might play critical and conserved roles in the tissues where they are uniquely expressed. These data provide a rich resource that the zebrafish community can now mine, test and validate.
Yang and co-workers identified 116,353 previously unidentified ATAC-seq peaks. Using a defined criterion for different cis-regulatory elements, they found that 40.9% of the predicted promoters and 62.5% of the predicted enhancers were novel, of which 71.3% enhancers were tissue-specific, located near genes and potentially important for tissue-specific functions. They generated 340,527 ATAC-seq peak-to-gene links with false discovery rate (FDR) below 0.01 using a correlation-based strategy, and predicted 96,540 enhancer- to-gene links, 37,241 of which were also supported by peak-to-gene links. They also validated enhancer tissue-specificity by GFP based reporter assays for 32 of the identified elements and >90% showed restricted expression patterns. It remains to be determined if the GFP based reporters reliably depict the endogenous expression patterns, and if the remaining vast number of enhancers can be validated.
Among the identified elements, promoters showed the highest degree of sequence conservation across species, with a lower degree of conservation among enhancers. 60% to 90% of the conserved zebrafish enhancers were also predicted as candidate enhancer elements by the NIH ENCODE and Roadmap Epigenetics Project (http://www.roadmapepigenomics.org). Those with no detectable sequence conservation in humans were conserved in other fish species with increased fish PhyloP scores. Investigating these elements could enhance our understanding of fish-specific genome regulatory patterns and the evolution of regulatory elements. Through scATAC-seq, the authors identified 25 clusters of cells that display key cell-type-specific transcription factor motifs in the zebrafish brain. Enrichment of the transcription factor motifs found neuronal transcription factors such as Sox9 and Olig2 enriched in different clusters, potentially suggesting cell-type-specific regulation by these factors in different areas of the zebrafish brain.
Through Whole genome bisuflite sequencing, the authors observed unmethylated regions (UMRs) that overlapped with candidate promoters and proximal ATAC-seq peaks and low-level-methylated regions (LMRs) that overlapped more with candidate enhancers and distal ATAC-seq peaks. Enrichment of differentially methylated regions (DMRs) and tissue-specific hypomethylated DMRs (hypoDMRs) identified potential tissue-specific cis-regulatory elements. Analysis of tissue-specific enhancers identified motifs that were enriched in the same tissues in zebrafish and humans. A three-node network analysis observed that the overall patterns of the networks were highly similar between zebrafish and humans, underscoring the value of zebrafish as a model to study human transcription factor regulatory circuits.
To study higher-order chromatin, the authors generated high-resolution Hi-C data, and predicted topologically associating domains (TADs) in the brain and in the muscle. Their analysis found that most of the loop anchors had convergent CTCF-binding motifs. 98.6% of the predicted loops in the brain were between regions that contain either at least one promoter or one enhancer, and 91.6% of enhancer–promoter loops overlapped with predicted enhancer–promoter or distal ATAC-seq peak–promoter linkage pairs. Further motif analysis showed that CTCF and BORIS were enriched in shared loops and tissue-specific transcription factors were enriched in tissue-specific chromatin interactions.
The authors identified three sets of zebrafish evolutionary breakpoints by aligning zebrafish genome against chicken, mouse and human respectively, and found that there is an association between TAD stability and conservation of the expression pattern. They suggest that strong chromatin interactions may contribute to TAD stability during evolution and breaking of TADs with strong interactions may be selected against during evolution, as these interactions may be physiologically important. Next, the authors conducted nano-pore, 10X Genomics and Bionano optical mapping long-read DNA sequencing, in one Tübingen female zebrafish. Combining these data with the Hi-C data from the brain allowed de novo assembly of a new version of chromosome 4. This represents an improved resource to study sex determination and the genes on this understudied zebrafish chromosome.
The work by Yang et al provides an extremely detailed and comprehensive analysis of candidate cis-regulatory elements, opening a rich and wide genomic landscape to be explored by zebrafish researchers. This will facilitate discovery of the functions of the newly discovered genes and uncharacterised regulatory elements during growth and development. The datasets and reference genome assembled in this study will likely serve as valuable new resources for comparative genomics and can significantly enhance the study of gene regulation in normal physiology and disease models using zebrafish.
References:
1. Yang, H., Luan, Y., Liu, T. et al. A map of cis-regulatory elements and 3D genome structures in zebrafish. Nature 588, 337–343 (2020). https://doi.org/10.1038/s41586-020-2962-9
2. Vastenhouw, N., Zhang, Y., Woods, I. et al. Chromatin signature of embryonic pluripotency is established during genome activation. Nature 464, 922–926 (2010). https://doi.org/10.1038/nature08866
3. Aday AW, Zhu LJ, Lakshmanan A, Wang J, Lawson ND. Identification of cis regulatory features in the embryonic zebrafish genome through large-scale profiling of H3K4me1 and H3K4me3 binding sites. Dev Biol. 2011 Sep 15;357(2):450-62. doi: 10.1016/j.ydbio.2011.03.007.
4. Kaaij, L.J.T., Mokry, M., Zhou, M. et al. Enhancers reside in a unique epigenetic environment during early zebrafish development. Genome Biol 17, 146 (2016). https://doi.org/10.1186/s13059-016-1013-1
Priyanka Jamadagni
PhD student
Kessen Patten Lab
INRS - Centre Armand Frappier Santé Biotechnologie.
Laval, QC H7V 1B7
Email: Priyanka.jamadagni@iaf.inrs.ca