Advances in DNA Sequencing Technologies for High Resolution HLA Typing
Recent advances in DNA sequencing technologies, so-called Next Generation Sequencing (NGS), have brought breakthroughs in deciphering the genetic information in all living species at a large scale and at an affordable level.
By introducing DNA barcode (index) sequences multiplexing samples from hundreds of individual became possible for genotyping certain genomic regions faster and cheaper with higher resolution.
Here we present Histogenetics’s experience and accomplishments in applying NGS for large-scale high resolution HLA typing. Histogenetics had established Sanger capillary technology in 2006 for large volume DNA-based sequencing typing and more than 3.8 million samples were typed with that technique. Histogenetics’ existing infrastructure helped us to transition to the NGS technologies without compromising accuracy, volume of typing and speed. In March 2013 Histogenetics introduced a Hybrid approach of Sanger + Illumina MiSeq DNA sequencing. A total 460,190 samples were typed with MiSeq+Sanger to validate MiSeq data during transition to NGS, shown in the table below.
High resolution typing was achieved using NGS MisSeq platform. Comparison of resolution level between NGS and Sanger sequencing techniques for registry donors are shown below.
After establishing the new platform, in October 2013 we introduced Illumina MiSeq as the first line method for high volume, high resolution HLA Typing. To date we have typed close to 5 million individuals using SBT. While we were pushing for higher volume typing, we were also exceling in quality and accuracy with the strict quality control and quality assurance policy established in Histogenetics‘ High throughput HLA typing process.
National Marrow Donor Program (NMDP) is one of the Histogenetics’ major clients, and has a strict quality control program where average 3% of blind QC samples are included in every batch of testing samples. The table below shows error free typing for NMDP registry donors.
Despite of these excellent results with Illumina MiSeq technology we have been exploring other single molecule sequencing technologies such as Pacific Bioscience’s RS II.
The MiSeq platform has accomplished higher resolution HLA typing results, faster and more cost effective and easier work flow compared to Sanger Sequencing and other NGS. However, it has some shortcomings such as shorter read length compared to Sanger and PacBio that could result in missing insertions, and inability in phasing the exons unless additional amplicons are introduced. In addition MiSeq has a long run time and produces sequencing artifact in certain amplicons. Also, depending on a single technology and company can be a risk when quality of reagents fails or becomes substandard.
The PacBio platform has the following advantages to the MiSeq platform: Long read lengths with excellent phasing of the Exons and Introns and short run times. It also provides us with an excellent alternative technology. Disadvantages of PacBio compared to MiSeq are a limitation in the barcoding (multiplexing) and longer sample preparation time.
Since October 2014, we have been routinely using PacBio for class I typing for resolving exon shuffling ambiguities and the new alleles. We have performed more than 5000 HLA-ABC on the PacBio platform, sequencing 1 kb amplicon that include ARS region (exon 2 and exon 3). We are incrementally extending the coverage length, and now for special projects we can routinely type the full gene length -3.5 kb which includes 8 exons and seven introns. Typing full length Class II genes are more challenging due to the lengths. They are approximately 18 kb or longer. But typing 5 kb fragments that include exon 2 and the rest of the downstream exons to the 3’ translated regions are underway.
Another very important issue with NGS is the interpretation, presentation and visualization of the data. The focus should be matching patients and potential donors for those regions defining Antigen Recognition Sites (ARS) unambiguously while noting the similarities and variations in other regions of the gene.
Below is an example of presentation for sequence matching at ARS between patient and potential donors.
The above figure is a schematic presentation of HLA typing report that compares patient and potential donors focusing on ARS regions.
Recent progress in sequencing technologies and laboratory processes together with advanced informatics enable us to have a clearer representation of MHC and other Immune response genes. This will in turn help us to understand the puzzles of complex genetic systems that can serve the base for health and disease.