RDML qPCR Data Format – Ready For The Next Level?

Andreas Untergasser1, Steve Lefever2, Jasper Anckaert2, Jan M Ruijter3, Jan Hellemans4, Jo Vandesompele2,4
1University of Heidelberg, Heidelberg, Germany; 
2Ghent University, Ghent, Belgium; 
3Academic Medical Center, Amsterdam, The Netherlands; 
4Biogazelle, Zwijnaarde, Belgium

Quantitative PCR (qPCR) is the gold standard method for accurate and sensitive nucleic acid quantification. To improve the quality and transparency of experiment design, data-analysis and reporting of results, the MIQE guidelines were established in 2009 (Bustin et al., Clinical Chemistry). The Real-time PCR Data Markup Language (RDML) was designed to establish a vendor independent, freely available XML based file format to store and exchange qPCR data (Lefever et al., NAR). RDML stores the raw data acquired by the machine as well as the information required for its interpretation, such as sample annotation, primer and probe sequences and cycling protocol.
Today, several instrument manufacturers realized its potential and implemented functionality to export data in the RDML-format. Third party software (LinRegPCR and qbasePLUS) uses this information for advanced data analysis. Due to the flexibility of RDML, the majority of the current software uses only parts of the format. Furthermore, with different RDML versions available, the need to convert between versions became obvious. The open source editor RDML-Ninja was designed to edit RDML-files and convert between different versions (sourceforge.net/projects/qpcr-ninja/). It should serve as reference implementation of the RDML-format and assist researchers, reviewers as well as software developers by offering access to all data in an RDML-file.
Ultimately, RDML could be extended to store all information required by MIQE. Currently the information required by MIQE seems overwhelming to a researcher, but RDML offers an easy way out. All the information would be only entered once and stored in a basic RDML file. Researchers would not have to re-enter this information with every qPCR run, but will import from this RDML file only the parts needed for the current qPCR run. Furthermore integration of MIQE in RDML and RDML-Ninja would allow checking to which extend MIQE information is provided by calculating the checklist completeness based on a provided RDML-file. We would like to discuss this vision, its chances and its applicability.

Back to qPCR BioStatistics & BioInformatics

Impact of Smoothing on Parameter Estimation inQuantitative DNA Amplification Experiments

Stefan Rödiger1, Andrej-Nikolai Spiess2, Michał Burdukiewicz3
1BTU Cottbus – Senftenberg, Senftenberg, Germany; 
2University Medical Center Hamburg-Eppendorf, Hamburg, Germany;
3University of Wroclaw, Wroclaw, Poland

Quantitative real-time polymerase chain reaction (qPCR) is one of the most precise DNA quantification methods. The parameters quantification cycle (Cq) and amplification efficiency (AE) are commonly calculated from distinct location indices of the amplification curve (threshold fluorescence, first- or second-derivative maxima) to quantify qPCR reactions. Consequently, a precise analysis is the requirement to quantify the copy number in samples [1]. Several smoother and filter methods for minimizing inherent noise in qPCR data have been proposed in the peer-review literature. Despite the fact that smoothing steps are so frequently employed during amplification curve analysis and generally taken for granted, the question that arises is if should we really accept to use any of these methods without paying attention to their possible implications.
The smoothers and filters we compared in our investigation are widely used to compensate for noisy data. We found no fundamental controversy in the scientific community about the smoothers and filters used in our study. All of them are thoroughly tested, peer-reviewed, and well accepted. In our study we specifically addressed the question, which of the smoothers is appropriate for amplification curve data acquired by isothermal amplification or qPCR.
Due to the lack of comprehensive models we have chosen an empirical approach in combination with amplification curve simulation to evaluate the smoother and filter functions in a testable scenario. For this purpose, we analyzed the impact of the smoother methods on real-world data from different thermo cycler equipment (low through-put and high-throughput cyclers) as well as different amplification methods. We also used in our analysis “user-controlled” noise structures based on Monte Carlo simulations.
Our results indicate that selected smoothing algorithms affect the estimation of Cq and AE considerably. The commonly employed moving average filter performed worst in all qPCR scenarios. Least bias was observed for the Savitzky-Golay smoother, Cubic Splines and Whittaker smoother. In general, we found a low sensitivity to differences in AE, whereas other smoothers like Running Mean introduced a significant AE dependent bias. We developed open source software packages to facilitate the selection of smoothing algorithms that can be incorporated in an analysis pipeline of qPCR experiments. The findings of our study were implemented in the R packages chipPCR and qpcR [2,3], freely available from “The Comprehensive R Archive Network”. We anticipate that our findings serve as guidelines for the selection of an appropriate smoothing algorithm in diagnostic qPCR applications. However, a general feasibility of qPCR data smoothing remains to be demonstrated.
[1] Pabinger and Rödiger et al., Biomolecular Detection and Quantification (2014), 1/1, 23-33. [2] Spiess AN et al., Clinical Chemistry (2015), preprint. [3] Rödiger et al. (under revision), Bioinformatics (Oxf.)

Back to qPCR BioStatistics & BioInformatics

Occurrence of unexpected PCR artefacts warrants thorough quality control

Adrián Ruiz-Vilalba1, Bep van Pelt-Verkuil2, Quinn Gunst1, Maurice van den Hoff1, Jan Ruijter1
1Department of Anatomy, Embryology and Physiology, Academic Medical Centre (AMC), Amsterdam, The Netherlands; 
2Department of Innovative Molecular Diagnostics, University of Applied Sciences, Leiden, the Netherlands

A recent comparison of qPCR data analysis methods showed that some of the amplification curve analysis methods perform better than the classic standard curve and Cq approach on indicators like variability and sensitivity. In this comparison the possibility to characterize the amplification curve and thus assess its quality remained under-exposed. To this end, different datasets have been compared. Our results show that for a significant fraction of the genes, low initial target concentrations lead to the amplification of artifacts independently of the primer specificity. These non-specific amplification curves are indistinguishable from those resulting in the correct products; they show similar baseline, PCR efficiency and plateau fluorescence behaviours. The validation of specific amplification curves requires a quality control in which the design of the plate, a melting curve analysis (MCA) and electrophoresis gels are combined. In addition, our data suggest that the relative concentration of the template in the cDNA input and of the primers determines the appearance of the PCR artifacts. Unexpectedly, the presence of non-template foreign cDNA seems to be an essential requirement for the amplification of the correct specific qPCR target.

Back to qPCR BioStatistics & BioInformatics

The PrimerBank database: an analysis of high-throughput primer validation

Athanasia Spandidos1,2,3, Xiaowei Wang1,2,4, Huajun Wang1,2, Brian Seed1,2
1Center for Computational and Integrative Biology, Massachusetts General Hospital, Boston, MA; 
2Department of Genetics, Harvard Medical School; 
3Current address: 1st Department of Pathology, National and Kapodistrian University of Athens, Athens, Greece.;
4Current address: Division of Bioinformatics and Outcomes Research, Department of Radiation Oncology, Washington University School of Medicine, St Louis, MO.

qPCR remains the gold standard used for validation of gene expression measurements from high-throughput methods such as DNA microarrays, however, non-specific amplification is frequently an issue. In order to overcome this, we developed the PrimerBank database, a public resource containing primers that can be used under stringent and allele-invariant amplification conditions. PrimerBank can be used for the retrieval of human and mouse primer pairs for gene expression analysis by PCR and RT-qPCR. Currently, the database contains 497,156 primers which cover 17,076 and 18,086 genes for the human and mouse species, respectively, corresponding to around 94% of all known protein-coding gene sequences. PrimerBank also contains information on these primers such as Tm, location on the transcript and expected PCR product size. Primer pairs covering most known mouse genes have been experimentally validated by amplification plot, gel electrophoresis, DNA sequence and thermal denaturation profile analysis, and all the experimental validation information together with primer information can be freely retrieved from the PrimerBank website (http://pga.mgh.harvard.edu/primerbank/). The database can be searched using various search terms. One of the advantages of PrimerBank primers is that they have been designed to work under a common PCR thermal profile, so they can be used for high-throughput RT-qPCR in parallel or genome-wide RT-qPCR. The expression profiles of thousands of genes can be determined simultaneously using high-throughput platforms available, making PrimerBank primers useful for gene expression analysis on a genome-wide scale.

Back to qPCR BioStatistics & BioInformatics

Unexpected System-specific Periodicity In Quantitative Real-Time Polymerase Chain Reaction Data And Its Impact On Quantification

Andrej-Nikolai Spiess1, Stefan Rödiger2, Thomas Volksdorf3, Joel Tellinghuisen4
1Department of Andrology, University Hospital Hamburg-Eppendorf, Germany; 
2Faculty of Natural Sciences, BTU Cottbus – Senftenberg, Cottbus, Germany; 
3Department of Dermatology, University Hospital Hamburg-Eppendorf, Germany; 
4Department of Chemistry, Vanderbilt University, Nashville, Tennessee, USA

The “baseline noise” of quantitative real-time PCR (qPCR) data is a feature of every qPCR curve and has substantial impact on quantitation. In principle, two different forms of baseline noise can be encountered: (i) the dispersion of fluorescence values in the first few cycles of a qPCR curve around their mean (within-sample noise) and (ii) the dispersion of fluorescence values between different qPCR curves at the same cycle (between-sample noise). The most predominant effect that results in between-sample noise is an overall shifting of the qPCR curve on the y-axis (“baseline shift”), which is frequently compensated by “baselining” qPCR data. Common approaches are to subtract an averaged (Lievens et al., 2012; Rutledge, 2011; Goll et al., 2006), iteratively estimated (Ramakers et al., 2003; Ruijter et al., 2009) or lower asymptote derived (Tichopad et al, 2003; Peirson et al., 2003; Spiess et al., 2008) baseline value from all fluorescence values prior to quantitation (compare Table 1 in Ruijter et al., 2013).
Recently, we showed preliminary results on a published large scale technical replicate dataset (Ruijter et al., 2013) that indicated between-sample periodicity for fluorescence values at early and late cycle numbers (Tellinghuisen & Spiess, 2014). A more detailed interrogation of the between-run noise periodicity revealed that this effect occurs at all cycle numbers and constitutes a second and completely independent noise component that adds to the overall baseline shift. Most importantly, periodic noise persists even after classical “baselining” and results in a propagation of periodicity into estimated Cq values when using fixed threshold methods (LinReg, FPKM, DART, FPLM), hence resulting in periodic Cq values. In contrast, Cq values obtained from variable threshold methods based on first- or second-derivative maxima (Cy0, Miner, 5PSM) or from normalization of fluorescence data are completely devoid of periodic noise, corroborating the feasibility of these approaches.
The origin of periodic noise in qPCR data remains elusive. By employing a larger cohort of published and also self-generated high-replicate qPCR data from different platforms, we used classical algorithms of time series/signal analysis (i.e. autocorrelation analysis) to characterize the periodicity in more detail. Interestingly, we generally observed a periodicity of 24/12 for 384/96-well plate systems, respectively. These findings strongly suggest an effect of uneven temperature profiles in peltier block systems or variable liquid deposition of manual/automated multichannel pipetting systems, manifesting themselves as periodic qPCR data. We will present ways to eliminate periodic noise from qPCR data that results in a more reliable estimation of Cq values.

Back to qPCR BioStatistics & BioInformatics

Removal of Between-Plate Variation in qPCR with Factor Correction: Completion of the Analysis Pipeline Supported by RDML

Jan Ruijter1, Jan Hellemans2, Adrian Ruiz-Villalba1, Maurice Van Den Hoff1, Andreas Untergasser3
1Academic Medical Center, the Netherlands; 
2Biogazelle, Belgium; 
3Heidelberg University, Heidelberg, Germany

Quantitative PCR is the method of choice in gene expression analysis. However, the number of experimental conditions, target genes and technical replicates quickly exceeds the capacity of the qPCR machines. Statistical analysis of the resulting data then requires the correction of between-plate variation. Application of calibrator samples, with replicate measurements distributed over the plates assumes a multiplicative difference between plates. However, random and technical errors in these calibrators will propagate to all samples on the plate. To avoid this effect, the systematic bias can better be corrected when there is a maximal overlap between plates using Factor Correction [Ruijter et al. Retrovirology, 2006]. The original Factor Correction program is based on Excel input and calculates corrected target quantities. To implement this correction into the analysis pipeline from raw data through LinRegPCR into qbase-plus, a new version of the program was created to handle RDML files. This version saves the corrected N0 values as efficiency-corrected Cq values to be used in further calculations. This program thus completes the analysis pipeline of qPCR data supported by RDML.

Back to qPCR BioStatistics & BioInformatics

Improved Small RNA Library Preparation Workflow for Next Generation Sequencing

Sabrina Shore, Jordana Henderson, Anton McCaffrey, Gerald Zon, Richard Hogrefe
TriLink Biotechnologies, United States of America

Next generation sequencing (NGS) can be used to analyze microRNAs (miRNAs), small non-coding RNAs that are important therapeutic targets and diagnostic markers. Commercially available small RNA sequencing library preparation kits require large inputs (100 ng) and a laborious gel purification step, which is not amenable to automation. Additionally commercial kits are hindered by adapter dimer formation, where 5΄ and 3΄ adapters ligate without an intervening RNA insert. Adapter dimers preferentially amplify relative to the library during PCR amplification. This is exacerbated at low RNA inputs where adapter dimers can greatly diminish usable sequencing reads. We describe an optimized small RNA library preparation workflow which suppresses adapter dimers, allows for RNA inputs as low as 1 ng and eliminates the need for gel purification. Chemically modified adapters and an optimized protocol were employed to suppress adapter dimers while still allowing for efficient library ligation. Library preparation with modified adapters was compared to the Illumina TruSeq® Small RNA Sample Prep Kit. Non-gel purified samples were purified with the Agencourt® AMPure® XP Kit. Samples were sequenced on an Illumina HiSeq™. Our modified adapter workflow was benchmarked against the TruSeq® Kit at 100 and 10 ng inputs. The modified adapter workflow allows RNA inputs as low as 1 ng and generates less than 1% adapter dimer reads when gel purified (Table 1). In contrast, the TruSeq® Kit yields 14% and 51% adapter dimer reads at 100 and 10 ng inputs, respectively. TriLink’s modified adapter workflow with magnetic bead-based size selection yields less than 10% dimer for all input levels, while the TruSeq® Kit results in a minimum of 14% dimer reads at the highest input level. TriLink’s modified adapter workflow improves small RNA library preparation by significantly reducing adapter dimer formation. In fact, our improvements allow for sequencing from 1 ng of total RNA without compromising valuable sequencing reads, which was previously not feasible. TriLink’s workflow with magnetic bead-based size selection, an automatable technique, results in lower amounts of dimer reads than current methods using gel purification. Replacement of gel purification with an automatable purification step results in less hands-on time, better reproducibility and higher throughput. The modified adapter workflow surpasses other currently available technologies and provides significant improvement to small RNA NGS.

Back to Non-coding RNAs 2

Development and Optimisation of PCR Assays to Analyse MicroRNAs and their Target Genes

David Arthur Simpson
Queen’s University Belfast, United Kingdom

Introduction: Endothelial progenitor cells (EPCs) isolated from blood release microRNA-containing extracellular vesicles (EVs) which can potentially be harnessed to modulate angiogenesis as a treatment for vascular disease. These EVs contribute to the pool of circulating microRNAs, which provide biomarkers for many conditions. Quantification of microRNAs is required both to study their role in vascular repair and to exploit their potential as biomarkers. Multiple issues need to be considered in the design of a successful assay. RT-PCR reagents and template concentrations must be optimised. The final reaction volume must be minimised to reduce reagent use and amount of template required. Increasing the rate of thermal cycling brings time-savings and can be critical for certain clinical applications. The specificity of the assay with regard to microRNA isomiRs must be defined.
Methods: Probe-based assays and polyadenylation followed by oligo-dT primed reverse transcription and PCR with SYBR Green were employed. PCR reactions were set up using an Echo liquid handler (Labcyte), which uses acoustic energy to transfer 25nl droplets. qPCR was performed using a 384 well LightCycler 480 (Roche) or a rapid thermal cycler (xxpress, BJS Biotechnologies). RT-PCR products were sequenced using the Ion Torrent platform (Life Technologies).
Results: The quality of data was maintained as qPCR volumes were reduced to 2 µl. MicroRNAs can be amplified from plasma in less than 10 min using the xxpress cycler. Sequencing of RT-PCR products provides a profile of isomiRs for specific microRNAs comparable to sequencing of entire small RNA libraries.
Conclusion: The ability to accurately transfer nanolitre volumes and therefore adopt very low reaction volumes facilitates rapid optimisation of PCR reaction conditions and saves reagents and template. Characterisation of microRNA isomiRs and rapid detection of them from plasma broadens their potential as clinical biomarkers.

Back to Non-coding RNAs 2

MiRNA Profiling In Tumor Tissue, Body Fluids And Exosomes – A Combinational Techniques Approach Of NGS And QPCR.

Robert P. Loewe
GeneWake GmbH, Germany

miRNA has gained a pivotal role in molecular diagnostics and disease analysis. Due its stable nature and an abundant presence either in the tissue of origin, in microparticles or free circulating, it is a viable and interesting analyte. To understand the dynamics and spectrum of this biological dilution – from the production site down to circulation – we analysed glioblastoma tissue in conjunction with cerebrospinal fluid (CSF) and serum from the same patients. As the generation of microvesicles, general gene expression and epigenetic changes (methylation and hydroxyl-methylation of DNA) might mechanistically contribute to the phenomena, these characteristics were also measured. miRNA was primarily analysed via miRNA-Seq on a HiSeq instrument to allow a certain depth of the data. A data convergence of the different biological levels and source materials in addition to further information via qPCR was gathered. The profiling procedure and results underlining the inherent dynamics will be presented.

Back to Non-coding RNAs 2

New sensitive and specific method for microRNA analysis

Robert Sjöback1, Lukáš Valihrach2, Mikael Kubista1,2
1TATAA Biocenter AB, Sweden; 
2Institute of Biotechnology, CAS, Czech Republic

MicroRNAs (miRNAs) are small non-coding RNAs that function as important biological regulators in viruses, plants, and animals (1). The single-stranded miRNAs regulate gene expression at the post-transcription level and the dysregulation of miRNAs is associated with various human diseases (2-4). Recent reports show microRNAs are abundant not only in tissues but also in body fluids and they show great potential as minimum-invasive biomarkers in the diagnosis and prognosis of cancers and other diseases (5, 6). However, there are two main challenges when analyzing miRNAs by molecular methods: 1) miRNAs are short, usually not more than 22 nucleotides, which is the length of conventional PCR primers; 2) closely related miRNAs may differ in only a single nucleotide position. Current methods approach these challenges by extending the length of the miRNA using either miRNA specific RT primers or by non-specific RT primers, which compromises specificity and sensitivity (7). Here we present a novel method for the detection and quantification of short nucleic acids that has higher sensitivity than current approaches with concomitant enhanced specificity.
1) He et al., Nat Rev Genetics 2004, 5: 522-31. 2) Esquela-Kerscher A & Slack FJ, Nat. Rev. Cancer 2006, 6: 259-69. 3) Michael et al., Mol. Cancer Res. 2003, 1: 882-91. 4) Dimmeler S & Nicotera P, EMBO Mol. Med. 2013, 5: 180-90. 5) Brase JC et al., Mol. Cancer 2010, 26: 306. 6) Chen et al., Cell Res. 2008, 18: 997-1006. 7) Mestdagh P et al., Nature Methods 2014, 11:809-815.

Back to Non-coding RNAs 2