What we do

This page lists some short use-case examples of projects where we have helped clients. Hopefully this will give you a reasonable idea of our expertise and capabilities.

  • Sequence design & processing
    • Assays (e.g. PCR, TaqMan, ligation, etc)
    • Probes (e.g. arrays, capture adapters, etc)
    • Custom indexes (e.g. barcodes / zipcodes)
    • NGS processing (e.g. variant calling, etc)
  • Optimization & troubleshooting
    • Data exploration, failure analysis (e.g. system improvements)
    • Statistical modeling, machine learning (e.g. predictive scores)
  • Process & technical
    • Process automation (e.g. pipeline design, scripting)
    • Technology review, prototype simulations (e.g. business intelligence)
    • Software validation, QC, porting, refactoring, documentation


Sample quality assay designs

Given: ‘We need assays to measure DNA sample quality and library prep efficiency. Can you help?’

Work: Devised two sets of (digital PCR) TaqMan assays with nested primers creating a ladder of amplicon sizes to asses sample quality. Designed additional (ddPCR) assays spanning library adapter components to quantify ligation efficiency.

Results: Both sample quality and library prep efficiency can now be accurately measured; Robust ddPCR metrics are incorporated into routine R&D process.

“Zipcode” sequence designs

Given: ‘We have a new platform […using technology that does XYZ…] and want to track inputs. Can you design Zipcodes?’

Work: After learning full technical details of the system, settled on set of sequence design constraints with client. Approach taken was 90% conservative and 10% exploratory; The majority of sequences fall within preconceived bounds (e.g. %GC) but some “push the envelope” providing data for future rounds of design. Generated and selected sequences meeting design constraints and also exhibiting pair-wise orthogonality (i.e. ever member of the set differs from every other by at least ‘X’). Work included simulations with various degrees of added error to anticipate performance (and help pick ‘X’).

Results: Set of 1000 zipcode sequences synthesized, evaluated and put to use on new platform. We also learned some rules applicable to future zipcode designs.

NGS system performance analysis

Given: ‘Our system works well but coverage is not balanced. What might cause this and how might we fix it?’

Work: Analyzed system (sequence) features in light of historical data, then designed experiments (and oligos) to confirm suspect causes for unusually high and low signals. New experimental results suggested possible system changes (e.g. buffer composition, process timing) and refinement of sequence design rules.

Results: Better understanding of signal sensitivity to sequence features, better balanced coverage and improved system.

Predicting capture probe performance

Given: ‘Probes [… in system XYZ…] consistently yield high or low signals. Can we figure out why?

Work: Calculated probe sequence features and attempted to correlate these with signal performance. Not surprisingly, %GC content alone was responsible for much of the variance, but other factors were clearly at work. Built a number of different machine learning models (regressions, decision trees, NN) to predict raw signal and “GC-normalized” signals. Models revealed features correlating with performance, and also allowed de novo scoring of new probes.

Results: List of (sequence) features that impact performance, along with several models for predicting this.

Simulating target coverage options

Given: ‘We’re considering [… technology options…] and would like numbers. Can you help?

Work: After discussing options and goals, arrived at list of sequencing options to simulate and evaluate (e.g. sequencing approaches X, Y, Z, each with given limits of coverage). For each option, tally targets likely to be covered / missed and partition these into classes (e.g. technology X should cover 88% of COSMIC targets in exons, miss >50% of variants in promoter regions, etc). Scripted steps of the process to simulate factors such as poor quality DNA (i.e. short fragments) or low coverage.

Results: Data tables summarizing outcomes if clients were to choose particular technology options.

Porting VB algorithm into Python

Given: ‘We are interested in an algorithm but it’s written in (poorly documented) Visual Basic so we can’t automate it’

Work: Defined test cases manually based on academic paper from which said algorithm originated (the algorithm groups “similar” motifs in antibody binding pockets). Walked through given visual basic code to verify that steps and calculations matched the paper. Decided to implement the Python code as a class that could be used by clients either stand alone (via simple command line interface) or within a larger Python framework.  

Results: Documented Python class (and module with test cases) implementing algorithm of interest.