banner

Whole Genome Profiling

Whole Genome Profiling (WGPTM) is a novel technology developed by Keygene for building a physical map of Bacterial Artificial Chromosome (BAC) clones. It uses sequence based tags to characterize the BACs and anchor them in contigs based on tag overlap.
WGP generates short sequence tags that are spaced every ~2 Kb to 3 Kb across the BAC clones, providing unparalleled quality in contig building. This provides an excellent framework for the assembly of entire genomes - WGS information can be placed on the framework to establish a high quality genome sequence. BAC clones are positioned in contigs across the genome - gene cloning or re-sequencing projects in regions of interested are greatly facilitated by this.

A schematic overview of the Whole Genome Profiling Technology

WGP Technology

An example of WGP results for a BAC contig in Arabidopsis:

BAC contig arabidopsis

The Finger Printed Contigs software (C. Soderlund, I. Longden and R. Mott, 1997: FPC: a system for building contigs from restriction fingerprinted clones. Comput. Appl. Biosci. 13: 523-535; ) can produce BAC clone alignments and contigs.

Example of an FPC contig from WGP data:

FPC

Result files

Customers will receive the following information:

  1. Report with WGP data metrics
  2. Sequences of all filtered WGP tags in FASTA and TAG format
  3. FPC input file
  4. High- and Reduced stringency WGP physical maps in FPC output format
  5. All of the above data combined in MS Access or MySQL database format
  6. WGP compatible FPC version in LINUX executable (binary) format to allow for additional FPC analysis by the customer

Example WGP data metrics:

Overview WGP results for Arabidopsis
genome size130 Mbp 
WGP tag length (incl. RE site)26 
nr BACs tested6,144 
genome equivalents BACs tested*5.9 
enzyme combination WGP tagsEcoRI/MseI 
nr reads generated (M)28.2 M 
nr deconvolutable reads (M)12.1 M (43%) 
nr unique tags65,734 
nr tagged BACs (FPC ready)4,599 (70%) 
average nr tags/BAC40 
average nr reads/tag71 
FPC results
nr contigs273 
nr singletons551 (12%) 
nr BACs in contig4,048 (88%) 
average BACs/contig15 
N50 BACs/contig25 
average contig size (Mbp)*0.408 
N50 contig size (Mbp)*0.702 
genome coverage (Mbp)*111.6 
% genome coverage86% 
Deconvolution
FPC input

Example of a TAG file:

WGP Tag sequence>Tag nr>BAC-ID
AAGCTTAACTCGTCGTCTTCGTTTAATCCGCTAATG226590AT.E010.A13,AT.E012.J17
AAGCTTAATTACGTTTGACGACAAAGGGAGGTGAGG194750AT.E012.L19,AT.E008.K05,AT.E005.D09, AT.E013.I14
AAGCTTAATTGATTACAGAGTAGACATATACTATAC153389AT.E002.D07,AT.E010.F11,AT.E006.F05
AAGCTTACATCTAACAAGTATCTGGCCTTTCTTAAC170506AT.E003.O23,AT.E006.L18,AT.E008.I19, AT.E005.F18,AT.E014.C03,AT.E012.E10, AT.E007.J01
AAGCTTATTGTGTTACCAACTCTGTCTCATATCATG155684AT.E003.M08,AT.E010.M22,AT.E002.B04
AAGCTTCAACTACCTATTCTTGTTCATGGCTTGCAA172316AT.E003.A15,AT.E005.J24,AT.E010.D02, AT.E013.I04
AAGCTTCAATAACCATATTTAATTCAATAACAACAT156157AT.E003.G21,AT.E002.F21
AAGCTTCATCTAAGAGTGCAATCACCCCAATAGGCT173006AT.E003.L20,AT.E004.P10,AT.E010.O02
AAGCTTCTGAATATCTTCAATTCAAGGAAGAGCTCG212100AT.E010.N03,AT.E007.L22
AAGCTTCTGCTGCCTCCACTATTAACACCGGCGAAG197601AT.E005.H24
AAGCTTCTTCCTTTTTCTTATGGATCTTCAAAATCT228005AT.E011.A01,AT.E010.A12
AAGCTTGGAGAGAAGTACATGTTGGTTAGGATTTGA143335AT.E001.C19,AT.E006.K24,AT.E004.N21, AT.E012.G21,AT.E009.B12
AAGCTTGTGTAAAAATTAAGCCTCCATCTAGCTGGA163337AT.E002.I24,AT.E008.B17
AAGCTTTATGAAGGATGTTAAAAGATCCTCCCTTTC200202AT.E013.M18,AT.E005.M23
AAGCTTTGAAGTGCTTCAACTCTCATTCTGTTTTCT148310AT.E007.N23,AT.E001.M17
AAGCTTTGATGGTATAGCCGAATTCCGTATGAGTCT180442AT.E003.C11

An example of FPC output: BAC clone assignment to WGP contigs and start (L_Pos) and end positions (R_Pos) of BACs on these contigs. Units of measure are CB-units (Consensus Band), representing the average distance between neighboring WGP tags. Ctg0 represents singleton BACs, not mapping to any contig.

BAC_namectg_No# TagsL_PosR_Pos
AT.E001.A02ctg50124103226
AT.E001.A03ctg14815631186
AT.E001.A04ctg14646277322
AT.E001.A05ctg25918789275
AT.E001.A06ctg26231131361
AT.E001.A07ctg06  
AT.E001.A08ctg097  
AT.E001.A09ctg196166420585
AT.E001.A10ctg01  
AT.E001.A11ctg1531169771092
AT.E001.A12ctg149186362547
AT.E001.A13ctg148247261507
AT.E001.A14ctg1282349381171
AT.E001.A15ctg1561120111
AT.E001.A16ctg916511531317
AT.E001.A18ctg2130377379
AT.E001.A19ctg78403675
AT.E001.A20ctg14123590324
AT.E001.A21ctg252178229406
AT.E001.A23ctg11969459527

Example of how WGP data can be integrated with WGS data:

WGP WGS integration

The WGP technology is described in a Genome Research publication:

Genome Research publication
(Click image to see the publication)

quote

The WGPTM technology is covered by patents and patent applications owned by Keygene N.V. WGP is a trademark of Keygene N.V.

banner