Whole Genome Profiling
Whole Genome Profiling (WGPTM) is a novel technology developed by Keygene for building a physical map of Bacterial Artificial
Chromosome (BAC) clones. It uses sequence based tags to characterize the BACs and anchor them in contigs based on tag overlap.
WGP generates short sequence tags that are spaced every ~2 Kb to 3 Kb across the BAC clones, providing unparalleled quality in contig building.
This provides an excellent framework for the assembly of entire genomes - WGS information can be placed on the framework to establish a
high quality genome sequence. BAC clones are positioned in contigs across the genome - gene cloning or re-sequencing projects in regions of
interested are greatly facilitated by this.
A schematic overview of the Whole Genome Profiling Technology

- (A) BAC library: BAC clones are available in a 384-wells plate format
- (B) BAC pooling and DNA extraction: DNA is extracted after pooling each row and each column
- (C) Template preparation and sequencing: pooled BAC DNA is digested (EcoRI/MseI) and amplified after ligation of barcoded adaptors for pool identification. Pooled DNA is sequenced on the Illumina GA or HiSeq
- (D) Deconvolution: sequence tags (30-50 per BAC) are assigned to individual BACs based on presence in 1 row and 1 column pool
- (E) Contiging: overlapping sequence tags from individual BAC clones generate contigs. Together these contigs represent a sequence-based physical map of the genome
An example of WGP results for a BAC contig in Arabidopsis:

- (A) Schematic overview of a complete BAC contig with orders produced by FPC; colored bars represent individual BAC clones
- (B) Details for a zoomed-in region with the presence of a detected WGP tag indicated by 'x' . Gaps in BACs represent missing tas due to insufficient deconvoluted reads
The Finger Printed Contigs software (C. Soderlund, I. Longden and R. Mott, 1997: FPC: a system for building contigs from restriction fingerprinted clones. Comput. Appl. Biosci. 13: 523-535; ) can produce BAC clone alignments and contigs.
Example of an FPC contig from WGP data:

Result files
Customers will receive the following information:
- Report with WGP data metrics
- Sequences of all filtered WGP tags in FASTA and TAG format
- FPC input file
- High- and Reduced stringency WGP physical maps in FPC output format
- All of the above data combined in MS Access or MySQL database format
- WGP compatible FPC version in LINUX executable (binary) format to allow for additional FPC analysis by the customer
Example WGP data metrics:
| Overview WGP results for Arabidopsis | ||
| genome size | 130 Mbp | |
| WGP tag length (incl. RE site) | 26 | |
| nr BACs tested | 6,144 | |
| genome equivalents BACs tested* | 5.9 | |
| enzyme combination WGP tags | EcoRI/MseI | |
| nr reads generated (M) | 28.2 M | |
| nr deconvolutable reads (M) | 12.1 M (43%) | |
| nr unique tags | 65,734 | |
| nr tagged BACs (FPC ready) | 4,599 (70%) | |
| average nr tags/BAC | 40 | |
| average nr reads/tag | 71 | |
| FPC results | ||
| nr contigs | 273 | |
| nr singletons | 551 (12%) | |
| nr BACs in contig | 4,048 (88%) | |
| average BACs/contig | 15 | |
| N50 BACs/contig | 25 | |
| average contig size (Mbp)* | 0.408 | |
| N50 contig size (Mbp)* | 0.702 | |
| genome coverage (Mbp)* | 111.6 | |
| % genome coverage | 86% | |
| Deconvolution | ||
| FPC input | ||
Example of a TAG file:
| WGP Tag sequence | >Tag nr | >BAC-ID |
|---|---|---|
| AAGCTTAACTCGTCGTCTTCGTTTAATCCGCTAATG | 226590 | AT.E010.A13,AT.E012.J17 |
| AAGCTTAATTACGTTTGACGACAAAGGGAGGTGAGG | 194750 | AT.E012.L19,AT.E008.K05,AT.E005.D09, AT.E013.I14 |
| AAGCTTAATTGATTACAGAGTAGACATATACTATAC | 153389 | AT.E002.D07,AT.E010.F11,AT.E006.F05 |
| AAGCTTACATCTAACAAGTATCTGGCCTTTCTTAAC | 170506 | AT.E003.O23,AT.E006.L18,AT.E008.I19, AT.E005.F18,AT.E014.C03,AT.E012.E10, AT.E007.J01 |
| AAGCTTATTGTGTTACCAACTCTGTCTCATATCATG | 155684 | AT.E003.M08,AT.E010.M22,AT.E002.B04 |
| AAGCTTCAACTACCTATTCTTGTTCATGGCTTGCAA | 172316 | AT.E003.A15,AT.E005.J24,AT.E010.D02, AT.E013.I04 |
| AAGCTTCAATAACCATATTTAATTCAATAACAACAT | 156157 | AT.E003.G21,AT.E002.F21 |
| AAGCTTCATCTAAGAGTGCAATCACCCCAATAGGCT | 173006 | AT.E003.L20,AT.E004.P10,AT.E010.O02 |
| AAGCTTCTGAATATCTTCAATTCAAGGAAGAGCTCG | 212100 | AT.E010.N03,AT.E007.L22 |
| AAGCTTCTGCTGCCTCCACTATTAACACCGGCGAAG | 197601 | AT.E005.H24 |
| AAGCTTCTTCCTTTTTCTTATGGATCTTCAAAATCT | 228005 | AT.E011.A01,AT.E010.A12 |
| AAGCTTGGAGAGAAGTACATGTTGGTTAGGATTTGA | 143335 | AT.E001.C19,AT.E006.K24,AT.E004.N21, AT.E012.G21,AT.E009.B12 |
| AAGCTTGTGTAAAAATTAAGCCTCCATCTAGCTGGA | 163337 | AT.E002.I24,AT.E008.B17 |
| AAGCTTTATGAAGGATGTTAAAAGATCCTCCCTTTC | 200202 | AT.E013.M18,AT.E005.M23 |
| AAGCTTTGAAGTGCTTCAACTCTCATTCTGTTTTCT | 148310 | AT.E007.N23,AT.E001.M17 |
| AAGCTTTGATGGTATAGCCGAATTCCGTATGAGTCT | 180442 | AT.E003.C11 |
An example of FPC output: BAC clone assignment to WGP contigs and start (L_Pos) and end positions (R_Pos) of BACs on these contigs. Units of measure are CB-units (Consensus Band), representing the average distance between neighboring WGP tags. Ctg0 represents singleton BACs, not mapping to any contig.
| BAC_name | ctg_No | # Tags | L_Pos | R_Pos |
|---|---|---|---|---|
| AT.E001.A02 | ctg50 | 124 | 103 | 226 |
| AT.E001.A03 | ctg148 | 156 | 31 | 186 |
| AT.E001.A04 | ctg146 | 46 | 277 | 322 |
| AT.E001.A05 | ctg259 | 187 | 89 | 275 |
| AT.E001.A06 | ctg26 | 231 | 131 | 361 |
| AT.E001.A07 | ctg0 | 6 | ||
| AT.E001.A08 | ctg0 | 97 | ||
| AT.E001.A09 | ctg196 | 166 | 420 | 585 |
| AT.E001.A10 | ctg0 | 1 | ||
| AT.E001.A11 | ctg153 | 116 | 977 | 1092 |
| AT.E001.A12 | ctg149 | 186 | 362 | 547 |
| AT.E001.A13 | ctg148 | 247 | 261 | 507 |
| AT.E001.A14 | ctg128 | 234 | 938 | 1171 |
| AT.E001.A15 | ctg156 | 112 | 0 | 111 |
| AT.E001.A16 | ctg9 | 165 | 1153 | 1317 |
| AT.E001.A18 | ctg21 | 303 | 77 | 379 |
| AT.E001.A19 | ctg78 | 40 | 36 | 75 |
| AT.E001.A20 | ctg141 | 235 | 90 | 324 |
| AT.E001.A21 | ctg252 | 178 | 229 | 406 |
| AT.E001.A23 | ctg119 | 69 | 459 | 527 |
Example of how WGP data can be integrated with WGS data:

The WGP technology is described in a Genome Research publication:

(Click image to see the publication)
