Use the menu above to examine haplotypes subgrouped by haplogroup. Some haplogroups are further subdivided by clades (shown in italics). Terminology is discussed below. To find the subgroup assigned to a particular person or pedigree in the project refer to the participants table.

Haplotype and STR

A haplotype is the specific genetic signature of a person's DNA. The set of numbers in the table below illustrates a 37 marker haplotype for a man named Fergus. Those markers are referred to as STR (Short Tandem Repeats).

ID D
Y
S
3
9
3
D
Y
S
3
9
0
D
Y
S
1
9
D
Y
S
3
9
1
D
Y
S
3
8
5
D
Y
S
4
2
6
D
Y
S
3
8
8
D
Y
S
4
3
9
D
Y
S
3
8
9
i
D
Y
S
3
9
2
D
Y
S
3
8
9
i
i
D
Y
S
4
5
8
D
Y
S
4
5
9
D
Y
S
4
5
5
D
Y
S
4
5
4
D
Y
S
4
4
7
D
Y
S
4
3
7
D
Y
S
4
4
8
D
Y
S
4
4
9
D
Y
S
4
6
4
D
Y
S
4
6
0
Y
-
G
A
T
A
-
H
4
Y
C
A
I
I
D
Y
S
4
5
6
D
Y
S
6
0
7
D
Y
S
5
7
6
D
Y
S
5
7
0
C
D
Y
D
Y
S
4
4
2
D
Y
S
4
3
8
Fergus 13 23 14 11 11-14 12 12 11 13 13 30 16 9-10 11 11 25 16 20 29 15-15-17-18 11 11 19-23 16 15 19 16 37-37 13 13

Haplogroup

A haplogroup is the set of haplotypes sharing a common characteristic. All persons within a specific haplogroup share a common ancestor. The relationship between different haplogroups is slowly being worked out, forming a family tree of mankind as a whole. To date there are 21 primary haplogroups A throughT and numerous subgroups. The diagram below is a simplification for clarity, an accepted standard for the complete tree is maintained by the International Society of Genetic Genealogy (ISOGG) and found on their page: Y-DNA Haplogroup Tree.

A person's haplogroup is determined by SNP tests. In many instances a person's haplogroup can be predicted by the existence of key markers in the haplotype. Another prediction method is to look at the haplogroup of closely related persons. In this project persons are subgrouped by haplogroup be it known or predicted.

Phylogenetic Tree

Clade

A clade is a subgroup within a haplogroup consisting of a progenitor and all his descendants. To simplify by way of analogy, a haplogroup could be a ggggg-grandfather and all his male descendants are the haplotypes. A clade could be defined as any one of his grandsons and all those males descended from that grandson. In this project subgroupings by haplogroup are subdivided by clades based on either confirmed or predicted SNP mutations.

Cluster

A group of halpotypes with similar STR is a cluster. They may be indicative of a clade but because they are dependent on how one defines "similar" a cluster might contain persons in different haplogroups whereas a clade does not. For example, consider the 37 marker R1bSTR47Scots modal haplotype defined by McEwan in 2005. All the haplotypes within a genetic distance of 4 of that modal could be called similar. That definition would include persons in a cluster who are in different clades within Haplogroup R1b-L1335, namely S756, S764 and S691. In this project subgroupings by clade are further divided by any clusters that are observed. Cluster names are italicized in project subgroup labels.

In many instances what were clusters became clades once new SNP mutations were discovered. Given that a clade is a cluster (but not vice versa), the cluster name is retained since it is likely a familiar term still in use.

SNP

SNP is an abbreviation for Single Nucleotide Polymorphism. For the typical project participant the details beyond that are not particularly important. What is important to know is that the existence of an SNP mutation can be measured in someone's DNA and that is the parameter that defines one's Haplogroup. The subject is confusing because label names vary by vendor, new SNP mutations are discovered that need to be incorporated and the differences between predicted, tested and terminal SNP mutations are confusing in an of themselves. The diagram below is an abbreviated diagram illustrating some of the terminology.

Terminal

myFTDNA Match List

The chart below is an example of a myFTDNA match list. Two frequently asked questions are (1) why does my match include people of different haplogroups and (2) how can I be classified as say DF21 when nobody in my match list is DF21? Answers:

  1. When no terminal SNP is shown such as for Tom, the haplogroup is an FTDNA prediction that typically falls short of the actual haplogroup. Harry on the other hand is known to be positive for L21 and thus his haplogroup is shown as R-L21 and his terminal SNP is L21. Since Tom, Dick and Harry are so closely related the probably are all R-L21.
  2. Project predictions are able to use marker patterns, paper trails, results in other projects and tests done at other labs. As a results a project predicted haplogroup can be downstream of both FTDNA predicted haplogroups and tested haplogroups.
Match List

Next Generation Sequence Tests

These are comprehensive tests used to discover new SNP to characterize a genetic yDNA line and postion it in the yDNA phylogenetic tree. The principal examples are the BigY and YElite tests. Like STR testing they provide a measure of relatedness that is subject to a large uncertainty but unlike STR relatedness of subgroups of men can be organized into chronological ordering of their most recent common ancestors. The following phylogenetic tree drawn from the R-Sct2VA group illustrates the point. The times A, B and C can be estimated but no more accurately than a TMRCA can be estimated from STR. The advantage to the SNP phylogenetic tree is that we know that B is further back in time than A and that C is further back than A or B. The hope is that when enough data of this type is available an analysis coupling of the STR data to the SNP data will yield to more accurate TMRCA estimates than is now possible.

NGS Example

Different Haplogroups

References

  1. ISOGG Wiki
  2. Haplotype, Genetic Distance, TMRCA tables (with these mutation rates) using a modification of McGee's Y-DNA Comparison Utility; i.e. Modified Y-DNA Comparison Utility.
  3. Phylogenetic trees on this site were created using MEGA and PHYLIP compatible input tables were generated using using McGee's Y-DNA Comparison Utility
  4. Network diagrams were created using Network with inverse mutation rate weighting.
  5. Clade and interclade TMRCA were computed using Nordtvedt's variance method.
  6. TMRCA from SNP Counts were computed from an approximation based on the method used by YFULL
  7. Maximum Likelihood Estimate TMRCA calculator


[Home ] [ Participants ]

Copyright © 2006 Fergus(s)on Y-DNA Project