Genetic Networks, Part 2: How to Identify Misclassified Members

In Part 1 of the Genetic Network blog post series, I defined genetic networks and provided several visualizations for the types of DNA matches forming genetic networks. I labeled these matches as A (test taker), B (match being viewed), and C (matches that match both A and B).

Genetic network: DNA tester (A), Viewed match (B), and shared matches (C)

As the Part 2 title suggests, there are some shared matches (C) who are not true members of the genetic network being visualized between the tester (A) and the viewed match (B). I call these matches “misclassified genetic network matches” and denote it as “C” with strikethrough text, i.e., C.

Misclassified genetic network matches (C) don’t actually share the same common ancestor that “A” and “B” share together despite matching both “A” and “B”. These misclassified matches become problematic when reviewing the family trees of shared matches trying to determine the common ancestor tying “C” to both “A” and “B”. With misclassified genetic network matches (C), you will not be able to find the common ancestor, which wastes precious time and creates frustration.

To explicate further, the so-called misclassified genetic network match (C) is a match to “A” and is a match to “B” but not through the same ancestral line that “A” and “B” share together. This is why I call it a misclassified genetic network match. In this situation, the misclassified match (C) shares a different ancestral line with the tester (A) and a different ancestral line with the viewed match (B). Yet, because “C” matches “A”, and “C” matches “B”, this is why “C” shows up on testing websites as a shared match.

Using the same A-B-C graphic as presented earlier, we can visualize the effects of misclassified genetic network matches in the image below.

Visualization of a true genetic network compared to the effects of misclassified match on a genetic network

In the above image, Figure 1 presents a true genetic network where the DNA tester (A), the viewed match (B), and the shared matches (C) all share the same common ancestor. Figure 2, on the other hand, presents what occurs when a misclassified genetic network match is included in the presumed genetic network. Here, DNA testing websites include the shared “C” match because it matches A and B. However, the reality for misclassified matches is that “C” shares one ancestor in common the DNA tester (A) and another common ancestor with the viewed match (B). “C” does not share the same common ancestor that “A” and “B” share together making it a misclassified genetic network match (C). It’s important to identify misclassified matches before starting to review or build out the family trees of shared matches (C), and the identification process is presented next. However, let’s visualize misclassified genetic network matches another way so the concept is solidified.

As another example, the image below uses the vertical family tree presented in Part 1 of the blog series but with a new shared match displayed as “C4”. Note that “A” and “B” still share the same great grandparents in generation four that were previously shown in Part 1. However, the misclassified genetic network match (C4) does not share the same “yellow” great grandparents. The misclassified match (C4) separately shares a different set of great grandparents with “A” (purple) and a different set of great grandparents with “B” (pink). Match C4 is a true match to “A” and to “B”, but because C4 does not share the same common ancestor that both “A” and “B” share with each other, C4 is a misclassified genetic network match (C4). So, “C4” is a member of another genetic network with “A” and a member of another genetic network with “B” through different ancestral lines. From the perspective of the test taker, A and B share different segment(s) of DNA than does A and C.

Misclassified genetic network match example

Identifying a Misclassified Genetic Network Match
While there are typically only a few misclassified matches within identified networks, you want to make sure the list of shared matches are true members before beginning to research their family trees more fully. This ensures your time is well spent. There are three approaches to identify misclassified genetic network matches: Ancestry’s SideViewTM technology, viewed match switching, and family tree building.

Ancestry’s SideViewTM Technology
If you’re using Ancestry to view your matches, the easiest and first approach to identify misclassified matches is to use their SideViewTM technology, which indicates the side of your family tree a match likely belongs. So, if you (A) are reviewing a match for a cousin (B) on your maternal side, and SideViewTM indicates that one of the shared matches (C) is on your paternal side, then the paternal shared match may be a misclassified genetic network match. This example is depicted in the image below. (Note: if you haven’t labeled your matches as maternal or paternal, Ancestry lists them as Parent 1 or Parent 2). While I have never found SideViewTM to be incorrect, there are ways to verify that a maternal-paternal mismatch is in fact a misclassified genetic network match. This is discussed next.

Ancestry SideView and genetic network example

Viewed Match Switching
The second approach is to use “viewed match switching”, which entails 1) swapping out the “B” match with one of the “C” matches you suspect is misclassified, 2) selecting the shared matches filter again, and 3) then observing how the membership of the genetic network changes (i.e., the other listed “C” matches). If the new genetic network for the swapped C match is relatively stable, meaning very few of the “other” C matches change, then the “C” match is probably not a misclassified genetic network match. However, if many of the shared matches change, then the “C” match is probably a misclassified genetic network match with both “A” and “B”. See the example below for further clarification.

Kline/McMasters Genetic Network Example
Let me show you an example from my own research. my uncle has a maternal-side match (B) to someone who descends from his Kline fifth great grandfather. The Ancestry shared match page for this example was presented above. The shared matches between my uncle (A) and the viewed match (B) produced 5 matches (C1 through C5), which are denoted in the second column in the table below. Four of the five shared matches have family trees, and I confirmed three of the four matches descend from my uncle’s 5x Kline great grandfather through different child lines. One of the matches (C4), which was presented earlier as a paternal match, I could not find the connection between myself and the match on the Kline line.

Kline McMasters example of a genetic network; data from Ancestry

To determine whether C4 is a misclassified genetic network match with my uncle (A) and the viewed match (B), I swapped out “B” match with C4. This produced a dramatic change in the list of shared matches (see the third column in the above table). All the original shared matches with “B” (C1 through C5 except C4) were replaced with 48 new shared matches (C6 through C53). After analyzing the new 48 shared matches, I found that most appear to share common ancestry with my uncle’s McMasters 6x great grandfather. Subsequent genealogical research also discovered that the original “B” match shares a different common ancestor with my uncle and a different common ancestor with C4. So, C4 is a valid match to my uncle (A) and my original “B” match. However, C4 does not share the same common ancestor that my uncle (A) shares with “B”. Therefore, C4 is a misclassified genetic network match for the genetic network between “A” and “B”.

Family Tree Building
The third approach is to build out the respective trees for yourself (A) and your matches (B and C) to find the common ancestral lines. While tree building is a necessary task when analyzing genetic networks (see Part 4 of this blog series, which is forthcoming Sept. 30th), this may not always be feasible especially when the purpose of genetic genealogy is to discover new ancestral lines when we have limited or no documentary evidence confirming the common ancestor. However, to the extent that your genealogy and your match’s genealogy is known, presented, and accurate, it can help. Family tree building takes a lot of time, but it can help you identify when you share more than one ancestral line with a match and thus alert you as to the possibility that the DNA you share between matches may be from different or multiple lines.

Conclusion
The important take away from Part 2 of this blog series is not that you need to build out the family trees for every shared DNA match to determine all possible genetic connections, but rather when you do build out trees you need to be aware that not all shared matches (C) are true genetic network matches. This knowledge can help you become more critical with your assumptions so that your genetic research conclusions are more accurate and prevent you from going down rabbit holes or getting frustrated with trying to figuring out how all matches in the genetic network converge onto a common ancestor when some misclassified matches will not converge.

In part 3 of the blog series, which will be posted on September 23, I describe how to find genetic networks within your list of DNA matches.


Acknowledgment: The image used within the header at the top of the blog post was created using Microsoft’s Copilot AI-powered assistant (DALL-E 3) and added to the title slide. AI tools were not used to generate the blog’s intellectual content or provide writing assistance. The post was authored solely by me.


Subscribe to Blog Posts

4 thoughts on “Genetic Networks, Part 2: How to Identify Misclassified Members”

  1. Interesting blog post, nice illustrations. Regarding these misclassified members. Would clustering not be a way to get around this issue? Because the chance that a misclassified match would appear in the cluster, matching all or most other members of that cluster as well, would be quite small I would think?

    Reply
    • Thanks for your comments. By clustering, I’m assuming you’re referring to auto clusters such as those offered by MyHeritage, Gedmatch, or Genetic Affairs. My understanding of auto clusters is that they still use “shared matches”, which could suffer from misclassification. They do not look at specific DNA segments but rather if matches match one another. However, they do assess the shared matches of your shared matches therefore minimizing the possibility of misclassification. Nevertheless, misclassification is typically quite small. In my next post in the series, I discuss auto clustering and other methods for genetic network identification.

      Reply

Leave a Comment