=== Are WikiTree circles lopsided? === Author : Bernard Vatant First publication : 2020-11-09 Last modification : 2020-11-10 == Notes of conversations with Julie Kelts and Eva Ekeblad, november 2020 == After the original publication of the "Cent cercles de Jean-Joseph" a conversation started on WikiTree G2G forum, followed by private exchanges with Julie and Eva around the idea of a "100 Circles" WikiTree project, which would allow WikiTreers to engage in a similar work around other profiles. This project is not yet public, but its concept has raised in preliminary rich exchanges a certain number of questions that I try to order below. #Q1 : Reading about circles, or spheres, or geometry at large, people will be led to think about graphical representations. Are such representations possible, using modern, fancy, dynamic graphical interfaces? Would they be useful? Can we even try to have useful mental representations of such a geometry? #A1 : Unfortunately, the answer to all the above is globally "hardly". The WikiTree graph has no dimension, in the sense the term has in usual "two dimensions" or "three dimensions" (or more) vector spaces. Any projection in a space our eyes and brain can grasp will not scale beyond the few first circles. Consider first the sheer size of circles. The three first ones only, when "complete" (see below #A2 for what is meant by that) will typically gather about 500 profiles. The second reason is graph complexity, including the multiplicity of minimal paths between two profiles, which can happen even in the first circles. Of course, the graph can be flattened on a screen and displayed as 2D circles around the center, a representation workable up to the third circle at best (with about one profile by arc degree on the third one), making it barely readable. But I am not an expert of graphical applications, and we should there stick to the principle that "people who say it's impossible should not disturb those who are doing it". #Q2 : What about gaps in the available data? The circles in WikiTree will be less and less complete with growing distance, because of unknown parents, unknown destiny of children, migrations etc. This lack of data will not be the same for endogamic first circles (such as the ones of Jean-Joseph or Olof) and for example American descendants of immigrants from many places, not evenly easy to search. This will lead to "lopsided circles", with strong biases towards easy-to-follow paths. #A2 : This will be true at some point whatever the original center profile, passed the few first circles. For European commoners living in the 19th century, ancestors can rarely be tracked before 1700, let alone before 1500, unless they have a few, often uncertain, aristocratic lineage. Even in endogamic areas with good records like the ones of Jean-Joseph, many known children have unknown destiny, and maybe important circle expansions, including migrations hence potential shortcuts towards completely different parts of the tree, will be missing. Attempts to complete the first circles can provide an assessment of the relative importance of the issue in various countries and times, but at large scale, the relative local differences are likely to be smoothed by great numbers laws. In all cases, we have to bear in mind than absolute completeness of circles is not reachable [1], because it would mean having reliable data for *billions* of profiles, and the task force to check them all. The current size of WikiTree data base and task force is still too small by at least two orders of magnitude. But, in genealogy like in SETI "the absence of proof is not the proof of absence", and assessing what is missing, and acknowledge it is missing is often the most difficult of all tasks. As everywhere, knowledge is as the old master Kong said "What you know, know you know it, what you don't know, know you don't know it". Nevertheless, one can reasonably pretend to complete, meaning by that searching, sourcing, and building at least minimal profiles, for every possible expansion in all directions, up to the fourth circle, and a good part of the fifth, in a reasonable genealogist life time. Beyond fifth circle, all bets are off, depending on the task force. A single genealogist, however dedicated, is likely to give up the hard paths, and follow only the easy ones, hence the bias already mentioned, and that will be also addressed below in the "aristocracy" section (TBD) #Q3 : What about the quality of data? The definition of most circles is based on paths spanning 20 or 30 steps, and often at least one of those steps is flawed by lack of source, dubious filiation, or other quality issues. #A3 : There is nothing that can be done about it, except keeping working towards improvement of WikiTree quality. On the other hand, systematic exploration of circles allows to discover such dubious connections, and work on fixing them with their respective profile managers. A wishful thinking of this project is that such cases don't impact too much the global circles distribution shape. #Q4 : Is not WikiTree, as many other genealogical data bases, biased by focus on searching notable ancestors or relatives, leading to an excessive weight in the graph of, e.g., European Aristocracy? #A5 : It certainly is. But this is not only based on "aristotropism" [2] the will to find notable relative or ancestors. For most "commoners", the search for ancestors will hit brick walls in every line somewhere between 1500 and 1700. The lines which are documented further in the path are very likely to be notable ones, and in most cases belong to European aristocracy. Two factors make easy to follow those paths : the strong endogamy of the said European aristocracy, and the care their families have dedicated to keeping track of their genealogy, which was needed as a proof of nobility. Stumbling somewhere in your circles expansion upon some descendants of Aliénor d'Aquitaine is simply unavoidable, even if you were not looking for them explicitly. And given the number of those, they will often be surprisingly closer than you think.[3] [1] There is quite a similarity in this situation with Gödel's second incompleteness theorem, although genealogy is not ground in formal logic. We have to live in a situation where some genealogical assertions will remain forever undecidable, like a disputed filiation with good arguments for and against it, or the absence of marriage or children for someone about whom nothing is known beyond its birth. [2] This term is rare but sometimes used in French (aristotropisme) to denote the attraction of the "bourgeoisie" towards aristocracy, and related practices like "borrowing" a noble family name, adding a particle to the family name, finding by all means ancestors in nobility, etc. [3] Roglo, a genealogical data base of over 8 millions of interconnected profiles, with a strong representation of French and European nobility, is claiming (as of november 2020) about 1.4 million descendants of Aliénor, that is about 15% of the profiles in the data base. Including my ancestor Catherine de Kerenor (WikiTree ID : Kerenor-1).