Maximum number of profiles per tree?

Dallan, My account is barely working. Have I hit some maximum (trees x profiles ) that is causing or related to persistent stalls and 502 or 503 timeouts?

The performance of the Morrel-Klingsmith tree’s 51 profiles has slowed to a crawl for loading or updating data and frequently stall out entirely where other trees with fewer profiles are performing slowly but somewhat better. Most attempts to load data to ancestry profiles today simply stalled with a “wait” message that displays for hours.

I sometimes get gateway 502 or 503 errors as well just to complicate the question. The only other time this happens is trying to run a 7cm one to many tier 1 report. When I go to 8cm I usually squeak by so can I take this for ISP bottleneck? If so I will challenge comcast to live up to the upload rate I pay for.

I ran local cleanup programs and compared Edge with Chrome trying to load a 20k selection from a 40k plus m_ ancestry file. I have many more plugins in Chrome but concluded that there is little difference except that the Edge bug (still not showing the CM up and down carats in the filter box) still exists makes using Edge not possible for >20 K match files.

Your thoughts?? Server or db limits on profiles per tree-- where am I falling down? The only clue may be that a file with <18K matches did load in reasonable time. It seems to be the ones where the original data set exceeds 20k that bog down.

Jenny

You can have up to 70,000 people per tree, but I expect the problem is with the number of DNA profiles you have. Can you tell me what tree you’re having the problems with (the URL)? I will look into it.

Morrel-klinglesmith will send url later

Just in case anyone else is reading this, jenniferfranklin has nearly 500,000 dna matches spread out over the various dna profiles. I hadn’t anticipated that people would have so many matches. I will make some changes on Saturday to handle this case.

In case anyone is wondering why I have so many matches, in addition to having kits across various platforms, I swap match list view rights with Ancestry Matches on lines I am actively researching because even when or especially when the source of a segment shared with one of my kits is known, depending on how many generations are involved. I do this because there is a really good chance that person inherited segments that my close relatives missed and because I get a lot of insight on family groups within a line - especially in the black box world of Ancestry.

This approach has helped me in numerous ways, but especially so since I took up graphing! Most of all I have a much greater appreciation for how those >=20 cMs shared match lists can be misleading!

Also, once I have access to a shared match list, I make it a point to share diagrams and screen captures and save source files so my matches can more easily create their own accounts and get started. Sharing cluster screenshots has been a great conversation starter.

Note: Until we can isolate permission to view to single dna profiles within a tree, I do not share trees that would expose other matches’ profiles.

jen


Virus-free. www.avast.com

1 Like

I am not an Ancestry user but I had no idea you could do that over there! That is way cool! But I am interested in this thread because I do use most of the other DNA websites & manage DNA for 13 family members so I anticipate many DNA profiles.

Users now have the ability to edit some of their DNA profiles an “archive” them. Archived profiles aren’t downloaded into the browser, so they don’t overload the system. This should help people with lots of DNA profiles.

works great

Blockquote greater appreciation for how those >=20 cMs shared match lists can be misleading

What did you find out about the 20CM+ ‘shared’ matches - what was misleading? Curious.

So far Archiving the profiles duplicated by the new combined kits I’m creating on gedmatch for three of my people is not overcoming my complete inability to use RootsFinder at the moment.
I’ve had noting but bad gateway system unavailable and assorted other messages, or un refreshed screens for quite some time now when I try to do anything. The last few attempts have been to try and add updates from GEDMatch and MyHeritage.)
So I guess I’m hitting another manifestation of Jenni’s initial problem here of overloading the system?
Tree LornaHen, lots of profiles
(need this working as preparing demo of RootsFinder for a forthcoming DNA day end of July )

The server ran out of memory when it tried to read the dna kits, which is what caused the errors. I’ve increased the memory on the server, so it should be ok now, but we need to figure out a longer-term solution.

You have around 500,000 matching kits among all of your profiles: 140,000 from Ancestry, 200,000 from GEDmatch, and 160,000 from MyHeritage. If a matching kit appears in any unarchived DNA profile, it’s downloaded as soon as you open any DNA profile. This is what’s causing the memory issues.

You’re ok for the demo, but if you want to add more kits going forward, I need to come up with a better solution:

  1. I could change the server to only read DNA kits for the type of profile (Ancestry, GEDmatch, or MyHeritage) that you were opening. This would mean in your case that you’d only read 200,000 kits at once (for GEDmatch profiles), which wouldn’t have memory issues, but this would make it more difficult for me to someday be able to provide views that showed all of the matching kits across all types of profiles at once. (The McGuire charts are a step in that direction.)

  2. What if I gave you a way to copy your tree to another tree? You could then copy your current tree into a “maternal” tree and a “paternal” tree, and remove the paternal-side DNA profiles from your maternal tree, and remove the maternal DNA profiles from your paternal tree? This only works if you can separate your DNA profiles into maternal and paternal. (It looks like you have around 130 profiles uploaded so far.) I’m not sure how easy that is?

  3. Any other ideas?

I’m going to bring this conversation to the attention of everyone who has a lot of kits, so we can get everyone’s input.

To be clear, is the total calculated across all active profiles within an active tree or across all trees in an account? Would not there be a way to also archive trees not being worked on if it is the latter?

and also how do trees you are invited to work ? ( was recently invited to a tree with more than 20 profiles)?

It seems to me that i most often work on profiles across platforms within the same tree and then only for the same person or small set of persons btw. In my currrent case it is my profile across 4 platforms and then only if necessary,s similar sets of kits across platforms for my paternal half siblings.

An any case a warning like an interactive gasoline tank dial advising me of the need to archive more profiles to get an optimal workspace would help me manage server resources better

jen

If anyone else is looking for the “Archive” feature, when you select a DNA Profile look for the pencil icon to edit DNA profile and there is an “Archive this profile” box to “check”. I have about 16 separate profiles and when I did about half I did a refresh on RF and it seemed to run faster to archive the second half.

1 Like

I’m entering this discussion at Dallan’s request. I will need some clarification to make an intelligent contribution. Let me lay out my experience with profiles which may be somewhat different from other contributors. Then I will raise some questions.

I have not experienced the kind of slowdown Jenny discussed at the beginning of this string. Early on, I had some slowdown issues but none recently.

I currently have 11 profiles associated with the Morrel-Smylie-Allen-Sykes Family Tree which was generated by my original downloaded GEDCOM from Ancestry. Four of the profiles are from GEDmatch and were updated a number of times prior to the GEDmatch switch to Genesis GEDmatch. The tree included some manual updating at RF. The other profiles were generated by my cousin Jenny and transferred to me. Two of those are for other cousins who are family members.

Within the last week, I created a new Ancestry GEDCOM because of significant updating of the Ancestry tree and uncertain pending changes to Ancestry’s platform. The tree, Morrel Famliy Tree Update 14 June 2019, seems to have transferred well (though I have been unable to figure out the GEDCOM Media Uploader piece, as yet). Since I was creating an enhanced tree, I also created a new John Morrel profile (ID A534979 Updated) from GEDmatch using the Tier One One-To-Many copied to a spreadsheet. That also worked. This should be my primary personal profile now. However, I was reluctant to delete the earlier one (ID A534979) because it probably includes GEDmatch kits that did not transfer to Genesis GEDmatch for one reason or another.

Questions:

  1. Are the profiles intended to be limited to a particular number of matches 2,000, 3,000 etc?
  2. If so, is this uniform across GEDmatch, FTDNA, Ancestry, etc.?
  3. Dallan, when you speak of a limit of 400,000 matching DNA kits across all profiles they’ve created, in my case would that refer only to the profiles I created (the GEDmatch profiles) or would it include profiles Jenny transferred to me?
  4. Since I manage four GEDmatch kits directly for myself, my brother, our uncle and our first cousin, a next step would be to create profiles for the other three (or use the existing GEDmatch profiles for them). (I’m assuming here that the 100,000 matching DNA kits Dallan mentioned to me for a single RootsFinder tree refers to the updated tree and the new profile I created for myself?) Given the changes in both the Ancestry-based RF tree and the Genesis GEDmatch platform, I assume it would make sense to create new profiles. Do you agree?
  5. In order not to overload the system, is there a way to put some boundaries on the FTDNA and Ancestry profiles in my account?

Many thanks for any observations. John

1 Like

Is not one solution to pay more proactive attention to archiving? It doesn’t solve the mega match (cross profile) problem but it would keep multiple kit users out of the weeds if the system would be more transparent re. Resources, e.g. give a status warning?

I have experienced some times when the systems locks when I save matches to my tree but otherwise no lasting slowdowns. At times I wish parts would load quicker.

Before reading the thead I didn’t know how to use the archive but I think I understand now. I’m currently bouncing between two trees so as to focus on a single family I share with others who help. It is a chore to update profiles twice, once in each tree.

You mentioned creating multiple trees for maternal/paternal, for example. I would prefer a way to create groups of profiles that can be toggled on/off depending on which side I’m working on. Would that help?

opening only data from the company being worked with sounds like an excellent first step Dallan.
Didn’t realise I had so many profiles.
I’ll work thru reminding my self why some are there, update their descriptions for my benefit, and probably archive a few more.
Splitting the tree /profiles would be a last resort for me. I hate having to think about who belongs where in trees/dna no matter what site I’m working on.

I’ve never managed to run a Maguire chart on RF but when I do, I hope RF will know who is linked as a descendant of the focus and include them regardless of source company.
In my own database (ie not RF) I delete anything under 7cM and my default for working with matches is 12cMs (largest segment) although I do occasionally go down to 10.

So both a warning about excessive data, and then an option to be able to delete smaller matches in bulk when things get “full” might be useful?
I try to remember not to import them but that doesn’t mean I always remember and there’s always the thought that I might be missing a vital clue!

This is just about profiles in the open tree. Profiles in other trees don’t matter.

Off-topic, but currently we don’t import media from Ancestry GEDCOMs yet – just from desktop genealogy programs like RootsMagic so far. I’ll be working on importing media from Ancestry this month or next.

It’s the total of all matches associated with all unarchived profiles attached to a single tree, regardless of who created them. To give an example: suppose you had a tree with three DNA profiles:

Profile One has 10,000 matches from GEDmatch
Profile Two has 10,000 matches from GEDmatch, but 2,000 of them are the same DNA kits as Profile One.
Profile Three has 20,000 matches from Ancestry, but it is archived.

Then the total number of matching DNA kits that we’re talking about is 10,000 from Profile One, 8,000 from Profile Two, and 0 from Profile Three, making a total of 18,000 matching DNA kits.

Yes

It’s possible to limit the profiles when you import them, to exclude profiles that don’t have a high cM overlap. Maybe another option would be to allow you to delete low-overlap profiles after they’ve been imported?

Yes, regardless of what we decide, it’s clear that I have to at least give a warning when you surpass say 200,000 matching DNA kits, and I should display a count of the number of matching DNA kits when you open a DNA profile.

That’s an interesting idea: give you a way to archive & unarchive a set of DNA profiles at once.

As a reference point for others, I see a slow down around 100,000 matches. I did a tally of all 16 profiles, some only have 4000-6000 others are right at 20,000. So the new “archive” button greatly improved “latency?” for me as long as my total is less than 100,000 matches.
I like Jennifer Franklin’s comment about a gas gauge, or maybe like a battery life indicator, green, orange and red as a health indicator. Dallan, it would also be nice to see a specific number to indicate exactly what the system is “juggling” and (2nd request) I personally would love for the filters to remain saved as I leave a profile, go to a new profile then return to the first. I have no idea how big of a deal that is.

love the idea of being able to group profiles regardless of whether this is for potential archiving for performance or not.
By using the fan chart filter? and/or a filter on the description field?