Closed Bug 522847 Opened 15 years ago Closed 14 years ago

Terrible accessibility performance on this testcase

Categories

(Core :: Disability Access APIs, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: bzbarsky, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: access, perf)

Attachments

(1 file, 1 obsolete file)

Attached file Testcase
This is spun out of bug 522748.  When accessibility is enabled, the attached testcase becomes about 100x slower (click the "run test" button to test).
Depends on: 522748
Thhoughts anyone?
In case it helps, profiling shows that a lot of time is spent under nsDocAccessible::ContentAppended calling InvalidateCacheSubtree.  All percentages below are percentages of the time spent under nsDocAccessible::ContentAppended.

InvalidateCacheSubtre calls FireDelayedAccessibleEvent(unsigned int,...) (76%), which calls a different signature of the same, which calls nsAccEvent:ApplyEventRules (73%); time is spent under this, as follows:

24% nsCoreUtils::AreSiblings (operates on nsIDOMNode instead of nsINode; why?)
20% nsAccEvent::QueryInterface (GetAccEventPtr is killing you here)
11% nsAccEvent::Release (probably same thing)
6%  nsAccReorderEvent::QueryInterface (same thing)

Can't you just store nsAccEvent in the array?  Especially given that you assume the QI will succeed?

The other main callee of InvalidateCacheSubtree is FireDelayedAccessibleEvent(nsIAcessibleEvent*) (23%).  Once again, all its time is spent under nsAccEvent::ApplyEventRules.

That said, ApplyEventRules doesn't have _that_ much QI and such going on to be showing up in the profile like that.  How many calls to ApplyEventRules are we making here, exactly?
Thanks for the ping Marco, and thanks for profiling Boris in the meantime I was also looking (I collided with your comment) :)

Shark data:
http://people.mozilla.com/~dbolter/perf/522847.mshark

For each event we're spending over 50% of the time in nsAccessibilityService::InvalidateSubtreeFor with accessibility performance pain points including:
nsAccEvent::ApplyEventRules
nsCoreUtils::IsAncestorOf
nsAccEvent::ApplyToSiblings
nsCoreUtils::AreSiblings
(In reply to comment #2)
> Can't you just store nsAccEvent in the array?  Especially given that you assume
> the QI will succeed?

This seems reasonable. Ginn, do you know the story here?
(In reply to comment #2)
> How many calls to ApplyEventRules are we
> making here, exactly?

Good question and I'm not sure how many calls to ApplyEventRules are made as compared to the number of events we process. Ideally not 1:1! This is something we should confirm.

I think we need to improve the performance of ApplyEventRules, but also to run it less often when the DOM is going crazy. Thinking aloud maybe we need to come up with a good throttling algorithm for a noisy DOM -- possibly a timeout based on our event queue length. If length > foo set next timeout for max(foo+bar, baz).
Just to be clear, my current suspicion is that the number of ApplyEventRules calls is O(N^2) in number of DOM manipulations in this case....
Yeah I'm not sure how that's happening.
Ah, no.  The number of ApplyEventRules calls is O(N) in number of nodes, but each call is O(N) in number of nodes.

The code in FireDelayedAccessibleEvent, in its entirety is:

1629   mEventsToFire.AppendElement(aEvent);
1632   nsAccEvent::ApplyEventRules(mEventsToFire);
1635   return PreparePendingEventsFlush();

So if a testcase calls appendChild 1000 times, we will call this code 1000 times and call ApplyEventRules 1000 times, each time with a longer and longer array.
It is a good suggestion to just store nsAccEvent*.
It would save a lot time for us.
Thanks, bz.

For ApplyEventRules() and ApplyToSiblings(), I guess something was wrong.
It's a bit complicated. I should think it over.
before this patch, ApplyEventRules takes 31.7s, after, it takes 18.6s.
Attachment #408577 - Flags: superreview?(bzbarsky)
Attachment #408577 - Flags: review?(surkov.alexander)
Attachment #408577 - Attachment is patch: true
Attachment #408577 - Attachment mime type: application/octet-stream → text/plain
Do you need to release nsAccEvent* during cycle collection's unlink?

Can you consider to have inline method to remove from array and release n elements?

nit: I would prefer to not have braces for single 'for' statement

can you use CallQueryInterface instead of aEvent->QueryInterface(NS_GET_IID(nsAccEvent)?
Attachment #408577 - Attachment is obsolete: true
Attachment #408577 - Flags: superreview?(bzbarsky)
Attachment #408577 - Flags: review?(surkov.alexander)
Bug 513156 makes ApplyEventRules() being called O(N) times.
Before it was committed, ApplyEventRules() takes 6.6s.

I'm trying to find a way to fix without back it out.
Really? What aspect of that (Canvas/WebGL) bug hit our perf?
For what it's worth, I think patches should go into bugs blocking this tracking bug, since there are multiple issues here which will need multiple patches, reviews, and checkins.
(In reply to comment #13)
> Really? What aspect of that (Canvas/WebGL) bug hit our perf?

Sorry, I meant Bug 513213.
Depends on: 524696
Blocks: 513213
(In reply to comment #12)
> Bug 513213 makes ApplyEventRules() being called O(N) times.
> Before it was committed, ApplyEventRules() takes 6.6s.
> 
> I'm trying to find a way to fix without back it out.

Can you describe please how it happens. The meantime we have

EventsNumber * seconds(new version of ApplyEventRules)

previously we had

seconds(old version of ApplyEventRules) == seconds("for" cycle from 0 to EventsNumber) * seconds(new version of ApplyEventRules).

which should be the same. What do I miss?
ApplyEventRules has been O(N^2) ever since it appeared, as far as I know...  It _is_ possible that the constant might have changed somehow as part of bug 513213.  But fundamentally, the O(N^2) is the problem.
Can a problem be in ApplyEventRules is called sync after bug 513213 (when we are about to fire new event then we call ApplyEventRules) but prior to it the method was called async?
(In reply to comment #18)
> Can a problem be in ApplyEventRules is called sync after bug 513213 (when we
> are about to fire new event then we call ApplyEventRules) but prior to it the
> method was called async?

I guess so. Perhaps getparent would fail if it called async.
I didn't debug into it yet.

I tried to get rid of ApplyToSiblings(), but ApplyEventRules() still takes about 15s.

The design of ApplyEventRules() is based on most of the events are from siblings nodes, so ApplyToSiblings() would reduce the number of calls to IsAncestorOf().

This testcase kills the design, because it has 5000 rows x 4 columns.
We still call IsAncestorOf() for table cells in different rows.
Big numbers.
Could we cache some information about DOM node position in accessible object like it's level or something (not sure when), so we could get rid many cases of DOM traverse?
Could we also use nsINode instead of nsIDOMNode in the accessible event?  That wouldn't change the growth order, but make the operations _much_ cheaper...

Alternately, would a non-list data structure be better for pending accessible events?  Note that we're looking into just such a change for pending restyles, for similar reasons.
(In reply to comment #21)
> Alternately, would a non-list data structure be better for pending accessible
> events?  Note that we're looking into just such a change for pending restyles,
> for similar reasons.

Yeah why not. What data structure would be cheaper?
Good question!  Note I said "looking into".  See m.d.t.layout "Lazy frame construction" thread for some relevant posts; I have another one coming up.
(In reply to comment #19)
> This testcase kills the design, because it has 5000 rows x 4 columns.
> We still call IsAncestorOf() for table cells in different rows.
> Big numbers.

I was wrong.
We're firing events for every row, not every cell.

But, for HIDE event, when the event was appended the parent of DOMNode was nsHTMLTableSectionElement. When the next event comes in, the parent of last event's DOMNode changed to nsDocumentFragment. So we couldn't find siblings for mDOMNode of tailEvent.

Perhaps eCoalesceFromSameSubtree never works for HIDE event?
(In reply to comment #24)
> (In reply to comment #19)
> > This testcase kills the design, because it has 5000 rows x 4 columns.
> > We still call IsAncestorOf() for table cells in different rows.
> > Big numbers.
> 
> I was wrong.
> We're firing events for every row, not every cell.
> 
> But, for HIDE event, when the event was appended the parent of DOMNode was
> nsHTMLTableSectionElement. When the next event comes in, the parent of last
> event's DOMNode changed to nsDocumentFragment. So we couldn't find siblings for
> mDOMNode of tailEvent.
> 
> Perhaps eCoalesceFromSameSubtree never works for HIDE event?

At least not every time ;) What is an nsDocumentFragment? Does it represent something not yet attached to the DOM?
> Perhaps eCoalesceFromSameSubtree never works for HIDE event?

It probably works for style changes; not so well for DOM changes.

> What is an nsDocumentFragment?

It's just a way of keeping track of a bunch of nodes.  In this testcase, the rows are removed from the DOM, put into the document fragment, then the document fragment is put back in the DOM (which puts all the nodes back in the DOM).
Thanks Boris.

(In reply to comment #24)
> (In reply to comment #19)
> 
> But, for HIDE event, when the event was appended the parent of DOMNode was
> nsHTMLTableSectionElement. When the next event comes in, the parent of last
> event's DOMNode changed to nsDocumentFragment. So we couldn't find siblings for
> mDOMNode of tailEvent.
> 

This sounds like a different problem than bug 493531 (?)

We might need to do more immediate processing of/and fire hides faster. This would screw up our event ordering but might be the lesser of two evils. I guess storing more state with each event would work too but feels non performant.
(In reply to comment #24)

> Perhaps eCoalesceFromSameSubtree never works for HIDE event?

it was true for DOM hide events prior to bug 513213, now we coalesce events before DOM node is removed from the tree.
(In reply to comment #28)
> it was true for DOM hide events prior to bug 513213, now we coalesce events
> before DOM node is removed from the tree.

No, it's still not working, because you're comparing an already removed node (thisEvent) with a yet removed node (tailEvent).
(In reply to comment #29)
> (In reply to comment #28)
> > it was true for DOM hide events prior to bug 513213, now we coalesce events
> > before DOM node is removed from the tree.
> 
> No, it's still not working, because you're comparing an already removed node
> (thisEvent) with a yet removed node (tailEvent).

Right, I forgot this. Bug 513213 affects on reorder events in some cases only. Iirc I wanted to coalesce hide events by their reorder events while they are in tree still.
Depends on: 538633, 342045
The performance of nsAccEvent::ApplyEventRules scales badly with the size of the queue, for queue sizes over about 3000 items it is more costly to process the queue in nsAccEvent::ApplyEventRules than to fire all events as they were. Have a real life case where a queue size of over 10000 spends several seconds in nsAccEvent::ApplyEventRules. During this time the Firefox UI is unresponsive.
RCI, that's what bug 538594 is about.
Closing as fixed by related performance work (see dependencies).
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: