Diagnosing Wine Richedit Performance Problems

Recently somebody asked me to look at why a particular app is slow to scroll its log window. There are many ways to look for performance problems, and I needed lots of them to understand this one. Here's the short story of how I approached speeding the app up.

Running the app under perf record / perf report (and getting it to insert 5000 lines into its log window) shows that a function ME_WrapMarkedParagraphs in riched20.dll uses a noticable amount of CPU time.

Running the app with WINEDEBUG=+richedit,+timestamp and looking at the repeating section of log corresponding to inserting one line into the log showed that the app was actually doing three insertions per line, and each one was slow. The app vendor could improve performance under wine by concatenating the strings before appending them to the log.

Adding a TRACE("line %d\n",__LINE__); at the top and bottom of ME_WrapMarkedParagraphs and reproducing the bug while running with WINEDEBUG=+richedit,+timestamp showed that ME_WrapMarkedParagraphs was indeed slow when the number of lines in the richedit was about a few thousand -- and that it was called five times per insertion into the richedit!

I noticed I was wasting a lot of time waiting for the app to start and getting it to insert 5000 lines of text into the log, so I wrote a tiny c program that inserted the 5000 lines all at once and added another 1000 one by one, and verified that it reproduced the slowness. I then filed wine bug 30627 describing the problem, and attached the source for the test program.

I used grep to find all callers to ME_WrapMarkedParagraphs, added TRACE("line %d\n", __LINE__); statements at each one, and ran my test program. This found three places where ME_WrapMarkedParagraphs was being called. One of them, ME_Repaint, is called from several places, so I repeated the process with that function. The app calls EX_SETSEL to position the selection at the end of the buffer. You wouldn't expect this to cause a redraw, but the log showed that ME_SetSelection calls ME_Repaint and ME_WrapMarkedParagraphs. Looking at the source, this is done to draw over the old selection -- but in this case, there is no old selection, so the redraw is wasted. Skipping the call to ME_Repaint reduced the time to insert a line by about one seventh. So I submitted a patch to avoid calling ME_Repaint in that case, and it was accepted into the winehq tree.

At this point it dawned on me that wine's riched20's problems on large documents probably come mostly from that linear scan down the linked list of all paragraphs, and there are probably about six left, most of which are calls to ME_WrapMarkedParagraphs. After chatting with previous riched20 contributors Dylan and Krzysztof, I decided to try adding a global dirty flag; ME_WrapMarkedParagraphs would return immediately if that flag wasn't set. Lazily, I used riched20's test suite (cd tests; make test) to find a few places that I'd missed. The resulting patch helps quite a bit; it's attached to bug 30627.

I then tested with the real app; it worked great up to 9995 lines, and then slowed down mysteriously. WINEDEBUG=+richedit showed that the app limits the number of lines in the richedit to 10000 lines, so it started deleting the first line of the buffer on each append about then. So I wrote a new test app that did that. Profiling and logging with WINEDEBUG=+richedit didn't show why it was slow, but logging with WINEDEBUG=+relay showed it clearly: riched20 was calling InvalidateRectangle() a bazillion times! I filed bug 30653 for that along with a draft patch which fixes the bug. A few days later, Dylan submitted a real fix, which entirely gets rid of one linear scan. (Yay Dylan!)

The real app, while still unhappy about speed in general, can limp along now that 30653 is fixed. Eventually, though, we'll need to come back to bug 30627, and get rid of the remaining linear scans.