I already tested with Bayes turned off and the results are definitely better. Some years ago I made serveral tests on another mailsystem with the bayes filter. I found that teaching the filter works good if you have enough spam AND no spam (at least 100 mails each before running the as_learn).
This is the basic problem that you easily get 100 spam mails from your customers but you do not get a good set of no spam.
You may offer some folders for the users where they can put spam and no spam but they only put spam.
Teaching the filter with only a lot of spam but only a handful of no spam results always gave negativ results (many spam mails got negativ points and thus went through).
I also had no good experince with the autolearn feature, which I can understand. Maybe we are getting good made spam mails offering some pills that are not detected as spam, then the autolearn of course will result in defining these mails as good.
Overall my experience also says: do not use Bayes if you are not able to teach the system with a good selection of spam and no spam. Do not rely on users filling your learning folders with mails.
Of course this is also just my personal experience which might be different in other situtations.