Alfedenzo (alfedenzo) wrote,
Alfedenzo
alfedenzo


A couple of weeks ago I wrote a quick and dirty Python script that would scrape the local Zehrs flier and last night I tossed a GUI around it and hooked it up to a Bayesian classifier to have it filter between things I'm interested in and those that I'm not.

Unfortunately, it seems that the classifier is too unstable. Marking interest in a few things will drag over to the 'interested' side many other things, with no apparent relation. Telling it that I'm not actually interested in adult diapers will cause it to decide that I'm not interested in the items that I originally indicated interest in.

Can anybody who's more familiar with Bayesian classifiers explain why telling it I'm not interested in VEET IN-SHOWER HAIR REMOVER makes it think I'm less interested in MAPLE LEAF BACON, even though the two have no words in common?

I'm using a pair of classifiers, one for 'good' and the other for 'bad'. If one scores high and the other low, it gets marked interested or interested. Otherwise it's undecided.

Edit: Problem solved. Reason given in comments. Now it's working like a dream.
Subscribe

  • Thoughts on the Library of Babel

    Each book in the Library of Babel has 1312000 characters (22 letters, space, comma and period). This means that there are 25 1312000 ~= 1.956 * 10…

  • (no subject)

    Last week I wrote my first real Haskell code. For that my matter, it was my first non-trivial purely functional code. Pattern matching is nice.…

  • Patience Diff, a brief summary

    Last night I went to a coding party at a friend's house. I was on my laptop, and so didn't have any of my regular projects with me, so instead I…

  • Post a new comment

    Error

    default userpic
    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.
  • 4 comments