Content Editing Tool For Your Web Site | posted: 4/14/08 | ||
The Curse Word Filter is our first step towards automating the editor's job. It might save you time in its' current form so we put it here for your convenience. Just add the script below to the Perl code that processes your blog or text input and put it to work as you gradually add to your negative word list. This is a concept piece, a work in progress. Your collaboration and development ideas would be greatly appreciated. I would greatly appreciate the opportunity to add this to the comment section of a working web site. |
|||
Description | posted: 4/14/08 | ||
The curse word filter consists of two parts; the textarea element and the negative word file. The content of the textarea is compared with the negative word file and if a match occurs the offending text is replaced with asterisks. With more creative programming (using if statements) you may be able to sort out different types of content and create different responses for different types of abrasive input. Maybe even sort out words like 'abrasive'.
|
|||
Script | |||
|
|||
|
|||
Try it. | |||
Try the words 'dimwit' or 'dweeb' which we've included in the curse word list for this demo. Notice dimwit and dweeb work but dimwitted and dweebalo don't. The script needs a more precise match allowing for leading and trailing spaces around the flagged word of phrase. The script will, however, sort out 'dweeb.' with trailing punctuation such as a period. It's a work in progress. Your feed back will be greatly appreciated (with or without the bad words deleted).
|
|||
|
|||
|
|||
Installation | |||
|
|||
Here's what I would do. First I would copy and paste the code above into the Perl code I'm using to process blog or text entries. You may have already imported the CGI or Text modules and you'll want to delete the calls to those modules here so the call is not duplicated. Next create your negative word text file, negatives.txt. Here it's in the same directory as the Perl code. The word list is a comma separated list with no spaces for example; dweeb,dweebette,dweeber,dunce . . . etc. etc. There's no comma after the last entry. You might also want to add phrases; donkey ass, cow pusher, dimwitted dweebalo, etc. Careful. You're practicing a new brand or censorship here that garners some responsibility. I would call the script using either the form action attribute or with the javascript httpRequest function. The second is more popular and allows increased design flexibility. You might want to take a look at the JavaScript file we use on this page if you're using ajax. Ajax is difficult to debug because if the script doesn't run you don't have access to standard error messages available with the Perl code being called. That's pretty much it except there are a couple of other details you might want to think about that are a matter of style and practice. I think it's a good idea, for example, to put some sort of character filter in so people can't submit script or code to your site, characters like the < and > characters for example or the parenthesis characters, '(' and ')' that are used in function calls. More labor intensive code, with a budget, would scramble these characters and unscramble them for display. It would also be interesting to add a few 'if' operators in the Perl code to flag certain key words or phrases and splash different messages for different types of offensive words. The curse word filter is a work in progress. There are a few companies around that let you link to their word lists but I think I would prefer the flexibility of designing my own negative word list, suitable for individual web sites. A basic list would have a lot of obvious words in it which I set up for download here. However it might be kind of iffy to email this around as there are government agencies, I hear, monitoring our emails. I don't want to be at the top of that list. Good luck. Call me. Just don't call me anything bad. |