JavaNLP meeting notes for 12/05/02 This was the last meeting for the quarter. The plan for the next couple of months is to work on improving the quality and usefulness of what's currently in the repository, rather than starting any major new projects. Please continue to look through the code you're responsible for and determine whether it belongs in the repository and/or is sufficiently well written and documented that others might benefit from it. We will pick a date/time/place for the next meeting after we return from winter break. Remember that after you've committed stable new code that you want others to be able to see and use, release new javadocs to the website and create a new official javanlp.jar file. Use the scripts /u/nlp/java/install-javanlp-javadoc.sh and /u/nlp/release-javanlp-jar.sh respectively. The javadocs show up on the website (accessible from the main javanlp pages) and the javanlp.jar files are put in /u/nlp/java/release with /u/nlp/java/release/javanlp.jar always pointing to the latest one. Use this jarfile when you're working on an outside application that makes use of the JavaNLP code. Summary of progress since last meeting: * ant is now installed on the nlp machines, which is a far superior alternative to the makefiles, since it's faster and smarter. Just run bin/setup.csh and then "ant" to compile stuff or "ant all" to clean and then compile or "ant javadoc" to get local javadocs. * The porter stemmer in the process package now uses the new system and works correctly. * Various improvements to the tagger and parser packages * Classify API continues to come along - classify package now has the new stuff, old stuff is removed or moved to old dir - we now have a few IClassifiers that can be trained/tested/saved * new classes in process for turning a document into a list of sentences and POS-tagging those sentences Tasks for next week(s): Everyone: - add @author tags to the package and class docs you're in charge of (so people know whom to contact with questions/problems/etc) - take stock of your code in the repository and remove/move old stuff and clean up existing stuff Dan: - external classify API (see the todo list we made) - more parser stuff - Naive Bayes IClassifier implementation Sep: - new Document for 20 news groups - try to use classify API to do textcat on 20 news groups docs Huy: - hmm experiments Kristina: - new IClassifiers? (decision tree, SVM, maximum entropy, naive bayes, etc?) Chris: - talk to Tim about PTBTokenizer and make sure you have the best of both your changes in the official release Thanks for a great quarter, I think the progress we've made is self-evident and you should all be proud of your contribution. See you in '03. Thanks, js