[Fsf-india] HTML.Template.java (fwd)
Frederick Noronha
fred@bytesforall.org
Tue, 19 Mar 2002 10:47:50 +0530 (IST)
>From Philip S Tellis <philip.tellis@iname.com>
---------- Forwarded message ----------
hi, just thought I'd let you know about developments on my contributions
to the OSI.
I've started a new project - HTML.Template.java. This is a templating
system for servlet writers. It is based on the HTML::Template perl
module written by Sam Tregar. The purpose of the module is to separate
code from design - which is so important in software development. The
project aims to be 100% compatible with HTML::Template and not include
any extra features. Ideally, we'd like to use the same templates for
our perl and java programs.
You can check out the project homepage at
http://html-tmpl-java.sourceforge.net/
the code is distributed under either the GPL or the Artistic licence.
Apart from that project that I run completely, I'm also contributing to
the everybuddy project. Everybuddy is an instant messenger client for
MSN, Yahoo, AOL, ICQ, Jabber and more. The new modular structure means
that anyone can write a module for a new service and plug it in. You
don't even have to restart everybuddy for that.
My contribution involves a lot of work to the currently unmaintained
yahoo code. I've added an option for invisible logins (a must for
people like me who get bombarded with chat requests the moment i appear
online). I've also added code to authorise add contact requests, and
fixed a few bugs that were present initially and cause eb to either
crash or hang.
You can see the everybuddy homepage at http://www.everybuddy.com/. It
hasn't been updated in a while.
I've also contributed code to the namazu project. namazu is a search
engine that uses a full text index, ie, no database required. This is
ideal in situations where you have a small website, and don't need a
database. There's no point installing a database server only for your
search engine. Namazu helps out here. Namazu is a Japanese project
(namazu is the japanese word for catfish). You can see the namazu
project at http://www.namazu.org/
My contributions include adding code for running the search through
server side includes and passing the index name through the PATH_INFO
variable instead of a query string (this helped in integrating it with
mailman).
These two are officially part of the namazu tree (the second one will
make it into 2.1).
I've also developed a namazu add-on that handles stopwords and synonyms.
If you've used google, you'll have seen stop words before. Basically,
there's no point searching for words like `a, to, the, it' which occur
in almost every document present. A stop list is a list of all such
words, and my code basically eliminates them from the search.
The synonym part was developed as part of our work at NCST. While
analysing search patterns, (the search is mainly used by students at
NCST), I noticed three things.
- people entered questions.
eg: When is the CST exam?
How much are the fees for G-level?
etc.
- people can't spell
- people used alternate terms (synonyms :) for words on the pages
eg: technical assistant instead of technical associate
or `umbrella words' - that covered a broad range.
eg: oops for oopj, oopc and oops
adbms for dbms, rdbms, odbms and adbms
naturally, the exact match search would return zero results in most
cases. this wasn't the kind of service we wanted to provide (I'll get
to the kind we wanted later).
so, I developed the synonyms add-on, which basically replaces words with
their alternatives.
so, I have things like this:
when /when|da(te|y)|time|sun|mon(th)?|tue|wed|thu|fri|sat|week/
where /where|place|centre|location/
conducted /conduct|held/
qualification /qualif|require|score/
begin /begin|start/
oops /oop[cjs]/
adbms /[a-z]dbms|advance database/
assistantship /associate/
assistant /ass(ociate|istant)/
assistanceship /associate/
cursors /curses/
collage /college/
the word on the left is what is to be replaced, the regex on the right
is what it is replaced with.
so, if someone searches for `when', we actually search for when or day
or date or time or sun, mon, tue..., week, month.
similarly for the others.
the last two you see are common spelling mistakes specific to our
domain.
For the linux users group, I have other synonyms, eg:
indianization /(indian|local)i[sz]ation/
these two patches are useful to a few people, but the namazu developers
have decided not to include them in the main tree yet. We do however
make them available to anyone who wants them under the terms of namazu
itself (GPL).
Our ultimate goal in this kind of a search system is to build a system
(Sandesh) that will answer student queries by email. A student sends in
a standard query to some staff (sometimes they even send it to the
director!). The staff will forward the mail to Sandesh. Sandesh in
turn will analyse the mail, and look for text that looks like a
question or something the student wants to know. It will then generate
a search string, and pass it on to our search engine. The search engine
will return a list of pages, that Sandesh will look at, and return clips
from the page that match the query.
This is good for cases where we have FAQs on a single page (in fact, the
perldoc -q method comes to mind).
As far as the official open source projects from NCST goes, I guess I
should just go ahead and tell you what they are.
The first one (that we're sure of releasing) is a web based calendar.
It was made primarily for scheduling courses over the web, so students
could check up on their classes. The design was extensible, and we
managed to use it for different options, even sending out birthday
wishes for staff in our department.
We're now planning on releasing it part by part. Each part would be
usable on its own for different projects. The first that will be
released is the Calendar perl module. This is nothing but an
abstraction around the unix cal program. It provides an abstract
object oriented interface to cal for perl programmers. Additionally, it
has hooks to attach text to each date. This will be used in conjunction
with a Schedule class that will populate the Calendar with events.
The only delay right now is in setting up a download server in NCST (or
at least in India) to host these. We had initially planned on
sourceforge, and I even created the account there, but then their terms
of service changed, and we decided not to go ahead. The next option was
savannah from gnu, but at this time we decided that it would be good to
have a server in India itself.
Well, hope you find this information useful.
Philip
--
One uses power by grasping it lightly. To grasp with too much force is to be
taken over by power, thus becoming its victim.
-Bene Gesserit Axiom
Visit my webpage at http://www.ncst.ernet.in/~philip/
Read my writings at http://www.ncst.ernet.in/~philip/writings/
MSN philiptellis Yahoo! philiptellis
AIM philiptellis ICQ 129711328