Posts

Showing posts from November, 2009

BWT for NLP (2)

Image
I show how the Burrows-Wheeler Transform can be used to compute the similarity between two strings. We submitted results from this method (along with results from the Context-Chain metric developed by my colleagues Frank Schilder and Ravi Kondadadi) for the Automatically Evaluating the Summaries of Peers (AESOP) task of the TAC 2009 conference. The task was to produce an automatic metric to evaluate machine generated summaries (i.e., system summaries) against human generated summaries for the TAC '09 Update Summarization Task. Clearly the automatic metric is just some function that produces a similarity score between the system summary and the human generated (the so-called model ) summary. The  proposed metrics were evaluated by comparing their rankings of the system summaries from different peers to that of the ranking produced by human judges. Similarity Metric We use an estimate of the conditional "compressibility" of the model summary given the system summary as the