Thursday, 3 January 2013

During the meeting held on 23rd November 2012 we firstly reviewed the comments made from Jeremy Ellman on the progress we had made so far. We were told that we would be given a lecture to advise us on the breaking down of individual tasks in the coming week.

Prior to this we each we had successfully inserted the removal of stop words into the program, we presented this to the module tutor, he was satisfied with the output, therefore we decided that the next step would be to research into how the text can be cleaned even more.

After an extensive amount of research we discovered that there were many different pre-programmed functions within python that allows the user to clean the text to a higher standard. We decided that the next step that would forward the project would be to divide up each section of the project and allocate each member tasks according to their expertise. The first task we decided on was on was to clean the text, this had already been researched into it just required implementing, Ste was allocated this. The next step was the stemming of the text David and Anupreet were given this. Feature vectors was the next task, Ste was given this. Text similarity was decided as the next task Andy, Alex and Ste were given this. The final section was clustering which Andy, Alex and Anupreet took the liberty of implementing this task. 

Attendees: Stephen Brown, Andrew Hill, David Wajiya, Anupreet Kaur, Alexandru Palade 


No comments:

Post a Comment