Saturday, 5 January 2013

The next meeting as was decided at the last session was held for today, 27 November 2012. All the members of the team reached well in time. We started with the day's discussion.

 As the next step was of stemming that was to be performed, David and Anupreet came with the required research on the stemming portion. Also, the codes were  provided by both. It was then that the codes were implemented  in the presence of all the five members after editing and listening to everyone's views, and stemming ultimately showed good results and we got one proper working coding schema for stemming. Everyone was happy with this portion of the system.

After stemming, at the end of the meeting it was decided that the next meeting must be for the generation of feature vectors. The members allocated for the task of  feature vectors generation agreed to work on the researching of the feature vectors.  That was when this meeting came to an end and we left.

Members Present: Alexandru Palade, David Wijaya, Anupreet Kaur, Ste Brown, Andrew Hill.



Thursday, 3 January 2013

During the meeting held on 23rd November 2012 we firstly reviewed the comments made from Jeremy Ellman on the progress we had made so far. We were told that we would be given a lecture to advise us on the breaking down of individual tasks in the coming week.

Prior to this we each we had successfully inserted the removal of stop words into the program, we presented this to the module tutor, he was satisfied with the output, therefore we decided that the next step would be to research into how the text can be cleaned even more.

After an extensive amount of research we discovered that there were many different pre-programmed functions within python that allows the user to clean the text to a higher standard. We decided that the next step that would forward the project would be to divide up each section of the project and allocate each member tasks according to their expertise. The first task we decided on was on was to clean the text, this had already been researched into it just required implementing, Ste was allocated this. The next step was the stemming of the text David and Anupreet were given this. Feature vectors was the next task, Ste was given this. Text similarity was decided as the next task Andy, Alex and Ste were given this. The final section was clustering which Andy, Alex and Anupreet took the liberty of implementing this task. 

Attendees: Stephen Brown, Andrew Hill, David Wajiya, Anupreet Kaur, Alexandru Palade