I originally intended to blog a lot about my Masters here. I haven't, partially due to discouragement from my department. I understand the concern about leaking results, but I feel like I've missed out on a lot by not being able to communicate more about what I've learned and my work. So, I'm just going to write about it anyway, though I won't get too specific.
To start, I have a system as a prototype of my hypothesis that mostly works. It's not ideal, and I like to tinker with it to try to make it better. Here are some concerns right now
- efficiency; computational efficiency isn't key to my work, but it helps me test the accuracy of the algorithm. Waiting 15-30 minutes to get a result and then analysing it to see where any problems may lie is a lot less productive than getting a result in one minute. So, I'll work on that.
- background knowledge; I need to understand two domains, one of linguistics and one of natural language processing. In my first year, I acquainted myself with a lot of linguistics, but lately, I've only paid attention to computation. I want to expand my understanding of linguistics in the hopes that it will serve my Masters better.
- visualisation; it's been hard to clearly explain to some other people exactly how the algorithm operates if they're not already deeply into computer science. I'd like to be able to have it visualise its progress as it runs so people can see its behaviour.
- stateliness; right now each run of the system starts fresh, and a grammar must be retrained from scratch. What I would prefer would be able to have a built grammar that persists between runs, and that I can continue to add to. I've used some of my GXml work to help with this a little, but I need to do more.