Multi Level Predictive Language Model
The topic is available also in Hungarian.
Simply speaking, the goal of grammar learning is to learn the right order of words in a sentence. This assumption is the fundamental statement of modern, predictive language models, e.g., USE, BERT, InferSent, GPT-3. Such models are typically trained by involving transfer learning and also provide high quality language models. For more information, see: https://huggingface.co/
However, the mentioned techniques are typically unimodal regarding the atomic units of the language. Most models involve word based representation. However, some approaches are based on characters and hyphens. The goal of this project is to broaden this scope and involve different modalities, e.g., characters, hyphens, words, compound words, sentences, POS tags, etc. The goal is to learn how to combine the mentioned modalities. The student will work on a novel architecture of artificial neural networks and will compare its performance to state of the art techniques.