AI Innovation

Named Entity Recognition from Online News

May 27, 2018

This is a project from the Natural Language Processing course in my Masters in Data Science program. The project aimed to create a series of models for the extraction of Named Entities (People, Locations, Organizations, Dates) from news headlines obtained online. We created two models: a traditional Natural Processing Language Model using Maximum Entropy , and a Deep Neural Network Model using pre-trained word embeddings. Accuracy results of both models show similar performance, but the requirements and limitations of both models are different and can help determine what type of model is best suited for each specific use case.

The final conclusion is that, as the Deep Learning Model is less dependent on specific language grammar rules, it is more generalizable (given embeddings and some labeled corpora is provided in any language) whereas the Maximum Entropy model will perform poorly on an language where there is no Domain Knowledge to create the required features.

Project Resources

Final Presentation

Final Report

View Source Code

All source code for this project is available on GitHub.

GitHub Repository
Back to All Articles