Home
Softono
FilipinoStanfordPOSTagger

FilipinoStanfordPOSTagger

Open source
12
Stars
2
Forks
1
Issues
1
Watchers
6 years
Last Commit

About FilipinoStanfordPOSTagger

FilipinoStanfordPOSTagger is a specialized part-of-speech tagging tool designed to analyze and label Filipino sentences. Built upon the foundation of the Stanford Part-of-Speech Tagger, this solution integrates a custom-trained model specifically engineered for the Filipino language. The model leverages the owlqn2 optimization algorithm and utilizes a comprehensive feature set including left context of five words, distributional similarity, and prefix features with lengths of six and two plus one. Input sentences are processed using the pipe symbol as the standard delimiter. The software supports a specific tagset detailed in its documentation to ensure accurate grammatical classification. While tailored for Filipino, the underlying framework allows for potential adaptation to other languages as per the core Stanford POS Tagger capabilities. Users must refer to the Java documentation for detailed implementation instructions and usage guidelines. This tool serves researchers, linguists, and developers working

Platforms

Web Self-hosted

Links

Using the Stanford part-of-speech tagger, we developed a Filipino tagger model for tagging Filipino sentences. https://paclic31.national-u.edu.ph/wp-content/uploads/2017/11/PACLIC_31_paper_5.pdf

The Stanford POS Tagger can be accessed in this website: https://nlp.stanford.edu/software/tagger.shtml

The Filipino tagger model is trained using the following feature sets: left5words, distsim, prefix(6), and prefix(2,1) and using the owlqn2 optimization. For the delimiter, we use the '|' symbol.

The tagset (list of tags) used by the Filipino tagger model can be accessed here: http://goo.gl/dY0qFe

For instructions how to use the Stanford POS Tagger and the Filipino tagger model, read through the tagger's Java documentation. The tagger may also be used in other languages as seen in the Stanford POS Tagger's homepage link above.