Programme system of word stem forming from words in Hungarian texts

Authors

  • Zoltán Balogh

Abstract

Because of the peculiar use of prefixes and suffixes the Hungarian language structure cannot apply English text analyzing methods. Analyzing word stocks of information storage and retrieval systems, textual information stored in data banks it is a frequent problem to trace back word forms to word stems; to recognize and separate frequently used glides before suffixes; to remove suffixes and prefixes; to recognize kinds of words; to identify them with words in dictionaries.
The programme system solves these problems regarding Hungarian texts. As aids the followings are used:
tables of suffixes and prefixes;
so-called null-dictionary listing form words of texts;
dictionary of content carrying word stems.
All these dictionaries could be automatically enlarged as a result of the processing. Human control is, however, indispensable. Further problem is the assimilation of suffixes and the solution of word stem elisions and phonetic changes.
The programme system is operating IBM 36O with OS PL/1./F/ language.
Structure of the system and results of an experimental processing are described.

Downloads

Published

2019-01-17

How to Cite

Balogh, Z. Programme system of word stem forming from words in Hungarian texts, Scientific and Technical Information, 21(7), p. 487–494, 2019.

Issue

Section

Articles