The Hebrew language poses special challenges to developers of natural language processing systems: its script, the direction of writing, its rich morphology, its root and pattern word-formation system, the lack of databases comtaining comprehensive archives of usage of the language. All the aforementioned issues turn Hebrew processing into a difficult and challenging task
There are many exmples of natural language processing applications, which include automatic translation from one natural language to another; conversion of speech into text and vise versa; natural language user interfaces for computerized systems; automatic summarization of documents; spelling and style checking; and so on. The vast majority of these applications requires a solid infrastructure of developments based on linguistic knowledge' such as corpora of text and speech that document how people use the language, computerized lexicons and dictionaries, morphological and syntactical analyzers, disambiguation programs, part of speech taggers, word sense disambiguators, semantic networks and so on.
Such developments can never be driven by the needs of the software industry, due to the limited market of software that requires Hebrew processing. So far no proper computational infrastructure for Hebrew has been built in order to solve fundamental issues such as those described earlier, which will be available for the research community without restrictions. Therefore, more complicated systems, such as automatic translations systems cannot be built due to the lack of such infrastructure.
The activity of this knowledge center focuses on development of infrastructure for and of applicative reseach of processing Hebrew in order to create the required theoretical and practical infrastructure for advanced human-computer user interfaces. All tools and reasources created in the center will be freely available as open source products to the research community via the center's website. The collaboration among a large number of researchers will enable definition of standards and formats for data representation, so that the results from one research can be directly used and applied to other research results. Research students of the center's principal researchers will also participate in the development. Furthermore, the resources developed in the center will be available to academic institutes for teaching purposes. This way, the center will also serve as a catalyst for development of modern and current courses which will allow the training of additional computational linguistics practitioners in Israel.