Untitled Document
 
 Home  Company  News  Product  Partners  R & D  Support  Contact  فارسی
Home > Product > Language Processing > Statistical Language Models for Persian > Overview
ASR Gooyesh Pardaz Products
 
Statistical Models for Persian
 Overview       
 Features        
 System Requirements 
Resources
 Language versions
 Demo
 White paper
 Testimonials
 More Resources
 
Choose This Product

Please Visit Request Product to request this solution.

 
Overview

 

The statistical language models for Persian are prepared in 3 types: monogram, bigram and trigram. These models are extracted from a Persian text corpus which contains about 10 million words. Monogram language model is the number of occurrences for each word or POS in text corpus. Bigram model is the number of occurrences for every couple of words, POS tags or classes. Trigram model presents the number of occurrences for every triple of words, POS tags or classes. So these statistics are extracted for sequences of words (word-based n-gram), POS-tags (POS-based n-gram) and classes (class-based n-gram). The Persian text corpus is being developed and beside its development, the extracted statistics are being updated. We can extract some more statistical models from corpus for different systems which use language models. These models can be used in speech recognition systems, intelligent typing systems or OCR systems.

    Select a product

    All speech solutions provided by ASR Gooyesh Pardaz: