Skip to main content

Posts

Workshop on Introduction to Computational Phonetics, Phonology and Prosody

Workshop on Introduction to Computational Phonetics, Phonology and Prosody 23 – 29 December, 2012 Department of Computer Science & Engineering Tezpur University Objective of the Workshop: In t his workshop leading figures from academia will provide the participants exposure to the basic concepts as well as the state-of-the-art of Speech processing and transcription methods, particularly in Indian language context. Emphasis will be put on hands-on practice on NE language transcription. Target participants: Teachers, researchers and students of Linguistics, English/Assamese/Bodo with specialization in linguistics, Computer Science, Electronics (speech processing) from Colleges and Universities. It will also be open for special interest groups. Prior requirement : Familiarity with grammar and linguistics. Resource persons : From IIIT - H yderabad, IIT-Guwahati a n d Tezpur University Registration Fees: 200.00 for students and 500.00 for

Computational Linguistic works on Assamese, published till 8-October-2012

তালিকাভুক্ত গৱেষণা পত্ৰকেইখন হ'ল যোৱা দশকটোত (২০০২-২০১২) বিভিন্ন আলোচনাচক্ৰ, কৰ্মশালা, সন্মিলন , তথা আলোচনীত পঠিত বা প্ৰকাশিত লিখনি, যি ৰাজ্যিক, ৰাষ্ট্ৰীয় তথা আন্তৰাষ্ট্ৰীয় পৰ্যায়ত অসমীয়া ভাষাক প্ৰতিনিধিত্ব কৰিছে আৰু গাণনিক ভাষাবিজ্ঞানৰ মেপত অসমীয়া ভাষক প্ৰতিষ্ঠা কৰিছে। দেখা যায় যে এই গোটেই কামখিনি মাত্ৰ তিনিটা পৰীক্ষাগাৰৰ পৰাহে হৈছে, তেজপুৰ বিশ্ববিদ্যালয় ভাষা সংসাধন কেন্দ্ৰ,  RCILTS-আই. আই. টি. গুৱাহাটী আৰু কম্পিউটাৰ বিজ্ঞান বিভাগ গুৱাহাটী বিশ্ববিদ্যালয়। ইয়াৰ উপৰিও CDAC, CIIL, LDC-IL আদি কেন্দ্ৰীয় চৰকাৰৰ ভাষা সংসাধন লেব সমূহটো অসমীয়া ভাষা বিভিন্ন ধৰণৰ কাম হৈছে যদিও তাৰ বেছিভাগ কামেই সদৰি কৰা নহয়, যাৰ ফলত সদৰি কৰা কামৰ সংখ্যা যঠেষ্ঠ কম। Analysis and evaluation of stemming algorithms: a case study with Assamese ; N. Saharia, U. Sharma and J.K. Kalita; in Proceedings of the International Conference on Advances in Computing, Communications and Informatics (ICACCI), Chennai, 2012. Dynamic segmentation of vocal extract for Assamese Speech to Text Conversion using RNN ; K.

Part of Speech tagging of Assamese

Part of Speech (POS) tagging is the process of marking up words and punctuation characters in a text with appropriate POS labels. The problems faced in POS tagging are many. Many words that occur in natural language texts are not listed in any catalog or lexicon. A large percentage of words also show ambiguity regarding lexical category.  The challenges of our work on POS tagging for Assamese, an Indo-European language, are compounded by the fact that very little prior computational linguistic exists for the language, though it is a national language of India and spoken by over 30 million people. Assamese is a morphologically rich, free word order, inflectional language. Although POS tagged annotated corpus for some of the Indian languages such as Hindi, Bengali, and Telegu have become available lately, a POS tagged corpus for Assamese  was unavailable till we started creating one for the work presented here. Another problem was that a clearly defined POS tagset for Assamese was un