DESIGN AND IMPLEMENTATION OF PAY AS YOU GO CALLING SYSTEM

in COMPUTER ENGINEERING PROJECT TOPICS AND MATERIALS, COMPUTER SCIENCE EDUCATION PROJECT TOPICS, COMPUTER SCIENCES PROJECT TOPICS AND MATERIALS on February 15, 2021

CHAPTER ONE

INTRODUCTION

  • Background of the Study

Text-to-speech system (TTS) is the automatic conversion of a text into speech that resembles, as closely as possible, a native speaker of the language reading that text. Text-to-speech synthesizer (TTS) is the technology which lets computer speak to you. The TTS system gets the text as the input and then a computer algorithm which is called TTS engine analyses the text, pre-processes the text and synthesizes the speech with some mathematical models. The TTS engine usually generates sound data in an audio format as the output (Dutoit, 2013).

The text-to-speech (TTS) synthesis procedure consists of two main phases. The first is text analysis, where the input text is transcribed into a phonetic or some other linguistic representation, and the second one is the generation of speech waveforms, where the output is produced from this phonetic and prosodic information. These two phases are usually called high and low-level synthesis (Suendermann & Black, 2010). A simplified version of this procedure is presented in figure 1 below. The input text might be for example data from a word processor, standard ASCII from e-mail, a mobile text-message, or scanned text from a newspaper. The character string is then pre-processed and analyzed into phonetic representation which is usually a string of phonemes with some additional information for correct intonation, duration, and stress. Speech sound is finally generated with the low-level synthesizer by the information from high-level one. The artificial production of speech-like sounds has a long history, with documented mechanical attempts dating to the eighteenth century (Allen & Klatt, 2017).

Voice/speech system is a field of computer science that deals with designing computer systems that synthesize written text. It is a technology that allows a computer to convert a written text into speech via a microphone or telephone (Allen & Klatt, 2017). As an emerging technology, not all developers are familiar with speech technology. While the basic functions of both speech synthesis and speech recognition takes only minutes to understand, there are subtle and powerful capabilities provided by computerized speech that developers will want to understand and utilize (Rubin & Baer, 2011).

Automatic speech synthesis is one of the fastest developing fields in the framework of speech science and engineering. As the new generation of computing technology, it comes as the next major innovation in man machine interaction, after functionality of Speech recognition (TTS), supporting Interactive Voice Response (IVR) systems.

The basic idea of text-to-speech (TTS) technology is to convert written input to spoken output by generating synthetic speech. There are several ways of performing speech synthesis:

  1. Simple voice recording and playing on demand;
  2. Splitting of speech into 30-50 phonemes (basic linguistic units) and their re-assembly in a fluent speech pattern;
  3. The use of approximately 400 diaphones (splitting of phrases at the centre of the phonemes and not at the transition).

The most important qualities of modern speech synthesis systems are its naturalness and intelligibility. By naturalness we mean how closely the synthesized speech resembles real human speech. Intelligibility, on the other hand, describes the ease with which the speech is understood. The maximization of these two criteria is the main development goal in the TTS field (Suendermann and Black, 2010).

  • Statement of the Problem

Text to System (TTS) is designed to improve efficiency and navigation quality of the system by exploiting different types of technologies.

The problem in this study will solve are as follow:

  1. Designing a user-friendly interface for the blind was challenging, the researcher has no idea in XML and Java programing language
  2. The researcher finds it difficult to implement an isolated word speech synthesizer that is capable of converting text and responding with speech, because, the researcher has limited access to the hardware and software needed for the program.

These are the major problems encountered in this study; thereby the researcher tried her best to work towards these problems.

1.3 Aim and Objectives of the Study

The aim of the project is to design and implement a Text-to-speech audio system. The specific objectives are:

  1. To design a user-friendly interface for the blind and the elderly.
  2. To implement an isolated whole word speech synthesizer that is capable of converting text and responding with speech
  3. To validate the automatic speech synthesizer developed during the study.

1.4 Significance of the Study

The text to speech audio system will be very useful to any researcher who may wish to venture into the “Designing and implementing a text to speech audio system”.

This text-to-speech synthesizing system will enable the semi-illiterates assess and read through electronic documents, thus bridging the digital divide. The technology will also find applicability in systems such as banking, telecommunications (Automatic system voice output), transport, Internet portals, accessing PC, emailing, administrative and public services, cultural centres, and many others. The system will be very useful to individual users and also to the government. This system will make is easier to communicate with the disabled (blind people), elderly, and also useful for those that can’t write but can speak

Get Full Project

Add New Institution