gocr ep0461127-001
(PICTURE)(PICTURE) (PICTURE)Europäi _hes Patentamt _ (PICTURE) EuroPean Patent offi_e (PICTURE) Offi_e euroPeen des brevets _ PubIication number . O 461 127 B1 _ EUROPEA_ PATE_T SPECIFICATlO_ _ Date oF pubI_i_t_ion oF patent spec_ifcat_ion .. _ lnt. _l.6 .. __O_ _JOO, _O6F __/28 2l.O9.9_ Bulletin 9_l39 _ AppIication number . aoao3ola.3 _ Date oF FiIing . 26.o_.ao _ International aPPlication number . PCTIUS9OlOO389 _ International Publication number . WO 9OlO9O2O O9.O8.9O Ga_ette 9Ol_9 (PICTURE)(PICTURE)(PICTURE) _ l_rERAcrlvE LA_GuAGE LEAR_l_G sYsrEM. __ h __ __ _ _ Note . Within nine months from the publication of the mention of the grant of the European patent, any penon may give notice to the European Patent Office of opposition to the European patent granted. _ Notice of opposition shall be fled in a written reasoned statement. It shall not be deemed to have been W fled until the opposition fee has been paid (Art. 99(1) European patent convention). (PICTURE)Jouve, 18, rue Saint-Denis, 75OO1 PARIS
gocr ep0461127-002
1 EP O 46_ _2l B_ 2 Des_ription Barbour, ''Computerized Speech. Talking Its Way Into The Classroom'', 6 Electronic Learning, n.4, FIELD OF THE Ih1VEh1TlOh1 p. 1 5 (Jan 1 987); (PICTURE) (PICTURE) PEAL SOFTWARE (Los Angeles, California), The invention relates to computer systems with _ ''Representational Play'', ''Keytalk'', and ''Exploratory speech capabilities. More particularly, the invention Play'' software packages; pertains to a computerized interactive language ''E Z Pilot Il Authoring System'' software by Iearning system which provides visual text displays Hartley Courseware, Inc., Dimondale, Michigan; and associated digitized audio speech. ''Smoothtalker Version 2.O'' software by First 1o Byte Inc.; BACKGROUh1D Ah1D SUMMARY OF THE ''Experlogo-TalkerIPrologo'' software by Ex- (PICTURE)lh1VEh1TlO perintelligence, Inc.; (PICTURE) 'Voice Master Version 4.O'' system by Covox As communications and high speed transporta- Inc. tion continue to make our world seem smaller, know- 1_ ''Basic Language Series -- Spatial Concepts'' ing a second language becomes more important and by Science Research Association; valuable. Unfortunately, traditional language instruc- ''Talking Text Writer'' and ''Talking Text Speller'' tion in the classroom by itself generally does not, due software published by Scholastic Inc., Jefferson City, to time constraints, sufficiently immerse the student Missouri; in the second language he or she is studying to en- _o ''Reading Skills Development Program'' soft- sure rapid learning. ware available from American Educational Computer, While written materials (e.g., textbooks, work- Inc., Oklahoma City, Oklahoma; books, and the like) provide some opportunity for the 'VVriting To Read'' by International Business student to study by himself, written materials cannot Machines; effectively assist the student in pronunciation and __ ''Language Experience'' software series from other aural aspects of language learning. Although Teacher Support Software, Gainesville, Florida; and some written language study materials are accompa- Houghton Mifflin's ''Listen and Learn'' series, nied by prerecorded audio tapes or records allowing Houghton Mifflin Educational Software Division, the student to listen to the language being spoken, Hanover, New Hampshire. even these prerecorded audio materials have the dis- 3o Additional patents generally relating to learning advantage that they cannot provide the student with aids with speech synthesizers include. feedback about his or her pronunciation. In the past, U.S. Patent No. 4,769,846 to Simmons; the only way to obtain effective spoken language U.S. Patent No. 4,4O3,965 to Hawkins; drills and practice outside of the classroom environ- U.S. Patent No. 4,421 ,487 to Laughon et al; ment was to hire a language tutor (an expensive pro- 3_ U.S. Patent No. 4,457,71 9 to Dittakavi et al; position) or to spend time with someone who was al- and ready fluent in the unfamiliar language. U.S. Patent No. 4,549,867 to Dittakavi. The concept of using computer hardwarelsoft- The Anderson et al '533 patent cited above dis- ware to provide synthesized or digitized spoken lan- closes a microprocessor based electronic teaching guage is generally known. The following is a some- 4o aid which enables the student viewing a display to what representative (but by no means exhaustive) designate any word or portion of text for vocalization listing of prior publications, prior issued U.S patents, by synthesized speech techniques. The ''reading'' ma- and published software packages relating to comput- terial provided by the system is stored in a preprog- er-assisted language learning with speech capabili- rammed (fixed) source. Read only memory. Pointers ties. 4_ are used to point to the start addresses for the words. U.S. Patent No. 4,579,533 to Anderson et al; Mass storage devices are avoided in favor of semi- U.S. Patent No. 4,591 ,929 to Newsom; conductor ROM memory. Speech data is stored in the U.S. Patent No. 4,749,353 to Breedlove; memory as individual words in a dictionary. Nofacility U.S. Patent No. 4,695,962 to Goudie; for inputting digitized student utterances into the sys- U.S. Patent No. 4,71 O,877 to Ahmed; _o tem is provided. Brower, ''Word Torture Eases Pain Of Lan- U.S. Patent No. 4,591 ,929 to Newsom teaches a guage Learning'', 2 MacWEEK n.48, p. 14 (29 Nov. second language learning system connected to a 1 988); (PICTURE) magnetic tape recorder. An electronic interface con- Parham, ''Computers That Talk'', 8 Classroom trols the tape recorder functions. The last phrase Computer Learning n. 6, pp. 26-36, 63 (Ma(PICTURE)rch 988); __ played back by the tape recorder is converted into dig- (PICTURE)ack, 'VVorte & Satze. AGerman Tutor For Kids ital form and stored in an electronic store to permit the Or Adults'', 2 Color Computer Magazine n.3, p. 2O student to reproduce the phrase as many times as de- (May 1 984); (PICTURE) sired without having to rewind the tape. The student 2
gocr ep0461127-003
3 EP O 46_ _2l B_ 4 can also record his own voicing of a phrase in a dif- ing digitizing hardware for later reproduction) is capa- ferent portion ofthe electronic store and can then se- ble ofproducing speech as realistic as recorded voice Iectively reproduce the teaching phrase or his re- -- in any language and including accent and inflec- sponse -- re-recording his voicing until satisfied. tion. However, the use ofdigitized speech is extreme- U.S. Patent No. 4,71 O,877 to Ahmed discloses a _ Iy memory intensive (a limitation which has proven to computer-based language learning system including be a major roadblock in its use in the past). A single a speech synthesis capability using linear predictive second of digitized speech can occupy 64 Kbytes of coding. A menu driven student interface is used to storage space (somewhat less if compression algo- step a student through preprogrammed lessons fea- rithms are used). To reduce the amountofmemory re- turing visual and synthesized speech stimulae. 1o quired, some system developers have used methods U.S. Patent No. 4,695,962 to Goudie teaches a for reusing words by encoding and storing individual system which attempts to increase the naturalness of words and phrases individually. This has, however, synthesized speech produced from linear predictive been a problematic approach for language learning in encoded speech data by substituting different data the past -- since it has been shown that students depending upon whetherwords are reproduced in iso- 1_ learn best when presented with words in natural con- Iation in a word mode or together with other words in text (and the same word or phrase is often pro- a phrase mode. nounced differently depending upon context -- see The Breedlove '353 patent discloses a hand-held the Goudie '962 patent referenced above). microprocessor based system that converts student Most prior digitized speech systems have been utterances into digital form and allows the student to _o limited to playing back prestored digitized speech. store the digitized utterances in memory associated However, some prior systems also permitted the stu- with student inputted text such as correct word spell- dent to digitize his own speech forlater play back. For ing. example, Covox, Inc. claims its ''Voice Master'' The ''Word Torture'' software program referenced speech synthesis system supposedly speaks in the above is anotherexample ofa computer-assisted lan- __ user's own voice, in any language, and with any ac- guage learning system. This program, published by cent. To record speech, a ''learn'' command is input- Hyperglot Software Co. of Knoxville, TN, is designed ted and the studentspeaks intoa microphone. To play to run on an Apple Maclntosh personal computer back the recorded speech, the student inputs the equipped with a ''Hypercard'' programmable database ''speak'' command. Up to 64 different words, phrases which supports digitized and synthesized sound. For- 3o or other sounds can be in memory at any one time - eign language study stacks provide automated vo- - with additional words being stored on disk and load- cabulary drills thatworkfrom English to a foreign lan- ed as needed. guage or vice versa, and permit users to adjust inter- See also U.S. Patent No. 4,591 ,929 to Newsom val times and add new words. The system also pro- discussed above, which teaches. (a) digitizing a spok- vides digitized pronunciations offoreign language al- 3_ en phrase spoken bythe userand storing the digitized phabets. user's phrase in an electronic store along with a digi- Other systems (including the Scholastic Soft- tized teaching phrase (played backfrom a tape recor- ware ''Talking Text Writer'' program) are essentially der); (b) and permitting the user to selectively repro- talking word processors with speech synthesis capa- duce the teaching phrase or his own response. How- bilities to allow students to hear whatever is typed 4o ever, Newsom provides only minimal digitized speech and well as hear text entered by the teacher. storage (e.g., a single teaching phrase) and requires However, as observed by Parkham in his survey the student to control the functions ofa tape recorder article ''Computers That Talk'' discussed above, lan- in order to select a different teaching phrase. The guage arts system developers have in the past had process of rewindinglfast forwarding a tape recorder great difficulties providing acceptable, useful sys- 4_ is extremely cumbersome. Moreover, Newsom pro- tems. Known text-to-speech synthesis algorithms are vides nofacilityfor integrating textual material, graph- capable of converting written text into synthesized ical orotherdisplay, orotherstudy aids with his strict- spoken words by referencing prestored ''phonemes'' Iy orla lesson. (sets of sounds). The ''Smoothtalker'', ''Experlogo- DE-A-37OO796 describes a voice trainer which Talker'' and ''Talking Text Writer'' systems referenced _o makes use of a display for displaying speech graphi- above are examples of systems which use text-to- cally in the form of intonation curves and frequency speech synthesis. While text-to-speech synthesis diagrams. may be acceptable for talking word processors, user Hence, although much priorwork has been done interfaces, or the like, known algorithms cannot pro- in the area of computer-assisted language learning, duce the range of inflections (stress and intonations) __ there is room for much further improvement. and pronunciations required for language learning. For example, no one is the past has successfully The digitized speech approach (i.e., in which ac- developed a truly interactive computer assisted lan- tual human speech is converted to digital signals us- guage learning system which integrates visual dis- 3
gocr ep0461127-004
5 EP O 46_ _2l B_ 6 plays with preprogrammed digitised speech and using a visual display interface. which also interactively digitizer student speech and _ An audio CLIP mode which permits the student permits the student to easily listen to his own pronun- to select any (random) portion ofdisplayed text ciation and compare itwith the digitised pronunciation (e.g., a phrase, a small part of a phrase, a sin- of a model word or phrase he selects. Significantly, _ gle word, a syllable, or a phoneme) using cur- the present invention may provide the very first rruly sor controls and to control the system to play interactive computer assisted language learning sys- the digitized speech corresponding to that se- tem which allows a student to select a model phrase lected portion. This feature allows the student from text displayed on an electronic display; record to concentrate on difficult phrases. (in digitised form) his own pronunciation of that 1o _ Integration ofdigitized sound in a high-level au- phrase; and instantly listen to the digitised vocal ver- thoring system (as distinct from an authoring sion ofthe selected phrase and his own recorded pro- language) is provided. An easy-to-use ''WYSl- nunciation for comparison purposes. WYG'' ('VVhat you see is what you get'') user in- According to one aspect of the present invention terface reduces or eliminates mistakes and as- there is provided an interactive language learning 1_ sociated frustration and does not require the system comprising user to have any programming ability. storing means for storing in digital form data _ An extremely flexible authoring system allows representative of a model of speech version of a pas- a teacher to link recorded digitized speech with sage of language and for storing data representative customized on-screen text (which may but of user input speech, _o need not match the digitized speech). This al- a display for displayinG visual information cor- lows a wide variety offree-form exercises to be responding corresponding to the passage, and created. selecting means operatively connected to said _ The system permits the studentto hear his own display and to said storing means and operative to speech and the correct (model) speech, each provide a comparison of the user input speech with __ at a keystroke, with no delay. the stored model version of the passage of language, _ Teacher-composed customized help screens characterised in that and instructions can be referred to by the stu- said storing means is arranged to store a digi- dent upon depressing a single keystroke. This tised speech version of the passage of language and feature permits great increases in the number also to store a digital data text version of said same 3o ofpossible teacher-created lesson formats and passage, also provides great flexibility in customization said selecting means is operable by a user to and ease of use not provided in other systems. select a portion ofsaid passage and to cause text cor- _ Despite the fact that digitized speech is em- responding to said selected passage portion to be dis- ployed, interrupt driven hardware in conjunc- played on said display based on said stored digital 3_ tion with software operating in the background data text version, and in that the system includes permits essentially continuous replay of digi- speech processing means for. tized audio data stored on a mass storage de- selecting the portion of said stored digitised vice -- without pauses due to loading and re- model speech version corresponding to said selected loading of memory (for up to 23 hours of con- portion of said passage, 4o tinuous speech from a CD ROM mass storage converting said selected digitised model device for example). speech version portion to audio signals for use in gen- The presently preferred exemplary embodiment erating speech sounds, of the invention provides a system including several converting audio signals representing user in- functional modules which are implemented in hard- put speech into digitised signals representing said 4_ ware, software or both. A digital speech processor user speech input and connected to a conventional personal computer is subsequently reconverting said digitised used to convert digitized speech data to audio signals speech signals representing said user input speech to and vice versa under control ofa memory resident in- audio signals. terrupt driven software module (this module handles Many other significant advantageous features _o all play and record requests for the speech proces- are provided embodiments of the present invention, sor). Apublic domain RAMdiskdriversets aside mem- including the following. ory for use as a simulated (virtual) disk drive. In the SoundSort - A text reconstruction exercise preferred embodiment, all recorded speech is placed based on aural clues. In accordance with this feature on the virtual disk first, then copied to other mass of the invention, the system automatically randomiz- __ storage devices (e.g., floppy disk). es the order of plural phrases, provides digitized ut- The personal computer processor executes pro- terances of the phrases in the randomized order, and gram control steps in the preferred embodiment requires the student to reconstruct the original order which provide a wide variety of useful functions. 4
gocr ep0461127-005
7 EP O 46_ _2l B_ 8 These functions may be divided into ''teacher'' func- correct order by moving the symbols around the dis- tions (used to create and compose lessons and exer- play screen (using interactive cursor controls and the cises); and ''student''functions (performed bythe stu- like). The only clues provided by the preferred em- dent for learning purposes). The student functions bodiment as to the correct order of the phrases are generally operate on lessons and exercises previous- _ aural versions of the phrases obtained by listening to Iy created by the teacher using the teacherfunctions. selected phrases (as many times as the student de- One of the teacher functions is a ''Text Writer'' sires) and by listening to the complete, original les- word processor permitting the teacher to compose son. The text is not shown on the screen in the pre- texts. A lesson authoring utility is then used to record ferred embodiment -- requiring the student to listen to segments of sound (phrases) which are linked to 1o the phrases and reorderthem into the propercontext. phrases in on-screen text(s) composed with the word The AudioWrite function ofthe preferred embodi- processor. The teacher may also select a second ment provides the digitized speech lesson one phrase (page two) textual display format to be presented as at a time, and requires the student to type or recon- instructions or help to the student. Afterrecording the struct what he hears (with complete freedom of cor- phrases, the teacher selects which of three student 1_ rection and repetition). The phrase typed in by the functions will be used with the newly created lesson. student is then compared to the original text, and any The teacher may, therefore, create texts and exercis- differences are flagged as errors. Punctuation, spac- es appropriate to any of the three functions. ing and capitalization are provided by the system in Three student functions are provided in the pre- the preferred embodiment and are thus not tested. ferred embodiment. (a) AudioLab (which provides au- _o Thus, the highly integrated speech and visuals ral and oral practice and learning); (b) SoundSort (an provided by the present invention permits a student aural text reconstruction exercise); and (c) Audio- to. Write (a writing exercise focusing on listening com- see, hear, record and compare complete text prehension). ordialogue, phrase by phrase (or byselected portions The AudioLab student function in the preferred __ of phrase); embodiment provides three modes. (i) PREVIEW, (ii) practice listening comprehension; and LAB, and (iii) CLIP. instantly, randomly access any part of a re- In the PREVIEW mode, the student can listen to corded selection. an entire prerecorded lesson with the option to view The system also provides teachers with an easy-to- the corresponding complete text on the personal 3o use utility for creating an infinite variety ofexercises. computerdisplay screen. Thus, the student hears the digitized model speech ofa lesson and can also view BRIEF DESCRIPTlOh1 OF THE DRAWlh1GS the displayed corresponding text (generallythe textof (PICTURE) the speech) as an audio-visual lesson. These and other features and advantages of the In the LAB mode, the student can select individ- 3_ present invention will be better and more completely ual phrases from the recording. The student may also understood by studying the following detailed de- viewthe complete texton the display-- oronly the text scription of a presently preferred exemplary embodi- corresponding to a phrase selected by the student. ment in conjunction with the appended sheets of The student can also record himselfspeaking any in- drawings, of which. dividual phrase ofhis choosing, and play back his own 4o FIGURE 1 is a schematic block diagram ofa pre- speech and the corresponding preprogrammed mod- sently preferred exemplary embodiment ofan in- el digitized speech so as to compare the two. teractive language learning system in accor- In the CLIP mode, the student can work with any dance with the present invention; selected portion of the current phrase (down to O.1 FIGURE 2 is a high level schematic flow chart- seconds long in the preferred embodiment). The stu- 4_ type description ofthe options presented toa stu- dent can play the entire original phrase or only a por- dent by the system shown in FIGURE 1 ; tion of the phrase he selects; record himself speak- FIGURES 3A-5 are graphical flow illustrations of ing; and compare his played back speech to the orig- the options shown in FIGURE 2; inal. Moreover, the student can examine phrases in FIGURES 6A-6B are together a flow chart of ex- three differentways in the preferred embodiment. for- _o emplary program control steps performed by the wards (e.g., ''Thislislanlellelphant''); backwards (e.g., FIGURE 1 system to provide the options shown ''phant'' -- ''elphant'' -- ''ellelphant''); or middle (e.g., in FIGURE 2; ''islan''). FIGURES 7A-7D are together a flow chart of ex- The SoundSort function provides a computer emplary program control steps relating to the Au- puzzle exercise which randomizes _umbles) the or- __ dioLab routine function shown in FIGURE 6; der of phrases in a lesson text. A column of symbols FIGURES 8A-8B are together a schematic flow is displayed representing the phrases in the lesson chart ofexemplary program control steps related text. The student must restore the symbols into the tothe AudioWrite routine (function) shown in FIG- _
gocr ep0461127-006
9 EP O 46_ _2l B_ 1O URE 6; sion slot of personal computer 52, and makes avail- FIGURES 9A-9C are together a schematic flow able on the personal computer rear panel an audio in- chart of exemplary program control steps per- putloutput socket. Speech processor6O converts au- formed by the FIGURE 1 system upon execution dio signals applied to its audio input into ADPCM ofthe SoundSort routine (function) shown in FIG- _ (Adaptive Differential Pulse Code Modulation) en- URE 6; coded digital data in a conventional manner for stor- FIGURE 1 O is a high-level schematic flow chart- age on mass storage device 56 -- and also converts type diagram ofthe options presented to a teach- previously recorded ADPCM encoded digital data er by the FIGURE 1 system to permit the teacher stored on the mass storage device into an audio sig- to create lessons; 1o nals provided at the speech processor audio output FIGURES 11A-11 E are togethera schematicflow socket (also in a conventional manner). chart of exemplary program control steps per- Speech processor 6O samples the audio wave- formed by the FIGURE 1 system to permit a form presented at its audio input (e.g., from the micro- teacher to create lessons; phone of headset 62a or from separate microphone FIGURE 12 is a flow chart ofexemplary program 1_ 62b) at either 8, 12 or 16 kHz using ADPCM encoding control steps performed by the ''Select File'' rou- technology. At the 16 kHz sampling rate, full fidelity tine shown in FIGURE 11A; sound is produced with a frequency response of 2O FIGURE 13 is a schematic flow chart of exem- Hz to 7.O kHz. The use ofADPCM provides a data re- plary program control steps performed by the duction of betterthan 2-to-1 over otherstandard digi- ''Choose Drive'' routine shown in FIGURE 12; _o tization techniques. FIGURE 14 is a schematic flow chart of exem- Speech processor 6O of the preferred embodi- plary program control steps performed by the ment operates in background under interrupt control ''DIR MENU'' routine shown in FIGURE 12; and of the conventional DOS disk operating system and FIGURE 15A-1 5B are together a schematic flow the associated microprocessor internal to personal chart of exemplary program control steps per- __ computer 52 -- using any one of several programma- formed by the FIGURE 1 system to execute the ble interruptand IlO addresses. Speech processor6O ''FILE MENU'' routine shown in FIGURE 12. also provides software programmable volume con- trols on both audio input and output and a software- DETAILED DESCRIPTlOh1 OF PRESEh1TLY addressable level detector to provide an indication of (PICTURE)PREFERRE EXEMPLARY EMBODIMEh1TS 3o signal amplitude during recordlplayback. (PICTURE) In the preferred embodiment, speech processor FIGURE 1 is a schematic block diagram ofa pre- board 6O is modified so that headset 62a can be con- sently preferred exemplary embodiment of an inter- nected directly to it using a DB-9 headset connector active language learning system 5O in accordance and is also provided with a program controlled volume with the present invention. In the preferred embodi- 3_ level and dual input capability (to support both the as- ment, system 5O includes a conventional personal tatic microphone ofan AKG K-18 headset and an ex- computer 52 (e.g., IBM PC or true compatible provid- ternal 5V signal). A microphone preamplifier stage is ed with a conventional DOS disk operating system also included to provide an increased signal-to-noise version 2.1 or higherand at least 384 kilobytes ofran- ratio. dom access memory); a keyboard input device 54; a 4o Mass storage device 56 in the preferred embodi- mass storage device 56 (which may be one or more ment stores three types of digital signal information. floppy diskette drives and associated floppy disk- a) digitized speech information; b) text information ettes, Winchester-type hard disk drives andlor CD associated with the speech information; and c) pro- ROM drives); a conventional CRT-type display 58; gram control instructions (which control the proces- and a speech processor6O connected to an appropri- 4_ sorand otherassociated componentswithin personal ate audio inputloutput device (a conventional head- computer 52 to perform the interactive language set-type speakerlmicrophone arrangement 62a learning functions provided by the present invention). andlor a microphonelloudspeaker combination 62b Keyboard 54 is used to permit the user (student or with appropriate external audio amplifiers as neces- teacher) to interact with the execution ofthe program sary). _o control steps, while display 58 permits the user to In the preferred embodiment, speech processor viewgraphics, textand othervisually-presented infor- 6O is a modified conventional model VP625 PC-com- mation. patible digital processor board manufactured by ANTEX Electronics of Gardenia, California. This con- STUDENT FUNCTlONS ventional speech processor, which is described in __ (PICTURE) readily available ANTEX Electronics published spec- FIGURE 2 is a high-level flowchart-type diagram ifications and can be purchased directly from the of the options presented to a student by system 5O, manufacturer, plugs directly into a so-called expan- and FIGURES 3A-5 are graphical illustrations of 6
gocr ep0461127-007
1 1 EP O 46_ _2l B_ 12 these options. The options provided by the preferred (to decrease volume level) or the right arrow key (to embodiment to the student in effect constitute an au- increase volume level) -- and this volume level adjust- dio visual user interface with which the student may ment remains in effect for all functions (programs) interact in order to learn a second language. provided by system 5O. To begin playing a prerecord- Upon starting system 5O (e.g., by powering on _ ed lesson, the student depresses the F2 function on personal computer 52 and its associated peripherals keyboard 54 on the preferred embodiment. In the pre- and controlling the personal computer to begin exe- ferred embodiment, this causes system 5O to begin cuting program instructions stored in mass storage producing audio in headset 62a by controlling speech device 56, FIGURE 2, block 1 OO), a display title processor 6O to convert digitized speech stored on screen is displayed on display 58 (block 1 O2) and sys- 1o mass storage device 56 into audio signals. tem 5O then prompts the student for an audio disk In the preferred embodiment, one or more (block 1 O4). Typically, lessons are stored on floppy screens of text may be associated with a particular diskettes so that the student may easily change les- blockofstored digitized speech, and in the PREVIEW sons by simply inserting anotherdiskette into person- mode this text may be displayed by display 58 while al computer 52. System 5O displays the names ofthe 1_ system 5O produces the converted audiosignalsfrom Iessons stored on the audio diskto permit the student the digitized speech. This associated text is typically to change lessons if he desires. A main menu is then actual text corresponding to the speech being repro- displayed on display 58 which permits the student to duced (since the student may then ''read along'' with select between five different options in the preferred the digitized speech being played back), but it may embodiment. 1) AudioLab; 2) SoundSort; 3) Audio- _o have some other contents -- depending upon what Write; 4) Change Audio Disk; and 5) Exit. The student the teacher desires (as will be explained). To stop the uses up arrow and down arrow cursor control keys in speech (and text) generation, the student may de- the preferred embodiment to selectone ofthe five op- press the ESC (escape) key of keyboard 54. To re- tions, and then depresses the enter key to cause that sume speechltext reproduction, the student may de- option to be executed. __ press the F2 key again. Depressing the ESC key an- The exit option causes the interactive language othertime returns the studentto the main menu (FIG- Iearning functions provided by the preferred embodi- URE 2, block 1 O6). Depressing the ENTER key caus- menttoterminate execution. The option tochange au- es the LAB mode to be entered (FIGURE 2, block dio disk causes system 5O to prompt for a new audio 11 O). On-line help is available by depressing the F1 disk (block 1 O4). The AudioLab, SoundSort and Au- 3o key in the preferred embodiment. dioWrite options perform interactive language learn- FIGURE 3B is a graphical illustration of options ing functions that will now be explained. available to the student in the LAB mode of the Au- The AudioLab function provides the student with dioLab function (FIGURE 2, block 11 O). In this LAB practice in pronounciation and listening comprehen- mode, the student can select different phrases (e.g., sion. In the preferred embodiment, the AudioLab op- 3_ sentences) to listen to in isolation one or more times. tion orfunction has three different modes. PREVIEW If the student wishes to concentrate on a specific (block 1 O8); LAB (block 11 O); and CLIP (block 112). In phrase, he selects the LAB mode by depressing the the PREVIEW mode, the student listens to the entire ENTER key. Once in the LAB mode, the student may selected textand alsosets the playbackvolume forall select the phrase he wishes to concentrate on using of the routines (AudioLab, SoundSort and Audio- 4o cursorcontrol keys. If in the PREVIEW mode the stu- Write). In the LAB mode (block 11 O), the student lis- dent has his text turned off (this is the default mode), tens to phrases from the text and may also record his then in the LAB mode only the selected phrase will be own speech and may compare his played back voice displayed by display 58. If, on the other hand, the stu- to the original. In both the PREVIEW and LAB mode dent in the PREVIEW mode selected that the text the student may choose to see the phrases and text 4_ should be displayed (by depressing F7), the full text in different combinations again on display 58 or he is displayed on display 58 but the selected phrase is can choose to listen withoutviewing the text. From the highlighted. The left arrow and the right arrow keys in LAB mode, student may select the CLIP mode (block the preferred embodiment move the display ''work 112). In the CLIP mode, the student may choose to box'' to different phrases, and the F6 key is used to workon anyselected portion ofa phrase to permit him _o turn a phrase on and off -- thereby selecting the to practice difficult sounds. phrase to be treated using the LAB mode. Upon selecting the AudioLab option from main Once the student has selected a phrase, he can display 1 O6 in the preferred embodiment, system 6O depress the F2 key to control speech processor6O to begins performing the AudioLab PREVIEW mode play back the digitized speech corresponding to that (block 1 O8). FIGURE 3A is a graphical description of __ phrase. By depressing the F3 key, the student may some of the options presented by the preferred em- record his own pronunciation of the same phrase. bodiment in the PREVIEW mode. The student may Once the F3 key is depressed, the student is prompt- adjust volume level by depressing the left arrow key ed to depress the space key to begin recording (and l
gocr ep0461127-008
13 EP O 46_ _2l B_ 14 mayadjustthe recording level using the leftarrowand key to control system 5O to compare the text typed in right arrow keys). Upon depressing the space bar, by the studentwith model text and indicate any errors speech processor6O begins converting audio applied in the student-generated text. at its audio input (e.g., from the microphone in head- FIGURE 4 is a graphical illustration ofthe options set 62a) into digitized speech information and storing _ available to the student in the AudioWrite function the digitized speech (on a virtual disk). When the stu- (FIGURE 2, block 114). Upon initiating the AudioWrite dent is through speaking, he depresses the space bar function, the student may depress the F2 key to listen again to stop recording. The student may then de- to the entire text, or simply depress the ENTER key press the F4 key to instantly play back his own just- tostartthe exercisewithoutlistening tothewhole text. recorded speech -- or depress the F2 key to listen 1o Once the exercise has begun, system 5O controls again to the model pronunciation of the selected speech processor 6O to produce the audio corre- phrase. Depressing the ESC key returns system 5O sponding tothe first phrase ofan exercise withoutdis- to the PREVIEW mode (FIGURE 2, block 1 O8), while playing the corresponding text on display 58. The stu- depressing the F5 key controls system 52 to perform dent may depress the F3 key to control speech proc- the AudioLab CLIP mode (FIGURE 2, block 112). 1_ essor6O to replay the spoken phrase (the phrase may The CLIP mode in the preferred embodiment al- be replayed as many times as the student wishes). Iows the student to analyze any section or part of a The student then enters text by depressing the keys phrase selected in the LAB mode. Agraphical illustra- on keyboard 54 (in the preferred embodiment, system tion of options presented to the student in the CLIP 5O adds spaces, punctuation and capitalization sothe mode are shown in FIGURE 3C. In the preferred em- _o student may concentrate on spelling and grammar). bodiment, the CLIP mode permits the student to ex- The student may depress the ENTER key at any time amine any section or part of the model digitized to check his progress. Upon depressing the ENTER speech recording down to a single phoneme (O.1 sec- key, system 5O compares the text keyed in by the stu- onds in duration). In addition, the complete phrase dent with the text version ofthe phrase being spoken can be heard by depressing a control key (e.g., F2). __ byspeech processor6O -- and highlights any portions In the CLIP mode, the cursor control keys (up ar- of the student-typed text which do not correspond to row, down arrow, left arrow, right arrow) are used to the model text. The student may use his cursor con- select which part ofthe current phrase is to be played trol keys to move the cursor to the erroneous charac- back. A graphical illustration of the length of the cur- ters and correct his mistakes by retyping over the in- rently selected portion of the phrase is displayed at 3o correct characters already there. The student may the bottom ofdisplay 58 in the preferred embodiment. depress the F9 key at any time to control system 5O In the preferred embodiment, this graphical illustra- to display the correct word corresponding to the stu- tion includes a horizontal line having a length propor- dent's inputted word the cursor points to. The dis- tional tothe length ofthe selected phrase portion. The played model word disappears when any other key is Iength and position ofthis horizontal line change in re- 3_ depressed. When the student has correctly entered a sponse to cursor controls to change the length and phrase, he depresses the enter key to hear the next position of the selected portion relative to the current phrase. When the student finishes the exercise, he phrase. may depress the F2 key to listen to the entire text, or Once the student has selected the portion of the depress the ESC keyto return to the main menu (FIG- phrase he wishes to concentrate on, he depresses 4o URE 2, block 1 O6). the F5 key to listen to the selected portion. As in the The SoundSort function provided by the prefer- LAB mode, the student may record and play back his red embodimentofthe present invention presents the own voice using the F3 and F4 keys, and alternate the studentwith an audio puzzle to solve. The ''pieces playback of his voice with playback of the selected ofthe puzzle'' are the phrases in the model text -- but clip by toggling the F5 and F4 keys. Depressing the 4_ jumbled in a randomized order. The student must put F8 key resets the clip to allow the student to select a the phrases back in the correct order based only on different part ofthe phrase. Depressing the ESC key aural clues. In the preferred embodiment, each returns system 5O to the LAB mode (FIGURE 2, block phrase is identified by a symbol (e.g., the letters A, 11 O). B, C) displayed on display 58. The student controls The AudioWrite function (FIGURE 2, block 114) _o system 5O to move the letters from a ''_umbled'' or- provides the student with an exercise in listening and dered listdisplayed on the left-hand column ofdisplay writing by requiring the student to type phrases he 58 to a correctly ordered list in the right-hand display hears. The student may listen toeach phrase as many column based on context of the associated phrases. times as he wishes, and may also listen to the entire Thus, system 5O's SoundSort function associates a text before concentrating on each phrase (since in the __ randomly ordered sequence of phrases with dis- preferred embodiment the typing exercise operates played symbols, and then requires the student to re- on a phrase-by-phrase basis). Once the student has orderthe displayed symbols to correspond to the cor- typed a particular phrase, he can depress the ENTER rect order of the phrase sequence. The student may 8
gocr ep0461127-009
15 EP O 46_ _2l B_ 16 listen to the phrases as many times as he wishes, but tem prompts the studentforan audio disk (block 152) system 5O does not display the text associated with and waitsforthe studenttodepress a key (block 1 54). the phrases -- only the symbols associated with the System 5O then determines whether the floppy disk- phrases. ette and the floppy diskette drive contains correctly FIGURE 5 is a graphical illustration ofthe Sound- _ formatted lesson data (decision block 156). If the Sort function depicted at FIGURE 2, block 16. Upon diskette being tested is not appropriately formatted, initiating the SoundSort function, display 58 displays a warning message is displayed on display 58 (block a vertical column ofsymbols (e.g., A, B, C, D) -- each 1 58), system 5O waits for the student to depress an- one symbolizing a phrase which is a portion ofa sen- other key (block 16O), and then rechecks the diskette tence or passage. The student may depress the F2 1o contents (block 156). key to listen to the entire text in the correct order -- or Once an appropriate diskette is inserted into the he may use his up arrow and down arrow cursor con- floppy diskette drive, system 5O reads a title of the trol keys to highlight one of the displayed symbols. Iesson (and a page two flag) stored on the diskette One symbol is always highlighted in the preferred em- and displays that title on display 58 (block 162). Ifthe bodiment. Depressing the F3 key causes digitized 1_ student wishes to choose another lesson, he de- speech processor 6O to reproduce the speech corre- presses an appropriate key (decoded by blocks 164, sponding to the highlighted phrase. The student may 166) which cause system 5O to repeat blocks 152- then use his rightarrowcursorcontrol keyto move the 166. If the student is satisfied with the lesson on the symbol corresponding to the phrase he hasjust heard current diskette, he depresses, for example, the N to the center column -- and may use the up arrow and _o key (decoded by blocks 164, 166), which controls sys- down arrow cursor control keys to move the symbol tem 5O to read the contents of the current diskette up and down the center column -- and then use the (block 168). System 5O then displays a main menu right arrow key to ''park'' the symbol in a desired pos- display format (block 17O) and waits for the student ition in the right-hand column. The object ofthe exer- to select one of the five options described previously cise is to move all of the symbols from the left-hand __ (decode block 1 72). The student may select execu- column to the right-hand column -- and to rearrange tion of the AudioLab routine (block 174, shown in the orderofthe symbols sothattheirrearranged order greater detail in FIGURES 7A-7D), the AudioWrite corresponds to the correct order of the phrases. The routine 176 (shown in greater detail in FIGURES 8A- space bar may be depressed to change columns 8B), orthe SoundSort routine (shown in greaterdetail (e.g., from the left-hand column to the right-hand col- 3o in FIGURES 9A-9C) using cursor controls and the umn or vice versa). By depressing the ENTER key, ENTER key as described previously. the student is provided with an indication of his pro- Referring now to FIGURE 7A, the Audiolab rou- gress -- since system 5O will highlight those symbols tine in the preferred embodiment first queries an in- moved to the right-hand column that are in the correct ternal flag to determine whether this is the first time order. The student may depress the F2 key at any 3_ the student has used AudioLab function in this ses- time to listen to the entire text -- and use the leftarrow sion -- and if so, displays an introductory screen and right arrow cursor control keys to select the start- (blocks 18O-1 86). System 5O then awaits depression ing point of the text to be reproduced in audio form. ofa key (block 188), and decodes that key (block 19O) This feature is especially useful when a long passage to determine what operation to perform next. Depres- or string of sentences is being operated upon (since 4o sion of the ESC key causes a return to the FIGURE the student may, for example, wish to concentrate 6A-6B main routine (block 1 92). If the student de- onlyon the last halfofthe passage and may notthere- presses the F1 key, the currently displayed screen is fore wish to listen to the entire passage from begin- saved, and a help screen is displayed in its place ning to end). To exit the SoundSort function, the stu- (block 194). Upon depression of a further key (block dent may depress the ESC key to return to the main 4_ 1 96), the saved workscreen is returned to the display menu (FIGURE 2, block 1 O6). (block 198). Now that the overall student interface provided If the student depresses the F9 key, system 5O by system 5O has been described, a detailed descrip- determines whether there is a ''page two'' associated tion of exemplary program control steps performed with the currently displayed text (e.g., by checking a by personal computer 52 under software control in _o page twoflag that is setwhen a page two exists -- de- the preferred embodiment to provide that student in- cision block 2OO). In the preferred embodiment, a terface will be presented in connection with FIG- teacher generates a lesson or exercise by recording URES 6-9C. a spoken passage; inputting a main screen of infor- FIGURES 6A-6B are together a schematic flow mation corresponding to that passage (this main diagram of an exemplary program control main rou- __ screen is typically the textual version of the spoken tine performed by the preferred embodiment system passage, but may be any text the teacher wishes), 5O. Upon starting system 5O, as described previously, and may also key in a ''page two'' screen providing a title is first displayed (block 15O), and then the sys- supplementary text associated with the spoken 9
gocr ep0461127-010
17 EP O 46_ _2l B_ 18 phrase. Thus, some lessons may have a page two which is serviced by a conventional interrupt handler screen associated with them and others may not. In performed by the processorofpersonal computer52. the preferred embodiment, page two text format is This conventional interrupt handler(which is provided stored in files separately from the page one display with conventional speech processor 6O) reads the formats -- and the existence of the page two file is _ next portion of the digitized speech file from mass flagged in a file called ''file.dat''. Ifa page two display storage device 56 and transfers the data to speech format does exist, it is displayed in a manner similar processor6O. Since the transferof information is per- to the way the help screen is displayed (blocks 2O2, formed under interrupt control foronly small blocks of 2O4, 2O6) in response to depression of F9. data at a time, the process is virtually transparent to The student may affect the volume of audio pro- 1o the userand results in only a negligible slowing ofthe duced by speech processor 6O in the preferred em- response time of personal computer 52. bodiment by depressing the right arrow key (to in- Once system 5O begins reproducing the audio crease the volume level) or the left arrow key (to de- corresponding to a particular lesson passage (block crease volume level). Volume is controlled by writing 226), itcontinues to produce the entire audio passage a new volume level byte to speech processor 6O 1_ until it reaches the end ofthe passage or until the stu- (blocks 2O8, 212) in a conventional fashion. The pre- dent depresses the ESC key (decision block 228). ferred embodiment also displays the current volume Upon the occurrence of either of these two events, Ievel on the lower right-hand portion of display 58 in display 58 is cleared and a command line is displayed the form of a horizontal bar the length of which indi- to permitthe usertoselectanotheroption (block23O). cates volume level (blocks 21 O, 214). _o The user may at any time depress the enter key The student may select whether or not he wishes to enter the LAB mode (blocks 194-23O being part of display 58 to display the page one text corresponding the PREVIEW mode discussed previously). Upon en- to the spoken passage by depressing the F7 key. tering the LAB mode, system 5O first determines ''Text ofF' is the default condition. If the student de- whether text display is on or off (e.g., by testing the presses the F7 key when the text is already displayed __ value of the text on flag (decision block 232). If text (decision block216), the textoffflag is set (block218) display is off, system 5O displays the ''current phrase'' to suppress the display oftext. Ifthe student depress- on display 58 (block 234) -- that is, the phrase that es the F7 key when system 5O is not displaying text was being ''played back'' while in the PREVIEW (decision block 216), a text on flag is set (block 22O) mode. PREVIEW mode plays all the phrases. When to result in display of the complete page one text as- 3o the student first enters Text Lab mode, the first sociated with the current lesson. phrase is the current mode. System 5O then waits for When the studentdepresses the F2 key, he caus- the student to depress a control key to select one of es the entire text ofthe lesson to be displayed on dis- the LAB options (block 236, decode block 238). play (but only if the text on flag is set by block 22O; The LAB mode provides its own help screen sup- blocks 222, 224). System 5O then controls speech 3_ port upon depression ofthe F1 key (blocks 24O-244), processor6O to reproduce the audiocorresponding to and permits the user to exit back to the PREVIEW the lesson by reading digitized speech information upon depressing the ESC key (block246, with control from mass storage device 56 and converting it to au- being returned back to FIGURE 7A block 1 88). Sim- dio signals. In the preferred embodiment, digitized ilarly, depressing the F9 key causes a ''page two'' dis- speech is stored on mass storage device 56 in the 4o play screen format to be displayed on display 68 if form of separate and discrete phrases. Files are such a ''page two'' format exists (blocks 248-254). packed 4-bit ADPCM sound data in the preferred em- If the user depresses the F2 key, system 5O re- bodiment. Speech processor 6O in the preferred em- produces to audio corresponding to the current bodiment accepts a string ofseveral file names to be phrase (block 256). Moving the cursor control keys played in sequence. Each separate recorded phrase 4_ down arrow or right arrow cause system 5O to select file is loaded and played with an interval of O.25 sec- the ''next phrase'' (that is, the next file in a sequence onds between to give the impression of continuous offiles that store the digitized speech phrases corre- replay. Up to 186 seconds of audio can be played sponding to the current lesson) while moving the up from a single floppy diskette, and up to twenty-three arrow or left arrow cursor control keys causes selec- hours may be played from a CD ROM storage device. _o tion of the previous phrase (blocks 258, 264) (blocks In the preferred embodiment, the actual mechanism 26O, 262, 266, 268). The F6 key causes a phrase se- for presenting digitized speech to speech processor lected by the cursorcontrol keys to be turned on (i.e., 6O includes reading digital information from mass flagged) and turned off (i.e., unflagged) (blocks 27O- storage device 56. Speech processor 6O then reads 274). A phrase that is turned on is displayed, while the digitized information into its own 32K buffer and __ turning off a phrase causes that phrase to cease be- converts the information to audio form. When speech ing displayed (block 276). The student may at any processor6O reaches the end ofthe data stored in its time turn display ofthe complete main text on and off buffer, it automatically generates an interrupt request by depressing the F7 key (blocks 278-284). _O
gocr ep0461127-011
19 EP O 46_ _2l B_ 2O If the phrase flag is on, each phrase will be dis- points to the end of the ''clip''); and 1 (the ''left-hand played as it is needed, even iftext is off. Ifthe phrase pointer'' -- which points to the beginning of the clip). flag is off, each phrase will be erased as it is reached, The right-hand pointer r is incremented and decre- even if text is on. The student may also at any time mented by the right-arrow and down-arrow cursor depress the F3 key in the preferred embodimentto re- _ control keys, respectively between the values of L cord his own voice. When the F3 key is depressed, (the beginning of the current phrase) and R (the end system 5O first gives the student the option to in- of the current phrase). Right-hand pointer r points to crease or decrease record level gain using the left ar- the end of the portion of the digitized speech phrase row and right arrow cursor control keys (block 286, that is selected (blocks 35O-354, 368-372). The left- decode block 288, blocks 29O, 292). Ifthe student al- 1o hand pointer 1 is decremented and incremented by ters the record gain, the new gain is displayed in the the left-arrow and up-arrow cursor control keys, re- Iower right-hand corner of display 58 (blocks 294, spectively between the values of L and R (which thus 296) and the new record gain level is written to set a range for both l and r-- l and rcannot pass each speech processor 6O in a conventional manner. The other). The left-hand pointer l points to the beginning student depresses the space bar to begin the record- 1_ of the ''clip'' (blocks 356-366). ing process (block 298). When recording is begun, In the preferred embodiment, the left-hand poin- speech processor6O is controlled to begin converting ter l and the right-hand pointer r define absolute time signals at its audio input into digitized speech signals offsets into the file containing digitized data repre- and writing those digitized speech signals onto the senting the current phrase. Thus, depressing the up- virtual disk. This process continues until either the _o arrow key moves the left-hand pointerl to the right (to- user depresses the space bar again to terminate re- ward the end ofthe phrase); depressing the leftarrow cording or until a preset recording time (the length in key moves the left-hand pointer l to the left (toward time of the model phrase in the preferred embodi- the beginning ofthe phrase); depressing the downar- ment) has elapsed (block 3O2). A record flag is then row key moves the right-hand pointer r to the left (to- set (block 3O4) to indicate that a phrase has been re- __ ward the beginning ofthe phrase); and depressing the corded, and the command line is displayed once rightarrow key moves the right-hand pointer r to the again (block 3O6). If the student now depresses the right (toward the end of the phrase). F4 key to playback his recorded phrase, it is first de- In the preferred embodiment, the ''clip'' mode termined whether the record flag has been set (deci- works on the basis oftime. That is, system 5O controls sion block 3O8) -- and if it has been set, system 5O 3o the speech processor6O (and associated diskread in- controls speech processor6O toconvertthe student's terrupt routine) to seek directly to the point in the stored digitized speech into audio (block 31 O). phrase file pointed to by the left-hand pointer 1 and The LAB mode thus permits the student to con- to begin playing back the file from that point until the centrate on a specific phrase from the prerecorded point pointed to by the right-hand pointer r (at which spoken lesson. If the student has trouble with a par- 3_ point the play back ceases) (block 378). The effect is ticular phrase, however, he maywish to listen to small thatthe usercan selectand ''play back'' anyarbitrarily pieces ofthat phrase in isolation (e.g., one syllable at small portion of the current phrase (within the range a time) so he can learn how to speak the entire ofresolution ofvariables 1 and r -- O.1 seconds in the phrase. The preferred embodiment of the present in- preferred embodiment) without having to hear the re- vention allows the student to concentrate on any por- 4o maining part of the phrase (and also without having tion ofthe current phrase by depressing the F5 key to to wait for the delays during which the remaining por- enter the CLIP mode. Upon entering the CLIP mode, tions ofthe phrase would be played back). In the pre- system 5O displays a ''clip'' line (a horizontal line atthe ferred embodiment, the CLIP mode is more than bottom of the display indicating the length and posi- merely a ''mute'' function since it actually presents tion ofthe current ''clip'' relative to the current phrase 4_ onlythe desired digitized speech data tospeech proc- display) and a new command line (block 312) and essor 6O for conversion to audio signals. then waits for the student to depress a key. Depres- FIGURES 8A-8B are together a flow chart of ex- sion ofthe ESC key deletes the CLIP line and returns emplary program control steps performed by system to FIGURE 7B block236. Ahelp screen is provided for 5O to implement the AudioWrite function shown in the CLIP mode (block 318-322), and the CLIP mode _o FIGURE 6. When the student selects the AudioWrite also permits the user to play the current phrase from function (see FIGURE 6, blocks 172, 1 76), instruc- beginning to end by depressing the F2 key (block tions in a command line are displayed (block 38O) and 324). Similarly, the student may record and play back then system waits for the student to select one ofthe his own speech just as in the LAB mode (blocks 326- options presented to him by the AudioWrite function. 348) by depressing the F3 and F4 keys, respectively. __ Depressing the ESC key causes control to return to Briefly, the clip mode provides two indexes into the main routine (FIGURE 6, block 17O). The student the digitized speech file relating to the currently se- may depress the F2 key to play back the audio corre- Iected phrase. r (the ''right-hand pointer'' -- which __ sponding to the current lesson (blocks 386, 388). De-
gocr ep0461127-012
21 EP O 46_ _2l B_ 22 pressing the F1 key or the enter key causes system until all letters in the student inputted text string buf- 5O to display a command line (block 39O) and then fer have been compared with the model text string play back the first phrase from the current lesson characters (spaces and punctuation being ignored). If (block 392). System 5O then waits forthe studentto in- any letters are incorrect (decision block 44O), system put either the words matching the phrase hejust heard _ 5O moves the cursor to the beginning ofthe firstword oracontrol key(block394). Ifawordfromthe model text that has a wrong character to permit the student to is displayed upon depressing this key, this word is re- correct his error (block 442). If all characters of the moved from the display (blocks 396-1OO) -- and like- student inputted text string correspond exactly to the wise, display of the elapsed time is suppressed if the characters in the model text string (meaning that the elapsed time is displayed when the next key is de- 1o student-inputted string is both entirely correct and pressed (blocks 4O2 and 4O6). complete), system 5O waits forenterto be depressed, The code block 4O8 then determines which key then advances to the next phrase (block 444) and re- the user has depressed. Ifthe user depresses the F3 peats blocks 392-442 for that next phrase. If the en- key, the time that has elapsed since the AudioWrite tire lesson has been analyzed (as tested for by deci- exercise began is displayed in a conventional manner 1_ sion block 446), an end of lesson message is dis- (block41 O). The cursorcontrol keys cause the cursor played (block 448) and upon inputting another key to move to the right or the left on display 58 (blocks (wait block 45O) control returns to FIGURE 6 block 412-418). 1 7O. The ''object'' of the AudioWrite exercise is for the FIGURES 9A-9C are together a detailed flow student to input alphanumeric characters which _o chart of exemplary program control steps performed match the phrase he is hearing from speech proces- by system 5O to implement the SoundSortfunction of sor 6O (and thus also the textual version of the same the preferred embodiment of the present invention. phrase from the main text). Ifthe student inputs an al- As will be recalled from the discussion above, the phanumeric key, the character corresponding to the SoundSortfunction presents the studentwith a game key he inputs is displayed on the display and the cur- __ in which he is expected to move symbols on display sor is moved one character to the right (blocks 42O, 58 corresponding to phrases of a sentence or pas- 422). Block 42O also causes the character corre- sage into the correct order (after system 5O has reor- sponding to the key depressed by the user to be add- dered the phrases into a random order). ed toa textstring bufferforanalysiswhen the userde- Upon initiating the SoundSort routine (from de- presses the enter key. System 5O automatically ''fills 3o code block 1 72, FIGURE 6), it is first determined in'' spaces and punctuation and changes the case of (e.g., bychecking aflag) ifthis is the firsttime the stu- the displayed characters if necessary to match the dent has used SoundSort in this session (decision ''model'' text. Ifthe userdepresses the delete key, the block 452). Ifthe current execution is the first time of character displayed immediately to the left ofthe cur- use, an introductory screen explaining how to play sor is deleted from the display (and also from the text 3_ the SoundSort game is displayed by display 58 string buffer) (blocks 424, 426). If the student de- (blocks 454, 456). System 5O then accesses a sen- presses the F9 key, a word from the model text cor- tence or passage of the lesson stored on mass stor- responding to the exercise is displayed in the lower age device 56, this sentence including plural phrases. right-hand corner of display 58 in the preferred em- In the preferred embodiment, only passages with a bodiment (blocks 428, 43O). Depressing the enter key 4o relatively small number of (e.g., a maximum of 21) causes system 5O to checkthe user inputted contents phrases are especially suitable for SoundSort since of the text string buffer against the model text string the SoundSort function uses the first 21 phrases ofa (character by character) and indicate errors in the given lesson (additional phrases are ignored). Sound- user inputted string -- as will now be explained. Sort then randomizes the sequence of phrases within Upon depressing the enter key, system 5O first 4_ the lesson (e.g., using a conventional pseudo-ran- displays the elapsed time in the lower right-hand cor- domizing algorithm) to provide a randomized (''jum- ner of display 5O (block 432). System 5O then scans bled'') sequence of the original phrase sequence. through the student inputted text string buffer one System 5O then assigns a symbol (alphabetical character at a time beginning with the first character letters in the preferred embodiment) to each one of in the buffer(block434). System 5O compares, forex- _o the random-order phrases (block 458). For example, ample, the first character in the student inputted buf- suppose the four phrase sequence involved is. ''Cats'' ferwith the firstcharacterofa model textstring stored ''have'' ''fou_' ''legs.'' With each word being a discrete on mass storage device 56 corresponding to the cur- phrase, block 458 might randomize the order of the rent phrase. If these two characters correspond, no phrases to result in. ''Four'' ''legs'' ''cats'' ''have'', and action is taken (decision block 436). On the other __ then assign the symbol A to symbolize the phrase hand, ifthe characters do not correspond, the display ''four''; the symbol B to symbolize the phrase ''legs''; of the first character is highlighted on display 58 the symbol ''C'' to symbolize the phrase ''cats''; and (block438). This process (blocks 434-438) continues the symbol ''D'' to symbolize the phrase ''have''. Sys- _2
gocr ep0461127-013
23 EP O 46_ _2l B_ 24 tem 5O then displays on display screen 58 the sym- ample, ratherthan the entire sequence (which may be bols corresponding tothe reordered phrase sequence of arbitrary length). in the left-hand column ofthe display (see FIGURE 5) In the preferred embodiment, the left arrow and so that the phrases, if heard in the order of the sym- right arrow cursor control keys only have the effect of bols displayed on the display, would be in the random- _ changing the phrase sequence playback beginning ized order (block 46O). System 5O then waits for the point after the F2 key has been depressed. Other- student to depress a key to select the next function wise, they control movement ofthe displayed symbols to be performed (blocks 462, 464). on display screen 58. Depressing the right arrow key The SoundSort function 178 in the preferred em- causes system 5O tofirstdeterminewhetherthe curson bodiment provides a help screen giving the student 1o is in the left column or the center column (decision instructions for what to do next if he gets confused blocks 498, 5O2, respectively). The objective in the (blocks 472-476). The student may exit the Sound- SoundSortfunction is to move the symbols displayed in Sort function 178 at any time by depressing the ESC the leftcolumn furtherto a centercolumn -- and then to key (decode block 464). If the student confirms he move those symbols intoa right-hand column in thecor- wishes to leave the SoundSort function, a return to 1_ rect order based upon aural clues. Ifthe cursor is in the main routine block 17O, FIGURE 6A is performed leftcolumn (and thus is pointing to a symbol displayed (blocks 468, 47O). On the other hand, if the student in the left column), and the user depresses the right does not confirm he wishes to leave the SoundSort arrow key, the symbol pointed to by the cursor is re- function, he is returned back to the get key block462 moved from the left column and displayed in the cen- to select the next function (decision block 468). _o ter column (block 5OO), using conventional screen Ifthe student depresses the F3 key in the prefer- control techniques. Similarly, ifthe cursor is pointing red embodiment, system 5O plays backthe phrase as- to a symbol in the middle column and the user de- sociated with the symbol the cursor is presently point- presses the right arrow key, the symbol is moved to ing to (block 478). Upon depressing the F2 key, sys- a right column position so long as there isn't already tem 5O displays on display 58 a promptwhich prompts __ a symbol displayed immediately to the right in the the user for ''starting point?'' (block 48O), and then right column (decision block 5O2, 5O6). Striking the waits for the user to input another selection (blocks left arrow key permits the student to move a symbol 482, 484). By striking the F2 key, the student may in the right column back to the middle column orfrom playback the entire sequence of phrases in their cor- the middle column back to the left column (blocks rect order-- or can select a portion ofthe correctly or- 3o 51 O-52O). dered sequence of phrases to hear the audio corre- In the preferred embodiment, the student sponding to. After the depressing the F2 key, if the changes the order of symbols by moving them to the student again depresses F2 (or depresses the EN- centercolumn and then moving them vertically before TER key), system 5O plays back a phrase sequence placing them into ''slots'' in the right column (these beginning at a portion of the sequence pointed to by 3_ slots correspond to entries in an array maintained in a pointer called a ''start point'' which is initially set at memory). Upon depressing the up arrow key, for ex- the beginning of the correctly ordered phrase se- ample, system 5O first determines whetherthe cursor quence (but may be changed by the student) (block is pointing to a symbol in the center column (decision 494). Once the phrase sequence playback has be- block 522). If so, the pointed to symbol is moved up gun, it will continue to the end of the sequence of 4o one row in the center column (thus, the symbol is al- phrases or until the student again hits the F2 (deci- ready in the top row in which case it is wrapped sion 496). If, instead of striking the F2 key or the EN- around to the bottom) (blocks 524-528). If the cursor TER key, the student depresses the left arrowor right points to a symbol in the left-hand or right-hand col- arrow keys in the preferred embodiment, the effect umn, on the other hand, the cursor is moved up one will be to change the value ofstart point. In particular, 4_ row (block 53O) and then system 5O determines ifthe student depresses the right arrow key, the start whether the new cursor position is on the letter in the point pointer value is advanced in the phrase se- left or right column (decision block 532). Ifthe cursor quence and its new value is displayed (blocks 486, does not point to a letter in its new position, it is either 488). Similarly, by depressing the left arrow key, the moved up or wrapped around (decision block 534, start point pointervalue is retracted toward the begin- _o 536). Similar symbol movement occurs upon de- ning of the phrase sequence (blocks 49O, 492). This pressing the down arrow key (blocks 54O-56O). allows the student to concentrate on the last portion Depressing the space bar in the preferred em- of the correctly ordered phrase sequence, for exam- bodiment controls the cursor to move between left ple, (or on any portion of the phrase sequence since and right columns. For example, if there are symbols he can strike the F2 key to discontinue phrase se- __ displayed in both the left column and the right column quence playback) and is especially useful for long and the cursor is presently in the centercolumn, strik- phrase sequences since it permits the student to lis- ing the space bar will do nothing. Space only has an ten to three or four phrases in the sequence, for ex- effect ifthe cursor is in either the left or right column. _3
gocr ep0461127-014
25 EP O 46_ _2l B_ 26 In the preferred embodiment, the space bar will only type in one or more screens of text the students are move the cursor to columns where symbols are dis- to view on the screen during the lesson -- including played. It only moves the cursor between the left and the page one and page two screens described previ- the right columns (and is ignored when the cursor is ously. The page one screen generally is (but need not in the centercolumn), and always results in the cursor _ be) the textual version of the recorded audio. The pointing to the uppermost symbol in the new column page two screen may be help or instructions. After (blocks 562, 564, 566). the Text Writerfunction (block 6O6) has been used to Depressing the ENTER keycontrols system 5O to input one or two screens oftext, the AudioLab studio check the right column entries to determine which function is used toconvertspoken audio intodigitized ones are correct and which ones are incorrect so that 1o speech phrasefiles stored on mass storage device 56 the student can monitor his progress. Upon depress- and to associate that recorded audiowith the text pre- ing the ENTERkey, system 5O examines the contents viously entered using the Text Writerfunction. Specif- ofthe rightcolumn positions one ata time (block568). ically, the teacher first chooses recording text (block Ifthe student has moved a symbol into a certain pos- 6O8) and then may choose whether of not to include ition, system 5O compares that symbol with a symbol 1_ a page two help screen (block 61 O, 612). The teacher order string (array) it formed at block 458 indicating then marks and records phrases using speech proc- the correct order ofthe symbols (decision block 57O). essor 6O (block614), and is finally permitted to select Ifthe symbol underexamination in the right-hand col- student menu layout for the lesson (block 616). umn corresponds to the symbol order (array) in the In the preferred embodiment, in the TextWriter model symbol string, it is marked on the displayas be- _o routine (block 6O6) a format different from ASCll is ing correct (blocks 572, 576). If the symbol is incor- used forconvenience and a utility is provided forcon- rect, it is marked on the display as being wrong (e.g., verting from ASCll to the differentformat. Preferably, by highlighting) (blocks 572, 574). Afterall ofthe right- the text files created by the Text Writer program rou- hand column positions have been marked correct or tine (block 6O6) are of limited length so that they can incorrect by blocks 57O-576, system 5O determines __ each fit on a single display screen (8Ox21). whetherany one has been marked incorrect (decision FIGURE 11Ais a detailed flowchartofexemplary block 578). Ifat least one symbol in the right-hand col- program control steps performed by the studio rou- umn is wrong (or missing), an elapsed time indicator tine shown in FIGURE 1 O. As mentioned previously, is displayed along with new command lines and sys- upon initiating this studio routine, the title screen is tem 5O then waits for the student to depress a key 3o displayed (block 6OO) and then the keyboard input is (blocks 58O, 582). Upon depressing a key, the work decoded to allow the teacher to select one offour op- screen is restored to permit the student to continue tions (block 6O2). Instructions may be displayed, the moving the symbols (blocks 584, 462, 464). If, on the textwriterconventional word processormay be called other hand, the student has successfully moved all of (block 6O6) orthe studio routine may be exited. Once the symbols to the right-hand column in the corrector- 3_ the teacher has inputted one or two text screens us- der, an end message is displayed (block 586) and ing the Text Writerword processor, he may select the control returns to main routine (blocks 17O, 1 72). AudioLab studio routine to actually assemble the tex- tual and audio components of a lesson (beginning at STUDlO ROUTlh1E block 62O). Upon selecting the AudioLab studio func- (PICTURE) 4o tion, system 5O first calls a select file routine (named FIGURES 1 O-16 describe utilities provided bythe ''selfile'' in the preferred embodiment) to choose a presently preferred exemplary embodiment of the main text format to be associated with the lesson. A present invention to permit a teacher to form andlor detailed flow chart of the select file routine 62O is customize lessons and exercises for use by students. shown in FIGURE 12. FIGURE 1 O is a high-level flow chart-type diagram of 4_ Upon initiating the selectfile routine 62O, system the user interface presented to the teacher. Upon 5O first displays a command line (block622) and then starting the studio routine in the preferred embodi- calls a routine called ''choose drive'' 624 to permit the ment, the title screen is displayed along with options teacher to select which ofseveral drives he wishes to available to the teacher (block 6OO). In the preferred use. As is well known, personal computer 52 may embodiment, the teacher may select between four _o have one or more hard disk drives and one or more different options (1) an instruction display; (2) a Text floppy diskette drives (all of which are shown sche- Writerword processor-type function; (3) an AudioLab matically in FIGURE 1 as mass storage block 56). studiofunction; (4) exit. In the preferred embodiment, Generally, the teacher wishes to store lessons on selecting option number 1 displays an instruction floppy diskettes so thatthey can be easily copied and screen (block 6O4) in which the teacher is told about __ distributed to students. The choose drive routine 624 a suggested general methodology for using the Au- (shown in detail in FIGURE 13) uses conventional dioLab studio and Text Writer functions. Briefly, the MSIDOS utilities in the preferred embodiment to teachergenerallyfirst uses the TextWriterfunction to count the number of disk drives (block 626), then _4
gocr ep0461127-015
27 EP O 46_ _2l B_ 28 clears the display screen 58 (block 628) and then dis- set to O (block 752) and decision block 754 determi- plays a window setting forth the drive designations of nes whether it is necessary for the teacher to select each ofthe existing mass storage drives (blocks 63O, between user subdirectories (e.g., if more than one 632). The personal computer 52 keyboard buffer is subdirectory exist). If subdirectory selection is re- then cleared (block 634) and system 5O prompts for _ quired, a routine DIRMENU 756 is called to permit the teacher's choice (block 636-642). Depressing the subdirectory selection. A detailed flow chart of this F1 keydisplays help text (block644-648). Ifthe teach- routine 756 is shown in FIGURE 14. er depresses the up arrow or down arrow cursor con- The DIRMENU routine 756 first uses convention- trol keys, different drive designation options dis- al DOS utilities tofind and record all subdirectories on played by display 58 are highlighted so as to permit 1o mass storage device 56 (block 758). This option is not the teachertoalterthe drive selection in a convention- available in Studio. The new lesson is always saved al manner (blocks 65O-66O). If the teacher keys in a on a floppy disk (drive A). So long as additional sub- valid drive designation letter rather than using the directories can be created, the teacher is given the cursorcontrols, thatvalue is selected as the designat- option tocreate a newsubdirectoryforthe new lesson ed drive (block 662). Otherwise, depressing the EN- 1_ (blocks 76O, 762). Next, instructions and a list of all TER key causes the drive designation selected by the of the subdirectory names existing on mass storage cursor control keys to be selected. Upon depressing device 56 are displayed (block 764), and system 5O the ENTER key, system 5O first determines whether then awaits user input (blocks 756-774). The cursor the A or B floppy diskette drives have been chosen control keys are used to highlight different displayed (decision block 664). If not, then a hard disk has been _o subdirectory names in a conventional manner(blocks selected and the hard disk designation is returned 776-782), and a conventional help facility is also pro- (block666). Ifa floppy diskette drive has been select- vided (blocks 784-79O). Upon depressing the ENTER ed, (decision block 6664), the keyboard buffer is key, system 5O determines whether selected the op- cleared once again (block 668) and the system tion tocreate a newsubdirectory (decision block792), prompts the teacherto inserta diskette in the diskette __ and ifso, maycreate a newsubdirectory in an entirely drive (blocks 67O, 672). Striking the F1 key at this conventional manner using the DOS ''MKDIR'' utility point displays help text (block674-68O). Ifthe teacher or the like. Otherwise, the subdirectory name is stor- depresses the ESC key, the routine is aborted (deci- ed (block 794) and a return to routine 62O is per- sion block 682, 684). If any other key is depressed, formed (block 796). routine key 624 returns to FIGURE 12 block 686, with 3o The teacher may depress the ESC key at any the A or B drive designation selected (block 666). time to select another drive or another diskette, (and Referring once again to FIGURE 12, system 5O thus call the choose drive routine 624) (decision block then determines the reason why the choose drive rou- 798). Otherwise, the teacher is permitted toselect be- tine 624 was exited. If the reason was because the tween files within the selected subdirectory using the teacher depressed the ESC key at block 638, control 3_ FILEMENU routine 8O4. A detailed schematic flow returns to FIGURE 11A block 62O with a null return chart of the exemplary program control steps related string (decision block 686, 688). If, on the other hand, to the FILEMENU routine 8O4 is shown in FIGURES the teacher depressed the ESC key at FIGURE 15 1 5A-15B. block682, routine 624 is called again to permitchoos- Referring now to FIGURE 15A, routine 8O4 first ing of another drive (decision block 69O). 4o scans the selected subdirectory (using conventional In the preferred embodiment, all files associated DOS utilities for all files having the extension ''.tlt'' with a particularlesson are preferablycollected within (block 8O6). If more than the number ofsuch files ex- a common subdirectory. The teacher may create the ist than will fit on the display, a warning message is subdirectory before initiating the FIGURE 1 O routine displayed (block 8O8-812). If no such files exist, sys- using conventional DOS utilities, or a conventional 4_ tem 5O determines whether the teacher has mistak- create directory routine may be included in the select enly removed the diskette from the drive (block 814, file routine to permit the teacherto create a subdirec- 816) and if he has displays an error message (block tory on the file. 818). If no such files existand the diskette is still with- Oncea valid subdirectory exists on mass storage in the drive, an error message indicating that no text device 56, system 5O permits the teachertoselect be- _o files exist is displayed (block 82O). tween different subdirectories that may exist if multi- If decision block 814 determines that some ''.tlt'' ple subdirectories exist. Specifically, the flag USE- files do exist, system 5O displays instructions and a DIRS is set by system 5O whenever at least one user list of the ''.tlt'' files (block 82O) and then permits the subdirectory exist on mass storage device 56 (block teacher to select one of the listed files (blocks 822- 746, 744). A flag CHDRIVES is set to eventually re- __ 828). By manipulating the cursor control keys, the quire the user to choose another drive using the teacher can highlight any selected file name dis- choose drive routine 624 (blocks 748-752) if no valid played on display 58 (blocks 83O-836), and may se- subdirectory exists. Otherwise, the flag CHDIRS is lect the highlighted file by depressing the ENTER key __
gocr ep0461127-016
29 EP O 46_ _2l B_ 3O (block 838, 84O). Depressing the ESC key exits rou- row and right arrow keys are used to lengthen and tine 8O4 without selecting a file name (block 842). shorten the currently selected phrase, with the select- Referring now once again to FIGURE 12, if the ed phrase being highlighted on the display to permit teacherfailed to selecta file name (determined by de- the teacher to see what phrase he has selected cision block 844), the flags are set appropriately _ (blocks 896-9O2). Depressing the F3 key causes sys- (block 846) to permit the teacher to select another tem 5O to determine whether a digitized speech file subdirectory (decision block 848, blocks 752, 756). corresponding to the currently selected phrase has Otherwise, the selected file name is returned at block already been recorded (decision block 9O4). If one 85O to FIGURE 11A block 852. has been recorded, that recording can be played FIGURE 11A block 852 reads the text of the se- 1o (block9O6) -- allowing the teacherto hearwhat he has Iected file and displays it on display 58 (blocks 852, recorded corresponding to the phrase. Depressing 854). System 5O then prompts the teacherwhether he the F4 key allows the teacher to record (or re-record) wishes to accept the text (blocks 856, 858). If he does up to ten seconds of digitized speech corresponding not accept the text, routine 62O is called again to per- to the selected phrase (blocks 92O-926). Once a mit him to select another file. Otherwise, the teacher 1_ phrase has been recorded, it is stored on the blank is prompted to entera newlesson title (block 86O) and formatted diskette and the teacher is given an indica- is asked whether he wishes to include a second page tion ofthe amountoffree diskspace remaining in sec- of help or instructions in the lesson (block 862, 864). onds (blocks 928, 93O). During recording, volume lev- If a page two screen is to be included, routine 62O is els continuously displayed and recording time is also called to permit selection of the file containing the _o displayed in seconds. The teacher speaks into head- page twotextand the teacher is given the opportunity set 62a microphone (or separate microphone 62b), to view and accept this page two text (blocks 866, and speech processor 6O converts his spoken 872). The second page need not necessarily be relat- speech intodigitized data which is stored in the virtual ed to either the main page of text or the recorded diskof personal computer52. Once the teacheragain speech -- permitting great flexibility to the teacher in __ depresses the space bar (or there is a time out) (de- creating lesson formats. However, the page two cision block 924), the stored information stored in the screen typically is supplementary textual material or virtual disk is transferred to the floppy diskette or instructions which may be displayed by the student hard disk. This procedure greatly increases sound upon depressing a key. quality because final storage is not to take place on Once both the main text screen and the page two 3o an interrupt driven basis. text screens have been selected and accepted by the In the preferred embodiment, each digitized teacher, system 5O prompts the teacher to insert a speech phrase is provided with a unique name. Spe- data disk (block 874, 876) which should preferably be cifically, each digitized phrase file is automatically blank in the preferred embodiment to provide suffi- numbered in the preferred embodiment with sequen- cient room (e.g., minimum 36OK) for storing digitized 3_ tially ascending numbers (e.g., text1 .SO, text2.SO, speech corresponding to the lesson being created etc.) and the start and end ofeach text phrase is flag- (blocks 878-884). System 5O in the preferred embodi- ged in the main text file corresponding to the lesson. ment thus insists that a blank format is used for each For example, a character sequence such as ''text- Iesson to ensure that recordings are transferred start(1)'' may be added to the text file at the point the whole onto the diskette and thus decrease access 4o teacher marked as corresponding to the first record- time (by eliminating searches for different related ed phrase, and a character sequence such as ''tex- speech files). tend(1)'' may be added to the text file at the point the System 5O then displays once again the main text teacher selected as the end ofthe corresponding text screen selected by routine 62O and accepted at phrase. In this way, a linkage is established between blocks 856, 858 (block 886) and waits for the teacher 4_ teacher-selected text strings within the main text file to select a string of text on the display (blocks 888, and discrete files stored on mass storage device 56 89O). Briefly, in the preferred embodiment, the teach- containing corresponding digitized text -- with a one- erfirst selects a phrase using the cursor control keys to-one correspondence generally existing between and then records digitized speech corresponding to text strings and digitized sound files in the preferred the phrase using the F4 key. The teacher may re-re- _o embodiment. The backspace key allows the teacher cord a given phrase ifnecessary. Depressing the EN- to easily move to a previously recorded phrase in or- TER key causes system 5O to move on to the next der to re-record it or the like (block 932, 934). De- phrase. The ''<'' and ''>'' may be used to skip over dis- pressing the ENTER key causes system 5O to go on played words the teacher does not wish to record. to the next phrase ifthe previous phrase has been re- Each recorded phrase may be up to ten seconds long. __ corded (decision block 936) -- or if all phrases have Blocks 892, 894 are used to skip over displayed been recorded, to move on to block 938 (which per- words (so that not all words of the main text screen mits the teacher to listen to the entire recorded need to correspond to a recorded phrase). The leftar- speech on an uninterrupted basis or to depress the _6
gocr ep0461127-017
31 EP O 46_ _2l B_ 32 enter key to save the lesson (see blocks 94O-944). data representative of user input speech, Finally, now that the teacher has stored the les- a display (58) fordisplaying visual informa- son he can preprogram which ofthe studentfunctions tion corresponding to the passage, and will be available to the students for a particular les- selecting means (54) operatively connect- son. In the preferred embodiment, the AudioLab func- _ ed to said display (58) and to said storing means tion is always available to the student. However, cer- and operative to provide a comparison ofthe user tain exercises may not be suitable for the SoundSort input speech with the stored model version ofthe or AudioWrite function. At block 946, system 5O passage of language, prompts the teacher to choose student functions that characterised in that should be available to students and to delete func- 1o said storing means (56) is arranged to tions that should not be available to the student. store a digitised speech version ofthe passage of Blocks 95O-958 result in displaying the same main language and also to store a digital data text ver- menu seen by the student and permitting the teacher sion of said same passage, to delete one or both of the AudioWrite or SoundSort said selecting means (54) is operable by a options orto undelete those options (blocks 956, 954, 1_ user to select a portion of said passage and to respectively). Depressing the ENTER key saves the cause text corresponding to said selected pas- student selection data (block 958) and returns control sage portion to be displayed on said display (58) to FIGURE 1 1A block6O2 to permitthe teacherto eith- based on said stored digital data textversion, and er exit the studio routine or to work on creating an- in that the system includes speech processing other lesson. _o means (6O) for. The present invention thus provides an extremely selecting the portion of said stored digi- flexible environment for creating preprogrammed au- tised model speech version corresponding to dio visual lessons in which both the audio portion and said selected portion of said passage, the textual portion can be programmed by the teach- converting said selected digitised model er. Once lessons have been constructed in this fash- __ speech version portion to audio signals for use in ion, they can be used by the student in a variety ofdif- generating speech sounds, ferent ways to develop different skills. For example, converting audio signals representing the Audiolab student function works on reading and user input speech into digitised signals repre- listening comprehension; the AudioWrite function senting said user speech input and concentrates on listening, comprehension and writing 3o subsequently reconverting said digitised skills; while the sound functions concentrates on lis- speech signals representing said user input tening, comprehension, grammatical and other skills. speech to audio signals. Since the same lesson can be used for various func- tions, the burden on the teacher is eased, while great 2. A system according to claim 1 , wherein. said sys- flexibility is maintained. All of these features are pro- 3_ tem further includes a transducer (62b) which vided by a truly interactive language learning system converts user speech to audio signals; in which the student is exposed to both audio and vis- said speech processing means (6O) in- ual stimuli and is capable ofeither listening to record- cludes means connected to said transducer(62b) ed model digitized speech andlor to his own attempts for converting said audio signals is digitized to pronounce the speech using self-correction meth- 4o speech signals and for temporarily storing said odology. digitized speech signals; While the invention has been described in con- said display (58) also displays a symbol; nection with what is presently considered to be the and most practical and preferred embodiments, it is to be said system further includes user input understood that the invention is not to be limited to the 4_ (54) means for permitting said user to (i) select disclosed embodiments, but on the contrary, is in- said portion of said passage by manipulating the tended to cover various modifications and equivalent position of said symbol displayed by said display arrangements included within the scope of the ap- with respect to said displayed text, and (ii) control pended claims. said speech processing means to rapidly alter- _o nate (a) converting said temporarily stored digi- tized speech signals representing his own Claims speech to audio signals, and (b) converting said digitized speech signals corresponding to said _ . An interactive language learning system com- selected portion of said passage to audio signals prising __ so as to alternately generate sounds correspond- storing means (56) for storing in digital ing to said user's speech and sounds correspond- form data representative of a model of speech ing to said stored digitized speech version. version of a passage of language and for storing _l
gocr ep0461127-018
33 EP O 46_ _2l B_ 34 3. A system according to claim 1 , wherein. said and said system includes speech processing means (6O) generates proc- re-ordering means (52) for re-ordering essor interrupts; and said plural phrases into a sequence having an or- said system further includes interrupt der different from said initial order; means for reading said digitized speech version _ said selecting means (54) operatively con- from said storing means for conversion to audio nected to said re-ordering means and operable signals by said speech processing means in re- by a user for permitting said user to further re-or- sponse to said generated processor interrupts. der said plural phrases into a user-specified or- der; and 4. A system according to claim 1 , wherein. said se- 1o said speech processing means (6O) is con- Iecting means (54) includes means for selecting nected to said re-ordering means (52) and is re- the position and length of a portion of said pas- sponsive to said digitized speech signals, forgen- sage; erating audible versions of said phrases so as to said speech processing means (6O) in- provide audible cues to said user. cludes further selecting means for selecting only 1_ those portions of said stored digitized speech _ O. An interactive language learning system accord- version corresponding to said selected passage ing to claim 9, further including a display (58) for portion; and displaying symbols representing said plural phas- said speech processing means also in- es in at least said user-specified order. cludes means for converting only said selected _o stored digitized speech version portions to audio __. An interactive language learning system accord- signals. ing to claim 9, including symbol display means (58) connected to said re-ordering means (52) for _. A system according to claim 1 , wherein said first- associating a symbol with each of said plural mentioned selecting means (54) comprises cur- __ phrases and for presenting a display ofsaid sym- sor control means manipulable by said user for bols in said re-ordered sequence; selecting portions of said text displayed by said testing means connected to said input display and for thereby selecting corresponding means for comparing the user-selected re-or- portions of said stored digitized speech version dered sequence with said initial order. for conversion to audio signals. 3o _2. A system according to claim 1 1 , wherein. 6. A system according to claim 5, further including said selecting means (54) includes means means connected to said cursor control means for selecting any one of said symbols; and (54) and to said display (58) for causing said se- said speech processing means (6O) is also Iected text portions displayed by said display to 3_ connected to said user selecting means and to have a different appearance than the non-select- said symbol display means and includes means ed displayed text portions. for converting the stored digitized speech asso- ciated with said selected symbol to audio signals. l. A system according to claim 5, wherein said sys- tem further includes text display selection 4o _3. A system according to claim 1 1 , wherein. means, manipulable by said user and operatively said symbol display means (58) includes. connected to said display, for alternately select- a left-hand display column which displays ing. (a) display ofonly said selected text portions, said symbols in said first-mentioned re-ordered and (b) display ofthe entire textual version ofsaid sequence, and passage including said selected text portion. 4_ a right-hand display column which dis- plays said symbols in said user-specified further 8. A system according to claim 1 , wherein said order; and speech processing means (6O) includes means said selecting means (54) includes means for converting between audio signals and adap- for moving said symbols from said left-hand dis- tive differential pulse code modulation encoded _o play column to said right-hand display column in digitized speech signals representing said audio response to user commands. signals. _4. A system according to claim 1 3, wherein said 9. An interactive language learning system accord- speech processing means (6O) is also connected ing to claim 1 , wherein said system can provide __ to said user selecting means (54) and includes in digital form data representative of speech sig- means for generating sounds corresponding to nals, said digitized speech signals representing a said phrases in said initial order. sequence of spoken phrases having initial order; _8
gocr ep0461127-019
35 EP O 46_ _ 2l B_ 36 _ _. A system according to claim 1 1 , wherein said Tonsignale umfasst. speech processing means (6O) is also connected to said user selecting means (54) and includes 2. System nach Anspruch 1 , wobei das besagte Sy- means for selecting a starting point within said re- stem weiterhin einen Wandler (62b) enthält, der ordered sequence, said selected starting point _ Benutzersprache in Tonsignale umwandelt; being different from the beginning of said se- das besagte Sprachverarbeitungsmittel quence, and for converting said corresponding (6O) mit dem besagten Wandler (62b) verbunde- digitized speech signals to audio signals in said ne Mittel zum Verwandeln der besagten Tonsi- initial order of said phrases beginning from said gnale in digitalisierte Sprachsignale und zum starting point so as to provide audible speech cor- 1o zeitweiligen Speichern der besagten digitalisier- responding to less than said entire sequence of ten Sprachsignale enthält; phrases. die besagte Anzeige (58) auch ein Symbol anzeigt; und das besagte System weiterhin Benutzer- Patentanspr ü_he 1_ eingabemittel (54) enthält, um dem besagten Be- nutzer zu erlauben, (i) den besagten Teil des be- _ . Interaktives Sprachenlernsystem mit sagten Textes durch Handhabung der Position Speichermitteln (56) zum Speichern in di- des durch die besagte Anzeige angezeigten Sym- gitaler Form von ein Modell der Sprachversion ei- bols in bezug auf den besagten angezeigten Text nes Sprachentexts darstellenden Daten und zum _o auszuwählen, und (ii) das besagte Sprachverar- Speichern von Benutzereingangssprache dar- beitungsmittel so zu steuern, da_ es schnell zwi- stellenden Daten, schen (a) dem Verwandeln der seine eigene einer Anzeige (58) zum Anzeigen von dem Sprache darstellenden besagten zeitweilig ge- Text entsprechenden Sichtinformationen und speicherten digitalisierten Sprachsignale in Ton- mit der besagten Anzeige (58) und den be- __ signale und (b) dem Verwandeln der dem besag- sagten Speichermitteln wirkverbundenen Aus- ten ausgewählten Teil des besagten Textes ent- wählmitteln (54) zur Bereitstellung eines Ver- sprechenden digitalisierten Sprachsignale in gleichs der Benutzereingangssprache mit der ge- Tonsignale wechselt, um auf diese Weise wech- speicherten Modellfassung des Sprachentexts, selweise Laute zu erzeugen, die der Sprache des dadurch gekennzeichnet, da_ 3o besagten Benutzers entsprechen, und Laute, die das besagte Speichermittel (56) zur Spei- der besagten gespeicherten digitalisierten cherung einer digitalisierten Sprachfassung des Sprachfassung entsprechen. Sprachentexts und auch zur Speicherung einer digitalen Datentextfassung des besagten selben 3. System nach Anspruch 1 , wobei das besagte Texts angeordnet ist, 3_ Sprachverarbeitungsmittel (6O) Prozessorunter- das besagte Auswählmittel (54) von einem brechungen erzeugt; und Benutzer zur Auswahl eines Teils des besagten das besagte System weitherhin Unterbre- Texts und zum Bewirken, da_ dem ausgewählten chungsmittel zum Auslesen der besagten digita- Textteil entsprechender Text auf Grundlage der lisierten Sprachfassung aus dem besagten Spei- besagten gespeicherten digitalen Datentextfas- 4o chermittel zur Verwandlung in Tonsignale durch sung auf besagter Anzeige (58) angezeigt wird, das besagte Sprachverarbeitungsmittel als Re- betätigt werden kann, und da_ das System aktion auf die besagten erzeugten Prozessorun- Sprachverarbeitungsmittel (6O) zum terbrechungen enthält. Auswählen des dem besagten ausgewähl- ten Teil des besagten Textes entsprechenden 4_ 4. System nach Anspruch 1 , wobei das besagte Teils der besagten gespeicherten digitalisierten Auswählmittel (54) Mittel zum Auswählen der Po- Modellsprachfassung, sition und Länge eines Teils des besagten Textes Verwandeln des besagten ausgewählten enthält; digitalisierten Modellsprachfassungsteils in Ton- das besagte Sprachverarbeitungsmittel signale zur Verwendung bei der Erzeugung von _o (6O) weitere Auswählmittel zum Auswählen von Sprachlauten, nur denjenigen Teilen der besagten gespeicher- Umwandeln von die Benutzereingangs- ten digitalisierten Sprachfassung enthält, die sprache darstellenden Tonsignalen in die besag- dem besagten ausgewählten Textteil entspre- te Benutzerspracheingabe darstellende digitali- chen; und sierte Signale und __ das besagte Sprachverarbeitungsmittel nachfolgenden Rückverwandeln der die auch Mittel zum Verwandeln von nur den besag- besagte Benutzereingangssprache darstellen- ten ausgewählten gespeicherten digitalisierten den besagten digitalisierten Sprachsignale in Sprachfassungsteilen in Tonsignale enthält. _9
gocr ep0461127-020
37 EP O 46_ _2l B_ 38 _. System nach Anspruch 1 , wobei das besagte den besagten Benutzer hörbare Stichworte be- ersterwähnte Auswählmittel (54) vom besagten reitzustellen. Benutzer handhabbare Textzeigersteuermittel zur Auswahl von Teilen des besagten durch die _ O. Interaktives Sprachenlernsystem nach Anspruch besagte Anzeige angezeigten Textes und zurent- _ 9, weiterhin mit einer Anzeige (58) zum Anzeigen sprechenden Auswahl dadurch von entsprechen- von die besagten mehrfachen Sätze darstellen- den Teilen der besagten gespeicherten digitali- den Symbolen in mindestens der besagten vom sierten Sprachfassung zur Verwandlung in Tonsi- Benutzer angegebenen Ordnung. gnale umfa_t. 1o __. Interaktives Sprachenlernsystem nach Anspruch 6. System nach Anspruch 5, weiterhin mit dem be- 9 mit mit dem besagten Umordnungsmittel (52) sagten Textzeigersteuermittel (54) und der be- verbundenen Symbolanzeigemitteln (58) zum sagten Anzeige (58) verbundenen Mitteln zum Zuordnen eines Symbols zu jedem der besagten Bewirken, da_ die durch die besagte Anzeige an- mehrfachen Sätze und zum Darstellen einer An- gezeigten besagten ausgewählten Textteile eine 1_ zeige der besagten Symbole in der besagten um- von den nicht ausgewählten angezeigten Texttei- geordneten Folge; Ien unterschiedliche Erscheinungsform aufwei- mit dem besagten Eingabemittel verbun- sen. denen Prüfmitteln zum Vergleichen der vom Be- nutzerausgewählten ungeordneten Folge mit der l. System nach Anspruch 5, wobei das besagte Sy- _o besagten Anfangsordnung. stem weiterhin vom besagten Benutzer handhab- bare und mit der besagten Anzeige _2. System nach Anspruch 1 1 , wobei das besagte wirkverbundene Textanzeigeauswahlmittel ent- Auswählmittel (54) Mittel zum Auswählen von ei- hält, zurwechselweisen Auswahl von. (a) der An- nem beliebigen der besagten Symbole enthält; zeige von nur den besagten ausgewählten Text- __ und teilen und (b) der Anzeige der gesamten Textfas- das besagte Sprachverarbeitungsmittel sung des besagten Textes einschlie_lich des be- (6O) auch mit dem besagten Benutzerauswähl- sagten ausgewählten Textteils. mittel und mit dem besagten Symbolanzeigemit- tel verbunden ist und Mittel zum Verwandeln der 8. System nach Anspruch 1 , wobei das besagte 3o dem besagten ausgewählten Symbol zugeordne- Sprachverarbeitungsmittel (6O) Mittel zum Um- ten gespeicherten digitalisierten Sprache in Ton- wandeln zwischen Tonsignalen und die besagten signale enthält. Tonsignale darstellenden mit adaptiver Diffe- renzpulscodemodulation codierten digitalisierten _3. System nach Anspruch 1 1 , wobei das besagte Sprachsignalen enthält. 3_ Symbolanzeigemittel (58) folgendes enthält. eine linke Anzeigespalte, die die besagten 9. Interaktives Sprachenlernsystem nach Anspruch Symbole in der besagten ersterwähnten unge- 1 , wobei das besagte System Sprachensignale ordneten Folge anzeigt, und darstellende Daten in digitaler Form bereitstellen eine rechte Anzeigespalte, die die besag- kann, wobei die besagten digitalisierten Sprach- 4o ten Symbole in der besagten vom Benutzerange- signale eine Folge gesprochener Sätze mit An- gebenen weiteren Ordnung anzeigt; und fangsordnung darstellen, und das besagte Sy- wobei das besagte Auswählmittel (54) Mit- stem folgendes enthält. tel zum Verlagern der besagten Symbole aus der Umordnungsmittel (52) zum Umordnen besagten linken Anzeigespalte in die besagte der besagten mehrfachen Sätze in eine Folge mit 4_ rechte Anzeigespalte als Reaktion auf Nutzerbe- einer sich von der besagten Anfangsordnung un- fehle enthält. terscheidenden Ordnung; mit dem besagten Umordnungsmittel in _4. System nach Anspruch 1 3, wobei das besagte Wirkverbindung stehende und von einem Benut- Sprachverarbeitungsmittel (6O) auch mit dem be- zer betätigbare besagte Auswählmittel (54), um _o sagten Benutzerauswählmittel (54) verbunden dem besagten Benutzer zu erlauben, die besag- ist und Mittel zum Erzeugen von den besagten ten mehrfachen Sätze weiterhin in eine vom Be- Sätzen entsprechenden Lauten in der besagten nutzer angegebene Ordnung umzuordnen; und Anfangsordnung enthält. wobei das besagte Sprachverarbeitungs- mittel (6O) mit dem besagten Umordnungsmittel __ _ _. System nach Anspruch 1 1 , wobei das besagte (52) verbunden ist und auf die besagten digitali- Sprachverarbeitungsmittel (6O) auch mit dem be- sierten Sprachsignale reagiert, um hörbare Fas- sagten Benutzerauswählmittel (54) verbunden sungen der besagten Sätze zu erzeugen, um für ist und Mittel zum Auswählen eines Anfangs- 2O
gocr ep0461127-021
39 EP O 46_ _2l B_ 4O punktes in der besagten ungeordneten Folge ent- 2. Système selon la revendication 1 , dans lequel. Ie- hält, wobei sich der besagte ausgewählte An- dit système comporte en outre un transducteur fangspunkt vom Anfang der besagten Folge un- (62b) qui convertit la voix d'utilisateur en signaux terscheidet, und zum Verwandeln der besagten audio; entsprechenden digitalisierten Sprachsignale in _ ledit moyen de traitement vocal (6O) Tonsignale in der besagten Anfang