(PICTURE)(PICTURE) (PICTURE)Europäis hes Patentamt
(1 9) (PICTURE) European patent o_ice
(PICTURE) o__ice europeen des brevets (1 1 ) E P O 522 59 1 B 1
(1 2) EU ROPEAN PATENT _PECl FICATlON
(45) Date of publication and mention (51 ) int. ci.7_. _O6F _ _J3O
of the grant of the patent_.
22.O3.2OOO Bulletin 2OOOl1 2
(21 ) Application number_. 921 1 182O.4
(22) Date of filing_. 1 O.Ol.1 992
(PICTURE)(54) Database retrieval system _or responding to natural language queries with corresponding
tables
Datenbankauffindungssystem zur Beantwortung natursprachlicher Fragen mit dazugehörigen
Tabellen
Système de recouvrement de données pour répondre aux interrogations en langage naturel avec des
(PICTURE)
tables oorrespondantes
__
Ot_
U7
_
_ _ote_ within nine month_ irom the pubiication oi the mention oi the grant oi the European patent any per_on may give
U7 not.ice to the European patent oii.ice oi oppo_.it.ion to the European patent granted _ot.ice oi oppo_.it.ion _haii be i.iied .in
O a wr_itten rea_oned _tatement. it _haii not be deemed to have been i_iied unt_ii th_ oppo__it_ion iee ha_ been pa_id. (Art.
_w 99(1) European patent convention).
(PICTURE)Prin_ed by XeroK (UK) Business Services
2.16.7_3.6
1 E_ O 522 5g_ B_ 2
DeSCfiPtiO_ IOOO5I Fig. 3a provides a more detailed depiction of an
example of the word dictionary 4. As this Fig. shows, the
IOOO1 I This invention relates to an information retrieval dictionary includes a plurality of entries, and each entry
system for retrieving information from a database com- inCludeS three fields. The headerfield identifies the term
prising an analysis unit using a dictionary for analyzing _ Or PhraSe aSSOCiated with the entry, whereas the part of
a natural language query. Speech field identifies the part of speech of the term or
IOOO2I Fig. 1 is a diagram illustrating a conventional PhraSe_ LaStIY, the type field identifies the type of term
database retrieval system for retrieving data from a Or phrase that is used. In the example shown in Fig. 3a,
table formatted database in response to a natural lan- the types are ''item name'' and ''data expression word''.
guage 4uerY_ A natural language query is a request for 1o IOOO6I Fig. 3b provides a more detailed depiction of
data that is set forth in a natural language, such as Eng- the hierarchical table model 6. This model 6 sets forth
liSh, JapaneSe, French, etc. The illustrated database the hierarchical relationship between the respective
retrieval SyStem is described in more detail in ''Kinu- tables. Each table specifies a number of attributes. For
kawa, A Natural bnguage Interface Processor Based instance, table 14 includes the attributes of ,,date,,,
On the HierarChiCal-T_ee StruCture Model of Relation 1_ ''Commodity Code'', ''Commodity group code'', and
Table. (PICTURE) ''sales''. The ''commodity code', a_ribute is also an
J_, V_l.27, No.5 (1986), pp. 499-5O9.'' This system attribute in table 16, which is hierarchically related with
iS deSigned tO process queries in Japanese. For the table 14. Similarly, the attribute of ''commodity group
exampleS deSCribed below, the English translations of code'' is an attribute of both table 16 and table 18. The
Japanese words and phrases are provided in parenthe- _o table 14 is a higher order table than tables 16 and 18.
SiS_ Moreover, table 16 is a higher order than table 18. This
IOOO3I The database retrieval system shown in Fig. 1 hierarchical table model is consistent with the relational
inCludeS an input unit 2, such as a keyboard, for enter- model for data proposed by E. F. Todd in ''A Relational
ing a natural language query 1 . The system also Model of Data for brge Shared Data Banks,'' (PICTURE)Co mu-
includes a communications controller 3 for forwarding __ (PICTURE)nicati ns of the ACM, June 197O, pp. 3JJ-38J.
the natural language query 1 to a retrieval sentence IOOOlI Table 3c provides illustration of the database 9.
analySiS unit 5. The retrieval sentence analysis unit 5 The database 9 includes table A, table B and table C.
ProCeSSeS the input query 1 to produce a hierarchical Each of the tables A, B, C includes different types of
mOdel Of the query. The system additionally includes a information. For example, table A contains sales infor-
Word diCtionary 4, that is constructed on the basis of the 3o mation, table B includes commodity information, and
Content of a database 9, and a hierarchical table model table C includes commodity group information. These
6 fOr hierarChiCally expressing the content of the data- tables are used in conjunction to obtain information
base. The dictionary 4 and hierarchical table model 6 requested by the natural language query 1 (Fig. 1).
are uSed bythe retrieval sentence analysis unit 5 in ana- IOOO8I Operation of the system shown in Fig. 1 will
IYzing the natural language query 1 . The retrieval sen- 3_ now be described. Initially, a natural language query 1 is
tenCe analySiS unit 5 performs both vocabulary analysis entered using the input unit 2. When a keyboard is used
and SyntaCtiCISemantic analysis on the natural lan- as the input unit 2, the query is entered simply by typing
guage 4uerY 1 _ The retrieval sentence analysis unit 5 the query. The query 1 is then passed to the conversa-
ProduCeS a retrieval sentence analysis result 7 as out- tion control unit 3, which fo_ards the query to the
Put that iS forwarded to a retrieval processing unit 8. The 4o retrieval sentence analysis unit 5. The retrieval sen-
retrieval prOCeSSing unit 8 uses the retrieval sentence tence analysis unit 5 parses the query into a hierarchi-
analySiS reSult 7 to retrieve data from the database 9. cal structure of words or phrases that is output as the
IOOO4I The depiction of the conventional database retrieval sentence analysis result J. In processing the
retrieval SyStem shown in Fig. 1 is a functional descrip- query, the retrieval sentence analysis unit 5 first chops
tiOn intended tO show the interaction between the 4_ the query into words or phrases. In the present exam-
reSpeCtive COmponents of the system. The components ple, the query is chopped into the phrases ''chokoreeto
Shown in Fig. 1 are, in fact, implemented in a data rui'' and ''uriage''. The terms ''no'' and ''ha'' are zyoshi,
PrOCeSSing SyStem 1O, SuCh aS that shown in Fig. 2. The whose significance will be described in more detail
data processing system 1 O includes a central process- below.
ing unit (CPU) 1 1 , a memory 12, the communications _o IOOO9I Once the query has been divided into words or
COntrOller 3, an output device 17 and the input unit 2. phrases, vocabulary analysis is performed on the words
EaCh of theSe Components is coupled to a bus 13. The or phrase to determine what each word or phrase in the
retrieval SentenCe analysis unit 5 and the retrieval query signifies. In performing such vocabulary analysis,
prOCeSSing unit 8 are implemented in software that is the retrieval sentenced analysis unit 5 references the
exeCuted by the CPU 1 1 (Fig. 2). The software is stored __ word dictionary 4 to determine that ''chokoreeto rui,,
in the memorY 12. The word dictionary 4 (Fig. 1), the (chocolates and the like) is a data expression word (see
hierarchical model table 6 and the database 9 are Fig. 3a). The retrieval sentence analysis unit 5 aIso
StOred within the memory 12 (Fig. 2). determines that ''uriage'' (sales) is an attribute item
2
3 E_ O 522 59_ B_ 4
name, respectively. The word dictionary 4 indicates that system, the input query for the second conventional
both of these phrases are nouns. The dictionary 2 is not system would be as follows. The first noun field would
referenced for the zyoshi ''ha'' and ''no''. be entered as ''chokoreeto rui'' and the corresponding
_oo1o_ Syntax and semantic analysis is then per- PartiCle field WOUld be entered aS ''nO''_ FUrther, the SeC-
formed on the query. ln particular, syntactic analysis is 5 Ond nOUn field WOUld be entered aS ''Uriage'' and the
pe_ormed to process the syntax or the query in order to partiCle field WOUld be entered aS ''ha''.
understand the role each phrase serves in the query. _OO14_ In this second conventional system, queries in
Semantic analysis, on the other hand, is performed to a natural Japanese format cannot be analyzed. Like-
understand what is being requested by the query. wise, the retrieval object is determined in view of the
_OO1 1_ Subsequently, semantic analysis is performed 1o restriction of the designated format shown in Fig. 5. A
to relate the meaning of the query to the database pertinent data file may, thus, be accessed only by lim-
entries. The semantic analysis relies on the hierarchical ited terminology including synonyms recorded in the
table model 6 (see Fig. 3b) to ascertain that ''chokoreeto dictionaries.
rui'' (chocolates and the like) is an attribute data expres- _OO15_ In the first conventional information retrieval
sion word of a commodity group in table 18 (i.e., table C 15 system described above, it is necessary to have previ-
in Fig. 3c) and ''uriage'' (sales) is an item name in the ously constructed a hierarchical table model. Since,
table 14 (i.e. table A in Fig. 13c). Moreover, the hierar- however, in general, it is not always possible to place
chical table model 6 (Fig. 3b) indicates that table 14 is a the content of a database into a hierarchy, input sen-
higher order table than table 18. Since the attribute item tences which do not fall under the defined hierarchical
appearing in the low order table is a noun, and a zyoshi _o structure cannot be processed. Further, there is no flex-
''no'' is added thereto, it is recognized that the attribute ibility in receiving natural language phrases or words,
''chokoreeto rui'' in table 18 modifies the attribute such as ''sengetsu'' (last month) which are not in the
''uriage'' (sales), which appears in a higher order table database. The system is limited solely to the phrases
14. Using these results, a retrieval formula ''retrieval included in the database. Still further no information is
condition_. (commodity group name = cho_reeto rui), _5 provided on ''zyoshi'' (particles). Thus, there is also the
retrieval object_. uriage'' is obtained and is output from problem that the ommission of a ''zyoshi'' cannot be
the retrieval sentence analysis unit 5. Subsequently, detected.
retrieval from the database 9 is performed by the _OO16_ In addition, when there is an ambiguous word
retrieval processing unit 8 to obtain the desired data. (for example, time periods or seasons), syntactic analy-
_OO12_ Figs. 4a, 4b and 4c show dictionaries used in a 3o sis is impossible unless the definition of the ambiguous
second conventional database retrieval system, as dis- word is recorded in detail. In some cases, each interro-
closed in Japanese Patent bid-Open Publication No. gator must record the definition on an individual basis
59-99539. In these dictionaries, information on column according to his usage of the ambiguous term.
name in a file, information on data item name, and infor- _OO1l_ Information retrieval is performed for each of
mation on a file name that possesses a common col- 35 the items recorded in a file. Thus, an answer cannot be
umn name or data name, are stored according to file obtained for a question in which a plurality of files are
names of a data file that is contained in a database. Fig. retrieved as a result of analyzing the input sentence and
4a represents a dictionary in which one of the database in which it is necessary to process such a retrieval result
files contains the column name of a file. The dictionary to obtain a final result.
also holds information regarding the order in which the 4o _OO18_ The foregoing problems in the prior art are
column is contained in the file and additionally holds overcome by the present invention of an information
information regarding synonyms of the column name retrieval system as defined in in claim 1 . The information
(i.e., file numbers and column attribute numbers of col- retrieval system for retrieving information from a data-
umns that are synonymous with the named column). base comprising an analysis unit using a dictionary for
Fig. 4b shows an analogous dictionary in which one of 45 analyzing a natural language query according to the
the files contains a data column name, and the diction- present invention for this reason is characterized by said
ary stores a position at which the named column is con- analysis unit being a parser for parsing said natural lan-
tained in thefile. Lastly, the dictionary stores information guage query into its constituent parts to determine a
regarding synonyms of the data column name. Fig. 4c syntax analyzing result as to the construction of the
shows a dictionary holding information as to semanti- 5o query; said dictionary including a column for a semantic
cally identical data columns that are connected as syn- ID which determines the semantic meaning of terms of
onyms. said constitutent parts in a manner which can be under-
_OO13_ Fig. 5 is the designated format for input queries stood by the database; virtual tables for identifying the
for the second conventional system. This format terms of said constituent parts in the database, each
requires that queries be entered as a number of entries, 55 term being associated with one or more virtual tables,
wherein each entry includes two fields; a noun filed and said virtual tables accounting for particles that modify
a particle or auxiliary field. Thus, for the example query the parts, a collating unit for preparing a database
1 (Fig. 1) used in the discussion of the first conventional retrieval formula from the syntax analysis result by
3
5 EP O 522 591 B1 6
selecting a virtual table which a term has in common virtual table 28 of Fig. 6;
with anOther term Of the qUery, a retrieval exeCUtiOn Unit Figs. 16a and 16b are diagrams illustrating the
fOr retrieving data frOm the data baSe On the baSiS Of operation of the system with a query that employs
Said databaSe retrieVal fOrmUla. the seasonal time period;
_OO19_ The information retrieval system may also _ Figs. 17a - 1 7c illustrate the processing of an entity
include an additional table for converting an undeter- table logic formula;
mined value phrase in the natural language query into a Fig. 18 is a depiction of a database retrieval word
determined value phrase in the database based on the grammar definition table 155 that is contained in the
syntax analysis result. Still further, the information virtual table 28 of Fig. 6;
retrieval system may include a terminology dictionary 1o Fig. 19 is an example of a database retrieval for-
for identifying entries in the virtual table that are to be mula processing for the entity table logic formula of
used in converting phrases of the natural language Figs. 1 7a - 17c;
query. The dictionary includes words representing times Figs. 2Oa and 2Ob illustrate the grouping in syntac-
and the dictionary is used by the parser in obtaining the tic trees of two complex queries; and
syntax analysis result. When the terminology dictionary 1_ Figs. 21a and 21b depict additional virtual tables
is used, the system may also include a time interval def- employed for the processing of the queries of Figs.
inition table in the virtual table for defining dates corre- 2Oa and 2Ob.
sponding to words representing time. Lastly, the system
may include a database retrieval formula conversion _OO2O_ A preferred embodiment of the present inven-
unit for generating a formula in a database retrieval lan- _o tion will now be described with reference to the draw-
guage from the database retrieval formula. ings. Fig. 6 shows the construction and flow of
processing of a first preferred embodiment of the
Fig. 1 is a blockdiagram of a first conventional data- present invention which provides a database retrieval
base retrieval system illustrating the processing system that responds to a natural language query 1 .
performed by the system; __ Like the first conventional system of Fig. 1 , the system
Fig. 2 is a block diagram of a data processing sys- may be implemented on a data processing system as
tem suitable for implementing the first conventional shown in Fig. 2. This first preferred embodiment
system; includes an input unit 2, a conversation control unit 3
Fig. 3a is a more detailed depiction of the word dic- and a database 9 like that employed in the conventional
tionary 4 of Fig. 1 ; 3o system of Fig. 1 . These components are implemented
Fig. 3b is a more detailed depiction of the hierarchi- in the data processing system 2 as discussed for the
cal table model 6 of Fig. 1 ; first conventional system. The preferred embodiment,
Fig. 3c is a more detailed depiction of the database however, differs from the conventional system in several
9 of Fig. 1 ; respects. These distinctions are highlighted below.
Figs. 4a - 4c illustrate dictionaries in a second con- 3_ _OO21_ The first preferred embodiment also includes a
ventional database retrieval system; parser 22 for parsing an input natural language query
Fig. 5 illustrates the input format for queries with the into its constituent parts. The parser 22 uses a grammar
second conventional database retrieval system; table 24 and a terminology dictionary 26. The grammar
Fig. 6 is a block diagram of an embodiment of the table 24 holds information for regulating the relation in a
present invention illustrating the processing per- 4o Japanese sentence, and the terminology dictionary 26
formed by the embodiment; ddines the part of speech and meaning of each word in
Fig. 7 is a more detailed depiction of the terminol- the query 22. While the terminology dictionary 26 is
ogy dictionary 26 of Fig. 6; similar to the conventional word dictionary 4 shown in
Figs. 8a - 8c are more detailed depictions of tables Fig. 1 , the terminology dictionary of Fig. 6 differs in that
held in the virtual table 28 of Fig. 6; 4_ is includes a column for a semantic marker (see Fig. 7).
Fig. 9 is an illustration of a syntax tree that is output The role of the semantic marker is described in more
by the parser 22; detail below. A column for a semantic ID (see Fig. 7) and
Fig. 1O is a flowchart of steps performed by the sys- a column for a correspondence item are also provided.
tem and processing a natural language query; The parser analyzes the input query 22 to determine the
Fig. 1 1 is a more detailed depiction of a definition _o subject, predicates and other parts of speech in the
table in the virtual table 28; input natural language query 22.
Fig. 12 is a depiction of an example natural lan- _OO22_ The system of Fig. 6 differs substantially from
guage correspondence logic formula; the conventional system of Fig. 1 in that the system of
Fig. 13 is a depiction of the modified version of the Fig. 6 includes a virtual table 28. The virtual table is a
formula of Fig. 12 __ natural language conversion virtual table held in mem-
Fig. 14 is a more detailed depiction of the collating ory 12 (Fig. 2), for designating which table in the data-
unit 3O of Fig. 6; base 9 is to be searched to find the data requested in
Fig. 15 is a depiction of a Definition Table A in the the query 22.
4
7 EP O 522 591 B1 8
_OO23_ In general, there are two types of data in the in detail, it is helpful to provide an overview of operation
database 9. There is fixed data, such as a masterfile for of the system. Initially, the natural language input query
defining ''object'', and there is variable data, which con- 2O (Fig. 6) is input bythe input unit 2 and received bythe
tinuously changes in accordance with ''event''. Variable communications controller 3. The communication con-
data iS aISo referred to aS a Cumulative file. Fixed data _ troller direCtS the input query to the parSer 22. The
having the same characteristics are grouped to form a grammar table 24 is used by the parser 22 to examine
virtual table. Further, a virtual table is formed by adding grammatical rules that help to parse the table into an
variable data to those fixed data items which are appropriate syntax tree like that shown in Fig. 9. The
strongly related thereto. parser 22 also uses the terminology dictionary 26 to
_oo24_ The virtual table 28 is composed of a number 1o determine WhiCh Of the tableS in the VirtUal table 28
of tabIes (i.e. tabIes 1 - 8) as shown in Figs. 8a - 8c. ShOuld be examined_ SPeCifiCalIY, the ''item'' COlumn Of
Each one of the entries in these tables includes a field the terminOlOgy di CtiOnary, aS ShOWn in Fig. 7, iS eXam-
for a ,,surface restriction,, (see Figs. 8a - 8c) and a field ined_
for a ''correspondence attribute'' is included for each _OO3O_ The collating unit 3O (Fig. 6) then determines
entry. The surface restriction field is filled with data only 1_ which of the tables in the virtual table 28 will be utilized.
for variable data. The surface restriction field is used to For the example of natural language query 2O, table 1
store particles which modify each header word of the (see Fig. 8a) is examined. The entries for the corre-
input natural language and which determine the value of sponding terms are examined in the table. The corre-
the ''correspondence attribute'' in combination with the spondence attribute field of the entries specify the table
header word. That is, the surface restriction is an item _o in the database 9 (Fig. 6) and entry where information
that is provided for performing a further selection when regarding the term of interest may be found, another
a plurality of corresponding attributes are possible for a correspondence table or an indication that the desired
header word. data is calculated as a mathematical function. The infor-
_OO25_ The correspondence attribute may designate mation retrieved by the collating unit 3O (i.e., the entity
another virtual table, a database entitytable, or an oper- __ table logical formula) then is passed onto the database
ation entity table. Designation of another virtual table formula generation unit 32 that converts this information
indicates that detailed data are stored in the other table. into a database retrieval formula for retrieving from the
Further, the storage in this fashion is used in an algo- database. The database retrieval formula is passed
rithm for selecting a virtual table. Specifically, if a virtual from the database formula generation unit 32 to the
table is designated in a correspondence attribute field, 3o retrieval unit 34, which retrieves the appropriate data
the designated virtual table is selected with priority. from the database 9. The retrieved data is then output to
_OO26_ The system of Fig. 6 also includes a collating the output device 17 (Fig. 2).
unit for retrieving data from the database 9 by referenc- _OO31_ The operation of the system of Fig. 6 will now
ing the virtual table 28 using the analysis result that is be described in detail. Initially, a natural language query
output from the parser 5. The collating unit may be 3_ 1 ''Chokoreeto rui no sengetsu no uriage ha?'' (Sales of
implemented in software that is executed bythe CPU 1 1 chocolates and the like in the last month?) is entered
(Fig. 2) and stored in memory 12. using the input unit 2. The communications controller 3
_OO2l_ The system further includes a database for- passes this query to the parser 22. Retrieval order and
mula generation unit 32 for converting an entity table operation order of the retrieval language are defined at
Iogic formula from the collating unit into a database 4o the communications controller 3. The parser 22 parses
retrieval formula. The database retrieval formula is used the query according to known strategies for parsing
by a retrieval unit that retrieves data from the database Japanese queries to produce a syntax analysis result
9. (like syntax tree shown in Fig. 9). The parser 5 uses the
_OO28_ Terms such as ''no'' and ''ha'' in the input natural grammar table 34 and the terminology dictionary 26 in
Ianguage query 2O are zyoshi. In Japanese, these 4_ performing its parsing. The grammar table 24 is a set of
terms serve to identify the role served by the words that e_ended context-free grammatical rules such as out-
precede them. For instance, in the example natural lan- lined in ''Iwanami Koza, Zyoho Kagaku 23_. Kazu to Shiki
guage input query 2O shown in Fig. 6, the zyoshi ''no'' to Bun no Shori'', Chapter 5 'Kikai Honyaku', Iwanami
modifies the phrase ''Chokoreeto rui'' (chocolates and Shoten''.
the like) to indicate that ''Chokoreeto rui'' is the object of _o _OO32_ The terminology dictionary 26 also has a for-
a prepositional phrase. Similarly, the zyoshi ''no'' follows mat as outlined in the above described article. This for-
the word ''sengetsu'' to indicate that ''sengetsu'' is the mat is shown in Fig. 7. To eliminate ambiguities in the
object of a prepositional phrase. Lastly, the zyoshi ''ha'' meaning of a word, a semantic ID is given to each word.
modifies the term ''uriage'' (sales) to indicate that The semantic ID helps to associate the input term or
''uriage'' is the subject of the query. The zyoshi help to __ words with term or words that are understandable to the
construct the hierarchical model shown in Fig. 9 that is database 9 (Fig. 6). For example, since there is no
output from the parser 22. retrieval key for ''shoohin'' (commodity), ''shoohin mei''
_OO29_ Before discussing the operation of this system (commodity name) is designated as the semantic ID for
5
9 E_ O 522 59_ B_ 1 O
''shoohin''. The database 9 (Fig. 6) includes information syntax tree is passed to the collating unit as the syntax
regarding the commodity name. Analogously, since analysis result (see step 4O in Fig. 1 O). The syntax tree
there is no entry for ''choko rui'' (chocolates and the like) is not directly converted into a database retrieving logic
in the database, ''cho_reeto rui'' (chocolates and the formula, but rather is converted into an intermediate
like) iS deSignated aS itS SemantiC ID. _ repreSentation _own aS a virtual table logiC formula.
_oo33_ Each entry in the terminology dictionary 26 Then an aPPrOPriate table in the VirtUal table 28 (Fig_ 6)
(Fig. J) also includes a semantic marker. The semantic iS SeleCted (SteP 42 in Fig_ 1 O)_
marker is provided to connect an ambiguous word (i.e., _OO3l_ For the example query 2O of Fig. 6, the termi-
not directly defined in the virtual table) to a correspond- nology dictionary 26 (Fig. 7) is referenced. Specifically,
ence attribute. Further, the semantic marker serves to 1o the ''item'' field is examined for ''sengetsu'' (last month).
combine words that are identical under the semantic The item field points to Table 5 in the virtual table 28
restriction in the virtual table. For example, since there (Fig. 6). Thus, Table 5 (Fig. 8c) in the virtual table 28
are no such retrieval keys for ''sengetsu'' (last month) in (Fig. 6) is examined. The entry for ''sengetsu'' has a cor-
the virtual table 28 (Fig. 6), the semantic marker for this respondence attribute pointing to Ddinition Table B-21 .
term is month (date), hence, indicating that this term is 1_ Accordingly, the entry with argument 21 in Definition
an indication of date on a monthly basis. Similarly, the Table B is examined (see Fig. 1 1a). This table entry sets
term ''Kyonen (last year), ''hi'' (day) and ''toshi'' (year) are forth the method of calculation for ''sengetsu''. ''sen-
also assigned semantic markers that indicate that the getsu'' (the last month) is a value which varies according
terms refer to date. A plurality of semantic markers may to the point in time of input and, therefore, must be cal-
be allowed for a word (e.g. ''uriage'' in Fig. 7). In such _o culated.
instances, the item in the virtual table 28 (Fig. 6) that is _OO38_ In order to understand the method, it is impor-
capable of corresponding to a retrieval key of the data- tant to first understand the format in which the date is
base 9 is searched by following semantic restriction on held. The current data is an 8 decimal digit number with
the virtual table designated by the semantic marker. digits 8-5 holding the year (e.g. ''1992''), digits 4 and 3
Further, in the terminology dictionary 26, a column for __ holding the month (e.g. ''O7'', for July) and bits 2 and 1
corresponding items (e.g. the ''ITEM'' column in Fig. 7) holding the date (e.g. ''1 1''). Thus, an example format for
is provided for designating which one of the tables of the the date of July 1 1 , 1992 is ''1992O71 1 ''.
virtual table 28 (Fig. 6) should be referenced. _OO39_ If July 1 1 , 1992 is the current date, the Defini-
_OO34_ Furthermore, in the case wherein the term, for tion Table B tells the system how to calculate the last
which a terminology dictionary entry is sought, is a 3o month (i.e. June or ''O6''). First one is subtracted from
numerical value having no corresponding virtual table the month digits 4 and 3. Hence, a result of (O7-1) or O6
entry, a correspondence attribute is determined by the is obtained. Then, the system checks whether the result
modifying-modified relation thereof or a semantic is OO. In this case, the result is not zero. If the result of
marker for units of numerical values. Alternatively, an the subtraction is OO, it is an indication that the last
actual value is determined in accordance with the ddi- 3_ month was December of the previous year. Therefore,
nition of an entity table. the month digits 4 and 3 are replaced with the digit 12
_OO35_ As a result of the analysis, performed by the for December, and the year digits 8-5 (the high order
parser 22, the construction of the query is identified and digits) are decremented by one. Lastly, the day digits 1
the object of the interrogation is known. It is necessary and 2 are replaced with OO.
to conform the object of interrogation to an item pos- 4o _OO4O_ Ne_, a table in the virtual table 28 (Fig. 6) for
sessed by the database. While several methods may be ''sengetsu'' (last month) is selected. In the terminology
employed for this purpose, the most effective method is dictionary 26 (Fig. 7), a plurality of virtual tables are des-
one in which the virtual table is provided to associate ignated for ''chokoreeto rui'' (chocolates and the like).
similar meanings which are referenced as different Specifically, Tables 1 and 3 are designated. An entry in
words in the database. By providing a virtual table, alter- 4_ the terminology dictionary 28 is also examined for the
ation and1or addition of the system is easy compared to term ''uriage'' (sales). The entry for ''uriage'' (sales) des-
a method in which the retrieval object item of the data- ignates Table 1 . Given that both the entry for ''Chokore-
base is directly entered into a terminology dictionary. eto rui'' and the entry for ''uriage'' specify Table 1 of the
Further, a variety of different natural Japanese queries virtual table 28, Table 1 is selected. Once the appropri-
may be correctly processed and the queries may _o ate table in the virtual table 28 is selected, an intermedi-
employ various different modifier representations. ate representation is formed by the collating process
_OO36_ The parser 22 (Fig. 6), thus, produces a hierar- (step 44 in Fig. 1 O) performed by the collating unit 3O.
chical syntax tree like that shown in Fig. 9. This result _OO41_ The collating unit 3O (Fig. 14) internally com-
indicates that the sales (i.e. ''uriage'') are what is sought. prises_. a virtual table selection unit 6O, for selecting a
The term ''Chokoreeto rui'' (chocolate and the like) __ table in the virtual table 28 (Fig. 6); an actual value cal-
specifies the commodity group for which sales are culation1combination unit 62 (Fig. 14) for performing cal-
sought, and the term ''sengetsu'' (last month) indicates culations and combination; and an interrogative
the time frame for which the sales data is sought. This structure determining unit 64 for determining the struc-
6
1 1 E_ O 522 59_ B_ 12
ture of interrogations that are passed to the database _nd of calculation is needed. With the definition table B-
formula generation unit 32. 21 , if for example the last month is April of 199O, the
_OO42_ The collating process involves incorporating value fOr the laSt mOnth iS Obtained frOm the value fOr
the contents of a di_ionary referenced by the input nat- the CUrrent date aS an OperatiOn reSUlt ''199OO4OO''. In a
ural language query into the table of the virtual table _ Similar manner, fUn-SUm (DB1-4) iS an OperatiOn fOr
that was seiected at step 42 in Fig. 14 or by performing Obtaining the SUm Of the nUmeri Cal VaIUeS On the SaleS
attribute coupling between virtual tables. ln the example COlUmn (COlUmn 4) in T_ble A Of the databaSe (Fig. 3C).
case, two virtual tables have been selected_. Table 1 (by The SYStem then maY aCCeSS T_ble A tO SUm all the
the entries in the terminology dictionary for ,,uriage,, and SaleS entrieS in the SaleS COlUmn fOr COmmOdity grOUp
,,Chokoreeto rui,,) and Table 5 (by the entry for ,,sen- 1o COde 2OO itemS dUring the mOnth Of APril 199O_
getsu''). A natural language correspondence logic for- _OO4l_ In this manner, the value of URl isfilled and the
mula 5O is generated as shown in Fig. 12. The database retrieval processing is terminated. The result
correspondence logic formula 5O is a table that sets is then outputted in a predetermined format.
forth what information is known from the query and what _OO48_ The query must be converted into a query set
additional information is needed to complete the query. 1_ forth in a database retrieval language to retrieve data
Specifically, it sets forth the relevant variables and any from the database. To replace the structure of the Japa-
values of these variables that are known. nese natural language querywith database retrieval for-
_OO43_ ''Chokoreeto rui'' is entered in the ''shoohin gun mulas, it is necessary to put together the restrictions
mei'' (commodity group name) in the formula 5O as and grammar possessed by the database retrieval lan-
''chokoreeto rui'' (chocolates and the like) is a commod- _o guage in the terminology definition table 26 (Fig. 6).
ity group name. This is known from the first table in the Construction of the queries in the database retrieval lan-
virtual table 28 (Fig. 6). Further ''URl'' and ''date'' are guage are made by referring to this terminology defini-
variables for which the values are not yet determined. tion table as described above. Further, having a
Those variables represented bythe same word have the separate grammar ddinition table 24 produces the
same value and represent that same attribute. In this __ advantage that all the changes to the database retrieval
example, ''URl'' in the question and ''URl'' in ''uriage hyo'' language may be absorbed by the grammar definition
are identical to each other. Note that values for those table, even when the present invention is applied to a
items other than the necessary items are not needed. A system using a different database retrieval language.
mark ''*'' indicates that no value is entered. _OO49_ As described above, by using the semantic
_OO44_ In step 46 of Fig. 1 O, a necessary virtual table 3o marker of a terminology dictionary and the virtual table,
is added to access the database 9 (Fig. 6). In this exam- a database is designated and a conversion is made into
ple, table 3 (Fig. 8b) of the virtual table 28 (Fig. 6) is a retrieval logic formula which is suitable even when an
selected based on correspondence attribute of ''shoohin ambiguous word is included in the query or an omission
gun mei'' 7 (commodity group name) in table 1 (Fig. 8a), occurs in the input query.
which specifies Table 3-2. The entry in table 3 directs 3_ _OO5O_ As described, in the present invention, no hier-
the user to Database Table entry 3-2 (e.g. DB 3-2). In archical table model is needed. Further, no considera-
addition, the actual value of ''sengetsu'' (last month) is tion of the hierarchical relation of the database is
calculated from the Definition Table B (as was dis- needed. Since the virtual tables have construction
cussed above). The table, thus, provided is indicated by which directly reflects the hierarchical relation of data-
52 in Fig. 13. The data shown assumes that the current 4o base, construction and alteration is easy. Further, since
date is in May 199O. Hence, the last month is April 199O the surface restriction and the semantic restriction are
or ''199OO4OO''. The commodity group code serves as included in the virtual table, the collating unit can desig-
the attribute for connecting Table 1 and the commodity nate a highly probable database file by selecting a suit-
group master table, and it possesses ''Code'' as an able virtual table even for an ambiguous input query.
undetermined variable. 4_ _OO51_ In the above described example, the term ''sen-
_OO45_ This table 52 is converted into a database getsu'' (last month) was included in the natural language
retrieval formula by the database formula generation query. This term was an ambiguous word related to
unit 32 (Fig. 6) at step 48 (Fig. 1 O). Retrievals are per- time. The system also has the capability of properly
formed sequentially by the retrieval unit 34 (Fig. 6) analyzing other ambiguous terms relating to time. Sup-
based on the retrieval formula to fill the undetermined _o pose that the Japanese input sentence is ''Kotoshi no
variables in the table 52 (Fig. 13). First, the undeter- haru no uriage ha'' (Sale for the spring of this year?).
mined variable ''Code'' is determined from commodity The parser 22 (Fig. 6) decomposes this sentence into
group master table 19 (i.e., table C in Fig. 3c) to be 2OO, its constituent part ''uriage'' (sales) and ''kotoshi no
which corresponds to ''chokoreeto rui'' (chocolates and haru'' (the spring of this year). Further, the parser 22
the like). __ _ows that ''kotoshi no haru'' modifies ''uriage''. The
_OO46_ The system then looks to the correspondence parser 22 looks up the term ''kotoshi no haru'' in the ter-
attribute for ''uriage'' (sales) (see Fig. 8a), which is ''fun- minology dictionary 26 and is directed to an appropriate
sum (BB1 -4)''. The symbol ''fun'' indicates that some table in the virtual table 28. The entry in the virtual table
l
13 E_ O 522 59_ B_ 14
directs the user to entry 3 in Definition Table A as shown _OO56_ The time interval definition reference unit 8O
in Fig. 15. This entry indicates that spring e_ends from contains the actual dates corresponding to ''fuyu'' (win-
O31O1 to O5131 . In this manner, the word ''kotoshi no ter). It obtains these dates by referring the time interval
haru'' (the spring of this year) contained in the syntax ddinition table 84. Hence, as shown in Fig. 15, ''fuyu'' is
analySiS reSult iS replaCed by ''199O nen 3 gatSu 1 ni Chi _ ddined aS Starting at ''OOOO12O1 '' (i.e., DeCember 1)
- 199O nen 5 gatsu 31 nichi'' (March 1 199O - May 31 and ending at ''OOO1 O331 '' i.e., March 31 of the next
199O). year). The time interval definition table reference unit 8O
_OO52_ In this example, however, any combination of SubStituteS the retrieved value 86 fOr ''fuYu'' (winter) in
time words to be usgd must be recorded on a terminol- the pOint in time CaICUlatiOn reSUlt 24 tO Obtain a time
ogy dictionary as a single word. For example, when it is 1o interVal definitiOn table referenCe reSUlt 76.
desired that ''kotoshi'' (this year) and ''haru'' (spring) be _OO5l_ The combining unit 82 combines the actual
combined ''kotoshi no haru'' (the spring of this year), it is dates corresponding to ''sakunen'' (the last year) and
necessary to previously record ''kotoshi no haru'' (the ''fuyu'' (winter) by addition to obtain a complete 8 digit
spring of this year) in the terminology dictionary 26 (Fig. range for dates for the interval as shown in the calcula-
6). Further, since the definition of a seasonal word or 1_ tion result 78. Specifically, the year ''1989OOOO'' is added
the like differs from user to user, a terminology diction- to the dates of ''fuyu'' ''OOOO12O1'' - ''OOO1O331 '' to obtain
ary must be prepared for each user. ''198912O1 '' - ''199OO331 ''. The calculation result
_OO53_ As such, an alternative embodiment as shown ''198912O1 -199OO331 '' means ''from December 1 , 1989
in Figs. 16a and 16b may be employed. This alternative to March 31 , 199O''. The calculation result 78 is then
embodiment differs from the first embodiment in that it _o processed as discussed in the first embodiment.
includes_. a point in time calculation unit 7O, for calculat- _OO58_ By changing the definition of each time word
ing a specific point in time from the current date, a time described in the time interval definition table 84 (Fig.
interval definition table reference unit 8O, and a combin- 16b), the user may obtain a calculation result in accord-
ing unit 82 for adding the reference result of the time ance with definition without altering the terminology dic-
interval definition table reference unit 8O and the calcu- __ tionary 26 (Fig. 16a). That is, it is possible for users to
Iated result of a point in time. Further, a system timer 68 share a terminology dictionary and manage the time
is provided. interval definition table individually. This benefit of shar-
_OO54_ Suppose that ''sa_nen no fuyu no uriage ha'' ing a terminology dictionary is more apparent when it is
(Sales during the winter of the last year?) is entered appreciated that a terminology dictionary is large in size
from the input unit 2 as the input query 66 (Fig. 16a). 3o and amendment of a terminology dictionary is difficult.
The parser 22 generate a syntax analysis result 72 (i.e., Moreover, if words containing many modifiers are to be
a syntax tree) by employing the grammar table 24 and ddined, storage requirements are large. Hence, provid-
the terminology dictionary 26. The syntax analysis ing a separate terminology dictionary for every user is
result contains ''sakunen'' (last year) and ''fuyu'' (winter), cumbersome.
which are time words. The definition of the word 3_ _OO59_ The example input natural language queries 1
''sakunen'' (the last year) is obtained bytime calculation, (Fig. 6) and 66 (Fig. 16a) requested sales information
and the definition of the word ''fuyu'' (winter) is desig- that could be readily reproduced by the system. The
nated to be described in the time interval definition table system, however, is capable of handling more sophisti-
82 (Fig. 16b). cated queries that require reasoning. For example, sup-
_OO55_ The syntax analysis result 72 is passed to the 4o pose that the Japanese input query is a sentence
collating unit 3O, where the result is received by the ''Sengetsu no uriage yori kongetsu no uriage ga ooi
point in time calculation unit 7O. At the point in time cal- tokuisaki ha'' (What customer had more sales in this
culation unit 7O, a point in time calculation is performed month than sales in the last month?). For such an input
with respect to the current date (e.g., ''199O1224'') that natural language query, the system produces a retriev-
is obtained by a system timer 68. The actual calculation 4_ ing logic formula, also _own as the entity table logic
method performed is selected from the definition pro- formula 14, in the form 14O shown in Fig. 1 7a. The for-
vided in Definition Table B in Fig. 1 1 . The definition that mula 14O includes a result table 142 for storing the final
is chosen depends on the value in the argument column results of the retrieved data. The result table 142
in the terminology dictionary. In this example, an 8-digit includes a location for storing the customer's name and
integer value indicating the year ''sakunen'' (last year), _o tables for storing the total sales of this month and the
''1989OOOO'', is obtained from the calculation method, total sales of last month. In addition, the entity table
corresponding to the value ''1 1 '' in the argument column logic formula 14O includes a GT table, which is a table in
of ''sakunen'' (the last year), which states, ''Subtract 1 the virtual table that performs a logical operation on
from the four high order digits and replace the four low parameters to determine if one parameter (the left side)
order digits with ''OOOO''. Subsequently, the calculated __ is greater than the other (the right side).
integer value is substituted for the portion of ''sakunen'' _OO6O_ The total sales of the last month table includes
(the last year) in the syntax analysis result 72 to obtain a pointer pointing to a last month's intermediate result
a point in time calculation result 74. table 144 that holds the results of intermediate calcula-
8
1 5 E_ O 522 59_ B_ 16
tions that are necessary to determine the total sales of no restriction on the executing order of these two),
the last month. Similarly, the total sales of this month's ( interrogation 3 ) .
table POintS tO thiS mOnth'S intermediate reSult table _OO63_ The system proceeds to process each of the
1 46. BOth Of the intermediate reSUlt tableS 1 44 and 1 46 interrogations as indicated in Fig. 1 g. ln particular, for
Seek tO haVe infOrmatiOn regarding the CUStOmer COde 5 interrogation 1 , which is interrogation for the last
and the tOtal SaleS fOr their reSpeCtiVe mOnthS. In Order month,s intermediate result table, the customer table in
tO CaICUlate the tOtal SaleS Of the laSt mOnth, it iS neCeS- the database 9 (Fig. 1 7c) is retrieved using retrieval unit
Sary tO determine the CaICUlatiOn ObjeCt (i.e., what kind 34 to obtain the customer code information. Further-
Of infOrmatiOn iS being SOUght). In additiOn, it iS neCeS- more, the system seeks to sum the amount fields in the
Sary tO determine the amOUnt Of OrderS that Were 1o received order file of the database g. ln order to pe_orm
reCeiVed dUring the mOnth frOm that CUStOmer. ACCOrd- this calculation, the system sums the amount entries
ingly, there iS an additiOnal table, the tOtal SaleS Of the having the appropriate customer code and which meet
IaSt mOnth'S intermediate reSUlt table 1 48_ AnalOgOUSIY, the date limitations of last month. The EQ table 3 is
a tOtal SaleS in thiS mOnth'S intermediate reSUlt table 1 51 used to ensure that the date requirements are fulfilled.
that SeekS Similar infOrmatiOn fOr thiS mOnth'S Sale, iS 15 ln this fashion, the intermediate result table is filled in
aISO prOVided. HenCe, the amOUnt Of reCeiVed Order fOr with the relevant information.
thiS mOnth and laSt mOnth fOr the SPeCified CUStOmer _oo64_ Interrogation 2 involves the processing for this
COde are reqUeSted and paSSed tO the databaSe fOrmUla month,s intermediate result table. The processing is the
generatiOn Unit 32 WhiCh COnVertS the lOgiC fOrmUla intO same as interrogation 1 except that different date
a databaSe retrieval fOrmUla 1 57 USing the databaSe _o requirements are utilized. Specifically, the date must
retrieval wOrd grammar definitiOn table 155. The reSUlt correspond to the limitations for this month. ln this fash-
table and the VariOUS intermediate reSUlt tableS 144, ion, the information for this month,s intermediate result
1 46, 1 48 and 151 are paSSed tO the databaSe fOrmUla table is completgd.
generatiOn Unit 32_ In additiOn, e4UalitY tableS (denOted _oo6__ Lastly, interrogation 3 is processed. The inter-
aS EQ tableS) are paSSed tO the databaSe fOrmUla gen- _5 rogation 3 is the interrogation for the result table. As Fig.
eration Unit 32. SpeCifiCally, EQ T_bleS 3 and 4, aS 1 g indicates, the customer table in the database cus-
Shown in Fig. 1 7b, are paSSed to the databaSe formUla tomer and name are selected, as are the total sales of
generatiOn Unit 32, EQ T_ble 3 SeekS tO determine if the this last month table and the total sales of this month
reCeiVed Order file date iS eqUal tO the laSt mOnth date, table. This information is retrieved from the customer
and EQ T_ble 4 SeekS tO determine if the reCeived Order 3o tabIe in the database g (Fig. 1 Jc) and from the Iast
file date iS e4Ual tO tOdaY'S date_ month,s intermediate result table 1 44 (Fig. 1 Jb) and this
_OO61 _ The entity table logic formula 1 4O is processed month's intermediate result table 1 46. In order for the
by the database formula generation unit 32 (Fig. 1 7c) customer name to be output, the sales of this month
which uses the database retrieval word grammar defini- table must be greater than the sales of last month table
tion table to process the logic formula 14O. The data- 35 and the customer code of this month's intermediate
base retrieval word grammar definition table is result table must equal the customer table and code.
examined by the database formula generation unit 32 _OO66_ In this manner, automatic generation of data-
with respect to the retrieval logic formula 1 4O. The data- base retrieval formula is possible. Operations are con-
base retrieval word definition table initially processes nected by means of pointer and a logic unit for judging
result table as indicated in Fig. 1 8. In particular, the sys- 4o executing order is provided in the database formula
tem is directed to select the SELECT (item) FROM (ref- generator unit 32 in Fig. 6.
erence table) WHERE (condition). Thus, the result table _OO6l_ Further, this approach provides the additional
is converted into a database retrieval formula of advantage a plurality of sequenced data retrievals are
( interrogation 3 ) of Fig. 1 9. The retrieval word grammar possible by way of intermediate results. The system
definition table 1 55 has a similar entry for the intermedi- 45 also provides the advantage that it is possible to readily
ate result tables 144 and 1 46. Further, the database for- conform to a different database retrieval language by
mula generation unit 32 investigates the executing order altering the grammar definition table.
of the specified operations with respect to another. In _OO68_ Specifically, when the retrieval language is
this case, since the result table 1 42 designates last changed, the database retrieval formula for a new
month's intermediate result table 144 and this month's 5o retrieval language may be generated and an e_ensive
intermediate result table 1 46 as ''left side > right side'' in rewriting thereof is not necessary. Rather, a simple
the GT table, it is learned that the operation of left side change in the description of (item), (reference table),
and right side must be performed before the GT table (condition) or SELECT, FROM, WHERE of the desig-
can be processed. In other words, it is seen that deter- nated item to the result table of the grammar definition
mination of the intermediate result tables must be per- 55 table is all that is required.
formed first. _OO69_ For some natural Japanese queries, a compli-
_OO62_ In this manner the execution order is deter- cated or plurality of processing must be performed to
mined as ( interrogation 1 ) , ( interrogation 2 ) (there is analyze the query. For example, there are instances
9
1 7 E_ O 522 59_ B_ 18
where data conforming to specific periods of specific _OOl4_ Further, by collating surface restriction, it is
conditions are added together. It is often desirable to be possible to check particle and to display an error mes-
able to perform a preprocessing operation at the collat- sage for an input sentence with an erroneous content.
ing unit for comparison or grouping. Hence, such pre- For example, with respect to a sentence ''Chokoreeto ga
proCeSSing may be inCorporated into the preSent _ utta Shoohin ha'' (What Commodity Sold by ChoCo-
invention. Iates?), since there is no ''ga'' in the surface restriction of
_oolo_ In order to explain such preprocessing, sup- ''ShOOhin'' in ''UrU hYO'', it iS iUdged aS an errOr and it iS
pose that the input query is ,,Mitsubishi shooten no pOSSible tO diSplay an errOr meSSage ''ZyOShi ga Chigai
uriage yori uriage ga ooi tokuisa_ ha,, (What a customer maSu'' (WrOng ''zYOShi'' iS uSed)_
has more sales than Mitsubishi shooten?) or ''(A- 1o _OOl5_ In the system of Fig. 1 described as a conven-
shooten no) kotoshi no haru kara aki made no uriage tional example, an answer is provided in the same for-
ha'' (How much were the sales to (A store) from the mat at all times. That is, in answering the retrieval result,
spring to fall of this year?). Figs. 2Oa and 2Ob are helpful the response is made in a tabular format and not in a
in explaining the structure of a syntax tree that is pro- sentence format. In some cases, the answer in this for-
duced for an input query which requires a plurality of 1_ mat is difficult to view. To eliminate this disadvantage, a
Iogic formula groups. First, the input sentence is broken response format selection unit may be provided in the
down by the parser 22 (Fig. 6) into elements in the form retrieval unit. This unit should provide at least two types
of a tree structure (i.e., the syntax tree) such as the tree of formats, i.e., a tabular format and sentence format, as
denoted as ''HIKAKU'' (comparison) in Fig. 2Oa and the the outputting format.
tree denoted as ''KARA MADE'' (from to) in Fig. 2Ob. _o _OOl6_ While the present invention has been shown
Fig. 2Oa shows the syntax tree for the first example with respect to preferred embodiments thereof, those
query, and Fig. 2Ob shows the syntax tree for the sec- skilled in the art will know of other alternative embodi-
ond example query. Particles are detected and the ele- ments which do not depart from the spirit and scope of
ments are forcibly divided at the parser 22 (Fig. 6). In the invention as ddined in the appended claims. For
Figs. 2Oa and 2Ob ''ji'' refers to a word serving as a key __ instance, the system may be adjusted to operate on nat-
and ''fu'' is a modifier. The modifier is used to refer to the ural language queries that are formulated in languages
surface restriction or is regarded as a special modifier in other than Japanese. Further, the system may be imple-
searching the virtual table. mented on data processing system other than that
_OOl1_ The first example query, as shown in Fig. 2Oa, shown in Fig. 2.
seek to compare sales of two entities. As such, two 3o
tableS have to be SeleCted. If a table iS SeleCted So that ClaintS
a comparison cannot be made. Two tables can be
selected by dividing the syntax tree into groups. A vir- 1 . An information retrieval system for retrieving infor-
tual table (see Fig. 21a) corresponding to a comparison mation from a database (9) comprising an analysis
expression like the ''ooi hyo'' shown in Fig. 2Oa is pro- 3_ unit using a dictionary (26) for analysing a natural
vided and a virtual table logic formula for comparison is language query
generated by indicating the relation between the two characterised in that
tables with the comparison virtual table. The compari-
son virtual table can be used for converting a word indi- - said dictionary associates natural language
cating a comparison meaning in any language to an 4o terms with their semantic meaning in terms
expression such as, [G_ (greater than). The two virtual which can be understood by the database and
table logic formulas are set by Group (a) in Fig. 2Oa. also associates each of these terms with one
_OOl2_ In a similar manner, as shown in Fig. 21b, by or more virtual tables (28);
using the virtual table constructed to have ''kara made'' - the virtual tables associate semantic meanings
(from ... to), ''yori made'' (from ... to) tables the interme- 4_ with database tables;
diate logic formulas are determined by Group (b) as - said analysis unit being a parser (22)using said
shown in Fig. 2Ob. It is designated at Fig. 21b to refer to dictionary for parsing said natural language
the definition formula, and actual dates are determined query into its constituent parts to determine a
by the operation discussed above. syntax analysing result as to the construction
_OOl3_ Also, interrogatives may be dealt with to some _o of the query;
extent by providing an item for surface restriction in the - a collating unit (3O) for preparing a database
virtual table and by investigating the items relative to the retrieval formula from the syntax analysing
surface restriction. For example, with respect to an input result by selecting the database tables as indi-
sentence ''Nani wo uttaka'' (What was sold?), since only cated by the virtual tables whereby for a term
a commodity name or commodity group name falls __ which is associated with more than one virtual
under those with the surface restriction ''wo'' in ''uru table the collating unit relies on the virtual table
hyo'', it is possible to assume that ''nani'' (what) refers to which this term has in common with another
one of them. term of the query and
1 O
1 9 E_ O 522 59_ B_ 2O
- a retrieval execution unit (32,24) for retrieving - eine Kollationiereinheit (3O) zum Vorbereiten
data from the data base (9) on the basis of said einer Datenbank-Auffindungsformel aus dem
database retrieval formula. Syntaxanalyseergebnis durch Auswahl der
DatenbanMabellen, wie durch die virtuellen
2. An information retrieval system as recited in claim 1 _ T_bellen angeleigt iSt, WOdUrCh für einen AUS-
further comprising_. drUCk, der mit mehr aIS einer VirtUellen T_belle
assoziiert ist, die Kollationiereinheit sich auf die
an additional table for converting an undeter- virtUelle T_belle Stützt, weIChe dieSer AUSdrUCk
mined value phrase in the natural language gemeinSam mit einem anderen AUSdrUCk der
query into a determined value phrase in the 1o Frage hat, Und
database (9) based on the syntax analysis
result. - eine Wiederauffindungs-Durchführungseinheit
(32, 24) zum Wiederauffinden von Daten aus
3. An information retrieval system as recited in claim 1 der Datenbank (9) auf der Grundlage der
or 2 further comprising_. 1_ Datenbank-Auffindungsformel.
a terminology dictionary (26) for identifying 2. Informationsauffindungssystem nach Anspruch 1 ,
entries in the virtual table to be used in identify- welches weiterhin aufweist_.
ing the terms of said constituent parts, said dic-
tionary including words representing time, and _o eine zusätzliche Tabelle zum Umwandeln
said terminology dictionary being used by the eines Satzes mit unbestimmtem Wert in der
parser (22) in obtaining the syntax analysis natursprachlichen Frage in einem Satz mit
result; and bestimmtem Wert in der Datenbank (9) auf der
a time interval definition table (8O) in the virtual Grundlage des Syntaxanalyseergebnisses.
table for defining dates corresponding to said __
words representing time. 3. Informationsauffindungssystem nach Anspruch 1
oder 2, weICheS weiterhin aUfweiSt_.
4. An information retrieval system as recited in one of
claims 1 to 3 further comprising_. ein Terminologielexikon (26) zum Identifizieren
3o von Eintragungen in der virtuellen Tabelle, die
a database retrieval formula conversion unit for zur Identifizierung der Ausdrücke der Bestand-
generating a formula in a database retrieval teile zu verwenden ist, wobei das Lexikon die
Ianguage from the database retrieval formula. Zeit darstellende Wörter enthält, und das Ter-
minologielexikon von dem Parser (22) ve_en-
Patentansprüche 3_ det wird beim Erhalten des
Syntaxanalyseergebnisses; und
1 . Informationsauffindungssystem zum Wiederauffin- eine Zeitintervall-Definitionstabelle (8O) in der
den von Informationen aus einer Datenbank (9) mit virtuellen Tabelle zum Definieren von Zeitanga-
einer Analyseeinheit unter Ve_endung eines Lexi- ben entsprechend den die Zeit darstellenden
kons zur Analyse einer natursprachlichen Frage, 4o Wörtern.
dadurch gekennzeichnet, daß 4. Informationsauffindungssystem nach einem der
- das Lexikon natursprachliche Ausdrücke mit Ansprüche 1 bis, welches weiterhin aufweist_.
ihrer semantischen Bedeutung in Ausdrücken
assoziiert, welche von der Datenbank verstan- 4_ eine Datenbank-Auffindungsformel-Umwand-
den werden können, und auch jeden dieser lungseinheit zum Erzeugen einer Formel in
Ausdrücke mit einer oder mehreren virtuellen einer Datenbank-Auffindungssprache aus der
Tabellen (28) assoziiert; Datenbank-Auffindungsformel.
- die virtUellen Tabellen SematiSChe BedeUtUn- _o ReVendiCationS
gen mit DatenbanMabellen assoziiert; 1 . Système de récupération d'informations pour récu-
- die Analyseeinheit ein Parser (822) ist, der das pérer des informations à partir d'une base de don-
Lexikon für das Parsing der natursprachlichen nées (9), comprenant une unité d'analyse utilisant
Frage in ihre Bestandteile verwendet, um ein __ un dictionnaire (26) pour analyser une demande
Syntaxanalyseergebnis bezüglich des Aufbaus formulée en un langage naturel, caractérisé en ce
der Frage zu bestimmen; que
21 E_ O 522 59_ B_ 22
- ledit dictionnaire associe des termes du lan- une unité de conversion de formule de récupé-
gage naturel à leur signification sémantique en ration d'une base de données pour produire
des termes qui peuvent être compris par la une formule dans un langage de récupération
base de données, et associent également cha- d'une base de données à partir de la formule
Cun de CeS termeS à une ou pluSieurS tableS _ de réCupération danS la baSe de donnéeS.
virtuelles (28);
- les tables virtuelles associent les significations
sémantiques à des tables de la base de don-
néeS;
- ladite unité d'analyse est un analyseur syntaxi- 1o
que (22) utilisant ledit dictionnaire pour analy-
ser ladite demande formulée en langage
naturel en ses parties constitutives pour déter-
miner un résultat d'analyse syntaxique concer-
nant la construction de la demande; 1_
- une unité d'assemblage (3O) pour préparer une
formule de récupération de bases de données
à partir du résultat de l'analyse syntaxique par
sélection des tables de la base de données
comme indiqué par les tables virtuelles, ce qui _o
a pour effet que pour un terme, qui est associé
à plus d'une table virtuelle, l'unité d'assem-
blage est basée sur la table virtuelle, que ce
terme a en commun avec un autre terme de la
demande, et __
- une unité d'exécution de récupération (32,24)
pour récupérer des données à partir de la base
de données (9) sur la base de ladite formule de
récupération dans la base de données. 3O
2. Système de récupération d'informations selon la
revendication 1 , comprenant en outre _.
une table additionnelle pour convertir une
phrase d'une valeur indéterminée dans la 3_
demande formulée en langage naturel en une
phrase de valeur déterminée dans la base de
données (9) sur la base des résultats de l'ana-
Iyse syntaxique. 4O
3. Système de récupération d'informations selon la
revendication 1 ou 2, comprenant en outre _.
un dictionnaire de terminologie (26) pour iden-
tifier des entrées dans la table virtuelle devant 4_
être utilisées pour identifier les termes desdites
parties constitutives, ledit dictionnaire compre-
nant des mots représentant un temps, et ledit
dictionnaire de terminologie étant utilisé par
l'unité d'analyse syntaxique (22) pour l'obten- _o
tion du résultat d'analyse syntaxique; et
une table (8O) de définition d'intervalles de
temps dans la table virtuelle pour définir des
dates correspondant auxdits mots représen-
tant un temps. __
4. Système de récupération d'informations selon l'une
des revendications 1 à 3, comprenant en outre _. 1 2
EP O 522 591 B1
(PICTURE)
F/_. 1
13
_(PICTURE)
_ __
EP O 522 591 B1
Fj_. 2
PRlOR ART .
EPO522591B1
Fi_. __
(PICTURE)
PRlOR ART q
_
_(PICTURE)
EPO522591B1
_
_ _
_ ii4
_' oco
___ _
_ 16
_ (PICTURE)
EP O 522 591 B1
_
W
_
4
_
_ _4
lfq 4
ji
_
O
_
lt
_
_
._'
_ __ __
1l
_
(PICTURE) (PICTURE)
EPO522591B1 (PICTURE)
_ _ _
__oc4 _ __q _ __q
_'_o __ _o ___o
___Ci_ _ii_ _Oi ___ __
Ct CL CL
18
(PICTURE)
EPO522591B1
Fi_. 5 .
PRlOR ART
19
EP O 522 591 B1
(PICTURE)
_ __
__
_
O
_
.___
_ 2O
_
EP O 522 591 B1
_ (PICTURE)
_
h
.__
_ 21
(PICTURE) (PICTURE)
EP O 522 591 B1
_
OO
.___
_ _
O
>_O
4_
_
_
4
_o _
>O z-
_ _O
_ _O
__ vijj
V_ N
W_ W_
__ CD4_
22
_ (PICTURE) (PICTURE)
EPO522591B1
_
O0 ô
O _
__ > O
i O>
___ . 4_ i
_ j_ 4
4 j_
_ _
4
z _
j
_ -_
z 4
-i _
o j_
~
i O
v_ v_
_ _
W_ _W
_ _
4_ _4
23
(PICTURE) (PICTURE) (PICTURE)
__ _(PICTURE)
EPO522591B1
_
O0
.__
_ _ _ _ OO
_w w_ _W W_
_ _ CD _
_4 4_ _4 _4
24
_ (PICTURE)
EP O 522 591 B1
_
.__
_ 25
EP O 522 591 B1
Fj_. 1O
(PICTURE)
4O
42
44
46
. 48
49 '_
26
_w__ (PICTURE)
EP O 522 591 B1
h
h
__ _
___ W_
_
4_
Z
__ O-_-z __
EPO522591B1 _W_-
_ _~___
W oi_
_q o_z
D _4
V 4j_w
tw O4_
_ _oô
j ij-
N _~_
i _ _ _ U_j
_ .. .. _..vU ..
____ _W..
W O
_ O __
4 U j _w
_ vD 4 _
CL z 4
jo D
w _ _
_ _ _ j H
_ - j .iijD ij_ W_
_ _ 6 _ > _ 4
h i u Wj _ D
j _ 4 D > V
_ v > ij _ _
_ ._ _ _ Z j -D i
._ __ _ -D j o _
ì j o > _ j z
_ Wj _ j _ u j O.
q 4_ j O v o j
_ o jO _o u
v ij j D v _ _
_ v o _ Ui _
__ z O o -w Wj 4
_ _ t U u i 4 j
q j O v V
- w D _ U7
_ O -. _ z V
j v_ O Wj j ~ t
_ U _ _ ~
_ z z z z _ __
z _ - - - _ w
o o i i i i _ W
- j O O O O 4 _
__ i Oi iO iO iO -_ Zw
W oi uj uj _ Ui j _
jo _j _ _ _ _ _ _
_O
_
28
, _
EPO522591B1
_ W
W D
_- o
_ 'oU __
_ W_ o_.. _-
_i O_w i
_ O
Oz UOo w
Oi4 zu ii
ww j
w_ __ _ o
_4_ Z-j j z .
_o ~o_O _4
w _ij o_
o o~_ ~ iow
Ou _~û iij> wi4
..V _ i _j
___ _ Vo _oo
o w W _O _u
O O j iojj jj~O
if O q ~u
O U Z iO ijv
O 4u ,
_ _ _ _ _ jV
_ .. j j oi _.
_.- - _ O O O _w
_ __ w _ _ LLz j _
_w D j _ _ _ 4 O
_ i ij 4 z O_
4 u Z >i >i j_ ~
h D - - o _ O
V > > oo Oo _ j _
_ ' i i _ O _
_ W_ - - j j _ _
__ j iio Oo j j > _ w
i _ j O O i i
_ -~ jj j U U -o > 4
, o o _V V o i- vD
oi u _ j o
j V vU v D j O _
_ Oo _w _ ij j i
H _ _ j iIj ij j i
_ j o U w v O Z
uj _ o j u O
w 4 o -' z Z 4 'o V j
j i o W j j _ i
4 u j _ _ V o _ i
_ _ ~ W _4
V 3j -z z-i Z-~ ~3 ,w ~ j j
~' j ~ij o 6 6 _4 z z V
_ iij ~ ~ ~ o - j j _-
4 _ -i i ~ ~ oi _ _ 'j
j V, ,Uj Uj_ ,Ui ,_ j, z-i -~z _iw
_9
EP O 522 591 B1
Fj_. 14
(PICTURE)
INPUT
_O OUTPUT
3O
_ _-wo (PICTURE)
EP O 522 591 B1
_
h
_' 4
___ W_
_
4_
Z
O-_-z 3_
(PICTURE)
EP O 522 591 B1
32
(PICTURE)
EP O 522 591 B1
_
__
._'
_ _ _
_ _
33
_
EPO522591 B1
__ (PICTURE) (PICTURE)
_'
_- O
__
4_
_
_
CCo
. LL
U
_
O_
W_
_
4_
_ ___
_ Z
W
h
.__
. _ _ . 34
(PICTURE)
EP O 522 591 B1
__
h .
.__
_ 35
__ (PICTURE)
__
EP O 522 591 B1
_
k
h
._'
_ 36
_ _____
_______ __(PICTURE)
EPO522591 B1
O0
h
.__'
_ 3l
EPO522591B1 W
Wj j___
_ 4
4i iWj
î iCD
iz i i4
o i j Zoi
j _ j_ ji
4j w j
_ _ ij_
i wW _4w
w i ijW _
q ~ w iCDi4 j
j C6 j w 4i- w
j i4 w i6 i _ ii iWi
i j 4 4 oìWj i4
ii i _ii7 i D ioi ó
O ~ Wj4iWj _iww_w
LL j - - i w Wjo_j_j
j _ _wii ~j_ 4jZ_ _w
4 W _j_ ujiW uj _4_i
> _ w=_w w3 __iwz
W-.w o o oi jíìijq_
i_ i ___ _ 4iijuj_
w q OiijO w_ io Zo~ _ì
_ ij D i iLLoj_/\i
w OwOCoOw 4Z Wwz
w j.> > óO- ._i_jii
_ _w_=wo=w Wjiq Wjwujw_jWo
4_ ioouiijij __ 4j4ii4 o
4 zuW>W wo i4_j4 _u
_ _ - _w_ ioi Dìî
4 _o_Uw_ Zii ojiWWjiiD
_ O iio_o -W Z_i i Z
4 _i qi _6_4
_ _ iz . z w o.WijO
_ O owiiwíi iv j_wi_z Wj
. jjjji i 4j j=_W
ii_ w i6o6jo z_ i_._ -iO_
j i4j4j 6q 4_4_iOui4
iLj _iqi j iij_ii
4 4 o w j 4 i___
x j_wOCi _i j__wi_wzoww
WiW4 -4 jO jj
W wjij iuj Wj j
iov6 i _oio _OO
iiji_ _ _iizi_jWi_i_
__ _4 W _o_-
jjjW _î _j jì4jj
__ij_u_ Ni _ujUi_UU
Z Z Z
Oi O Oi
iU Woi i iU j W_
4W jw 4 4W o wo
_jw _Oi _ _jw _ iz
O i O O_ _ j4
_U7 _ _ _
_ _ _
Wiz iiW Wiz
38
EPO522 591 B1
_4
_ z_
V W
W i
_ o
z 4 o
ij _ _
-i j __ _
4 __ - _o
_o w ___
j _
_ o _w
CC i __
W uj - _i4
_o iz j O -_
v - u O jj
lt
j . .
O ' _ _ ' ' '
_ - hI _ _
_ _ _ CC _
. _ _ _ _ _
_ _
_ (PICTURE)
_
_
.__'
_ Z
O
_
_
4_
j
O
U _
_ _
-_ _
39
_
EPO522591B1
_
4
W
Z _
O
_ U7
4 O _ '
_ _ _
O _ _
_ _ _
_ _w w _ 3 j
ii _ j O ii j
V z 4 _ _ 4
Ct uj _ Ut Li
_ -
O
_ . ._ .. __ __
_ _ _ _ _ _
_ _ _ _ CC
_ _ _ _ _ _
_ _ _
_ (PICTURE)
_
iij
.__'
_ W_
O
4
_
4
_
4_
L _
_
_ _
-hI 4O
(PICTURE)
EP O 522 591 B1
__ '
_
._'
_ 41
_ (PICTURE)
EPO522591 B1
__ _
_
.__
_ 42