4.3.2 Making tokens database
To make the tokens database first of all the GDBMCE dynamic link library should be loaded into memory. After that open a database file with the name tokens.db. This file will contain the token and type for characters of the input text provided by the frontend through the file TextIscii.txt. In the database file save the token and type as they are produced by Natural Language Processor.
An index value is maintained that starts with 0 and works as a key to this database. As a new set of token and type is added to the database according to the current index, the index value is increased by 1. Hindianalyser phase returns the value of index and the frontend passes this value to Hindiengine to retrieve all the tokens and their respective type.
When the token and type are inserted in database a delimited is added between the two. The following code shows the insertion technique:
char buffer[20];
char keybuffer[20];
//0 concatenated with word[i] gives the token name.
//1 is the token type in this case.
//delimiter | is added between the token name and type.
sprintf(buffer,"%d%d|%d",0,word[i],1);
//content is the datum variable needed for insertion
content.dptr = buffer;
content.dsize = strlen(buffer);
//key is the datum variable to hold the key which is index in this case
sprintf(keybuffer,"%d",index);
key.dptr = keybuffer;
key.dsize = strlen(keybuffer);
//dbftokens is the handler to the tokens database
//Function is called with GDBM_REPLACE as the argument so that the //database will be rewritten if the key already exists.
(*pgdbm_store)(dbftokens,key,content,GDBM_REPLACE);
//increase the index value
index = index + 1;
|