Using Arrays for Fast Word Lookups
As our word list (Words/Responses) grows longer, our program will gradually slow down. If we have 2,000 words in our list and the user types longer sentences, We have to search each word in their sentence through the entire list.
For a sentence - "I have a baby turtle that I named Willy" which is 9 words long, we will check each word against the entire list looking for a word that matches. That makes 9 x 2000 = 18,0000 comparisons.
Even though a computer is fast, it still takes a few more seconds. As our chatbot gets smarter, the vocabulary list gets longer.
If a 16 year-old know 25,000 words, that makes 2,250,000 comparisons. Computers are fast but that is a lot of comparisons.
If a user is running your program on a older, slower cellphone, your program may never finish.
We need to do something...
We can make our chatbot super fast by making our list an Array. It will be like a Lookup Table. We can just look up each word just once to see if it is in our list. That is only 9 Lookups for our example sentence (of 9 words)
Changing Your Program to Use an Array (put your list of words/responses into an array instead of a simple list.)
1. openCard changes
In the openCard handler, where it is:
OLD CODE:
on openCard put specialFolderPath("Desktop") & "/" & "Chatbot.txt" into myFile put URL ( "file:" & myFile ) into myList set itemDelimiter to tab put "Hello, Let's talk" & return into field "log" select before line 1 of field "input" // put the cursor at the start of the field end openCard
Change the line in red to - split myList by return and tab
NEW CODE:
on openCard put specialFolderPath("Desktop") & "/" & "Chatbot.txt" into myFile put URL ( "file:" & myFile ) into myList split myList by return and tab // split each line into a "key" and it's "value" put "Hello, Let's talk" & return into field "log" select before line 1 of field "input" // put the cursor at the start of the field end openCard
The split command separates each line to the "KEY" (Lookup word) and it's "VALUE".
Our data looks like this after the split
cat Tell me about your pet
dog Tell me about your pet
- It is sort of like a "Dictionary" in English
- In Python it is called a dictionary. In Java, it is called a Hash Map.
2. getResponse changes in the card script
OLD CODE:
In the getResponse handler, where the code is:
on getResponse put field "input" into theInput // save what the user typed put theInput & return after field "log" // put it into the log and go to next line put empty into theResponse // clear out the response variable global x repeat for each word w in theInput // check each word in the user's sentence put findWordInList(w) into theResponse // try to find the word in our list if theResponse is not empty then // if we returned a response exit repeat // ...we can exit the loop, we found a word end if end repeat
if theResponse is empty then // if there is nothing in theResponse, then
.... // ...keep looking
Replace the code in red with: put myList[w] into theResponse
note: we use square brackets [] not parentheses () for arrays
NEW CODE:on getResponse put field "input" into theInput // save what the user typed put theInput & return after field "log" // put it into the log and go to next line put empty into theResponse // clear out the response variable global x repeat for each word w in theInput // check each word in the user's sentence put myList[w] into theResponse // Lookup the word in the array if theResponse is not empty then // if we returned a response exit repeat // ...we can exit the loop, we found a word end if end repeat
if theResponse is empty then // if it came up empty,
.... // then keep looking
Now we just lookup the word right there.
We do not need the findWordInList function anymore. You can delete that function or just leave it there. It just will no longer be used.
That's It. You are done!!
|