BigData - Finding and Loading the DataWhere do we find the data? There are many Gov. sites as well as private sites that have their data for download.
I have listed many that I have found here: BigData - Resources and Links For this exercise, we will use data from the Social Security Administration (SSA.Gov). I have provided the names for the years 1946 through 2002 in the Files section below ("yob1946.txt" to "yob2002.txt" where "yob" stands for "year of birth") . If you want other years than those two, download them yourself from here.
note: Those are not absolutely all the names for those years, just the ones for 5 or more people. (i.e. If only you and 3 other people have the same name, it was not included). It still is a lot of names (almost 4 million names)
1. Create the Application - "Names" a. open up LiveCode and create a new mainstack, size it to be the full height of the screen Click on "Object" on the EditBar, select "Stack Inspector", name it "Names"
b. save the Stack Click on "File" on the Editbar", select "Save As", Save it as "Names.livecode" on your computer
2. Add a field to hold the names a. add a text entry field, size it to the card, name it "names" and check the Vertical Scrollbar check box
3. Add a Button to read in the file of names a. add a button, call it "Load File". We will program it to allow us to select different files from our computer. add this script to it:
answer file "Select a Text file" // open up a file-selection dialog
if it is empty then exit mouseUp // if user clicks "Cancel", do nothing
put it into x // save the name of the file they selected
put url ("file:" & x) into field "names" // load the file into out text field
4. Load the Data a. Download one of the data files below for the year in which you were born - e.g. "yob1999.txt" b. go into "Run" mode, click on the button and load that file into your program
It should look like this:
You can see three fields - this is called a CSV format file. The letters stand for Comma Separated Values. Every field/item/value is separated with a comma. They are - the name, the gender and the number of people with that name.
Notice that the names are in no particular order We can open the "Message Box" on the EditBar and try some commands.
a. Find the name Sam by typing: Find "Sam" in field "names"
We get this:(The name "Samantha" has a box around the letters "Sam"
That helps but it would be nice if all the names were in order. Let's put them in alphabetical order by typing: Sort field "names" in the MessageBox
That looks better, now let's so the find again: Find "Sam" in field "names"
That looks good, because we now have all the names beginning with "Sam" together. looking at the data, for this year there were
You can experiment with other commands in the Message Box for now but we will want to put those commands in our program to run automatically Lets add the sort command to our "Load Files" button: answer file "Select a Txt file" // open up a file-selection dialog
if it is empty then exit mouseUp // if user clicks "Cancel", do nothing
put it into x // save the name of the file they selected
put url ("file:" & x) into field "names" // load the file into out text field
Now, when we load a new year, it will be sorted for us.
For You To Do: (answer the questions, then make the changes to your program)
Questions to Investigate:- Does case matter? try doing a find on "Sam" then a find on "sam"
- What happens when you do "Find Willy" twice? Why?
- Look up "find" in the LiveCode Dictionary. Is there a way to reset the find command?
Adding a Look-up Field
Now, Add another text field - "name2find" and a button "Find Name"
On the button put the code to get the name the user types into that field and look for that name in then list of names
e.g.
on mouseUp
put field "name2find" into x
find x in field "names" end mouseUp
note: - You can use whatever names you want for the fields, you do not have to use mine
- do you need to put a "find empty" at the beginning to reset the find command? try it out without it and see
- nice feature to add - check the result after the find. If the result is "not found" then the find command did not find the name. Put up an answer box to tell the user that.
You can load any file into a field. For fun, load any other text file. You can see its contents inside the field.
Try a HTML page, you can see all the HTML tags and text.
New Code for "Load Files" button
answer file "Select a Txt file"
put url ("file:" & x) into y
put empty into field "girls"
put empty into field "boys"
repeat for each line l in y
if item 2 of l = "M" then
sort lines of b1 descending numeric by item 3 of each
sort lines of g1 descending numeric by item 3 of each
put g1 into field "girls"
new code for Lookup Name put field "myname" into x
set itemDelimiter to space
put item 2 of foundline() into field "num"
|
 Updating...
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:51 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:51 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:51 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:52 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:52 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:52 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:52 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:52 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:52 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:52 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:52 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:52 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:52 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:52 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:52 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:52 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:52 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:52 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:53 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:53 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:53 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:53 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:53 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:53 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:53 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:53 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:53 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:53 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:53 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:53 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:53 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:53 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:53 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:53 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:53 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:54 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:54 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:54 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:54 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:54 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:54 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:54 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:54 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:54 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:54 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:54 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:54 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:54 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:54 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:54 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:54 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:55 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:55 AM
cyril.pruszko@pgcps.org, Feb 28, 2017, 7:55 AM
cyril.pruszko@pgcps.org, Nov 29, 2016, 7:49 AM
cyril.pruszko@pgcps.org, Nov 29, 2016, 7:49 AM
cyril.pruszko@pgcps.org, Nov 29, 2016, 7:49 AM
|