Hone Your Powers Part 2 (Parsing) [Jan. 26, 2012, 7:59 p.m.]
The data that the Twitter API call returns to your script will be formatted in JSON. In particular, it wll be a list - a list where each item is a dictionary (and each dictionary can also have nested sub-lists and sub-dictionaries!).
It sounds complicated but processing JSON is really as simple as using for loops and iterating over the contents of the lists and hashes. Good thing we already have some code for that from our previous task ;).
Here is a test piece of JSON for you. It lists a couple of (made-up) courses with some data about each course:
[ { "name" : "Saving the World 101", "start-date": "Jan 01 2012 12:00:00 +0000 2012", "participants" : 25, }, { "name" : "Advanced Superhero Costume Design", "start-date": "Feb 02 2012 12:00:00 +0000 2012", "participants" : 12, } ]
When your program first receives the JSON data, it will be formatted as a string. It won't look quite as pretty, but rather more like so:
'[{ "name":"Saving the World 101","start-date":"Jan 01 2012 12:00:00 +0000 2012","participants":25},{"name":"Advanced Superhero Costume Design","start-date": "Feb 02 2012 12:00:00 +0000 2012","participants" : 12}]'
JSON is formatted as a string because every language has a string data type. Using a string means that JSON-formatted data is inter-operable with any language. Each language has a library or module that you can use to work with JSON. Those libraries convert the JSON-formatted string into native data types of that language.
To process this bit of JSON, you will need to create a variable and set it equal to the JSON string above. For example, in both Ruby and Python this would look like (note the single quotes surrounding the string):
js = '[{ "name":"Saving the World 101","start-date":"Jan 01 2012 12:00:00 +0000 2012","participants":25},{"name":"Advanced Superhero Costume Design","start-date": "Feb 02 2012 12:00:00 +0000 2012","participants" : 12}]'
Find the JSON-specific function call in your language of choice that takes a string as its argument and returns language-specififc data objects like lists and hashes. Here's a hint to get you started: in Python, tell the interpreter you want to use the JSON module by typing import json . In Ruby, you will want to type require 'json' (including the single quotes around the word json).
Once you find the function you want to use (a quick google search should help), make use of the print functionality in your language's interpreter to convince yourself (and ensure) that it's working: print out the raw JSON string, use the JSON function call to make the conversion, and then print out the converted data object if it's not automatically printed out for you.
Make a note below of what function you used. Was the documentation clear or confusing? What help documentation or website did you find that helped you decide which function was appropriate?
Now we get to the fun part. We're going to use the for loop from the previous task to iterate over our new data object and print out only the course names. This will require declaring a variable to store the json data object in, and iterating over that variable with your for loop.
Each time through the for loop you will have to handle one list item. As we know from inspection of the JSON string above, the JSON-converted object is a list, and each list "item" is itself actually a hash (dictionary)! But don't let that trip you up: inside the for loop, you can access and manipulate the dictionary just like any other dictionary.
How do you retrieve only the course name and print it out? Can you print out only the course names that contain the word "Superhero"? (hint: you will need an If-statement).
Keep playing with the JSON data until you are comfortable working with it. Can you extend the JSON example above by adding another list as a data item within each course? How about other data types? Post your ideas, tweaks, and solutions below.