This course will become read-only in the near future. Tell us at community.p2pu.org if that is a problem.

Hone Your Powers Part 2 (Parsing)


working with JSON-formatted data

The data that the Twitter API call returns to your script will be formatted in JSON. In particular, it wll be a list - a list where each item is a dictionary (and each dictionary can also have nested sub-lists and sub-dictionaries!).

It sounds complicated but processing JSON is really as simple as using for loops and iterating over the contents of the lists and hashes. Good thing we already have some code for that from our previous task ;).

Here is a test piece of JSON for you. It lists a couple of (made-up) courses with some data about each course:

[
    {
      "name"  : "Saving the World 101",
      "start-date": "Jan 01 2012 12:00:00 +0000 2012",
      "participants"  : 25,
    },
    {
      "name"  : "Advanced Superhero Costume Design",
      "start-date": "Feb 02 2012 12:00:00 +0000 2012",
      "participants"  : 12,
    }
]   

When your program first receives the JSON data, it will be formatted as a string. It won't look quite as pretty, but rather more like so:

'[{ "name":"Saving the World 101","start-date":"Jan 01 2012 12:00:00 +0000 2012","participants":25},{"name":"Advanced Superhero Costume Design","start-date": "Feb 02 2012 12:00:00 +0000 2012","participants" : 12}]'

JSON is formatted as a string because every language has a string data type. Using a string means that JSON-formatted data is inter-operable with any language. Each language has a library or module that you can use to work with JSON. Those libraries convert the JSON-formatted string into native data types of that language. 

To process this bit of JSON, you will need to create a variable and set it equal to the JSON string above. For example, in both Ruby and Python this would look like (note the single quotes surrounding the string):

js = '[{ "name":"Saving the World 101","start-date":"Jan 01 2012 12:00:00 +0000 2012","participants":25},{"name":"Advanced Superhero Costume Design","start-date": "Feb 02 2012 12:00:00 +0000 2012","participants" : 12}]'

Find the JSON-specific function call in your language of choice that takes a string as its argument and returns language-specififc data objects like lists and hashes. Here's a hint to get you started: in Python, tell the interpreter you want to use the JSON module by typing import json . In Ruby, you will want to type require 'json'  (including the single quotes around the word json).

Once you find the function you want to use (a quick google search should help), make use of the print functionality in your language's interpreter to convince yourself (and ensure) that it's working: print out the raw JSON string, use the JSON function call to make the conversion, and then print out the converted data object if it's not automatically printed out for you. 

Make a note below of what function you used. Was the documentation clear or confusing? What help documentation or website did you find that helped you decide which function was appropriate?

Now we get to the fun part. We're going to use the for loop from the previous task to iterate over our new data object and print out only the course names. This will require declaring a variable to store the json data object in, and iterating over that variable with your for loop.

Each time through the for loop you will have to handle one list item. As we know from inspection of the JSON string above, the JSON-converted object is a list, and each list "item" is itself actually a hash (dictionary)! But don't let that trip you up: inside the for loop, you can access and manipulate the dictionary just like any other dictionary.

How do you retrieve only the course name and print it out? Can you print out only the course names that contain the word "Superhero"? (hint: you will need an If-statement).

Keep playing with the JSON data until you are comfortable working with it. Can you extend the JSON example above by adding another list as a data item within each course? How about other data types? Post your ideas, tweaks, and solutions below.   

Task Discussion


  • Eenvincible said:

    Hello,

    I just completed this level of the challenge and I had lots of fun parsing json data returned after making a twitter api call. I thought it would be more helpful and easy for others to benefit from it and that is why I documented everything on my blog which, without a reason to offend or break any forum rule, I would like to share a link here:

    http://simpledeveloper.wordpress.com/2013/01/04/hone-your-powers-part-2-json/

    Thank you for reading this and hope you had as much as fun as I did! See you around and let us keep learning

    on Jan. 4, 2013, 12:30 a.m.
  • govindreddy said:

    note

            I use Python 2.7. Import json. i used dumps and loads functions.documentation of json is helpful.i got what i need in Stackoerful.

    code

    import json
    #here output s is in string
    s=json.dumps([{ "name":"Saving the World 101","start-date":"Jan 01 2012 12:00:00 +0000 2012","participants":25},{"name":"Advanced Superhero Costume Design","start-date": "Feb 02 2012 12:00:00 +0000 2012","participants" : 12}])
    #output of li will be in python list of dictionaries
    li=json.loads(s)
    #for loop on li list
    for d in li:
       # for loop of each dictionary in list
        for key,value in d.items():
            #checking if string value in value of dictionary. but some values in dictuonary are int so converted value to string str(value)
            if 'Superhero' in str(value):
                print "Course %s:- %s"%(key,value

    on Dec. 5, 2012, 3:22 a.m.
  • trbuh said:

    i used json.loads to decode the string. thus:

    formatedString = json.loads(inputString)

    to play around i added a lecture room to ever course via a for loop:

    for entry in formatedString :
        lrName = raw_input('Enter lecture room: ')
        entry[u'\room']=lrName



    to my shame i have to admit, that i did not find an explanation (json-definition?) why it is required to write entry[u'\room'] in order to get 'room' as a key.

    further i wasn't able to come up with a nicer input-prompt (in short time), such as "enter lecture room for course [CourseName]". this is due to the fact that saving the course name (e.g. courseString = entry['name'] ) returned u'Saving the World 101' and not the desired output such as print entry['name'] which returns Saving the World 101.

    i'm glad on any comments/help regarding my two "problems".
    thanks in advance!

    on Sept. 26, 2012, 4:10 p.m.

    Eenvincible said:

    Hi,

    When I was running python code tests I had the same problem: 'u\something' and I found the solution: just use the encode('utf-8') method on that object.

    on Jan. 3, 2013, 10:14 p.m. in reply to trbuh
  • Alvaro said:

     

    Hi! 

    Here you can find the code I made in Python: http://pythonfiddle.com/hone-your-powers-part-two

    I found some helpful info here: http://docs.python.org/library/json.html

    It's quite simple! =)

    on Aug. 25, 2012, 7:44 p.m.
  • Jos said:

    Keeping with the node.js theme, this is quite easily done because there is a JSON object within the v8 engine that does all the heavy lifting for you. No need to import anything! The source code if you are curious about it can be found here.

    The following simple command can parse the resulting data for you:

    var data_in_JSON = JSON.parse(data);

    To acces data you can simple get the first position of the array and ask for the 'name' attribute:

    data_in_JSON[0].name

    This is so much fun!

    on April 11, 2012, 4:23 p.m.