Quantcast
Channel: I does Javascript! » Useful Functions
Viewing all articles
Browse latest Browse all 10

Serializing Objects in Javascript

$
0
0
It's worth noting that this post is rather old at this point. I don't use the function listed in this post anymore and haven't for a rather long time. If you are using it and it's working for you, great! But as Ron in the comments sections points out, there are a few issues with regards to strings and special characters. I recommend following the link he posts if your needs merit a very robust version of JSON serialization. I'm leaving this post up though, as I think it's helpful for folks who want to understand the general concept of recursion and serialization.

Recently, in a personal project I'm working on, I came across a need to be able to represent any Javascript object as a string. This isn't a problem since just about every object in Javascript can be represented with JSON (Javascript Object Notation). Every modern browser can parse JSON for you easily enough through eval(..), and Gecko-based browsers even have the ability to reverse the process ("uneval" if you will) and give you back a string representation of an object through a call to .toSource().

If you need this ability in any other browser though, you're gonna have to write it yourself. I needed this ability, so I wrote it (and posted it here for your enjoyment!)

Gecko-based browsers, .toSource():

Gecko-based browsers provide a handy function: .toSource() that you can call on any object in your Javascript code to get back a JSON-like representation of that object.

function Cat(name, age)
{
   this.Name = name;
   this.Age = age;
   this.Speak = function() { alert('Meow!'); };
}

var garfield = new Cat('Garfield', 5);
alert(garfield.toSource());

/* garfield.toSource() yields:
({Name:"Garfield", Age:5, Speak:(function () {alert("Meow!");})})
*/

 

Pretty simple right? You have an object; you want a string. Just invoke the object's .toSource() function.

Serializing objects in other browsers:

Serializing an object manually (as is required by non Gecko-based browsers) requires a bit of recursion. Simple types like integers, booleans, and even functions are trivial to represent as strings. Objects though, are more complicated because they can contain simple types or custom objects (which would need to be serialized themselves). Those "inner" objects could in turn, contain more custom objects, which would also need to be serialized, and this pattern could (theoretically) go on forever.

In practice of course, this pattern will (had better) come to an end. And we can leverage that fact to write a recursive function that will return a string representation (in JSON format) of a given object.

The serialize(..) function:

First the code, then the explanation.

function serialize(_obj)
{
   // Let Gecko browsers do this the easy way
   if (typeof _obj.toSource !== 'undefined' && typeof _obj.callee === 'undefined')
   {
      return _obj.toSource();
   }

   // Other browsers must do it the hard way
   switch (typeof _obj)
   {
      // numbers, booleans, and functions are trivial:
      // just return the object itself since its default .toString()
      // gives us exactly what we want
      case 'number':
      case 'boolean':
      case 'function':
         return _obj;
         break;

      // for JSON format, strings need to be wrapped in quotes
      case 'string':
         return '\'' + _obj + '\'';
         break;

      case 'object':
         var str;
         if (_obj.constructor === Array || typeof _obj.callee !== 'undefined')
         {
            str = '[';
            var i, len = _obj.length;
            for (i = 0; i < len-1; i++) { str += serialize(_obj[i]) + ','; }
            str += serialize(_obj[i]) + ']';
         }
         else
         {
            str = '{';
            var key;
            for (key in _obj) { str += key + ':' + serialize(_obj[key]) + ','; }
            str = str.replace(/\,$/, '') + '}';
         }
         return str;
         break;

      default:
         return 'UNKNOWN';
         break;
   }
}

Explaining a recursive function can be difficult, but I'll give it shot:

The function accepts just one parameter: the object (_obj) to be serialized. If you'll remember, I mentioned previously that simple types (string, boolean, number, etc...) were trivial because they all have an obvious string representation already. Complex types though, are more difficult because they can be made up of additional complex types, which in turn could be made up of additional complex types (and so on).

Of course, this pattern will eventually end; ultimately, everything is made up of simple types that have a string representation. The trick is figuring out how to traverse through this maze of "types within types." Recursion (simply stated: a function that calls itself) is perhaps the easiest way to solve this "types within types" problem.

Recursive functions always have a termination case -- something which causes the function to stop calling itself. Otherwise, the function would go into an infinite loop. In our function, there are actually four different cases in which the serialize(..) doesn't need to call itself:

  1. typeof _obj is a number
  2. typeof _obj is a boolean
  3. typeof _obj is a function
  4. typeof _obj is a string

If any of the above four conditions are met, returning a string representation is trivial, so we simply do it.

You'll notice that strings are treated separately from the other 3 types. You'd think strings would be the most trivial case, but actually, there is one thing we must do before returning the "string representation" of _obj when it is of type string: wrap it in quotes. We need to do this, because JSON expects it, and if we ever want to be able to eval(..) the result of a serialize(..) call, we'll need these quotes.

The only other case to deal with is when _obj is of type object. Within this case though, there are two "sub-cases" we need to deal with. The first is when _obj is an Array, or when it has a .callee property (more on that later). The second is well... anything else.

Basic Object Types

Your every-day, run-of-the-mill, object in Javascript can be represented as JSON with:

{ key1: val1, key2: val2, ... }

Where the keys are strings and the vals can be any simple type, or some custom object you've dreamed up. The logic I've used is to simply loop through a given object's keys and build a string of comma delimited, serialized key/value pairs that are wrapped in { and }.

Notice I said serialized key/value pairs. Here, our function is calling itself as it builds the object representation. This ensures that any objects within the object being serialized will also be serialized. If we didn't do this, we'd end up with a lot of strings that looked (something) like this:

{ key1: [object Object], key2: [object Object], etc... }

And that's clearly not what we want. We want those inner objects to be serialized as well, and that's what the recursive nature of our function will take care of for us.

Arrays

When _obj happens to be, not just any object, but more specifically, an Array, we have a better way of representing that as a string:

[ val1, val2, val3, ... ]

The logic I used here is to simply iterate through the array building a comma delimited list of serialized values, wrapped in [ and ]. Arrays in Javascript already have a .toString() function, but we can't use it here; if the Array contains objects, then the result of the Array 's .toString() could end up something like:

[ [object Object], [object Object], etc... ]

Again, not what we want, so we need to make sure we recursively serialize all of the elements in the array.

Arguments (the .callee "gotcha")

There's one bit of code I haven't discussed yet and it deals with the (possible) .callee property of the passed in _obj. It turns out that .toSource() (native function used by Gecko-based browsers) doesn't do anything very useful when called on an arguments object.

The arguments object is an array-like (but not an Array) object that is automatically available within the scope of every function. Unfortunately, no matter what is contained within that arguments object, calling .toSource() on it will always return "({})"

In order to be able to serialize an arguments object then, we need some way to detect it and then treat it like an array. The .callee property is a good choice because arguments objects have it, but other objects (to the best of my knowledge) do not.

In the case of the serialize(..) function, I decided to use the native .toSource() function whenever it was available, unless the object were an arguments object, in which case, I send Gecko-based browsers down the same path as all other browsers for serialization.

Finally, just for good measure, I've added a default case which returns the string UNKNOWN to handle a situation where no other cases applied. Of course, we probably don't want UNKNOWN showing up in the our serialized strings, but it probably won't do much (immediate) harm if it ever does show up, and its presence would be a helpful indicator that there is some case not being met (that probably needs to be).

Conclusion:

What would you ever use a function like this for? Well, In my case, I wanted to use a Javascript object as the key for a hash, but if I tried to do that, Javascript would just represent all of my (different) objects as the same string: [object Object], which wouldn't do me any good. By serializing the object, I can then use that serialized representation as a key in the hash.

It's worked well for my need so far, but I haven't tested it a great deal. If you find any bugs or short-comings, please let me know.

As always, comments are welcome.


Viewing all articles
Browse latest Browse all 10

Trending Articles