July 2009: new features

Introduction

Although the JSON format is very stable and the wxJSON library version 1.0 does implement all JSON specifications, I received some feature requests from wxJSON users so I am about to release a few new versions of the library. The first one is a workaround to correctly build the library on MinGW and the second one is an extension of the JSON syntax.

In the third version I will not implement new features but some improvements related to speed. Compared to other JSON implementations, wxJSON is very slow in parsing JSON streams: some simple optimisations will surely speed it up.

Version 1.1

As well known, the wxJSON library is a specialized JSON implementation for wxWidgets and it only depends on the wxBase library. No other external library is needed, in particular, wxJSON does not depend on std::string nor on STL containers.

In order to store arrays and key/value pairs of objects the library uses the wxWidget's container implementation wxObjArray and wxHashMap. These implementations work fine on nearly all platforms but there are some systems that cannot compile the library if using wxWidget's container classes. One of these systems is MinGW which fails to compile the wxHashMap class due to the redefinition of the container (to know the details of this failure see the http://sourceforge.net/tracker/?func=detail&aid=2807075&group_id=51305&atid=462816 "wxCode bug trackign system bug ID 2807075").

The solution for those systems is to actually use STL instead of wxWidget's containers. I tried this solution and it seems to work properly (thanks to Andrejs Cainikovs which actually tried the solution and wrote to me that it worked).

Beside the compilation's problems, a wxWidget's application may also use STL for its own, so STL is available for both the application and wxWidgets. So why not use it for wxJSON, too.

This release adds support for using STL containers for the array (std::vector) and object (std::map) JSON type intead of wxWidget's conainers. By default, this feature is disabled, you have to uncomment the line that contains:

  wxJSON_USE_STL

in the the include/wx/json_defs.h file.

Version 1.2

All of you already know that JSON is a text-based data interchange format but the word data refers especially to program variables and not other kind of data. JSON is not suitable for exchanging binary data such as audio, video, image or complex documents and it should not because it was not developed for that purpose. The goals of JSON are: simple, human-readable, fasr and compact.

On the other hand, there may be some particular situations in which a program should store and/or transmit small amount of binary data such as a small GIF image (for example a logo), a 1- or 2-seconds sound or a tiny memory buffer.

In these situations, JSON is very limited: you cannot use JSON strings to store binary data because they are converted to UTF-8, the only possible solution is to use an array of numbers. For example, if we want to store a simple GIF image we can write something like the following:

  {
    "image" :
    "type"   : "gif",
    "width"  : 160,
    "height" : 160,
    "data"   : [ 32, 160, 255, 47, 89, 47, 123, 85, ... ]
 }

The above may be a solution but it consumes a lot of space because for every byte in the buffer we need 3-4 characters. Also, the program has to convert the buffer into an array of INTs when writing the JSON text and to convert it back when reading the stream.

wxJSON introduces another JSON data type as an extension of the JSON syntax: the binary buffer type. In order to maintain the text-based format, the binary buffer is encoded as a string of two hexadecimal digits for every byte and it is enclosed in single quotes. The above example will look like the following:

  {
    "image" :
    "type"   : "gif",
    "width"  : 160,
    "height" : 160,
    "data"   : '20A0FF2F592F7B55...'
 }

The reader will store such a type in a wxMemoryBuffer object and the wxJSONValue class will have functions to return it:

        bool            wxJSONValue::IsMemBuffer();
        wxMemoryBuffer* wxJSONValue::AsMemBuffer();

A value of that type can be stored in a wxJSONValue by constructing the oject or by assigning to it a wxMemoryBuffer object or a void pointer.

        wxJSONValue( const wxMemoryBuffer& mem );
        wxJSONValue( const void* mem, size_t size );

As the memory buffer is not valid JSON text, you have to use a special wxJSONReader's flag in order to handle it otherwise an error will be reported.

By default, the wxJSONWriter will write a wxJSONValue that contains a binary buffer type as an array of INTs thus producing valid JSON text output. If you want to write the special binary buffer type which will be recognized by the reader, you have to use a special writer's flag when constructing it. Note that other JSON implementations will fail to read such a text.

Version 1.3

Compared to other JSON implementations, wxJSON is very slow when it reads UTF-8 streams. I ran some speed tests (see samples/test15.cpp) for reading from / writing to wxString and streams in both ANSI and Unicode builds. The JSON value to be read / written is an array of 10,000 elements each of which contains 4 key/value pairs.

If the JSON text has to be read from / written to a wxstring object, the test application took 1.4 and 1.6 seconds in ANSI and Unicode mode, respectively (not really a lot of time, I think). ANSI seems to be faster than Unicode: probably this is because wxString objects stores strings in one-byte characters.

Using streams, the write process took 2.4 seconds in ANSI builds and 1.9 seconds in Unicode. Now Unicode seems faster, I think because in ANSI builds characters go through a double conversion:

In Unicode builds only the second conversion takes place so Unicode is faster than ANSI.

In order to speed things up when using UTF-8 streams, there are several optimisations that could be used in the wxJSON writer:

For speeding-up the wxJSON parser, the following tricks can be used:

In ANSI, when reading a string or a comment, the parser checks that a character is convertible to the locale dependent charset. If not the unicode escaped sequence is stored in the ANSI string. So the conversion has to be performed char-by-char in ANSI and I think that there is not a trick for not doing so.

json_ver11_12_ver13_std Use of std::string and wxJSON
Another issue related to speed when reading from / writing to UTF-8 streams is that some wxJSON users actually use the library for JSON data but they use the std::string class for storing strings and not the wxString class. I do not know why they do so but the problem in using wxJSON is that on some platforms (maybe Windows?) the std::string class stores strings internally in UTF-8 format.

As a conseguence, when they want to write to UTF-8 streams they have to convert strings to wxString (which uses UCS-2 or UCS-4 encoding) and then back to UTF-8: clearly, this unnecessary double conversion slows things down.

I am very sorry for this issue but wxJSON was written to let JSON data be easily accessible from wxWidgets. For those users my only tip is to use other JSON implementations which natively use std::string for storing strings. For example they can take a look at http://sourceforge.net/projects/jsoncpp/ "jsoncpp" from which the wxJSON's interface heavily derives.


Generated on Thu Oct 22 18:15:09 2009 for wxJSON by  doxygen 1.5.5