[manual index][section index]


cjson: makekeys, JSON2Token, Token2JSON - fast JSON tokenizer


include "cjson.m";
        cjson: CJSON;
        JSON2Token, Token2JSON, END_OBJ, UNK_KEY, EMPTY_KEY: import cjson;
cjson = load CJSON CJSON->PATH;

makekeys: fn(a: array of string): ref Keys;

JSON2Token: adt{
        new:    fn(a: array of byte): ref JSON2Token;
        obj:    fn(t: self ref JSON2Token);
        arr:    fn(t: self ref JSON2Token);
        close:  fn(t: self ref JSON2Token): int;
        getkey: fn(t: self ref JSON2Token, k: ref Keys): int;
        getnull:fn(t: self ref JSON2Token): int;
        getbool:fn(t: self ref JSON2Token): int;
        gets:   fn(t: self ref JSON2Token): string;
        getr:   fn(t: self ref JSON2Token): real;
        getn:   fn(t: self ref JSON2Token): int;
        skip:   fn(t: self ref JSON2Token);
        gettype:fn(t: self ref JSON2Token): int;
        end:    fn(t: self ref JSON2Token);

Token2JSON: adt{
        new:    fn(sizehint: int): ref Token2JSON;
        obj:    fn(j: self ref Token2JSON): ref Token2JSON;
        arr:    fn(j: self ref Token2JSON): ref Token2JSON;
        close:  fn(j: self ref Token2JSON): ref Token2JSON;
        key:    fn(j: self ref Token2JSON, k: ref Keys, id: int): ref Token2JSON;
        str:    fn(j: self ref Token2JSON, s: string): ref Token2JSON;
        num:    fn(j: self ref Token2JSON, n: int): ref Token2JSON;
        bignum: fn(j: self ref Token2JSON, n: big): ref Token2JSON;
        realnum:fn(j: self ref Token2JSON, n: real): ref Token2JSON;
        bool:   fn(j: self ref Token2JSON, n: int): ref Token2JSON;
        null:   fn(j: self ref Token2JSON): ref Token2JSON;
        encode: fn(j: self ref Token2JSON): array of byte;


This module provide faster (in about 5 times) and ease to use alternative to json(2).

To parse/generate as fast as possible it doesn't validate input strictly, and so may accept incorrectly formed JSON. Also it unable to return unknown object key names - all possible object key names must be precompiled using makekeys and provided to getkey while parsing or key while generating.

makekeys return compiled form of known object keys. You may compile keys separately for each type of object, or compile all possible keys in all possible objects at once. You'll need returned value to call getkey and key. Usually makekeys called only when your application initializes. Will raise exception on duplicate keys in a.

Parsing JSON (JSON2Token)
new create and return new JSON2Token, which then should be used to parse JSON from a (which should contain any amount of complete JSON values).

obj ensure next token is { and skip it. Maximum depth of opened objects/arrays currently limited to 16. Will raise exception if next token isn't { .

arr ensure next token is [ and skip it. Maximum depth of opened objects/arrays currently limited to 16. Will raise exception if next token isn't [ .

close after opening obj ensure next token is } and skip it and return 1. Will raise exception if next token isn't } .

close after opening arr check is next token is ] , if yes then skip it and return 1, else do nothing and return 0. User expected to call close when unsure how many elements left in array and check returned value.

getkey ensure next token is object key. It parse key and match it to list of known keys provided in k (returned by makekeys). If match found, it will return index of that key in a (array provided to makekeys). For empty keys EMPTY_KEY constant returned. If match not found, it will return UNK_KEY constant. If there are no more keys in this object it will return END_OBJ constant. Will raise exception if next token isn't object key or } .

getnull check is next token is null , if yes then skip it and return 1, else do nothing and return 0. User expected to call getnull when unsure is next value defined or null.

getbool ensure next token is true or false and skip it. If token was true return 1, else return 0. Will raise exception if next token isn't true or false .

gets ensure next token is string and return it (unquoted). Will raise exception if next token isn't string.

getr ensure next token is number and return it (as real). Will raise exception if next token isn't number.

getn ensure next token is number and return it (as int). If token was real number instead of integer, will leave unparsed tail on that token, and this most likely broke parsing next token. Will raise exception if next token isn't number.

skip skips next token, including complex tokens like objects or array. User expected to call it to skip values of UNK_KEY keys. Will raise exception if unable to skip token.

gettype check next token type and return one of these values for each token type: 0 no more tokens, -1 bad token, + + for string, + 0 + for number, + n + for null, + t + for true, + f + for false, + { + for opening object, + } + for closing object, + [ + for opening array, + ] + for closing array.

end ensure there are no more tokens available. Will raise exception if there are more tokens available.

Generating JSON (Token2JSON)
new create and return new Token2JSON, which then should be used to form JSON from tokens added by calling other methods, and finally generating JSON using encode. The sizehint used to help choose initial size of buffer to store JSON. If it will be smaller than needed to keep generated JSON, buffer will automatically grow as needed, but this may slowdown JSON generation.

All methods which append tokens return Token2JSON to allow calling them one after one as a chain (see EXAMPLES).

obj append + { +. arr append + [ +. close append either + } + or + ] + to close current obj or arr.

key append key name : , and it need k (returned by makekeys) and key's id to find key name.

str append quoted string s. num, bignum and realnum append number n.

bool append true if n != 0, or false otherwise. null append null .

encode return current JSON. Will raise if detect incomplete JSON (not closed arr or obj).


Struct: adt{
        str: string;
        r:   real;
        opt: int;
        arr: list of int;

K_STR, K_REAL, K_OPT, K_ARR: con iota;

        keys := cjson->makekeys(array[] of {
                K_STR   => "str",
                K_REAL  => "real",
                K_OPT   => "opt",
                K_ARR   => "arr",

        struct := ref Struct;

### Parsing

        t := JSON2Token.new(array of byte "{"real": -2.3e2, "arr":[10,20]}");

OBJ:    for(;;) case t.getkey(keys) {
        END_OBJ =>      break OBJ;
        UNK_KEY =>      t.skip();
        K_STR =>        struct.str = t.gets();
        K_REAL =>       struct.r = t.getr();
        K_OPT =>        if(!t.getnull())
                                struct.opt = t.getn();
        K_ARR =>        t.arr();
                                struct.arr = t.getn() :: struct.arr;

### Generating

        json := Token2JSON.new(128)
                .key(keys, K_STR)       .str(struct.str)
                .key(keys, K_REAL)      .realnum(struct.r)
                .key(keys, K_OPT)       .num(struct.opt)
                .key(keys, K_ARR)       .arr();
        for(l := struct.arr; l != nil; l = tl l)
                json                    .num(hd l);
        json                            .close()

        text := string json.encode();






These bugs are intentional, to increase speed:

true/false/null detected by first letter

numbers with leading zeroes allowed

structure doesn't 100% validated: {"a":"b",} and 1, are valid json

CJSON(2 ) Rev:  Tue Mar 31 02:42:38 GMT 2015