JSON Parser

NOTE: This is my first parser and I pushed it to Github just to showcase the project. It's not production ready in any sense.

This is a small project showcasing the parsing process of JSON syntax and how parser can turn JSON syntax into the object that Javascript can read.

Parser was build based on this document.

This parser can parse the following types:

string
number
object
array
true ( In parser this is called bool )
false ( In parser this is called bool )
null

Stages

Stages will be showcased based on the following JSON syntax:

{
  "key": "value"
}

Tokenization

Tokenization is a first process executed here. Tokenization is spliting the content ( string type ) into multiple different characters that are then identified using different identifiers. List of identifiers are:

WHITESPACE
OBJECT_START
OBJECT_END
STRING_START
STRING_CONTENT
STRING_END
COLON
NUMBER
COMMA
DOT
ARRAY_START
ARRAY_END
BOOL
UNKNOWN
NULL

These identifiers are used to identify different characters. In this case the tokenization output would be:

[
  {
    "start": 0,
    "end": 1,
    "raw": "{",
    "identifier": "OBJECT_START"
  },
  {
    "start": 1,
    "end": 2,
    "raw": "\r",
    "identifier": "WHITESPACE", 
    "child": true
  },
  ...WHITESPACE
  {
    "start": 7,
    "end": 8,
    "raw": "\"",
    "identifier": "STRING_START",
    "child": true
  },
  ...STRING_CONTENT
  {
    "start": 10,
    "end": 11,
    "raw": "y",
    "identifier": "STRING_CONTENT",
    "child": true
  },
  {
    "start": 11,
    "end": 12,
    "raw": "\"",
    "identifier": "STRING_END",
    "child": true
  },
  {
    "start": 12,
    "end": 13,
    "raw": ":",
    "identifier": "COLON",
    "child": true
  },
  {
    "start": 13,
    "end": 14,
    "raw": " ",
    "identifier": "WHITESPACE",
    "child": true
  },
  {
    "start": 14,
    "end": 15,
    "raw": "\"",
    "identifier": "STRING_START",
    "child": true
  },
  ...STRING_CONTENT
  {
    "start": 19,
    "end": 20,
    "raw": "e",
    "identifier": "STRING_CONTENT",
    "child": true
  },
  {
    "start": 20,
    "end": 21,
    "raw": "\"",
    "identifier": "STRING_END",
    "child": true
  },
  {
    "start": 21,
    "end": 22,
    "raw": "\r",
    "identifier": "WHITESPACE",
    "child": true
  },
  {
    "start": 22,
    "end": 23,
    "raw": "\n",
    "identifier": "WHITESPACE",
    "child": true
  },
  {
    "start": 23,
    "end": 24,
    "raw": "}",
    "identifier": "OBJECT_END"
  }
]

Lexing

Lexing is a second process and it's fed using Token[] type generated from tokenization process. Lexing returns Entity[] type that contains different entities. Entity in this case resembles connected tokens that produce one type. On example, if STRING_START, STRING_CONTENT and STRING_END are registered as tokens, entity will generate string type.

Entities can be of type:

string
object
number
array
bool
null
whitespace
colon
comma
unknown

In this case the lexer output would be:

[
  {
    "type": "object",        
    "children": [
      {
        "type": "whitespace",
        "value": "\r\n    "  
      },
      {
        "type": "whitespace",
        "value": "\n    "    
      },
      {
        "type": "whitespace",
        "value": "    "      
      },
      {
        "type": "whitespace",
        "value": "   "       
      },
      {
        "type": "whitespace",
        "value": "  "        
      },
      {
        "type": "whitespace",
        "value": " "
      },
      {
        "type": "string",
        "value": "key"
      },
      {
        "type": "colon",
        "value": ":"
      },
      {
        "type": "whitespace",
        "value": " "
      },
      {
        "type": "string",
        "value": "value"
      },
      {
        "type": "whitespace",
        "value": "\r\n"
      },
      {
        "type": "whitespace",
        "value": "\n"
      }
    ]
  }
]

AST

AST generation is a third process that is fed with Entity[] type and generates ASTNode[] type that can finally be recognized by the final process.

One AST node can be of type:

object
string
array
number
bool ( true | false )
null

And have different optional data fields like:

children ( only for parent-based types ( in this case array and object )
key
value

While type field is required and is always defined.

In this case the AST output would be:

[
  {
    "type": "object",    
    "children": [        
      {
        "type": "string",
        "value": "value",
        "key": "key"     
      }
    ]
  }
]

Parser

The fourth and final process is parsing the ASTNode[] and converting node types to Javascript native types.

The output is:

{ key: 'value' }

You can now access key field inside this object.

How to run?

First install all packages using npm i and run npm start.

Documentation

parseFromFile(path: `string`): `object`

Parses JSON from a file.

Args:

path -> string
- Defines a path of the JSON file

parseFromString(content: `string`): `object`

Parses JSON from a string.

Args:

content -> string
- Defines a JSON syntax based string

Access processes

To access different processes you can import Tokenizer, Lexer, AST and Parser classes. Each of them have parse method ( except AST which has construct ) that accept argument of previous process result.

Helpers

To visualize every process better, use JSON.stringify(<object>, null, 2).

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
lib		lib
.gitignore		.gitignore
README.md		README.md
example.json		example.json
index.ts		index.ts
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

JSON Parser

Stages

Tokenization

Lexing

AST

Parser

How to run?

Documentation

parseFromFile(path: `string`): `object`

parseFromString(content: `string`): `object`

Access processes

Helpers

About

Releases

Packages

Languages

szivkovicx/json-parser

Folders and files

Latest commit

History

Repository files navigation

JSON Parser

Stages

Tokenization

Lexing

AST

Parser

How to run?

Documentation

parseFromFile(path: string): object

parseFromString(content: string): object

Access processes

Helpers

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

parseFromFile(path: `string`): `object`

parseFromString(content: `string`): `object`

Packages