Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid escaping forward slash #57

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

schnell18
Copy link

Forward slash may be escaped according to the JSON specification. In practice, escaped forward slash is not interpreted properly by most applications that consume JSON. For instance, it's very common to embed http(s) link in the JSON file. Using cjson will breaks these applications.
So please consider accept this pull request to make cjson a good fit for these applications.

Thanks in advance!

- add unit test to verify forward slash is not escaped
@MorseWayne
Copy link

this is really helpful to solve the foward slash escape problem, i used your code and compiled my cjson.so, now everything is ok!

@ktalebian
Copy link

@mpx can this pull request be merged? This is regarding this issue #66.

@mpx
Copy link
Owner

mpx commented Jun 8, 2021

Can you please provide some example decoders that fail with an escaped slash? Or maybe significant APIs? Just curious - I'd like to understand the impact better.

Conversely, is anyone aware of a need to escape / in some cases? It appears escaping slash is not particularly common in other JSON encoders. Perhaps Lua CJSON is an outlier supporting escaped forward slash?

If so, it might be better to stop escaping forward slash. If there is a clearly better choice we should do that.

@tai-x
Copy link

tai-x commented Oct 8, 2021

@mpx Here some examples:

    local t = {
      some_secure_token = 'P9JUJs/li1tZAAzjlveIBLROlLMkQrzxMzvZmBP/tDE='
    }
    print(require('inspect').inspect(cjson.encode(t)))

This prints:

'{"token":"P9JUJs\\/li1tZAAzjlveIBLROlLMkQrzxMzvZmBP\\/tDE="}'

In #66 @ktalebian mentioned something similar with an url like:

cjson.encode({url =  "https://google.com"})

Resulting in:

{"url":"https:\/\/google.com"}

@mpx
Copy link
Owner

mpx commented Oct 9, 2021

@tai-x Yes, forward slashes are currently escaped - this is valid JSON.

I'm particularly interested to understand:

  1. What breaks? Eg, Which decoders do not handle escaped forward slashes? Which APIs break when foward slashes are escaped?
  2. Is anyone aware of situations where something might break if forward slashes were never escaped?

In theory, either escaping or not escaping should be fine in all cases according to the spec. I originally escaped forward slash since it seemed to be the potentially safer choice (but I have no evidence this is true). Apparently escaping is uncommon?

In practice, this bug reports that escaping can break some applications. I'd like to confirm what specifically breaks (details are missing from this bug).

I'd prefer to remove forward slash escapes if escaping causes significant breakage somewhere and leaving it unescaped is generally fine. CJSON is already too configurable in places.

@tai-x
Copy link

tai-x commented Oct 11, 2021

Since its valid JSON it probably doesn't break anything. Nonetheless I think when you make a library call like this one (cjson.encode), you would expect as less modification as possible (escape only when the language or so needs it). Escaping the forward slash is not necessary and it is modifying the field. If it's not necessary, what do you gain with mangling the value.

So it's not a encode/decode issue (that's probably fine). It's an encode/then use the result issue. For my particular case, for instance, I'm running a unit test where I have a table that will be compared against a JSON api response. Encoding that table to compare both results on json can't be performed "as is" since forward slashes in the values get escaped, so the assertion fails. I'm manually generating the json now with regular string concatenation, from that table.

If #76 was reported, someone put on a PR (#57), and there are stack overflow questions about this, it means there are many persons seeing this issue, because very few people interact on this tools.

@kljensen
Copy link

I agree with @tai-x , since escaping forward slashes is permitted but not required, it is preferable to not do it. I believe also that the most common behavior is to not escape forward slashes. (Maybe the escaping of forward slashes is most useful when json is embedded in HTML script tags?) Of course, @mpx has a valid point---maybe users of lua-cjson are already depending upon this behavior and, if so, changing could add a headache.

@tai-x
Copy link

tai-x commented Oct 12, 2021

@kljensen Backwards compatibility is a very important concern, we should assume that this would be a breaking change for someone. To deal with that, I guess MAJOR in semantic versioning should express that this is a breaking change.

@humanoid2050
Copy link

I concur with this suggestion. Even if the current behavior of escaping forward slash works correctly in most (or even all) instances, it's a bit odd to suddenly see backslashes that were not present in your original data before encoding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants