Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #221 usage of new Function(...) for CSP #223

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

ajainarayanan
Copy link

@ajainarayanan ajainarayanan commented Feb 24, 2019

Context:

  • Strict CSP policy doesn't allow the use of new Function as it could cause security vulnerabilities (it is similar to doing eval)

Fixes:

  • As part of using avsc in a strict csp environment I have converted all the usages of new Function to just plain function.

Note:

  • The implementation of RecordType.prototype._update still has an issue. One unit test fails for it.
  • I am not entirely aware of the performance degradation because of these changes. If there is a benchmark that already exists I could run the same with my change to measure how slow the implementation is.
  • I didn't really introduce a flag to use the older version vs the new one for CSP based environments. Thought lesser code and lesser branches. If we need to have both available as choice I would vote for building separate files (may be avsc.csp.min.js?). I will let @mtth decide.
    The reason I didn't go with both implementations co-existing is because people who do not care about either environments shouldn't be forced to ship code handling environments they are not interested in.

Fixes #221

Copy link
Owner

@mtth mtth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, @ajainarayanan.

I am not entirely aware of the performance degradation because of these changes. If there is a benchmark that already exists I could run the same with my change to measure how slow the implementation is.

You can use npm run perf; it runs a small script which benchmarks decoding, encoding, and validation for a few sample schemas. Running it with the current changes on a Linux machine (node 10.13), we're seeing ~2x to ~10x slowdowns:

Before

fromBuffer toBuffer isValid (ops/sec)
15,059,992 1,688,823 51,950,334 ArrayInt.avsc
1,615,493 818,550 30,913,164 ArrayString.avsc
4,907,738 1,448,752 42,524,391 Bytes.avsc
326,697 308,362 755,749 Cake.avsc
1,350,643 811,312 11,572,852 Coupon.avsc
21,100,681 1,802,179 47,665,039 Double.avsc
18,265,202 1,732,785 24,690,402 Enum.avsc
1,128,710 728,342 1,716,808 HistoryItem.avsc
1,726,531 941,075 21,684,065 Human.avsc
25,850,704 1,828,579 47,363,765 Int.avsc
13,946,536 1,711,799 29,267,844 Long.avsc
122,902 111,099 323,444 PciEvent.avsc
8,177,425 1,560,454 9,259,604 Person.avsc
4,009,543 1,176,586 19,475,759 ProtobufTest.avsc
5,902,378 1,248,164 44,461,277 String.avsc
70,933 46,667 193,406 Tile.avsc
6,378,196 1,379,425 9,844,583 Union.avsc
679,422 516,834 1,212,194 User.avsc

After

fromBuffer toBuffer isValid (ops/sec)
5,324,210 1,545,379 6,640,302 ArrayInt.avsc
1,186,656 790,020 5,198,795 ArrayString.avsc
2,820,988 1,304,473 6,121,860 Bytes.avsc
147,498 161,153 0 Cake.avsc
517,400 460,578 398,507 Coupon.avsc
3,056,212 1,431,762 4,730,325 Double.avsc
4,768,575 1,487,743 4,080,362 Enum.avsc
377,993 452,471 306,100 HistoryItem.avsc
709,535 584,251 627,045 Human.avsc
5,098,550 1,441,949 4,241,981 Int.avsc
4,410,719 1,370,996 4,320,919 Long.avsc
62,715 71,660 0 PciEvent.avsc
1,620,208 1,057,403 1,050,400 Person.avsc
892,477 702,414 606,041 ProtobufTest.avsc
2,703,178 1,131,277 4,580,292 String.avsc
26,971 26,781 29,592 Tile.avsc
2,745,252 1,139,151 2,747,374 Union.avsc
198,284 252,656 142,919 User.avsc

Given the magnitude of the performance hit, we won't be able to default to this implementation. Unless you are able to bring the difference under ~5%, we will want to make it optional.

lib/types.js Outdated Show resolved Hide resolved
function ConstructorFunction() {
return function Branch$(val) {
if (~name.indexOf('.')) {
this[`${name}`] = val;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avsc currently runs on all versions of node since 0.11, it would be good to keep compatibility the same. Can you remove all features not supported by ECMAScript 5 (template literals, let, etc.)?

Copy link
Author

@ajainarayanan ajainarayanan Feb 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mtth It is easier to remove the usage of template literals and other es6 features but one of the core parts of the refactor is the use of spread operator to pass in the arguments without any function signature change. If not for it we will be forced to change the signature which I was not sure would be better.

For instance, in the existing implementation we do something like,

 body += '  return new ' + uname + '(' + innerArgs.join() + ');\n};';

this will eventually resolve to something like,

return new Person(arg1, arg2, arg3);

However when we want to flatten it out into a function we would not able to refactor it as,

innerArgs = [arg1, arg2, arg3];
return new self._constructor(innerArgs)

we need the spread operator to maintain the same implementation like,

innerArgs = [arg1, arg2, arg3]
return new self._constructor(...innerArgs);

Let me know if it makes sense. I am open to other ideas as well if this something that can be solved with js that is compatible with Node 0.11

I went with ES6 implementation since our primary usage is in modern browsers.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you be open to the idea of introducing a build step which will compile and build files that are compatible for Node 0.11? Just throwing it out to see if thats a possibility that we are open to.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can emulate the spread operator by applying on bind:

function Foo(a, b) { this.a = a; this.b = b; } // Example constructor.
var args = [undefined, 1, 2]; // The dynamic array of arguments to apply.
var BoundFoo = Foo.bind.apply(Foo, args); // Bind the array to the constructor.
var foo = new BoundFoo(); // Foo { a: 1, b: 2 }

Introducing a build step sounds acceptable if it doesn't introduce too much complexity. It should be done separately from this PR though.

function ConstructorFunction(outerArgs) {
const constructorName = self._getConstructorName();
var innerFunction = ({
[constructorName]: function(...innerArgs) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why nest the function inside an object here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so that the function can get a dynamic name. This is again only available in ES2015. An example of it would be something like this,

var fun1 = function() { console.log('fun1'); }
var funname = 'fun2';
var dfun = ({
  [funname]: function() { console.log('dynamic function'); }
})[funname];

dfuninstance = new dfun()
dynamic function
dfun {}
dfun.name
"fun2"

The aim was to have the constructor function with proper name when being invoked.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, pretty cool. Let's extract the logic to a function and document it to help future readers.

@ajainarayanan ajainarayanan changed the title Fixes usage of new Function(...) for CSP Fixes #221 usage of new Function(...) for CSP Feb 24, 2019
@ajainarayanan
Copy link
Author

@mtth Thanks for the feedback. Have replied inline. Regarding the performance I am not sure if I will have time in the next 3 weeks as its a little busy at work. Will try my best to see how I can optimize.

Having said that I would definitely vote for making this optional as a secondary build (or a separate node package). The primary purpose of this change is to support environments which enforce a strict CSP policy which is generally common in enterprise environments.

@fmmoret
Copy link

fmmoret commented Aug 16, 2022

Just to add to other parties maybe evaluating this branch -- it does have some minor bugs due to array indexing and might have some broken this references. The test suite is good but not comprehensive. I think this impl flops on mixtures of fields with default values and non-default values and probably other edge cases. Good POC.

@Bessonov
Copy link

@fmmoret Can you please provide more details on how to test for the issues you mentioned? I don't have time at the moment, but I hope to have time next year to pick up this MR.

@fmmoret
Copy link

fmmoret commented Nov 17, 2023

I am so far away now that I don't have a great idea. I was using it to move around postgres results and
probably some other data sources that had nested schemas. Postgres results would have had lots of nullable types and basically all versions of number types that Avro supports.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support strict CSP enabled environments
4 participants