Table of Contents generated with DocToc
- Data Types
read Numbered properties: fast elements
- most objects in heap are 4-byte aligned
- according to spec all numbers in JS are 64-bit floating doubles
- v8 passes around 32-bit numbers to represent all values for improved efficiency
- bottom bit reserved as tag to signify if value is a SMI (small integer) or a pointer to an object
| object pointer | 1 |
or
| 31-bit-signed integer (SMI) | 0 |
- numbers bigger than 31 bits are boxed
- stored inside an object referenced via a pointer
- adds extra overhead (at a minimum an extra lookup)
- prefer SMIs for numeric values whenever possible
+-------------------+
| Object | +----> +------------------+ +----> +------------------+
|-------------------| | | FixedArray | | | FixedArray |
| Map | | |------------------| | |------------------|
|-------------------| | | Map | | | Map |
| Extra Properties |----+ |------------------| | |------------------|
|-------------------| | Length | | | Length |
| Elements |------+ |------------------| | |------------------|
|-------------------| | | Property "poo" | | | Property "0" |
| Property "foo" | | |------------------| | |------------------|
|-------------------| | | Property "baz" | | | Property "1" |
| Property "bar" | | +__________________+ | +__________________+
+___________________+ | |
| |
| |
+-----------------------------+
- above shows most common optimized representation
- most objects contain all their properties in single block of memory
"foo", "bar"
- all blocks have a
Map
property describing their structure - named properties that don't fit are stored in overflow array
"poo", "baz"
- numbered properties are stored in a separate contiguous array
"1", "2"
read Some surprising properties of properties
- object is a collection of properties aka key-value pairs
- property names are always strings
- any name used as property name that is not a string is stringified via
.toString()
, even numbers, so1
becomes"1"
- Arrays in JavaScript are just objects with magic
length
property
- hash table used for difficult objects
- aka objects in dictionary mode
- accessing hash table property is much slower than accessing a field at a known offset
- if non-symbol string is used to access a property it is uniquified first
- v8 hash tables are large arrays containing keys and values
- initially all keys and values are
undefined
- on key-vaule pair insertion the key's hash code is computed
- low bits of hash code are used as initial insertion index
- if that slot is taken the hash table attempts to insert it at next index (modulo length) and so on
- computing hash code and comparing keys for equality is commonly a fast operation
- still slow to execute these non-trivial routines on every property read/write
- v8 avoids hash table representation whenever possible
read Fast, in-object properties | read
- v8 describes the structure of objects using maps used to create hidden classes and match data types
- resembles a table of descriptors with one entry for each property
- map contains info about size of the object
- map contains info about pointers to constructors and prototypes
- objects with same structure share same map
- objects created by the same constructor and have the same set of properties assigned in the same order
- have regular logical structure and therefore regular structure in memory
- share same map
- adding new property is handled via transition descriptor
- use existing map
- transition descriptor points at other map
function Point () {
// Map M0
// "x": Transition to M1 at offset 12
this.x = x;
// Map M1
// "x": Field at offset 12
// "y": Transition to M2 at offset 16
this.y = y;
// Map M2
// "x": Field at offset 12
// "y": Field at offset 16
}
Point
starts out without any fields withM0
this.x =x
-> map pointer set toM1
and valuex
is stored at offset12
and"x" Transition
descriptor added toM0
this.y =y
-> map pointer set toM2
and valuey
is stored at offset16
and"y" Transition
descriptor added toM1
var p = new Point();
// Map M2
// "x": Field at offset 12
// "y": Field at offset 16
// "z": Transition at offset 20
p.z = z;
// Map M3
// "x": Field at offset 12
// "y": Field at offset 16
// "z": Field at offset 20
- assigning
z
later- create
M3
, a copy ofM2
- add
Transition
descriptor toM2
- add
Field
descriptor toM3
- create
function Point(x, y, reverse) {
// Map M0
// "x": Transition to M1 at offset 12ak
// "y": Transition to M2 at offset 12
if (reverse) {
// variation 1
// Map M1
// "x": Field at offset 12
// "y": Transition to M4 at offset 16
this.x = x;
// Map M4
// "x": Field at offset 12
// "y": Field at offset 16
this.y = y;
} else {
// variation 2
// Map M2
// "y": Field at offset 12
// "x": Transition to M5 at offset 16
this.y = x;
// Map M5
// "y": Field at offset 12
// "x": Field at offset 16
this.x = y;
}
}
- both variations share
M0
which has two transitions - not all
Point
s share same map - in worse cases v8 drops object into dictionary mode in order to prevent huge number of maps to be allocated
- when assigning random properties to objects from same constructor in random order
- when deleting properties
read In-object slack tracking
- objects allocated by a constructor are given enough memory for 32 fast properties to be stored
- after certain number of objects (8) were allocated from same constructor
- v8 traverses transition tree from initial map to determine size of largest of these initial objects
- new objects of same type are allocated with exact amount of memory to store max number of properties
- initial objects are resized (down)
read Methods and prototypes
function Point () {
// Map M0
// "x": Transition to M1 at offset 12
this.x = x;
// Map M1
// "x": Field at offset 12
// "y": Transition to M2 at offset 16
this.y = y;
// Map M2
// "x": Field at offset 12
// "y": Field at offset 16
// "distance": Transition to M3 <pointDistance>
this.distance = pointDistance;
// Map M3
// "x": Field at offset 12
// "y": Field at offset 16
// "distance": Constant_Function <pointDistance>
}
function pointDistance(p) { /* calculates distance */ }
- properties pointing to
Function
s are handled viaconstant functions
descriptor constant_function
descriptor indicates that value of property is stored with descriptor itself rather than in the object- pointers to functions are directly embedded into optimized code
- if
distance
is reassigned, a new map has to be created since theTransition
breaks
function Point(x, y) {
this.x = x;
this.y = y;
}
Point.prototype.pointDistance = function () { /* calculates distance */ }
- v8 represents prototype methods using
constant_function
descriptors - calling prototype methods maybe a tiny bit slower due to overhead of the following:
- check receiver's map (as with own properties)
- check maps of prototype chain (extra step)
- the above lookup overhead won't make measurable performance difference and shouldn't impact how you write code
read Numbered properties: fast elements
- numbered properties are treated and ordered differently than others since any object can behave like an array
- element === any property whose key is non-negative integer
- v8 stores elements separate from named properties in an elements kind field (see structure diagram)
- if object drops into dictionary mode for elements, access to named properties remains fast and vice versa
- maps don't need transitions to maps that are identical except for element kinds
- most elements are fast elements which are stored in a contiguous array
read Numbered properties: fast elements
[read]((https://v8.dev/blog/fast-properties)
- v8 has two methods for storing arrays, fast elements and dictionary elements
- compact keysets
- linear storage buffer
- contiguous (non-sparse)
0
based- smaller than
100K
elements see this test
- v8 makes distinction whether the elements backing store is packed or has holes
- holes in a backing store are created by deleting an indexed element
- missing properties are marked with special hole value to keep Array functions performant
- however missing properties cause expensive lookups on prototype chain
- in the past reading beyond the length of an array made it holey, however this has been fixed and is no longer a problem (TODO: resource)
- fast elements kinds in order of increasing generality:
- fast SMIs (small integers)
- fast doubles (Doubles stored in unboxed representation)
- fast values (strings or other objects)
+--------------------+
| PACKED_SMI_ELEMENT |---+
+--------------------+ | +------------------------+
| +--->| PACKED_DOUBLE_ELEMENTS |---+
↓ +------------------------+ | +-------------------+
+--------------------+ | +--->| PACKED_ELEMENTS |
| HOLEY_SMI_ELEMENTS |---+ ↓ +-------------------+
+--------------------+ | +------------------------+ |
+--->| HOLEY_DOUBLE_ELEMENTS |---+ ↓
+------------------------+ | +-------------------+
+--->| HOLEY_ELEMENTS |
+-------------------+
- can only transition downwards through the lattice
- more specific elements kinds enable more fine-grained optimizations
- hash table storage
- slow access
- sparse
- large
- Array's hidden class tracks element types
- if all doubles, array is unboxed aka upgraded to fast doubles
- wrapped objects layed out in linear buffer of doubles
- each element slot is 64-bit to hold a double
- SMIs that are currently in Array are converted to doubles
- very efficient access
- storing requires no allocation as is the case for boxed doubles
- causes hidden class change
- requires expensive copy-and-convert operation
- careless array manipulation may cause overhead due to boxing/unboxing watch | slide
- difference is in semantics of indexed properties
- v8 uses unboxed backing stores for such typed arrays
- gets 64-bit allocated for each element
- once array is marked as holey it is holey forever
- don't pre-allocate large arrays (
>=100K
elements), instead grow as needed, to avoid them being considered sparse - do pre-allocate small arrays to correct size to avoid allocations due to resizing
- avoid createing holes, and thus don't delete elements
- don't load uninitialized or deleted elements watch | slide
- use literal initializer for Arrays with mixed values
- don't store non-numeric values in numeric arrays
- causes boxing and efficient code that was generated for manipulating values can no longer be used
- use typed arrays whenever possible especially when performing mathematical operations on an array of numbers
- copying an array, you should avoid copying from the back (higher indices to lower indices) because this will almost certainly trigger dictionary mode
- avoid elements kind transitions, i.e. edge case of adding
-0, NaN, Infinity
to a SMI array as they are represented as doubles
- string representation and how it maps to each bit
map |len |hash|characters
0123|4567|8901|23.........
- contain only data (no pointers)
- content not tagged
- immutable except for hashcode field which is lazily computed (at most once)