find the previous version of this document at crankshaft/data-types.md
Table of Contents generated with DocToc
- Efficiently Representing Values and Tagging
- Objects
- Arrays
- Strings
- Resources
read Numbered properties: fast elements
- most objects in heap are 4-byte aligned
- according to spec all numbers in JS are 64-bit floating doubles
- V8 passes around 32-bit numbers to represent all values for improved efficiency
- bottom bit reserved as tag to signify if value is a Smi (small integer) or a pointer to an object
| object pointer | 1 |
or
| 31-bit-signed integer (Smi) | 0 |
- numbers bigger than 31 bits are boxed
- stored inside an object referenced via a pointer
- adds extra overhead (at a minimum an extra lookup)
- on 64 bit architectures Smis are
32-bit-signed
instead of the31-bit-signed
on 32 bit architectures
- prefer Smis for numeric values whenever possible
+-------------------+
| Object | +----> +------------------+ +----> +------------------+
|-------------------| | | FixedArray | | | FixedArray |
| Map | | |------------------| | |------------------|
|-------------------| | | Map | | | Map |
| Extra Properties |----+ |------------------| | |------------------|
|-------------------| | Length | | | Length |
| Elements |------+ |------------------| | |------------------|
|-------------------| | | Property "poo" | | | Property "0" |
| Property "foo" | | |------------------| | |------------------|
|-------------------| | | Property "baz" | | | Property "1" |
| Property "bar" | | +__________________+ | +__________________+
+___________________+ | |
| |
| |
+-----------------------------+
- above shows most common optimized representation
- most objects contain all their properties in single block of memory
"foo", "bar"
- all blocks have a
Map
property describing their structure - named properties that don't fit are stored in overflow array
"poo", "baz"
- numbered properties are stored in a separate contiguous array
"1", "2"
read Some surprising properties of properties
- object is a collection of properties aka key-value pairs
- property names are always strings
- any name used as property name that is not a string is stringified via
.toString()
, even numbers, so1
becomes"1"
- Arrays in JavaScript are just objects with magic
length
property
- hash table used for difficult objects
- aka objects in dictionary mode
- accessing hash table property is much slower than accessing a field at a known offset
- if non-symbol string is used to access a property it is uniquified first
- V8 hash tables are large arrays containing keys and values
- initially all keys and values are
undefined
- on key-vaule pair insertion the key's hash code is computed
- computing hash code and comparing keys for equality is commonly a fast operation
- still slow to execute these non-trivial routines on every property read/write
- data structures such as Map, Set, WeakSet and WeakMap use hash tables under the hood
- a hash function returns a hash code for given keys which is used to map them to a location in the hash table
- hash code is a random number (independent of object value) and thus needs to be stored
- storing the hash code as private symbol on the object, like was done previously, resulted in
a variety of performance problems
- led to slow megamorphic IC lookups of the hash code and
- triggered hidden class transition in the key on storing the hash code
- performance issues were fixed (~500% improvement for Maps and Sets) by hiding the hashcode
and storing it in unused memory space that is connected to the JSObject
- if properties backing store is empty: directly stored in the offset of JSObject
- if properties backing store is array: stored in extranous 21 bits of 31 bits to store array length
- if properties backing store is dictionary: increase dictionary size by 1 word to store hashcode in dedicated slot at the beginning of the dictionary
read Fast, in-object properties | read
- V8 describes the structure of objects using maps used to create hidden classes and match data types
- resembles a table of descriptors with one entry for each property
- map contains info about size of the object
- map contains info about pointers to constructors and prototypes
- objects with same structure share same map
- objects created by the same constructor and have the same set of properties assigned in the same order
- have regular logical structure and therefore regular structure in memory
- share same map
- adding new property is handled via transition descriptor
- use existing map
- transition descriptor points at other map
class Point {
constructor(x, y) {
// Map M0
// "x": Transition to M1 at offset 12
this.x = x
// Map M1
// "x": Field at offset 12
// "y": Transition to M2 at offset 16
this.y = y
// Map M2
// "x": Field at offset 12
// "y": Field at offset 16
}
}
Point
starts out without any fields withM0
this.x =x
-> map pointer set toM1
and valuex
is stored at offset12
and"x" Transition
descriptor added toM0
this.y =y
-> map pointer set toM2
and valuey
is stored at offset16
and"y" Transition
descriptor added toM1
var p = new Point(1, 2)
// Map M2
// "x": Field at offset 12
// "y": Field at offset 16
// "z": Transition at offset 20
p.z = z
// Map M3
// "x": Field at offset 12
// "y": Field at offset 16
// "z": Field at offset 20
- assigning
z
later- create
M3
, a copy ofM2
- add
Transition
descriptor toM2
- add
Field
descriptor toM3
- create
class Point {
constructor(x, y, reverse) {
// Map M0
// "x": Transition to M1 at offset 12ak
// "y": Transition to M2 at offset 12
if (reverse) {
// variation 1
// Map M1
// "x": Field at offset 12
// "y": Transition to M4 at offset 16
this.x = x
// Map M4
// "x": Field at offset 12
// "y": Field at offset 16
this.y = y
} else {
// variation 2
// Map M2
// "y": Field at offset 12
// "x": Transition to M5 at offset 16
this.y = x
// Map M5
// "y": Field at offset 12
// "x": Field at offset 16
this.x = y
}
}
}
- both variations share
M0
which has two transitions - not all
Point
s share same map - in worse cases V8 drops object into dictionary mode in order to prevent huge number of maps to be allocated
- when assigning random properties to objects from same constructor in random order
- when deleting properties
read In-object slack tracking
- objects allocated by a constructor are given enough memory for 32 fast properties to be stored
- after certain number of objects (8) were allocated from same constructor
- V8 traverses transition tree from initial map to determine size of largest of these initial objects
- new objects of same type are allocated with exact amount of memory to store max number of properties
- initial objects are resized (down)
read Methods and prototypes
function pointDistance(p) { /* calculates distance */ }
class Point {
constructor(x, y) {
// Map M0
// "x": Transition to M1 at offset 12
this.x = x
// Map M1
// "x": Field at offset 12
// "y": Transition to M2 at offset 16
this.y = y
// Map M2
// "x": Field at offset 12
// "y": Field at offset 16
// "distance": Transition to M3 <pointDistance>
this.distance = pointDistance
// Map M3
// "x": Field at offset 12
// "y": Field at offset 16
// "distance": Constant_Function <pointDistance>
}
}
- properties pointing to
Function
s are handled viaconstant functions
descriptor constant_function
descriptor indicates that value of property is stored with descriptor itself rather than in the object- pointers to functions are directly embedded into optimized code
- if
distance
is reassigned, a new map has to be created since theTransition
breaks
class Point {
constructor(x, y) {
this.x = x
this.y = y
}
pointDistance() { /* calculates distance */ }
}
- V8 represents prototype methods (aka class methods) using
constant_function
descriptors - calling prototype methods maybe a tiny bit slower due to overhead of the following:
- check receiver's map (as with own properties)
- check maps of prototype chain (extra step)
- the above lookup overhead won't make measurable performance difference and shouldn't impact how you write code
read Numbered properties: fast elements
- numbered properties are treated and ordered differently than others since any object can behave like an array
- element === any property whose key is non-negative integer
- V8 stores elements separate from named properties in an elements kind field (see structure diagram)
- if object drops into dictionary mode for elements, access to named properties remains fast and vice versa
- maps don't need transitions to maps that are identical except for element kinds
- most elements are fast elements which are stored in a contiguous array
read Numbered properties: fast elements
- V8 has two methods for storing arrays, fast elements and dictionary elements
- compact keysets
- linear storage buffer
- contiguous (non-sparse)
0
based- smaller than 64K
- hash table storage
- slow access
- sparse
- large
- V8 makes distinction whether the elements backing store is packed or has holes
- holes in a backing store are created by deleting an indexed element
- missing properties are marked with special hole value to keep Array functions performant
- however missing properties cause expensive lookups on prototype chain
- fast elements kinds in order of increasing generality:
- fast Smis (small integers)
- fast doubles (Doubles stored in unboxed representation)
- fast values (strings or other objects)
+--------------------+
| PACKED_SMI_ELEMENT |---+
+--------------------+ | +------------------------+
| +--->| PACKED_DOUBLE_ELEMENTS |---+
↓ +------------------------+ | +-------------------+
+--------------------+ | +--->| PACKED_ELEMENTS |
| HOLEY_SMI_ELEMENTS |---+ ↓ +-------------------+
+--------------------+ | +------------------------+ |
+--->| HOLEY_DOUBLE_ELEMENTS |---+ ↓
+------------------------+ | +-------------------+
+--->| HOLEY_ELEMENTS |
+-------------------+
- Array's hidden class tracks element types
- if all doubles, array is unboxed aka upgraded to fast doubles
- wrapped objects layed out in linear buffer of doubles
- each element slot is 64-bit to hold a double
- Smis that are currently in Array are converted to doubles
- very efficient access
- storing requires no allocation as is the case for boxed doubles
- causes hidden class change
- requires expensive copy-and-convert operation
- careless array manipulation may cause overhead due to boxing/unboxing watch | slide
- difference is in semantics of indexed properties
- V8 uses unboxed backing stores for such typed arrays
- gets 64-bit allocated for each element
- once array is marked as holey it is holey forever
- don't pre-allocate large arrays (
>64K
), instead grow as needed, to avoid them being considered sparse - do pre-allocate small arrays to correct size to avoid allocations due to resizing
- avoid creating holes, and thus don't delete elements
- don't load uninitialized or deleted elements watch | slide
- use literal initializer for Arrays with mixed values
- don't store non-numeric values in numeric arrays
- causes boxing and efficient code that was generated for manipulating values can no longer be used
- use typed arrays whenever possible especially when performing mathematical operations on an array of numbers
- when copying an array, you should avoid copying from the back (higher indices to lower indices) because this will almost certainly trigger dictionary mode
- avoid elements kind transitions, i.e. edge case of adding
-0, NaN, Infinity
to a Smi array as they are represented as doubles
- string representation and how it maps to each bit
map |len |hash|characters
0123|4567|8901|23.........
- contain only data (no pointers)
- content not tagged
- immutable except for hashcode field which is lazily computed (at most once)
- video: breaking the javascript speed limit with V8 | slides
- tour of V8: garbage collection - 2013
- tour of V8: object representation - 2013
- V8-design
- Fast Properties in V8 - 2017
- “Elements kinds” in V8 - 2017
- video: V8 internals for JavaScript developers - 2018
- slides: V8 internals for JavaScript developers - 2018