Highly customizable abstraction for generating SPSS-like tabulations, cross-tabulations (also known as crosstabs or contingency tables), and layered cross-tabulations in PHP. These are useful in showing the relationship between two or more categorical variables.
- PHP 8.2 or higher
Ensure that the intl
and bcmath
extensions are installed for, respectively, international number formatting and
better mathematical precision.
Run the following to install this library:
$ composer require cliffordvickrey/crosstabs
Here, we generate a crosstab that shows the browsers of a website's visitors, as well as client operating system.
<?php
// some data. If "n" is omitted, each row is treated as a single case
$rawData = [
['Device Type' => 'Desktop', 'Browser' => 'Chrome', 'Platform' => 'Linux', 'n' => '256'],
['Device Type' => 'Tablet', 'Browser' => 'Safari', 'Platform' => 'iOS', 'n' => '6'],
['Device Type' => 'Desktop', 'Browser' => 'Chrome', 'Platform' => 'MacOSX', 'n' => '227'],
['Device Type' => 'Desktop', 'Browser' => 'IE', 'Platform' => 'Windows', 'n' => '35'],
['Device Type' => 'Desktop', 'Browser' => 'Chrome', 'Platform' => 'Windows', 'n' => '221'],
['Device Type' => 'Desktop', 'Browser' => 'Firefox', 'Platform' => 'MacOSX', 'n' => '38'],
['Device Type' => 'Mobile Device', 'Browser' => 'Safari', 'Platform' => 'iOS', 'n' => '21'],
['Device Type' => 'Desktop', 'Browser' => 'Netscape', 'Platform' => 'Windows', 'n' => '21'],
['Device Type' => 'Desktop', 'Browser' => 'Safari', 'Platform' => 'MacOSX', 'n' => '38'],
['Device Type' => 'Mobile Device', 'Browser' => 'Edge', 'Platform' => 'iOS', 'n' => '72'],
['Device Type' => 'Desktop', 'Browser' => 'Safari', 'Platform' => 'Windows', 'n' => '15'],
['Device Type' => 'Desktop', 'Browser' => 'Firefox', 'Platform' => 'Linux', 'n' => '27'],
['Device Type' => 'Desktop', 'Browser' => 'Firefox', 'Platform' => 'Windows', 'n' => '12'],
['Device Type' => 'Desktop', 'Browser' => 'Edge', 'Platform' => 'Windows', 'n' => '11']
];
// the builder does exactly what it says. Set a bunch of options and call the "build" method
$builder = new \CliffordVickrey\Crosstabs\CrosstabBuilder();
$builder->setRawData($rawData);
$builder->setTitle('Browser Usage by Platform');
$builder->setColVariableName('Browser');
$builder->setRowVariableName('Platform');
$builder->setShowPercent(true);
$builder->setPercentType((\CliffordVickrey\Crosstabs\Options\CrosstabPercentType::Column);
$crosstab = $builder->build();
// display the crosstab as HTML (see example output below)
echo $crosstab->write();
// if you use a Bootstrap layout and want a table with all the fancy utility classes, etc., you can override the default
// writer like so:
echo $crosstab->write(writer: new \CliffordVickrey\Crosstabs\Writer\CrosstabBootstrapHtmlWriter());
// some inferential stats. Degrees of freedom are equal to the number of columns (minus 1) multiplied by the number of
// rows (minus 1). Chi-squared is a test statistic, comparing actual values with ones we'd expect if no relationship
// existed between the row and column variables
var_dump($crosstab->getDegreesOfFreedom()); // 15
var_dump($crosstab->getChiSquared()); // 598.35 (clearly significant!)
// now: let's add third dimension: device type
$builder->setTitle('Browser Usage by Platform by Device Type');
$builder->addLayer('Device Type');
// percentages will be of columns within each layer category; great for visualizing the effects of control variables
$builder->setPercentType(\CliffordVickrey\Crosstabs\Options\CrosstabPercentType::ColumnWithinLayer);
$crosstab = $builder->build();
echo $crosstab->write();
// want a simpler display? Let's just show a frequency distribution of browsers
$builder = new \CliffordVickrey\Crosstabs\CrosstabBuilder();
$builder->setRawData($rawData);
$builder->setTitle('Browser Usage');
$builder->setRowVariableName('Browser');
$builder->setShowPercent(true);
echo $crosstab->write();
The builder is used to configure and create the desired table. When configuring, you're always going to want to set
rawData
and rowVariableName
. In most cases, you'll also want to set colVariableName
(if you're visualizing two or
more categorical variables).
Builds the table. Throws \CliffordVickrey\Crosstab\Exception\CrosstabInvalidArgumentException
when the options are
invalid
@addLayer(CrosstabVariable|array|string $layer, ?string $description = null, ?array $categories = []): void
Adds a layer variable to the crosstab
Adds multiple layer variables
Sets the label of a column variable. If none is set, use the name
Sets the name of the column variable in the raw data
Explicitly defines the categories of the column variable in the raw data; otherwise, they are inferred. Useful for relabeling/recoding categorical values
Sets the key in the source data representing the number of cases in a row. If this information is absent, each row will be treated as a single case. Defaults to "n"
Sets the key in the source data representing row weight. If this information is absent, each row will be weighed equally. Defaults to "weight"
Sets multiple layer variables
Sets the locale used for number formatting. Defaults to "en_US." See the intl extension documentation
Sets the scale used for floating point math. Defaults to "16," roughly the precision of floats in most builds of PHP
Sets the label to use for the expected frequency table cell. Defaults to "Frequency (Expected)"
Sets the label to use for the expected percentage table cell. Defaults to "% (Expected)"
Sets the label to use for the frequency table cell. Defaults to "Frequency"
Sets the label to use for NULL values in the table. Defaults to "-"
Sets the label to use for empty tables. Defaults to "There is no data to display"
Sets the label to use for percentage cells. Defaults to "%"
Sets the label to use for total cells. Defaults to "Total"
Sets the label to use for weighted expected percentage cells. Defaults to "Expected Frequency (Weighted)"
Sets the label to use for weighted expected percentage cells. Defaults to "Expected % (Weighted)"
Sets the label to use for weighted frequency cells. Defaults to "Frequency (Weighted)"
Sets the label to use for weighted percentage cells. Defaults to "% (Weighted)
Sets the percent type (row, column, total, etc.). See the \CliffordVickrey\Crosstabs\Options\CrosstabPercentType
enum
for a list of allowable options. Defaults to CrosstabPercentType::Total
The raw data to tabulate. Should be an iterable of iterables (rows). Rows are cast to arrays. The "n" and "weight" keys in each array are optionally used in computation, should you need to represent more than one case per row or capture survey weighting
Sets the label of a row variable. If none is set, use the name
Sets the name of the row variable in the raw data
Explicitly defines the categories of the row variable in the raw data; otherwise, they are inferred. Useful for relabeling/recoding categorical values
Sets the scale of formatted decimal values in the table. Defaults to 2
Sets the scale of formatted percentage values in the table. Defaults to 2
Sets whether to display expected frequencies (e.g., the values we'd expect if no relationship existed between X and Y)
in the table. Defaults to FALSE
Sets whether to display expected percentages (e.g., the values we'd expect if no relationship existed between X and Y)
in the table. Defaults to FALSE
Sets whether to display frequencies in the table. Defaults to TRUE
Sets whether to display percentages in the table. Defaults to FALSE
Sets whether to display weighted expected frequencies (e.g., the values we'd expect if no relationship existed between X
and Y) in the table. Defaults to FALSE
Sets whether to display weighted expected percentages (e.g., the values we'd expect if no relationship existed between X
and Y) in the table. Defaults to FALSE
Sets whether to display weighted frequencies in the table. Defaults to FALSE
Sets whether to display weighted frequencies in the table. Defaults to FALSE
Sets an optional title to appear in the table header. Defaults to NULL
(i.e., display no title)
Encapsulates the data and presentation elements of a crosstab. Implements \Traversable
; traversal will return row
objects (CliffordVickrey\Crosstabs\Crosstab\CrosstabRow
), which themselves provide cell objects
(CliffordVickrey\Crosstabs\Crosstab\CrosstabRow
) when traversed. These cells contain the table's presentation data,
whereas the matrix (exposed by a getter) contains the tabulated data.
Returns a cell at specified Cartesian coordinates. If none exists, returns NULL
Gets the number of independent values used to compute a chi-squared test statistic, etc. The more degrees there are,
the harder it is for a test statistic to achieve significance. Formula is (rowCount - 1) * (colCount - 1)
Gets the chi-squared test statistic. The higher the value, and the lower the number of cells used as factors to compute the statistic, the more likely there is to be a relationship between the population parameters of the row and column variables
Gets a rectangular matrix of value objects, representing data within the crosstab
Convenience method for writing a crosstab to a string. If no writer provided, the default HTML writer
(CliffordVickrey\Crosstabs\Writer\CrosstabHtmlWriter
) will be used. See the class constants of the method for output
options. Returns the string output
Convenience method for writing a crosstab to a file. If no writer provided, the default HTML writer
(CliffordVickrey\Crosstabs\Writer\CrosstabHtmlWriter
) will be used. See the class constants of the method for output
options. If no filename is provided, a temporary file will be created. Returns the filename written to