Replies: 2 comments 1 reply
-
Thanks for publishing this. I must say I already find the current API rather complex and confusing, and some of the proposed changes would make it even more overwhelming... I've been thinking about how to simplify the API towards a few general rules so here's my two cents:
ExamplesIt seems to me the following examples cover the desired functionality in a simple way (but maybe I don't have the full picture): Simple data df = (a=1:3, b=rand(3), c=["x", "y", "z"])
data(df) * mapping(:a, :b, color=:c) Short form: mapdata(1:3, rand(3), color=["x", "y", "z"]) Pre-grouped df = (a=1:5, b=1:5, c=rand(5), d=rand(5))
data(df) * mapping([:a, :b], [:c, :d], color=dims(1)) Short form: mapdata([1:5, 1:5], [rand(5), rand(5)], color=dims(1)) With group names data(df) * mapping([:a, :b], [:c, :d], color=dims(1) => ["A", "B"]) Short form: mapdata([1:5, 1:5], [rand(5), rand(5)], color=dims(1) => ["A, "B"]) With more dimensions df = (a=-2:2, b=1:5, c=rand(5), d=rand(5), e=rand(5), f=rand(5))
data(df) * mapping([:a :b; :a :b], [:c :d, :e :f], marker=dims(1), color=dims(2)) Short form: mapdata([[-2:2] [1:5]; [-2:2] [1:5]], fill(rand(5), 2, 2), marker=dims(1), color=dims(2)) Wide data and mixed data This is just a special case of pre-grouped data where some parameters have no groups: df = (a=1:5, b=rand(5), c=rand(5), d=["a", "b", "c", "d", "e"])
data(df) * mapping(:a, [:b, :c], color=:d) Short form: mapdata(1:5, [rand(5), rand(5)], color=["a", "b", "c", "d", "e"]) More complex df = (a=1:5, b=rand(5), c=rand(5), d=["a", "b", "c", "d", "e"], e=rand(1:5))
data(df) * mapping(:a, [:b, :c], color=[:d, :e]) Short form: mapdata(1:5, [rand(5), rand(5)], color=[["a", "b", "c", "d", "e"], rand(1:5)]) Matrix data Here it probably makes sense to only support the short form, mapping directly the data without going through a "table": visual(Heatmap) * mapdata(1:10, 1:10, rand(10, 10)) Matrix columns No specific support for this, it can be done easy enough as a case of pre-grouped data: A = rand(10, 10)
visual(Scatter) * mapdata(1:10, [eachcol(A)...]) |
Beta Was this translation helpful? Give feedback.
-
Regarding case 1 and 2: yes that's right. In case 1, Note that in my mind the cases were divided differently:
Column identifiers represent vectors, so in this sense it's consistent: the Ambiguities:
Mapdata vs mapping: indeed! I hadn't thought of using the "no data" information, I think that would work fine. Much nicer than introducing Regarding #328, another option would be to allow data(df) * mapping(:x, :y, color=:s, markersize=dims(0) => [20,30]) I guess this could prove useful in other contexts too... And maybe more useful than # Marker size proportional to element index:
data(df) * mapping(:x, :y, color=:s, markersize=dims(0))
data(df) * mapping(:x, :y, color=:s, markersize=dims(0) => sqrt) Edit: maybe this |
Beta Was this translation helpful? Give feedback.
-
At the moment the data format is slightly inconsistent. The following two options are supported.
data(df) * mapping(:col1, :col2)
(long format), ordata(df) * mapping(:col1, [:col2, :col3])
(wide format)mapping([rand(10), for _ in 1:3], [rand(10), for _ in 1:3], color=["a", "b", "c"])
(pregrouped format)Naturally, this is inconsistent: whether data is in a dataset or in a set of variables should not affect whether it is considered "pregrouped" (each entry is a trace in the plot) or not.
However, the above approach has a key advantage: long and wide format can be supported simultaneously, AlgebraOfGraphics can figure it out based on whether a mapping has a unique symbol or a list of symbols. The two approaches can already happen simultaneously, with e.g.
mapping(:col1, [:col2, :col3], color = :col4)
. This format (which we can call columnwise) is also useful when data is in separate variables, and is the defaut in Plots.jl (where one can doscatter(rand(10), rand(10, 5))
to plot 5 traces).Proposal
Distinguish between three possible formats (going from data to plot):
visual(Heatmap) * mapping(1:10, 1:10, rand(10, 10))
visual(Scatter) * mapping(rand(10), rand(10, 5))
gives 5 tracesvisual(Scatter) * mapping([rand(10), for _ in 1:5], [rand(10), for _ in 1:5], color=["a", "b", "c", "d", "e"])
gives 3 tracesConceptually, the difference is how one slices the data to get a trace. The option raw slices along all directions (the whole data is a unique trace), the option columnwise slices along the first direction (each column is a trace), the option pregrouped slices along no directions (each entry is a trace). So a possible API could be
where possible values of
dims
areAll()
for raw (using and reexportingDataAPI.All
)1
for columnwise (IMO, should be the default)()
for pregroupedmapslices
andeachslice
follow different conventions in base.In this scenario, having variables in a dataset or separated has no effect: the columns are simply extracted, then the slicing context is applied.
Most things seem possible with a reasonably easy syntax with the column-wise default, other than plots where it's not possible / convenient to pass data as vectors (eg
surface
), in which case one should explicitly declare the raw context.Remark
The raw context may not be necessary, it can be achieved via the pregrouped context by wrapping each entry in
fill
. In that case, a simpler API may bedata([df=nothing]; [pregrouped=false])
.Naming and styling "slices"
At the moment, this is also inconsistent, as for the pregrouped context, the user simply passes the list of names, eg
whereas for the columnwise context at the moment one uses a
dims
helper function, egThis feels a bit inconsistent and is annoying in practice. Unfortunately, passing
color=[:col2, :col3]
wouldn't work, as AoG would try to apply the data extraction / slicing mechanism to it.A possibility to uniform both approaches would be to have an
escape
orjust
function to signal "ignore this item when applying the data extraction / slicing pipeline. So the two examples above, with the new proposal, would beNote that in the first case
escape
is not really necessary, asdata(dims=())
does not in practice cause any data extraction / slicing.Beta Was this translation helpful? Give feedback.
All reactions