You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
see list of solved tasks at a glance (similar to a JUnit html report)
including coverage count
or encountered errors
and any additional metrics we (will) collect (processing time, character count, ...)
with possibility to click and see the actual result (i.e. what the model generated)
diff/compare two model results
on a list basis (i.e. which tasks was which model able to solve and where are differences in coverage: character count, ...)
on a task basis (i.e. diffing the model results 1:1)
maybe also the ability to switch between different results from different runs (i.e. compare plain.go result from model A's run 2 against plain.go result from model B's result 4)
The text was updated successfully, but these errors were encountered:
plain.go
result from model A's run 2 againstplain.go
result from model B's result 4)The text was updated successfully, but these errors were encountered: