Implementing benchmarking metrics within status-mobile app #19047
Replies: 1 comment
-
Before anything, thanks for the iniciative to start this discussion @siddarthkay! There are many interesting topics you're touching. I'll lightly pass over them to share my perspective.
We could tell shadow-cljs to run the dev build with a bunch of other optimizations to approximate to what's running in prod builds. Especially The best I think would be to decouple the collection of data from the device of origin. Ideally, the developer should decide if they want to observe from a dev build, from an emulator, from a real device, etc. The dev would trade convenience over precision, understanding the limitations of each environment.
The step to generate reports and make sense of time series data is solved already in the industry. I would certainly try not to reinvent this wheel. Existing tools will allow us to report p95, p99, medians, standard deviations, generate dashboards, and so on. This is all nice, because it means we can descope this effort.
Using a limited mobile device to view metrics is not ideal to me. Some metrics can be okay, like UI FPS, or CPU usage, but once we embed the concept of time series, mobile devices just don't fit the bill when the developer needs to understand why something is not performing well. Shaking the device would be a nice feature, but we could ignore this and focus on the developer side of things. We will be the main consumers and generators of those metrics, actively seeking improvements. Example, we could have Prometheus pulling data from the device(s) and use Graphana to create dashboards. This would allow us to see results in near real-time. Therefore, I think it's viable to start without any UI in the app, or actually, just a way to start and stop data collection as you suggested.
TTFI is very important indeed, but a single data point for TTFI is kind of unreliable. So the concept of time is somewhat essential. I believe our in-house solution should always assume data could be consumed as a time series or aggregated into one point.
This is a bit nuanced to me. Profiling and benchmarking in non-prod builds is useful to give a sense of relative performance. Obviously every result should be taken with a grain of salt and not as truths, even when measuring in production (e.g. real devices have plenty of other things running simultaneously, and every user is different). I agree with you, results can be misleading, but it's mostly the dev's responsibility to know this, so I still think it's valuable to invest time in making using tools like FlashLight and the Hermes profiler easier for everybody. Both tools worked well in my system, albeit Hermes was a bit inconvenient. |
Beta Was this translation helpful? Give feedback.
-
Problem
We need a common language and some useful benchmarking metrics so that we can measure performance across different user journeys on different platforms under various conditions.
Existing solutions
ref -> https://reactnative.dev/docs/debugging?js-debugger=new-debugger#performance-monitor
Provides following key metrics :
This is a good tool but the only limitation here being that it provides information on debug builds and we are interested in those metrics on a release build.
The app goes through various compile time optimisations such that what might be a problem in debug build may not be a problem in release build.
Hence I think its not fair to consider these metrics on debug versions since they may be a misleading.
Hermes Profiler -> A profiler by react native team
ref -> https://reactnative.dev/docs/profile-hermes
Hermes Profiler provides a good waterfall but we need to connect to a laptop to extract these metrics and it is sometimes hard to setup / measure.
This way is often not easy for everyone to setup or use.
FlashLight Library -> A standalone library to measure perf of Android Apps
ref -> https://github.com/bamlab/flashlight
A good thing about this library is that they track important metrics on release builds of the app.
The only limitation here is that they use an external C debugger to profile the performance on an emulator ( or a connected device via a cable ).
The problem with emulators are that they have subtle differences when we compare them with real devices and still do not show the complete picture. The problem with connected devices is just that its (in my opinion) not an optimal UX for measuring the performance but still a doable approach in the absence of other solutions.
Proposal
We build an in house solution with the help of Android and iOS native modules to measure and log the performance metrics we care about.
How we would do this is by providing a toggle profiler button in settings and a floating button on all screens where we could turn it off. We only would want to profile certain user journeys to measure the performance of the app in those situations and not have it be turned on by default. Although it would be cool if we could turn it on at build time via a flag and turn it off in settings so that we can measure onboarding performance.
We would need a basic UI in the app to view these metrics and the ability to share these metrics as part of our "shake device to share logs" feature so that they are easily exportable.
Since the performance metrics is a time series data we want to store these values with a timestamp and also have the ability to average out the each value in the end.
There are other metrics as well in which we are interested in which are not time series related
Beta Was this translation helpful? Give feedback.
All reactions