Releasing code is my thing…I used1 to ship Mac OS X updates at Apple, Firefox updates at Mozilla, the web frontend at Facebook, and many mobile apps at Facebook. Because of those experiences I’ve seen a lot of what works and what doesn’t, which platforms do things better, and tweaks that could improve the lives of developers shipping software.
Lately I’ve been giving talks about how to ship mobile apps at scale for conferences and startups2. While many of the experiences and challenges for popular apps like Facebook are unique, it is becoming clear that a lot of the experience shipping apps is universal. I decided to write down some of the current frustrations and suggested changes, starting with how I would change the App Store. It got so long that I broke it into a series of blog posts.
Note: I am sure there are other issues that can be fixed in the App Store, particularly related to discovery and monetization. I know nothing about those, as apps I worked on had brand recognition with millions of users and were free.
Many of these suggestions I have already given directly to the TestFlight / iTunes Connect team3, though some are new. I’ve ordered my suggestions from what I think would be the highest impact to lowest impact. Previous suggestions were:
🐜 Better support for bug and crash reporting
When you look at one-star ratings for an app, they frequently contain transient crashes or major bugs:
Users have no obvious outlet when a bug is encountered except leaving a review in the App Store4. Often these bugs do not affect every user of the app, yet the resulting review and rating is there for everyone to see and consider. Star ratings and reviews may inform app developers of an actual problem but are next to useless for tracking down what the issue is.
Savvy developers like Facebook build very sophisticated5 in-app bug reporting flows to get more actionable reports, but many smaller developers do not. Even with the best custom in-app bug reporting flow, a small fraction of your users will actually find and use it. Of course, with startup crashes and out-of-memory app evictions your in-app bug reporting code may never even be called.
Apple’s dashboard for app quality does not cut it for most app developers. While it has gotten better recently, it is still vastly underpowered compared to many third-party6 or in-house7 systems. Some reasons:
- Apple’s crash reporting systems treat the app like a black box. Third-party systems like Crashlytics integrate an SDK into the app so that the app can annotate reports and tie the backend session to the frontend session. This app-specific information is often more useful than the crash stack for debugging issues and is something Apple’s systems will never be able to do.
- Counting “unique” issues is hard using just stack traces. Coming up with a signature that groups individual crashes into logical buckets (that is, saying X crash appeared Y number of times) is very hard to get right. The state-of-the-art often involves regular expressions and manual whitelisting and blacklisting. At Facebook we had people work on bucketing using machine learning, but even then it wasn’t exact. There needs to be a feedback loop for engineers to say “these stacks are incorrectly grouped together” and “these stacks the system thinks are unique are all the same issue”. Apple’s system does not have this feedback loop.
- No official API. I am not going to sift through user reports in Apple’s web UI. I want my machines to do the work and I want to import it into my data warehouse to query as I see fit.
- Non-crash bug reporting is archaic. If a user has a non-crash problem using the app, the only supported options are to send an email to the support address or write a negative review.
There are some really nice things about Apple handling crash and bug reporting of course:
- Visibility of all crashes/out-of-memory/hangs. Because Apple’s crash reporter is the OS, it is active and reporting even when the app is in a very bad state. Third-party SDKs are blind to many issues–particularly startup crashes and hangs/deadlocks–because the app is monitoring itself and error reporting therefore depends on the app behaving to some extent. While many blindspots can be mitigated with enough engineering work (watchdog thread looking for hangs in the main thread, dropping cookies before and after memory warnings to see if the OS evicted you, writing cookie on clean shutdown to catch startup crashes, etc) they are not 100% solutions and every app developer / third-party service needs to implement them.8
- Visibility into OS state. Apple’s crash reporter knows many things about the environment an app is running in which may help you debug issues. For example, the OS knows how many other apps are running, if the user is on cell or wifi, if they just started the phone, if a call came in, if a user just uninstalled and reinstalled an app, and so on. Again, while many of these can be be deduced they are not 100% solutions and every app developer / third-party service needs to implement them.
- No increase in app size. Because Apple’s reporting is at the OS level there is no code to integrate in your app–it just works. With third-party SDKs each app developer needs to include them in their app bundle. For users this makes every app package bigger, wasting bandwidth and space on their device. Because SDKs aren’t shared across apps9, users are taking that size penalty for every app they have–even if the SDKs are identical. Think about how many copies of SDKs like Crashlytics you have on your phone right now!
- Smaller developers get quality reporting for free. By default app developers don’t need to know how to manage a crash and bug reporting web service or pay for one. This is a huge benefit to new developers and smaller ones just starting out and really makes building quality apps more accessible by default.
I believe Apple can create a system that has the best of both worlds.
Generally there are two classes of issues people can encounter. I have suggestions on how to improve both.
💬 Proposal for Interactive Bug Reporting
Users need some way to interactively report issues, upload screenshots, and attach app-specific logs when they encounter a bug in an app. As I mentioned before, Facebook probably has the best system in the business here10, but it isn’t very discoverable11. Most apps do not even have a bug reporting flow and the ones that do implement their UI differently. How is a motivated end-user supposed to know how to report bugs?
I suggest a consistent interface in the OS for users to report bugs to app developers. An obvious and discoverable option would be to add a “Report Problem” button in the app switcher:
Additionally, Apple should ship a revamped Feedback Assistant app in the production OS. It would let you provide feedback on any installed app instead of just Apple’s beta:
Even though it isn’t discoverable, the individual app settings in the Settings app should also have an entry to report an issue just in case a user stumbles into there.
The App Store app should also try to get people to report problems using the new flow instead of writing negative app reviews for transient bugs. This could be accomplished by putting a “Report Problem” button on the app listing page as well as prompting if the user enters a 1 star rating:
Upon activating the bug reporting flow:
- The OS calls into the app. The app may choose to override in case the OS-supplied flow is insufficient for whatever reason.
- The user is presented with a list of product areas. The list is generated by calling into a method in the app.
- The user can take screenshots or videos of the app. Just like Facebook’s flow. Apple can actually do this better than anything in-app because the OS owns the window server and doesn’t have to resort to tricks to get the active window content.
- The user can annotate the screenshots or videos. This is important not only for pointing out the issue but also to support people blanking out something they feel is sensitive.
- The OS calls into the app for logs. The app returns opaque key/value app-specific data it wants to add to the report (like a user id, etc).
- The OS sends the report to Apple. This does not include the app-specific key/values to maintain confidentiality.
- The OS sends the report to the app developer or third-party service. This would be optionally controlled by a CFUserReportURL entry in the app’s Info.plist.
A system like this will increase app quality by giving higher-quality feedback to app developers sooner, make App Store customer reviews more focused on app content, and address the helplessness people feel when they encounter a bug in an app.
Of course, not all bugs will need user information to be actionable. Crashes, hangs, and out-of-memory evictions generally don’t need information from the user and can be handled automatically.
💣 Proposal for Crash/Hang/Memory Reporting
Apps would opt-in to the enhanced reporter via the Info.plist:
<!-- Tell the OS where to send automated reports -->
<!-- Or do it for individual types -->
If CFProblemReport is set, all issues are sent to that URL. The developer can optionally choose to send different reports to different URLs. The URLs must be https as they may contain private information.
Apple would maintain a set of documented standard interpolation values that could optionally be used in the URL (OS version, app version, device type, device locale, etc):
If there are no plist entries set, the current reporting behavior is used. This maintains the easy onboarding for new developers and those who don’t need a third-party service.
When a crash/hang/out-of-memory occurs:
- The OS sends a generic report to Apple.
- This preserves the current Apple system behavior and protects against a third-party web service going down or losing reports.
- The OS calls into the app.
- The crash reporter calls into a pre-determined method.
- The app returns opaque key/value combinations back to the OS12. This allows app developers to annotate their issue reports with valuable app-specific information.
- There should also be some method of returning headers in the same way, so that an app can specify things like an API key (using the Authorization header) and a third-party webservice can determine if it wants to handle the request without looking at the whole payload (which may be large).
- If the app doesn’t return an affirmative response in a documented time frame, Apple’s crash reporter will note that calling into the app failed, how it failed (it crashed, took too long, or the app itself returned a negative response) and move on.
- The OS generates an app-specific report.
- The crash reporter takes the generic Apple report, adds the key/values and headers from above.
- The OS sends an app-specific report to the developer-specified URL.
- Apple’s crash reporter handles sending the issue report to the specified URL in the Info.plist. It does nice things like retries if it fails and can optionally support batch uploads to cut down on http requests.
I believe a system like this would make the whole ecosystem better off. No more needing to integrate libraries like breakpad and PLCrashReporter just to get stack traces. Developers have visibility into 100% of issues including hangs and startup crashes. Reports are sent to your own systems with Apple’s as a backup. Smaller developers still have everything handled for them. Apps are smaller due to not needing code to handle this sort of thing.
Next up, I’ll talk about App Store ratings.
- I am now doing my own stealth startup in this space. ↩
- If you want me to talk at yours, email me at my first name at my last name.com or contact me on Twitter @LegNeato. ↩
- If anyone at Apple needs more context or wants confidential information, people in the Program Office know how to get in touch with me. ↩
- They could send an email to the developer via their contact email but users don’t and developers generally don’t monitor it closely. ↩
- It can take screenshots, videos, give end-to-end server and client logging and playback, etc. Facebook’s “Flytrap” bug reporting analysis is also insanely good and leverages their expertise in ML…I wish they’d talk about it more publicly. ↩
- Like Crashlytics ↩
- The Googles and Facebooks of the world know how to build these the best. Apple can’t build web services to save their life. I know, I worked there and have many friends that still do. ↩
- Facebook has built all these things. It takes a lot of work to get it right. ↩
- Static linking and sandboxing is not so great for sharing code. ↩
- Choose “Report a Problem” from the bookmarks. It has feature categories, screenshots with annotations, and even videos! ↩
- For employees it is activated with a “rage shake”. ↩
- Note that the app could encrypt these if they don’t trust Apple with the values. ↩