Releasing code is my thing…I used1 to ship Mac OS X updates at Apple, Firefox updates at Mozilla, theÂ web frontend at Facebook, and manyÂ mobile apps at Facebook. Because of those experiences I’ve seen a lot of what works and what doesn’t, which platforms do things better, and tweaksÂ that could improve the lives of developers shipping software.
Lately I’ve been giving talksÂ about howÂ to shipÂ mobile apps at scale forÂ conferences and startups2. While many of the experiences and challenges for popular apps like Facebook are unique, it is becoming clear that a lot of the experience shipping apps is universal. I decided to write down some of the current frustrations and suggested changes, starting with how I would change the App Store. It got so long that I broke it into a series ofÂ blog posts.
Note: I am sure there are other issues that can be fixed in the App Store, particularly related to discovery and monetization. I know nothing about those, as apps I worked onÂ had brand recognition with millions of users and were free.
Many of these suggestions I have already given directly toÂ the TestFlight / iTunes ConnectÂ team3, though some are new. I’ve ordered my suggestions from what I think would be the highest impact to lowest impact. Previous suggestions were:
🐜Â Better support for bug and crash reporting
When you look at one-star ratings for an app, they frequently contain transient crashes or major bugs:
Users haveÂ no obvious outlet when a bug is encountered except leaving a review in the App Store4. Often these bugs do not affect every user of the app, yet the resulting review and rating is there for everyone to see and consider. Star ratings and reviews may inform app developersÂ of an actual problem but are next to useless for tracking down what the issue is.
Savvy developers like Facebook buildÂ very sophisticated5 in-app bug reporting flows to get more actionable reports, but many smaller developers doÂ not. Even with the best custom in-app bug reporting flow, a small fraction of your users will actually find and use it. Of course, with startup crashes and out-of-memory app evictions your in-app bug reporting code may never even be called.
Apple’s dashboard for app quality does not cut it for most app developers. While it has gotten better recently, it is still vastly underpowered compared to many third-party6 or in-house7 systems. Some reasons:
- Apple’s crash reporting systems treat the app like a black box. Third-party systems like Crashlytics integrate an SDK into the app so that the app can annotate reports and tie theÂ backend session to the frontend session. This app-specific information is often more useful than the crash stack for debugging issues and is something Apple’s systems will never be able to do.
- Counting “unique” issues is hard using just stack traces. Coming up with a signature that groupsÂ individual crashes into logical buckets (that is, saying XÂ crash appeared Y number of times) is very hard to get right. The state-of-the-art often involves regular expressions and manual whitelisting and blacklisting. At Facebook we had peopleÂ work on bucketing using machine learning, but even then it wasn’t exact. There needs to be a feedback loop for engineersÂ to say “these stacks are incorrectly grouped together” and “these stacks the system thinks are unique are allÂ the same issue”. Apple’s system does not have this feedback loop.
- No official API. I am not going to sift through user reports in Apple’sÂ web UI. I want my machines to do the work and I want to import it into my data warehouse to query as I see fit.
- Non-crash bug reportingÂ is archaic.Â If a user has a non-crash problem using theÂ app, the only supported options are to send an email to the support address or write a negative review.
There are some really nice things about Apple handling crash and bug reporting of course:
- Visibility of all crashes/out-of-memory/hangs. Because Apple’s crash reporter is the OS, it is active and reporting even when the app is in a very bad state. Third-party SDKs are blind to many issues–particularly startup crashes and hangs/deadlocks–because the app is monitoring itselfÂ and error reporting therefore depends on the app behaving to some extent. While many blindspots can be mitigated with enough engineering work (watchdog threadÂ looking for hangs in the mainÂ thread, dropping cookies before and after memory warnings to see if the OS evicted you, writing cookie on clean shutdown to catchÂ startup crashes, etc) they are not 100% solutions and every app developer / third-party service needs to implement them.8
- Visibility into OS state.Â Apple’s crash reporter knows many things about the environment an app is running in which may help you debug issues. For example, the OS knows how many other apps are running, if the user is on cell or wifi, if they just started the phone, if a call came in, if a userÂ just uninstalled and reinstalled an app, and so on. Again, while many of these can be be deduced they are not 100% solutions and every app developer / third-party service needs to implement them.
- NoÂ increase in app size.Â Because Apple’s reporting is at the OS level there is no codeÂ to integrate in your app–it just works.Â With third-party SDKs each app developer needs to includeÂ them in their app bundle. For users this makes every app package bigger, wasting bandwidth and space on theirÂ device. Because SDKs aren’t shared across apps9, users areÂ taking thatÂ size penalty for every app they have–even if theÂ SDKs are identical.Â Think about how many copies of SDKs like Crashlytics you have on your phone right now!
- Smaller developers get quality reporting for free. By default appÂ developers don’t need to know how to manage aÂ crash and bug reporting web service or pay for one. This is a huge benefit to new developers and smaller ones just starting out and really makes building quality apps more accessible by default.
I believe Apple can create a system that has the best of both worlds.
Generally thereÂ are two classes of issues peopleÂ can encounter. I have suggestions on how to improve both.
💬Â Proposal for InteractiveÂ BugÂ Reporting
UsersÂ need some way to interactively report issues, upload screenshots, and attach app-specific logs when they encounter a bug in an app. As I mentioned before, Facebook probably has the best system in the business here10, but it isn’t very discoverable11. Most apps do not even have aÂ bug reporting flow and the ones that doÂ implement their UI differently. How is a motivated end-user supposed to knowÂ how to report bugs?
I suggest a consistent interface in the OSÂ for users to report bugs to app developers. An obvious and discoverable option would be to add a “Report Problem” button in the app switcher:
Additionally, Apple should ship a revamped Feedback Assistant app in the production OS. It would let you provide feedback on any installed app instead of just Apple’s beta:
Even though it isn’t discoverable, the individual app settings in the Settings app should also have an entry to report an issue just in case a user stumbles into there.
The App Store app should also try to get people toÂ report problems using the new flow instead of writingÂ negative app reviews for transient bugs. This could be accomplished by putting a “Report Problem” button on the app listing page as well as prompting if the user enters a 1 star rating:
Upon activating the bug reporting flow:
- The OS callsÂ into the app.Â The app may choose to override in case the OS-supplied flow is insufficient for whatever reason.
- The user is presented with a list of product areas.Â The list is generated by calling into a method in the app.
- The user can take screenshots or videos of the app.Â Just likeÂ Facebook’s flow.Â Apple can actually do this better than anything in-app because the OSÂ owns the window server and doesn’t have to resort to tricks to get the active window content.
- The user can annotate the screenshots or videos.Â This is important not only for pointing out the issue but also to supportÂ people blanking out something they feel is sensitive.
- The OS calls into the app for logs.Â The app returns opaque key/value app-specific data it wants to add to the report (like a user id, etc).
- The OS sends the report to Apple.Â This does not include the app-specific key/values to maintainÂ confidentiality.
- The OS sends the report to the app developer or third-party service.Â This would be optionally controlled by aÂ CFUserReportURLÂ entry in the app’sÂ Info.plist.
A system like this will increase app quality by giving higher-quality feedback to app developers sooner, make App StoreÂ customer reviews more focused on app content, and address the helplessness people feel when they encounter a bug in an app.
Of course, not all bugs will need user information to be actionable. Crashes, hangs, and out-of-memory evictions generally don’t need information from the user and can be handled automatically.
💣Â Proposal for Crash/Hang/Memory Reporting
AppsÂ would opt-in to the enhanced reporter via theÂ Info.plist:
<!-- Tell the OS where to send automated reports --> <key>CFProblemReportURL</key> <string>...</string> <!-- Or do it for individual types --> <key>CFUserReportURL</key> <string>...</string> <key>CFCrashReportURL</key> <string>...</string> <key>CFSpinReportURL</key> <string>...</string> <key>CFOutOfMemoryReportURL</key> <string>...</string>
If CFProblemReport is set, all issues are sent to that URL. The developer can optionally choose to send different reports to different URLs. The URLs must be https as they may contain private information.
Apple would maintain a set of documented standard interpolation values that could optionally be used in the URL (OS version, app version, device type, device locale, etc):
<key>CFProblemReportURL</key> <string> https://www.example.com/myapp/%CF_BUNDLE_VERSION%/%OS_VERSION%/%LOCALE% </string>
If there are no plist entries set, the current reporting behavior is used. This maintains the easy onboarding for new developers and those who don’t need a third-party service.
When a crash/hang/out-of-memoryÂ occurs:
- The OS sends a generic report to Apple.
- This preserves the current Apple system behavior and protects against a third-party web service going down or losing reports.
- The OSÂ calls into the app.
- The crash reporter calls into a pre-determined method.
- The app returns opaque key/value combinations back to the OS12. This allows app developers to annotate their issue reports with valuable app-specific information.
- There should also be some method of returning headers in the same way, so that an app can specify things like an API key (using the Authorization header) and a third-party webservice can determine if it wantsÂ to handle the request without looking at the whole payload (which may be large).
- If the app doesn’t return an affirmative response in a documented time frame, Apple’s crash reporter will note that calling into the app failed, how it failed (it crashed, took too long, or the app itself returned a negative response) and move on.
- The OSÂ generates an app-specific report.
- The crash reporter takes the generic Apple report, adds the key/values and headers from above.
- The OS sends an app-specific report to the developer-specified URL.
- Apple’s crash reporterÂ handles sending the issue report to the specified URL in the Info.plist. It does nice things like retries if it fails and can optionally support batch uploads to cut down on http requests.
I believe a system like this would make the whole ecosystem better off. No more needing to integrate libraries like breakpad and PLCrashReporterÂ just to get stack traces. Developers have visibility into 100% of issues including hangs and startup crashes. Reports are sent to your own systems with Apple’s as a backup. Smaller developers still have everything handled for them. Apps are smaller due to not needing code to handle this sort of thing.
Next up, I’ll talk about App Store ratings.
- I am now doing my own stealth startup in this space. ↩
- If you want me to talk at yours, email me at my first name at my last name.com or contact me on Twitter @LegNeato. ↩
- If anyone at Apple needs more context or wants confidential information, people in the Program Office know how to get in touch with me. ↩
- They could send an email to the developer via their contact email but users don’t and developers generally don’t monitor it closely. ↩
- It can take screenshots, videos, give end-to-end server and client loggingÂ and playback, etc. Facebook’s “Flytrap” bug reporting analysisÂ is also insanely good and leverages their expertiseÂ in ML…I wish they’d talk about it more publicly. ↩
- Like Crashlytics ↩
- The Googles and Facebooks of the world know how to build these the best. Apple can’t build web services to save their life. I know, I worked there and have many friends that still do. ↩
- Facebook has built all these things. It takes a lot of work to get it right. ↩
- Static linking and sandboxing is not so great for sharing code. ↩
- Choose “Report a Problem” from the bookmarks. It has feature categories, screenshots with annotations, and even videos! ↩
- For employees it is activated with a “rage shake”. ↩
- Note that the app could encrypt these if they don’t trust Apple with the values. ↩