A Data Scientist’s Guide to Communicating Results
So your model is finally done running, you’ve tweaked and optimized all of the hyperparameters you could to obtain the best results, and you’re ready to present your findings. Now what?
One of the most important skills for data scientists to have is being able to clearly communicate results so different stakeholders can understand. Since data projects are collaborative across functions and data science results are often incorporated into a larger final project, the true impact of a data scientists’ work depends on how well others can understand their insights to take further action.
Here at Comet.ml, we strive to make make this process of communicating both results and the steps leading up to those results easier 👍🏼
In this post, we’ll explore:
- Who might be in your audience
- How to effectively structure your presentation and results based on your audience
- Common errors with communication to watch out for
- How Comet.ml helps with communication
Who is your audience?
Throughout every project, you’ll encounter people with varying levels of technical expertise, buy-in, and business goals. When you present to these different stakeholders, make sure you keep an eye on how your work ties into their role and decisions.
Here are three common audiences:
- Your team manager: probably the first line of review for any work you do or show to other stakeholders. Your manager may or may not be technical, but they certainly will be communicating with other teams/stakeholders.
- Line-of-business (LOB) stakeholders: this could be a product manager, business analyst, or a VP of customer support. Data science is amazing because it enables cross-functional work — just remain aware of how your insights or recommendations influence other teams’ workflows.
- Data engineers/engineering team: don’t forget the team that’s working to deploy champion models! Just because these are more technical stakeholders doesn’t necessarily mean they should not have any business context in the information — often times
Making a targeted presentation
Once you understand your audience, you can begin tailoring an effective and targeted presentation.
With non-technical stakeholders, you should avoid highly technical terms (e.g. your hyperparameters for your TensorFlow model) and instead, try to frame the machine learning problem into the same terms in which business decisions are made — marginal cost and benefit. However, with your engineering or devOps team, they will need to know details such as how long the model takes to train and GPU/CPU metrics during training.
The most important thing to recognize is that this should not be the only time these results are communicated. Frequent communication and feedback will help alleviate pressure on the final presentation, increase buy-in for your work, and help ease business stakeholders into technical details.
Here’s a useful starting framework you can use:
- Your understanding of the business problem
- How to measure business impact — what business metrics do your model results align to?
- What data is/was available — if appropriate, reference what data it would be helpful to collect
- The initial solution hypothesis
- The solution/model — use examples and visualizations
- The business impact of the solution and clear action items for stakeholders
Hungry for more? We also recommend reading these great posts that include tips on communication: (1) Aspiring Data Scientists? Master these fundamentals from Peter Gleeson and (2) The Data Science Process: What a data scientist actually does day-to-day from Raj Bandyopadhyay
Common errors to watch out for:
While there are a number of statistical and technical errors you can make during your analysis, we’ll focus on some common communication errors you might run into:
- Omitting/glossing over any key assumptions made during the analysis
- Recycling the same presentation for different audiences
- Showing visualizations like charts and tables without re-iterating the main idea
- Saving all insights until a final presentation instead of making the process piecemeal and iterative
- Saving the findings until the end of the presentation — make sure to include an executive summary and recommendations at the beginning of the presentation
- Not having back-up materials of different technical levels — an appendix with supporting details is useful for both the actual presentation itself and context if the presentation is shared
- Not opening up after the presentation (either verbally or via email) for feedback — everyone has a different way of absorbing information so if you need to adjust, the easiest way to find out that out is to ask for feedback!
How Comet.ml helps communicate results
At Comet.ml, we help data scientists and machine learning engineers to automatically track their datasets, code, experiments and results creating efficiency, visibility and reproducibility.
With Comet.ml, you can visualize and track all your model results — this is especially helpful for long running experiments, since you can track the results live. Comet.ml also allows you to:
- Share experiment results via slack or email
- Add rich documentation of your work in context (with images!)
- Visualize and save sampled predictions such as images and figures
- Match each experiment with a dataset hash.
Want to see an example? Follow along as our team competes in Kaggle’s Home Credit Default Risk competition in our public project!
Just like you need to iterate with your models to improve their performance, you shouldn’t expect to have a perfect presentations skills from the jump. Improving your communication skills requires flexibility for your audiences and practice over time! 👏🏼