import React from "react";
import { SimpleContentPage } from "../../components";
import { Tabs, Tab } from "carbon-components-react";
import { Link } from "react-router-dom";
import "./CommGuide.scss";

export const CommGuide = () => (
  <SimpleContentPage
    key="CommGuide"
    title="Guidance on Communicating Uncertainty "
  >
    <Tabs type="container" light>
      <Tab label="Overview">
        <h2>Overview: What to consider when choosing communication methods</h2>
        <p>
          Communicating UQ means presenting the uncertainty estimates to
          stakeholders, assuming you have chosen the right UQ algorithm to
          generate the right type of uncertainty estimates (see our{" "}
          <Link to="/resources/guidance">
            guidance on choosing UQ algorithms
          </Link>
          ). This is a crucial step because even high-quality uncertainty
          estimates can be misunderstood by people if they have difficulty or
          biases in interpreting the numbers or statistics. In this guide, we
          will introduce you to some key considerations for communicating UQ and
          example methods. In practice, it is necessary to conduct tests with
          your target users to make sure the chosen UQ communication method is
          understood correctly.
        </p>
        <p>
          Let's start with a few key questions that should guide your choice of
          UQ communication methods. You can navigate to the tabs above to see
          example methods to communication UQ of regression and classification
          models.
        </p>
        <h3>What is the form of the UQ?</h3>
        <p>
          The first step is to identify the form of the UQ to be communicated,
          i.e. whether it is a single confidence score or a range or
          distribution of the possible predicted outcomes. In current ML tasks, the
          former is how uncertainty estimates of a classification model often
          appear, and the latter is how uncertainty estimates of a regression
          model appear.
        </p>
        <p>
          Note here we focus on the form instead of the source of UQ. For
          example, for a regression model, UQ of different sources, whether it
          is data uncertainty, model uncertainty, or overall predictive
          uncertainty, can all be communicated as ranges of possible outcomes ,
          but it is possible that users would perceive them or act on them
          differently.
        </p>
        <h3>Communicating UQ of a single instance or a group of instances?</h3>
        <p>
          The next question is whether you would like to present the UQ of a
          single instance or a group of instances. End users are often
          interested in the uncertainty estimate of a particular prediction in
          order to assess and act on the prediction. In this guide we focus on
          different UQ communication methods in the case of a single instance.
        </p>
        <p>
          Sometimes a user or a data scientist might be interested in
          investigating the UQ of a group of instances, for example groups of
          different feature-values, or what kind of instances get high or low
          uncertainty estimates from the model. As shown in our{" "}
          <Link to="/demo/2/decision">interactive demo</Link>, this can often be
          achieved by either visually plotting the uncertainty estimates of
          instances with regard to different feature-values, or presenting the
          uncertainty estimates aggregated by the interested groups (e.g., mean
          uncertainty estimates).
        </p>

        <h3>How precise does the UQ need to be?</h3>
        <p>
          UQ can be shown with different levels of precision. For example,
          describing the UQ of a classification model using pre-defined
          categories, such as low/medium/high confidence, is less precise than
          showing numerical values of confidence. Providing a numerical range
          for UQ of a regression model is less precise than showing the
          probability distribution of possible values.
        </p>
        <p>
          In general, high-precision communication methods can help ML experts
          understand the full scale of the uncertainty of the model. While
          low-precision methods are often preferred by lay people. Again, it is
          necessary to design and test the communication methods with your
          target stakeholders to identify the appropriate precision level.
        </p>
        <h3>What medium to use?</h3>
        <p>
          Another question that could correlate with the choice of precision
          level is using what medium to communicate UQ, whether by verbal,
          numerical or visual expressions. Oftentimes, verbal expressions pair
          well with low-precision communication methods to allow stakeholders to
          easily consume the UQ information. Visualization is often a good
          approach to communicate high-precision, probabilistic UQ information.
          Sometimes the choice of medium is pre-determined by the interface or
          user workflow, and you have to design the content accordingly.
        </p>
      </Tab>
      <Tab label="Communicate for Regression" id="longTab">
        <h2>
          Communicating UQ as a range or distribution of possible outcomes for
          regression models
        </h2>
        <h3>Prediction interval range</h3>
        <p>
          For regression models, oftentimes people choose to communicate the
          range where the possible prediction outcome may fall with a
          pre-specified probability as the confidence level (e.g., 95%). You may
          have seen the use of a prediction interval or error bar to visually
          communicate this range. It has the benefits of being easy to
          understand and not involving probabilities, and thus often preferred
          when communicating UQ to lay people or audience with relatively low
          numeracy.
        </p>
        <p>
          Depending on the interfaces and user workflow, it is not always
          desirable to present a visual plot. You can also communicate the
          prediction interval range in a verbal expression, such as below.
        </p>
        <div>
          <div className="price" style={{ width: "30em" }}>
            <h2>Recommended price:</h2>
            <h2 style={{ color: "darkgreen" }}>478K</h2>
            <h5>
              margin of error with 95% confidence:{" "}
              <span style={{ color: "darkgreen" }}>+/- 20K</span>.
            </h5>
          </div>
          <p style={{ fontStyle: "italic" }}>
            Communicating the prediction interval in a verbal form
          </p>
        </div>
        <p>
          The down side of an interval range is that stakeholders don't get to
          see the distribution details, so they could misinterpret that possible
          outcomes are equally distributed across the interval range, which is
          often not the case. Also, not everyone is familiar with the concept of
          prediction interval or understand that it should be read with a
          specified confidence level, so additional explanations may be
          necessary.
        </p>
        <p>
          Another benefit of a visual interval range is it is simple enough to
          be combined with other visual encoding to communicate the UQ for a
          group of predictions. For example, it is common to show an uncertainty
          band on a line chart for a group of predictions, such as below, as
          shown in our <Link to="/demo/2/improve">interactive demo</Link> to
          illustrate the prediction intervals for instances with different
          values for the feature “number of rooms (RM)”.
        </p>
        <figure>
          <img
            src="/imgs/uncertainty_before_adding_feature.png"
            alt="Communicating the prediction interval for a group of predictions with uncertainty bands"
          ></img>
          <figcaption className="caption">
            Communicating the prediction interval for a group of predictions
            with uncertainty bands
          </figcaption>
        </figure>
        <h3>Visualizing the distribution</h3>
        <p>
          Sometimes you may want to show detailed UQ information of how the
          possible predicted outcomes are distributed. This would help
          stakeholders understand that the possible outcomes are not equally
          distributed across the range and therefore better assess the
          uncertainty. There are a few popular methods to visualize a
          probability distribution, including{" "}
          <a
            href="https://uq360.readthedocs.io/en/latest/regression_metrics.html#uq360.metrics.regression_metrics.plot_uncertainty_distribution"
            target="_blank"
            rel="noopener noreferrer"
          >
            density plot
          </a>
          , violin plot and gradient plot. You can often get them from standard
          visualization libraries. It is worth noting that uncertainty estimates
          do not always follow a symmetrical Gaussian distribution centered
          around the predicted value. In that case, it is even more important to
          show the distribution details to alert the stakeholders that the
          uncertainty estimates are skewed or have multiple peaks.
        </p>
        <figure>
          <img
            src="/imgs/high_uncertainty_density.png"
            alt="Density plot to visualize the distribution of UQ"
          ></img>
          <figcaption className="caption">
            Density plot to visualize the distribution of UQ
          </figcaption>
        </figure>
        <p>
          A useful variation to density plots, developed by a recent{" "}
          <a
            href="http://dx.doi.org/10.1145/2858036.2858558"
            target="_blank"
            rel="noopener noreferrer"
          >
            human-computer interaction research
          </a>
          , is quantile dot plots. A critical drawback of a traditional density
          plot is that people may have trouble correctly assessing the relative
          likelihood of different possible values, especially for those not
          trained to read data visualization. A quantile dot plot mitigates this
          problem by using stacked dots to visually represent the approximate
          frequencies of different values. For example, in the plot below, it is
          easy to see that 12 out of 20 times the prices fall below 480K.
        </p>
        <figure>
          <img
            src="/imgs/high_uncertainty_quantile.png"
            alt="Quantile dot plot"
          ></img>
          <figcaption className="caption">Quantile dot plot</figcaption>
        </figure>

        <p>
          When communicating the uncertainty of a group of data, it is possible
          to overlay these visual distributions with other visual encoding. For
          example, fan chart is a common approach to encode gradient plots on a
          line chart for a group of predictions.
        </p>
        <figure>
          <img src="/imgs/fanchart.png" alt="An example of fan charts"></img>
          <figcaption className="caption">Fan chart</figcaption>
        </figure>
        <p>
          For regression tasks, the UQ360 python library provides functions to
          generate numerical ranges, visual intervals, density plots, and
          quantile dot plots. You can see examples in the{" "}
          <Link to="/resources/tutorials">tutorials</Link>.
        </p>
      </Tab>

      <Tab label="Communicate for Classification" id="longTab">
        <h2>
          Communicating UQ as a confidence score for classification models
        </h2>
        <p>
          In the case of a classification model, the uncertainty estimate of a
          prediction is typically a single score. You have a few choices to
          communicate the score.
        </p>
        <h3>Ordered categories</h3>
        <p>
          Often for lay people, a numerical confidence score is hardly
          actionable. One may have trouble judging, say, whether a 85%
          confidence level is good enough to rely on the model's prediction. The
          answer highly depends on the task domain and the stakes. Using a
          pre-defined, ordered categorization, such as high, moderate, low and
          very low confidence, could efficiently guide people in making such a
          judgment.
        </p>
        <p>
          The main downside of using ordered categories is a lack of precision.
          Also, the mapping between a confidence score to a category, or the
          threshold criteria, should be carefully designed based on the task and
          target stakeholders, such as the stakes of their follow-up actions by
          accepting or rejecting a model prediction, and their risk tolerance.
          For example, a 90% confidence score could be seen as high-confidence
          for a movie recommender, but might not be so for medical imaging AI
          for disease diagnosis.
        </p>
        <h3>Confidence score</h3>
        <p>
          Sometimes it is better to present the actual confidence score, if the
          target stakeholders are interested in the precise values or comparing
          the uncertainty estimates of different predictions.
        </p>
        <p>
          With some loss of precision, lay people are often better at
          interpreting a verbal frequency (e.g., the model believes 9 in 10
          chances the prediction is...) than a numerical number.
        </p>
        <p>
          If it is important to emphasize the existence of uncertainty, you can
          also use a pie chart or icon array to communicate the probabilistic
          nature of the prediction, like below.
        </p>
        <figure>
          <img src="/imgs/iconarray.png" alt="Icon array"></img>
          <figcaption className="caption">Icon array</figcaption>
        </figure>
        <p>
          For classification tasks, the UQ360 python library provides functions
          to generate confidence scores for relevant UQ algorithms.
        </p>
      </Tab>
    </Tabs>
  </SimpleContentPage>
);
