Supplemental - Design Rationale


Because the research group has very rich datasets and there are many different possibilities for our tool, it is important for us to focus on solving the most important pain-point for the researchers. At the same time, the target users of this tool will be the legislators, the court system and/or civil rights groups whom the MGGG group aims to educate and inform so it is important that we have a simple and effective design that does not involve overly complicated interaction techniques to overload the user. After several rounds of brainstorming and discussion sessions among ourselves and with the researchers, we discovered that the most urgent need of the researchers comes from the struggle to understand how the distributions of these different evaluation metrics interfere with each other, i.e. the tradeoffs between different constraints.

We use simulated data of 100,000 districting plans generated by a Markov Chain sampling technique (provided by MGGG). For this project, we focus on the state of Virginia and Pennsylvania to show that the relationships between the metrics of interest could different drastically from state to state and modeling needs to be done on the state level. For this phase, we only focus on the state of Virginia and use voting data from the 2017 Attorney General election to get estimate on the expected House of Representatives voting results (assuming all eleven districts will vote for the same party). Based on these expected results, we then compute six evaluation metrics for gerrymandering:

When the user selects a reasonable range of a metric of their interest, it will filter the data and the other five metrics distribution will change accordingly. Hence the user can directly see how changing one metrics affects the other metrics. We also applied some transformations (log scale, scaling factors for percentage numbers, etc) on the scale of our metrics to make them effective on the histograms.

Some lessons we would like the user to get out of this tool:

  1. Help user better understand the interactions between different metrics.
  2. Use the tool to observe the impact on change in certain redistricting criteria on partisan and population balance.
  3. Looking at any particular metric or districting plan by itself is meaningless - this is why we use distributions of metrics across large ensembles of districting plans. What could be meaningful is how much of an outlier a plan is compared to the distribution of plans.
  4. You cannot compare metrics across different states because they are geographically and demographically difference by nature.

Alternative options we considered but decided not to go for:

  1. Some geospatial visualization based on a map:

    This was vetoed by the researchers because it is hard to show distributions (different possibilities) on a map. We also realized that this would not help us accomplish any of the goals above. If anything, it further fixates the notion of considering one particular plan instead of looking at the distribution of plans.

  2. A visualization that covers the entire country including all states:

    As we learned from some EDA and research, these metrics and how they interact differ drastically from state to state. So we decided to focus on two states and illustrate those differences instead.

  3. Giving the user the freedom to draw maps and see the resulting metrics:

    This would be a fun tool to have but would not be able to accomplish our goals because 1) they would not be able to see the distribution of each metric and the feasible region so will have nothing to compare their map with; 2) it would be very time-consuming without much help to the research group.
Sources: