Georgia Tech的Data and Visual Analytics的作业,这次是用 D3.js 这个库,在七个不同的场景中,根据数据画七种图。工作量巨大,断断续续写了一个星期吧。
Q1. Designing a good table. Visualizing data with Tableau
Imagine you are a data scientist working with the United Nations High
Commissioner for Refugees (UNHCR) and need to perform the following tasks to
aid UNHCR’s understanding of persons of concern.
Table
Create a table to display the details of the refugees (Total Population) in
the year 2005 from the data provided in unhcr_persons_of_concern.csv. You can
use any tool (e.g., Excel, HTML) to create the table. Keep suggestions from
class in mind when designing your table (see lectures slides for what to and
what not to do, but you are not limited to the techniques described). Describe
your reason for choosing the techniques you use in explanation.txt in no more
than 50 words.
Tableau
Visualize the demographic attributes (age, sex, country of origin, asylum
seeking country) in the file unhcr_popstats_demographics.csv (in the folder
Q1) for any given year in one chart. Tableau is a popular InfoViz tool and the
company has provided us with student licenses. Go to this link and select “Get
Started”. On the form, enter your Georgia Tech email address for “Business
email” and “Georgia Institute of Technology” for “Organization”. The Desktop
Key for activation is available in TSquare Resources as “Tableau Desktop
Key”. This key is for your use in this course only. Do not share the key with
anyone.
Provide a rationale for your design choices in this step in the file
explanation.txt in no more than 50 words.
Q2. Forcedirected graph layout
You will experiment with many aspects of D3 for graph visualization. To help
you get started, we have provided the graph.html file (in the folder Q2).
Adding node labels
Modify the graph.html to show labels to the right of each node in the graph.
If a node is dragged, its label must also move with the node. (You are welcome
to split graph.html into graph.html, graph.js and graph.css.)
Coloring links
Color the links based on the “value” field in the links array. Assign the
following colors:
If the value of the edge is >= 1.5 : assign Blue color to the link.
If the value of the edge is < 1.5 : assign Green color to the link.
Scaling node sizes
- Adjust the radius of each node in the graph based on the degree of the node.
- In explanation.txt, using no more than 40 words, discuss which metric (possible metrics: scaling the radii linearly, scale the radii by the square root of the degree, etc.) you have used and explain why you think it is a good choice.
Pinning nodes (fixing node positions)
- Modify the html so that when you double click a node, it pins the node’s position such that it will not be modified by the graph layout algorithm (note: pinned nodes can still be dragged around by the user but they will remain at their positions otherwise).
- Mark pinned nodes so that they are visually distinguishable from unpinned nodes, e.g., pinned nodes shown with a different color, or border thickness, or visually annotated with a “star” (*), etc.
- Double clicking a pinned node should unpin (unfreeze) its position and unmark it.
Q3. Visualizing scatter plots
Use the dataset provided in the file iris.tsv (in the folder Q3) to create a
scatterplot.
Features/ Attributes in the dataset:
- Sepal length in cm
- Sepal width in cm
- Petal length in cm
- Petal width in cm
- Class: Iris Setosa, Iris Versicolor, Iris Virginica
Creating scatter plots
- Create two scatter plots, one for each feature combination specified below. In the scatter plots, visualize the different classes using different symbols (circle for setosa, square for versicolor and triangle for virginica) and add a legend showing how symbols map to the classes
- Features 1 and 2
- Features 3 and 4
- In explanation.txt, using no more than 40 words, discuss which plot is better at separating the classes and why.
Scatter plots should be placed one after the other in an html page as shown in
the reference below. Please note that your design need not be identical to the
given reference.
Based on the scatter plot created for features 1 and 2 (Sepal Length vs Sepal
Width), create new plots for the following questions: - Scaling symbol sizes. Set the size of each symbol in the plot to be proportional to the square root of the the length parameter. Create a new plot for this part.
- Axis Scales in D3. Create two plots for this part to try out two axis scales in D3, one for using the square root scale (applied to both axes) and another for using the log scale (also applied to both axes). Explain in no more than 40 words which scale works best for this dataset in explanation.txt.
Q4. Visualizing heat map
Use the dataset 2 provided in hourly_heatmap.json (in the folder Q4) that
describes glucose readings over time, and visualize it using D3 heatmaps. To
get started, refer to the heatmap example here.
- Plot the glucose readings against the time of the day (Hint: Use the glucose readings as a “z” parameter in the given example)
- Now use the file day_heatmap.json (in the folder Q4) to plot the glucose readings against the day of the week on the heatmap. Use the day names instead of numbers as the tick labels on the axis, e.g., day 1 being Monday.
- A pattern should emerge from the visualizations. Explain the pattern and why it occurs, using no more than 40 words in explanation.txt.
Please note that there will be two heat maps, one for part i and the other for
part ii. Place them one after the other on an html page (the one for part i
goes first).
Q5. Sankey Chart
Formula One racing is a championship sport in which race drivers represent
teams to compete for points over several races (also called Grand Prix) in a
season. The team with the most points at the end of a season wins the
prestigious Formula One World Constructors’ Championship award. You will
visualize the flow of points for the races held in this season up to September
2016. The drivers win points according to their final standing in each race,
which finally get added to their respective team’s total.
- Create a Sankey Chart using the datasets provided (races.csv and teams.csv) in the Q5 folder. The chart should visualize the flow of points in the order:
race → driver → team
You may refer to this example to create the chart (sankey.js is provided in
the lib folder). You can keep the blocks’ vertical positions static. Your
chart should look like the example Sankey Chart for the 2015 season as shown
in Figure 5.Hint : For this part, you will have to read in the csv files and combine the
data into a format that can be passed to the sankey library. To accomplish
this, you may find the following javascript functions useful: d3.nest(),
array.filter(), array.map() - Use the d3tip library to add tooltips as shown in Figure 5 (you can make your own visual style choices using css properties).
- From the visualization you have created, determine the following:
- Which team has the best current standing?
- Which driver has the most points currently?
- Which driver won the Monaco Grand Prix?
- Which two drivers switched their teams midseason?
Put your answers in observations.txt.
Q6. Interactive visualization
Mr. Fluke runs a small company named FooBar. His company manufactures eight
products around the year. He wants you to create an interactive visualization
report using D3 so that he can see the total revenue generated per product
type and the revenue breakdown across product types for the four quarters in
2015. Use the dataset provided in the Q6 folder. Integrate the dataset
provided in dataset.txt directly in an array variable in the script.
- Create a horizontal bar chart with its vertical axis denoting the product names and its horizontal axis denoting the total revenue. Each bar should have the total revenue amount in dollars labelled inside it. See Figure 6 for an example.
- Create a legend with three columns.
- Column 1: quarter labels: Q1, Q2, Q3, Q4
- Column 2: initialized with each quarter’s total revenue (e.g., Q1’s value is initialized as the sum of all products’ revenues in Q1)
- Column 3: presents the percentage share of each value in Column 2
- While hovering over any bar, the second and third columns in the legend should update to 8 show the revenue generated (in value and percentage share, respectively) for each quarter of the selected product. For example, when hovering over Product C’s bar, the second and third columns in the legend should update to show Product C’s revenues in the four quarters and those revenues’ percentage shares. See Figure 6 for an example.
Note:
- The vertical axis of the chart should use product names as labels.
- On hovering over any horizontal bar, the color of the bar should change. You can use any color that is visually distinct from the regular bars.
- The legend should reset to the initial values on mouseout (i.e., when the mouse leaves a bar).
Q7. Visualizing college scorecard data
This is a freeform question. We want you to apply the D3 knowledge that you
have gained to assist decision making for a realworld problem: help students
make college decisions.
Using D3, construct a visualization using the college scorecard dataset
(located in the Q7 folder) which contains statistics about colleges (e.g.,
affordability, value).
Create one large visualization or multiple small ones using the entire dataset
or a subset of it. If you want, you may also use the Bootstrap library, which
is a popular framework used in frontend development, to organize your
dashboard we recommend Bootstrap because many student teams in previous
semesters had good experiences using it for their projects. Place the
Bootstrap library files in the lib/bootstrap folder. The visualization does
not need to support any interactions.
- Points will be awarded for usability, functionality, and creativity.
- Summarize your main ideas behind the visualization in explanation.txt in no more than 50 words.