Creating an Advanced 3D Scatter Plot with Python and Plotly

$Plotly$’s 3D scatter plot capabilities allow us to visualize complex, multi-dimensional data interactively.

We’ll create a detailed 3D scatter plot with customizations such as color-coding, sizing, and axis labels.

This example demonstrates how to plot complex data with customized markers, informative labels, and meaningful axes, all designed to enhance clarity and interaction with the data.

We’ll use synthetic data that includes three main dimensions, X, Y, and Z, representing each axis in the 3D space, and additional features that determine point colors and sizes.

Step-by-Step Explanation and Code

  1. Generate Data:
    We’ll create synthetic data using $NumPy$ to represent our 3D points, with values for each axis (X, Y, Z) and two additional variables (category for color and size for marker size).

  2. Set Up the Plotly 3D Scatter Plot:
    Using $Plotly$’s go.Scatter3d, we’ll customize our 3D plot to visualize:

    • Point Colors: Categorical values will be represented by different colors.
    • Point Sizes: A continuous variable will determine the size of each point.
    • Axis Titles: Labels for each axis to provide clear context.
  3. Customize Layout:
    We’ll adjust the layout to add titles, background, and enhance interactivity.

Here’s the $Python$ code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
import plotly.graph_objects as go
import numpy as np

# Generate synthetic data
np.random.seed(42)
n_points = 200
x = np.random.uniform(0, 100, n_points) # X values
y = np.random.uniform(0, 100, n_points) # Y values
z = np.random.uniform(0, 100, n_points) # Z values
category = np.random.choice(['Category A', 'Category B', 'Category C'], n_points) # Color by category
size = np.random.uniform(5, 20, n_points) # Size variable for marker size

# Map categories to colors
color_map = {'Category A': 'red', 'Category B': 'blue', 'Category C': 'green'}
colors = [color_map[cat] for cat in category]

# Create the 3D scatter plot
fig = go.Figure(
data=[
go.Scatter3d(
x=x,
y=y,
z=z,
mode='markers',
marker=dict(
size=size,
color=colors,
opacity=0.8,
line=dict(width=1, color='DarkSlateGrey')
),
text=[f"X: {x_val:.2f}, Y: {y_val:.2f}, Z: {z_val:.2f}, Size: {size_val:.2f}"
for x_val, y_val, z_val, size_val in zip(x, y, z, size)],
hoverinfo='text'
)
]
)

# Customize the layout
fig.update_layout(
title="3D Scatter Plot with Color and Size Encoding",
scene=dict(
xaxis=dict(title="X Axis (e.g., Feature 1)"),
yaxis=dict(title="Y Axis (e.g., Feature 2)"),
zaxis=dict(title="Z Axis (e.g., Feature 3)"),
bgcolor="rgba(240,240,240,0.95)"
),
width=800,
height=600,
showlegend=False
)

fig.show()

Detailed Explanation

  1. Data Generation:

    • We use $NumPy$’s np.random.uniform to generate random values for X, Y, and Z coordinates within a specified range.
      We also define category (with values like Category A, Category B, Category C) and size, which affects the marker sizes.
    • We assign colors using color_map, where each category has a different color (e.g., red, blue, green).
  2. Creating the Scatter Plot:

    • Scatter3d: This $Plotly$ function is used to create 3D scatter plots.
      x, y, and z are assigned to the respective coordinate values.
    • Marker Customization:
      • size=size adjusts the size of each marker based on the size variable.
      • color=colors assigns colors based on the category.
      • opacity=0.8 provides transparency, making overlapping points more distinguishable.
      • line=dict(width=1, color='DarkSlateGrey') adds a border to each marker, improving visibility.
    • Hover Information: We provide custom hover text to display values for X, Y, Z, and size when hovering over points.
  3. Layout Customization:

    • title provides a clear title for the plot.
    • Axis Titles: Each axis is labeled to describe the corresponding feature.
    • Background Color: scene.bgcolor is set to a light grey, which enhances the visibility of the colored points.
    • Dimensions: We specify width and height for consistent display.

Interpretation

This interactive 3D scatter plot allows us to observe:

  • Distribution: The spread of points across the three dimensions reveals clustering patterns and possible outliers.
  • Category Comparison: Colors represent different categories, allowing us to compare distributions between them.
  • Point Emphasis: Size variation highlights differences in another variable, helping us to see patterns across categories or regions of the 3D space.

Output

The resulting plot will display a 3D scatter plot with different colors for each category, sizes reflecting a separate feature, and axis titles.

This interactive view lets you rotate, zoom, and hover over points for detailed insights.

Conclusion

The 3D scatter plot with $Plotly$ provides a powerful way to explore complex, multi-variable relationships in data.

By encoding multiple variables in position, color, and size, we can convey a rich amount of information in a single, interactive plot, ideal for data science, research, and presentation purposes.