#### How an evening with AI transformed my approach to visualizing data

Photo by Simon Abrams onÂ Unsplash

As the sun began to dim and the city lights came to life, the inevitability of a late night in the office settled in. I found myself in a race against time. A crucial sales presentation was looming less than a day away, and success hinged on an unfulfilled requestâ€Šâ€”â€Šan insightful data visualization that could succinctly convey the underlying methodology of our new TV attribution model.

I needed something complex enough to show off, but simple enough to convey most of the information without explanation. Not an easy task with a ticking clock. Thankfully, a beacon of artificial intelligence, my trusty new assistantâ€Šâ€”â€ŠChatGPTâ€Šâ€”â€Šsat ready to simplify the task at hand. Its main mission: To help me translate data that was no more than a concept in my head into an eloquent visual narrative, capable of captivating even the most data-averse minds in theÂ room.

This might be a little high drama, but I *was* in a bit of a crunch. For a sales deck, this needed an overly simple example which I couldnâ€™t readily find in our database, so made-up example data itÂ was.

I knew in my head exactly what I wanted to displayâ€Šâ€”â€Ša web traffic baseline, a spike in web responses following a TV ad, and some way to convey in little to no words that in this example we would attribute the difference between the baseline and theÂ spike.

Matplotlib charts are solid in an academic paper but can require an overly deft hand to produce something both complex and truly flashy. I often turn to Power BI for the out-of-the-box looks and quick charts, but it isnâ€™t ideal to work with when you need to formulate something from scratch with no existingÂ data.

Iâ€™d been working with Plotly for a while, which seemed like a happy medium. However, I wasnâ€™t confident enough with the package to put my thoughts into code without spending precious minutes combing through documentation and Stack Overflow. So, with time short, I started my firstÂ prompt.

Iâ€™m creating a visualization to display web traffic response following a TV spot airing. Letâ€™s start by representing web traffic as a per minute count of site visits over the course of 30 minutes. The first ten minutes need to be a random number between 0 and 4. The second ten minute period needs to be random numbers between 4 and 20, with the numbers distributed in what looks like a right skewed histogram. The third ten minute period is the same as the first ten minute period. Please create that data and chart it for me using a line graph inÂ Plotly.

The prompt returned the following, with my only edits in the update_layout function:

import numpy as np

import plotly.graph_objects as go

# Set seed for reproducibility

np.random.seed(42)

# Create arrays with the specified distributions

first_ten = np.random.uniform(low=0, high=4, size=10)

last_ten = np.random.uniform(low=0, high=4, size=10)

middle_ten = np.random.gamma(shape=2, scale=2, size=10) + 4 # shift the gamma distribution

# Combine the arrays

measurements = np.concatenate((first_ten, middle_ten, last_ten))

# Create x values

x_values = np.arange(1, 31)

# Create the line chart

fig = go.Figure(

data=go.Scatter(

x=x_values,

y=measurements,

mode=’lines+markers’)

)

fig.update_layout(

title=’Ad Influence on Web Traffic’,

xaxis_title=’Time’,

yaxis_title=’Visits per Minute’,

width=1000,

height=600

)

fig.show()Chart as generated by the prior codeÂ block

I was already impressed here as I wasnâ€™t expecting the right-skewed histogram lookalike request to be handled so well. Maybe Iâ€™d even be finished before the after-work session down the block wasÂ over.

Next, I needed something to represent the web traffic baseline. In reality, this could be as simple as a rolling median, but asking for that here wonâ€™t work with the limited length of data. Instead, I figured the median of the first and third sets of ten observations would probably convey the rightÂ idea.

Please calculate the combined median of the two arrays first_ten and last_ten. Plot the calculated median as a constant horizontal line.import numpy as np

import plotly.graph_objects as go

# Set seed for reproducibility

np.random.seed(42)

# Create arrays with the specified distributions

first_ten = np.random.uniform(low=0, high=4, size=10)

last_ten = np.random.uniform(low=0, high=4, size=10)

middle_ten = np.random.gamma(shape=2, scale=2, size=10) + 4 # shift the gamma distribution

# Combine the arrays

measurements = np.concatenate((first_ten, middle_ten, last_ten))

# Create x values

x_values = np.arange(1, 31)

# Calculate combined median

combined_median = np.median(np.concatenate((first_ten, last_ten)))

# Create the line chart

fig = go.Figure()

# Create the line chart

fig = go.Figure(

data=go.Scatter(

x=x_values,

y=measurements,

mode=’lines+markers’)

)

# Add median line

fig.add_trace(

go.Scatter(

x=x_values,

y=[combined_median]*len(x_values),

mode=’lines’,

name=’Baseline’,

line=dict(color=’firebrick’, width=2, dash=’dash’))

)

fig.update_layout(

title=’Ad Influence on Web Traffic’,

xaxis_title=’Time’,

yaxis_title=’Visits per Minute’,

width=1000,

height=600

)

fig.show()

I then just changed the name of the median line trace from â€œMedianâ€� to â€œBaselineâ€�.

Chart from the above code block, which added the baseline constantÂ line

Now the idea of â€œliftâ€� comes through if youâ€™re familiar with marketing models, however, for the untrained eye this needs some more focused elements; the less that needs explaining, the more effective a visual is. The main things missing at this point were a chart element that implies causality of the spike in web traffic, and a way to convey the volume of traffic attributed to theÂ ad.

I started with the causality element.

At minute 10 in the data, a TV spot airs and drives a spike in web traffic. What kind of chart element would you suggest to convey not just that the ad ran that minute, but that the ad is the causal factor behind theÂ spike?

The response suggested a vertical line and annotation and provided the codeÂ update.

import numpy as np

import plotly.graph_objects as go

# Set seed for reproducibility

np.random.seed(42)

# Create arrays with the specified distributions

first_ten = np.random.uniform(low=0, high=4, size=10)

last_ten = np.random.uniform(low=0, high=4, size=10)

middle_ten = np.random.gamma(shape=2, scale=2, size=10) + 4 # shift the gamma distribution

# Combine the arrays

measurements = np.concatenate((first_ten, middle_ten, last_ten))

# Create x values

x_values = np.arange(1, 31)

# Calculate combined median

combined_median = np.median(np.concatenate((first_ten, last_ten)))

# Create the line chart

fig = go.Figure()

# Add scatter plot

fig.add_trace(

go.Scatter(

x=x_values,

y=measurements,

mode=’lines+markers’,

name=’Web Traffic’)

)

# Add median line

fig.add_trace(

go.Scatter(

x=x_values,

y=[combined_median]*len(x_values),

mode=’lines’,

name=’Baseline’,

line=dict(color=’firebrick’, width=2, dash=’dash’))

)

# Add vertical line at minute 10

fig.add_shape(

type=’line’,

x0=10, y0=0,

x1=10, y1=max(measurements),

line=dict(

color=”Purple”,

width=3,

dash=”dot”,

)

)

# Add annotation to indicate TV spot

fig.add_annotation(

x=10,

y=max(measurements),

text=”TV ad aired”,

showarrow=True,

arrowhead=1,

ax=-100,

ay=-25,

font=dict(

size=16,

color=”Black”

),

)

fig.update_layout(

title=’Ad Influence on Web Traffic’,

xaxis_title=’Time’,

yaxis_title=’Visits per Minute’,

width=1000,

height=600

)

fig.show()

I changed only the font size of the annotation.

Chart from the above code block, which added the TV ad aired constant line and annotation

Now we had a clearly defined cause and point of occurrence for the spike. The chart still needed something to denote the volume of traffic attributed. I already knew what I wanted for thisâ€Šâ€”â€Što shade the area between the baseline and web traffic for the duration of the â€œlift windowâ€� (ten minutes). I went back to my AI colleague to implement this.

I want to display the volume of web traffic above baseline that is attributed to the ad. Please add green shading between the baseline line and web traffic per minute line. Only shade the area between the 10th and 20th minute of the data. Also add a legend item for the shadedÂ area.import numpy as np

import plotly.graph_objects as go

# Set seed for reproducibility

np.random.seed(42)

first_ten = np.random.uniform(low=0, high=4, size=10)

last_ten = np.random.uniform(low=0, high=4, size=10)

middle_ten = np.random.gamma(shape=2, scale=2, size=10) + 4 # shift the gamma distribution

# Combine the arrays

measurements = np.concatenate((first_ten, middle_ten, last_ten))

# Create x values

x_values = np.arange(1, 31)

# Calculate combined median

combined_median = np.median(np.concatenate((first_ten, last_ten)))

# Create the line chart

fig = go.Figure()

# Add scatter plot

fig.add_trace(

go.Scatter(

x=x_values,

y=measurements,

mode=’lines+markers’,

name=’Web Traffic’)

)

# Add median line

fig.add_trace(

go.Scatter(

x=x_values,

y=[combined_median]*len(x_values),

mode=’lines’,

name=’Baseline’,

line=dict(color=’firebrick’, width=2, dash=’dash’))

)

# Add shaded region for the influence of the ad

fig.add_trace(

go.Scatter(

x=list(x_values[9:20]) + list(x_values[9:20])[::-1],

y=list([combined_median]*11) + list(measurements[9:20])[::-1],

fill=’toself’,

fillcolor=’rgba(0,200,0,0.2)’,

line=dict(width=0),

hoverinfo=”skip”,

showlegend=True,

name=’Ad Impact’)

)

# Add vertical line at minute 10

fig.add_shape(

type=’line’,

x0=10, y0=0,

x1=10, y1=max(measurements),

line=dict(

color=”Purple”,

width=3,

dash=”dot”,

)

)

# Add annotation to indicate TV spot

fig.add_annotation(

x=10,

y=max(measurements),

text=”TV ad aired”,

showarrow=True,

arrowhead=1,

ax=-100,

ay=-25,

font=dict(

size=16,

color=”Black”

),

)

title=’Ad Influence on Web Traffic’,

xaxis_title=’Time’,

yaxis_title=’Visits per Minute’,

width=1000,

height=600

)

fig.show()Chart from the above code block, which added the shading to represent adÂ impact

I felt like this was getting to a good place, but I still wasnâ€™t happy with the â€œclarityâ€� of what I was trying to convey. I made some additions based on the code Iâ€™d already beenÂ given.

I added a constant vertical line and annotation to denote the lift window cutoff, and changed the colors of the ad aired and window cutoff lines to green and red, respectively. I also changed the baseline constant line to purple and styled it as dotted to better match the other constantÂ lines.

I then wanted one final elementâ€Šâ€”â€Šshading for the baseline region. Wanting to tap the immaculate performance so far, I gave this one to my assistant.

Please add purple shading to the entire baseline region. Add an annotation saying â€œBaselineâ€� at minute 15, underneath the baseline constant horizontal line.

Note here that I havenâ€™t referenced any coordinates or specific numbers for the location of theÂ shading.

import numpy as np

import plotly.graph_objects as go

# Set seed for reproducibility

np.random.seed(42)

first_ten = np.random.uniform(low=0, high=4, size=10)

last_ten = np.random.uniform(low=0, high=4, size=10)

middle_ten = np.random.gamma(shape=2, scale=2, size=10) + 4 # shift the gamma distribution

# Combine the arrays

measurements = np.concatenate((first_ten, middle_ten, last_ten))

# Create x values

x_values = np.arange(1, 31)

# Calculate combined median

combined_median = np.median(np.concatenate((first_ten, last_ten)))

# Create the line chart

fig = go.Figure()

# Add baseline region

fig.add_trace(

go.Scatter(

x=list(x_values) + list(x_values)[::-1],

y=[combined_median]*len(x_values) + [0]*len(x_values),

fill=’toself’,

fillcolor=’rgba(128, 0, 128, 0.2)’,

line=dict(width=0),

hoverinfo=”skip”,

showlegend=True,

name=’Baseline’)

)

# Add scatter plot

fig.add_trace(

go.Scatter(

x=x_values,

y=measurements,

mode=’lines+markers’,

name=’Web Traffic’,

line=dict(color=’#636EFA’))

)

# Add median line

fig.add_trace(

go.Scatter(

x=x_values,

y=[combined_median]*len(x_values),

mode=’lines’,

name=’Median’,

line=dict(color=’Purple’, width=2, dash=’dash’))

)

# Add shaded region for the influence of the ad

fig.add_trace(

go.Scatter(

x=list(x_values[9:20]) + list(x_values[9:20])[::-1],

y=list([combined_median]*11) + list(measurements[9:20])[::-1],

fill=’toself’,

fillcolor=’rgba(0,200,0,0.2)’,

line=dict(width=0),

hoverinfo=”skip”,

showlegend=True,

name=’Ad Impact’)

)

# Add vertical line at minute 10

fig.add_shape(

type=’line’,

x0=10, y0=0,

x1=10, y1=max(measurements),

line=dict(

color=”Green”,

width=3,

dash=”dot”,

)

)

# Add annotation to indicate TV spot

fig.add_annotation(

x=10,

y=max(measurements),

text=”TV ad aired”,

showarrow=True,

arrowhead=1,

ax=-100,

ay=-25,

font=dict(

size=16,

color=”Green”

),

)

# Add vertical line at minute 20

fig.add_shape(

type=’line’,

x0=20, y0=0,

x1=20, y1=max(measurements),

line=dict(

color=”Red”,

width=3,

dash=”dot”,

)

)

# Add annotation to indicate lift window cutoff

fig.add_annotation(

x=20,

y=max(measurements),

text=”Lift window cutoff”,

showarrow=True,

arrowhead=1,

ax=100,

ay=-25,

font=dict(

size=16,

color=”Red”

),

)

# Add annotation for lift

fig.add_annotation(

x=15,

y=combined_median + (max(measurements[9:20])-combined_median)/4,

text=”Lift due to Ad”,

showarrow=False,

font=dict(

size=18,

color=”Green”

),

)

# Add annotation for baseline

fig.add_annotation(

x=15,

y=combined_median/2,

text=”Baseline”,

showarrow=False,

font=dict(

size=18,

color=”Purple”

),

)

title=’Ad Influence on Web Traffic’,

xaxis_title=’Time’,

yaxis_title=’Visits per Minute’,

width=1000,

height=600

)

fig.show()Chart from the above code block, which added baseline shading, a cutoff line, and more annotations

The capture of the baseline shading request without being provided specific values is top-tier, and at this point, I was beyond impressed. Happy with the visualization, I dropped the chart into the presentation, fired off a Teams message, and with a fond look at the ChatGPT tab, closed my laptop. As I was packing up, a notification from the after-work group chat flashed across my phoneÂ screen.

Saw the presentation message, thought you said youâ€™d be going late? Weâ€™ll have a pint waiting forÂ you.

Cheers, ChatGPT.

The Ultimate Visualization Assistant was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.