Basics of SQL (In Plain English)

by Lota Chukwu | Nov 27, 2024 | Uncategorized

Introduction

This is for you if you have zero to beginner-level knowledge in the world of Data, specifically SQL.

Perhaps you have come across that term(SQL) multiple times, and might have been a bit curious, however, like with many things on this earth, it came with a lot of baggage: maybe you watched a video or came across an article, and started to hear and see strange things.

Things like PostgresSQL, TSQL, RDBMS etc. And you thought to yourself, this is another one of those programming languages reserved for those with glasses and energy drinks. Well, the purpose of this is to “defang” SQL, to bridge the gap from zero knowledge to a basic understanding and hopefully pique your interest to study it a bit further.

SQL is actually not too difficult to pick up, if you have done any sort of data manipulation or used formulas in Microsoft Excel, then you have enough expertise to use SQL. In fact, you have already been using some SQL functions without realizing it.

For this lesson, we’ll be focusing on the basic SQL words (actually called SQL clauses), used for retrieval of data i.e. SELECT, FROM, WHERE, ORDER BY, and some basic functions e.g. count.

We won’t be covering joins in this lesson.

Resources

The following links and files would be helpful if you would like to follow along.

Why Learn SQL https://www.dataquest.io/blog/why-sql-is-the-most-important-language-to-learn/ | https://www.wealest.com/articles/lindy-effect
Recording: Set up Google Big Query https://www.loom.com/share/5d3f1186133445389b6efa182a597aa5

BSQEx1_BasicSalesData.xlsx

BSQEx2_BasicStoreData.xlsx

About Me

BSc. Electrical Engineering
~ 5 years of Experience in various Data Related roles.
Interests include Reading, Running, Music (Bass Guitar), Teaching.

What is SQL

SQL stands for Structured Query Language. As the name implies, it is a language, and languages are used for communication to exchange ideas and information from one mind to the other.

SQL is the language for large datasets, it can be used to talk to large amounts of data. To retrieve, change, give or withhold access to the data.

Why are there sometimes different prefixes to SQL ?

(e.g. TSQL, PostgreSQL, OracleSQL, MYSQL etc.)

Well, those different prefixes are used to identify different “dialects” of SQL. Let us stick with the language analogy a bit further.

There exists a language called the English language, there are general rules of grammar, punctuation, and sentence construction that govern the language across the world. However, in different parts of the world, there are some differences in how the language is spoken and used e.g. British English, American English, Australian English etc. The general rules are the same and if you learn English in any one of those countries, you can generally easily understand or pick up the English spoken in another country.

Similarly, SQL is like the English language. Various organizations (like countries) adopt the language in a way that is comfortable to them. Hence Microsoft’s dialect of the SQL language is TSQL, Oracle’s is OracleSQL etc. They are very similar, but just like you would say “chips” and “booth” in England but “fries” and “trunk” in USA. Similarly in TSQL you would say “SELECT TOP …..”, and in PostgresSQL you would say “LIMIT” – however, in both cases you are communicating the same thing, get me only these number of rows of data (say the first 1000 rows).

SQL Platforms or RDBMS

Have you ever seen a job posting and seen different tools and technologies?

For example, some jobs might require you to be an “expert in Google Big Query” e.g. https://www.linkedin.com/jobs/search/?currentJobId=3537938588&f_E=2%2C3&geoId=101174742&keywords=Data Analyst&location=Canada&refresh=true&start=50

Some might require you to have experience with ‘Microsoft SQL Server’ e.g. https://www.linkedin.com/jobs/search/?currentJobId=3544447417&f_E=2%2C3&geoId=101174742&keywords=Data Analyst&location=Canada&refresh=true&start=25

These are various platforms, do not be overwhelmed by the various ones out there and think you need to have used them all when applying for jobs in the data field. A good way to think about this is to see SQL as email technology, and see the different platforms as the different email platforms out there. E.g. Gmail, Outlook, YahooMail etc. There are some differences in their interfaces, but the basic concept is the same. Once you use one, you can typically easily use another without too much of a learning curve.

Similarly, once you use one SQL platform, the interface of another might be slightly different, but have the confidence that you can pick it up quickly and feel free to radiate this confidence in your interviews.

Your First SQL Query(Bridging the Gap from Excel to SQL)

It’s easier than you think. Keep in mind that we’ve spoken about SQL as the language used to talk to large datasets, and if you think about it, data is everywhere. Actually, most of what you will be doing with SQL( if you choose to further your learning) are things you have been doing already.

There are different aspects of the SQL language, however, here, we would focus on the aspects of SQL that deal with retrieving data from a table.

For these examples, we’ll journey from the ubiquitous to the common. From the known to the unknown.

Example 1:

Feel free to download this very basic Excel file and follow along

BasicSalesData.xlsx

Here is a simple Excel Table (Let us assume this table is called product_Table).

SQL is just like Excel on steroids, you can have multiple tables with millions of rows, you can link and relate those tables to each other.

If you have this table in a database and want to see the full table using a SQL Statement, it is as simple as what you see in the screenshot below.

Those two lines of text are essentially saying: get me all the columns from that table(product_Table).

Hiding and Showing Columns is essentially a SQL SELECT statement.

If you have ever had an excel file with some columns you don’t need and you hide that column. That is essentially a SQL SELECT Statement.

For example: if you hide the “Online” column in our Excel Table, that is the equivalent of choosing the specific columns you want to return from your table.

Have you ever sorted your data in Excel? That is an ORDER BY clause

What if we wanted to see the products in our tables from the one with the highest sales to the lowest one. You can sort with a few clicks in EXCEL.

That Sorting is done with one line in SQL.

For more advanced Excel users: Something like a VLOOKUP is pretty similar to a JOIN in SQL.

Have you ever filtered an Excel file? That is a WHERE clause in SQL

Imagine you wanted to see just your Online orders and you filtered that column

That is a simple WHERE clause in SQL

From this, I hope you can see that Excel and SQL are cousins. And hopefully, now you’re more familiar with SQL (the more intimidating cousin who it turns out was not as scary as it seemed. )

Example 2:

Feel free to download this very basic Excel file and follow along:

BSQEx2_BasicStoreData.xlsx

The AI facial recognition software that was newly installed at the local grocery store just identified the last few people that walked in. For some reason, you were given this data and asked to help the manager answer some questions for the chance to win a full tank of gas.

Specifically, the manager would like to:

Print out the names of the customers, she wants to know if those in the store at the moment are her regulars or if there are any new folks.
Print out the name of the oldest person in the store at that moment, she is in a good mood and would like to offer said person a “lifetime” 10 percent discount.
Determine how many male shoppers are in the store, they might need more help 😀

Name	Gender	Age
Abigail	F	20
Bond	M	45
Carissa	F	40
David	M	19
Eleanor	F	67
Femi	M	36
Gisele	F	29
Harold	M	12
Imani	F	15

Solution:

Let’s assume the table has a name (Table name = Customers), for easy identification.

Excel

These sound like simple enough tasks, a fair percentage of people are familiar with excel, if you were to do these tasks in excel, here is what they would look like:

Hide the gender and age columns in the table. Print out the only visible column(i.e. the name column)

There are many ways to go about this in excel, but one way would be to Activate the filters at the top of each column, sort Age from highest to lowest, now you can filter the age column by checking the highest number in that column, these returns the rows with the customers that have that age(in this case, that would be 67).

For the number of male shoppers, you might filter on the Gender column by M, and count the number of rows. For a larger dataset, you might use the COUNT formula in excel. NB: In Excel ensure to copy out the values for ‘M’ to a different set of cells after filtering, and do the count on the numerical column. NB Yet Again: You can also use a countif formula to do this without needing the longer process outlined above. But the aim is to make this as simple as possible.

SQL

In SQL you can do the exact same things displayed above, however, you do them using typed commands as opposed to graphical interface point

In SQL, the results could look like the following:

The line of code below, would do exactly the same thing as the excel example. It looks at the table and returns only the columns you need.

SELECT Name
FROM Customers

The SQL query below would return the same results for the second query. It gets the maximum age, filters the table by the rows that contain that age, and then retrieves the specified columns in that Select statement SELECT Name ,Gender ,Age FROM Customers WHERE Age = MAX(Age)
To return the number of male customers, you could probably use a query like so: SELECT COUNT(*) FROM Customers WHERE Gender = 'M'

Exercise

You have been hired by a billionaire who loves tea, he aims to find some of the best tea shops in California and buy them, he gives you the dataset below to work with.

GDAsf_boba_tea_shop_data – San Francisco Boba Tea Shop Location Info (1).csv

Answer the following questions using Excel and SQL.

How many tea shops are in the dataset?
How many tea shops have a rating of 4.5 or greater?
How many tea shops have monthly sales greater than $10, 000, sort from the highest selling to lowest selling shop
How many tea shops with a rating lower than 3 have avg sales greater than $8000

Data Analysis Perspective

Now, that you have answered those questions. It is good practice in industry not just to learn how to use tools, but to know how the utilization of those tools connect to business value. Essentially, how does what I am doing with these tools help the company save money, make money, save time, improve efficiencies etc.

The data analysis process from Google (https://www.coursera.org/learn/foundations-data/lecture/JjA1f/six-phases-of-data-analysis) gives me a good framework to aspire to when handling data requests in my day to day job.

Ask

Ask questions about what the person is trying to achieve, what decisions would be made based on the information? Ask questions about the data, where was it gotten from? Is it complete? Do I have enough questions to answer the question of the requester?

Prepare/Process

e.g.

Create Data Backup
Find Duplicates
Missing Data- Look for nulls
Ensure Data types are consistent. No strings in number columns
Split or Concatenate Columns as needed

Analyze

Do the sorting, calculations, manipulations needed to answer the questions. What story is the data telling? Does it agree or contradict with your initial intuition?

Share

Present the data in a good looking clean visualization.

Act

Beyond pretty visuals, provide recommendations based on the results of your analysis.

Are Rich Kids Smarter than Low Income Kids? A Data Story

by Lota Chukwu | Nov 18, 2024 | Data

TLDR: How your kids spend their summers can impact the rest of their lives.

Intro

Chapter 9 of a book called Outliers (Malcolm Gladwell) presents an interesting problem, it seemed that kids from upper class neighborhoods were increasingly getting higher scores on reading comprehension tests than those from lower class neighborhoods.

This article retells how the root cause of this problem was identified, and I use it as a case study for the importance of Data Visualization in information consumption and decision making.
We’ll journey through 3 datasets to find out the root cause of this problem.

Dataset 1

The question asked in the title of this article is highlighted in the table below.

This table shows the average math and reading test scores (on the California Achievement Test) of kids from the first grade to fifth grade broken down by socio-economic class.

Here is the information contained in this table:

From first grade there is a discrepancy in the scores between the kids from upper class backgrounds and those from more humble backgrounds.
This difference continues to widen, 31 points in first grade to 73 points by 5th grade.

Why is this? Was this because this kids from upper class neighborhoods were inherently smarter? Or was it because the school system was failing the poorer kids?

This table shows the difference in test scores from the start of the school year to the end of the school year.

Here is the information contained in this table:

There is not much difference in the scores across socioeconomic class. In actuality, when you look at the total of the increase in tests scores from the start of the school year to the end of the school year, the kids from the poorer backgrounds seem to improve better during the school year (189) than the kids from the upper class backgrounds(184). If that is the case, why does the gap in the test scores keep widening from 1st grade to 5th grade?

The answer lies in DataSet 3

DataSet 3

This table shows what happens during the summer holidays.

Here is the information contained here:

Over the summer, between the end of the previous school year and the start of the next school year, the upper class kids have substantial increases in test scores over the other kids.

This can be attributed to summer classes, extracurricular activities undertaken during the summer etc. Activities kids from poorer neighborhoods typically can’t afford or are not encouraged to participate in.

Let’s Visualize those Tables

Now let’s looks at these 3 data sets in a different way. Same information contained within, different way of consuming the information.

DataSet 1

Information Contained in Dataset:

From first grade there is a discrepancy in the scores between the kids from upper class backgrounds and those from more humble backgrounds.
This difference continues to widen, 31 points in first grade to 73 points by 5th grade.

How much easier is it to see the widening difference between the average scores of the upper class kids vs the lower class kids from grade to grade?

DataSet 2

Information Contained in Dataset:

This dataset shows the difference in test scores from the start of the school year to the end of the school year.

Here is the information contained in this dataset:

There is not much difference in the scores across socioeconomic class. In actuality, when you look at the total of the increase in tests scores from the start of the school year to the end of the school year, the kids from the poorer backgrounds seem to improve better during the school year (189) than the kids from the upper class backgrounds(184). If that is the case, why does the gap in the test scores keep widening from 1st grade to 5th grade?

How much easier is it to how closely the scores are from the start to the end of the school year?

Dataset 3

Here is the information contained here:

Over the summer, between the end of the previous school year and the start of the next school year, the upper class kids have substantial increases in test scores over the other kids.

How much more easier is it to see the gap in improvement between the upper class kids and other kids over the summer?

Takeaways

How kids spend their summer and free time can greatly impact the rest of their lives, their ability to compete with their peers in tests which determine the quality of schools they go to etc. In scenarios where money is an issue, organizations like Black Boys Code, Black Girls Code, Canada Learning Code, Local Libraries etc. Provide free and low cost learning. All that is needed is the time and attention.
It is easier to tell a stories with data visualizations as aids. Humans process pictures much easier than text and tabular data. A combination of the two is usually more compelling. Consider this at your next board meeting or work presentation.

References

Outliers Malcom Gladwell – Chapter 9 (Marita’s Bargain)
Karl L. Alexander et al. “Schools, Achievement, and Inequality: A Seasonal Perspective,” Educational Evaluation and Policy Analysis 23, no 2 (Summer 2001): 171-191

Why Visualize Data? What’s the Point?

by Lota Chukwu | Oct 25, 2024 | Data

The Financial and Organizational Benefits of Data Visualization

Are you a people leader in a non-tech industry, or non-tech department within your organization? (E.g. Supply Chain, Automotive, Manufacturing etc. ). Or do you work for a data team that reports to such an organization or department? If so, you might find that despite the spend on tools like PowerBI, Tableau, SAP etc. There is still the prevalence of siloed work, duplicated efforts, different excel sheets for different departments (and in some case for different individuals). It often leads to inefficiency and fragmentation all round.

Now, if this problem has come to your attention and you are looking for a fix. A good place to start will be to understand why Data Visualization important in the first place. What is the need for it?

There is a popular acrostic in change management called ADKAR, the “A” being Awareness. For any change to occur, there must first be the awareness of the need for change. This article aims to awaken awareness on the need for data visualization in your organization.

So let’s get to work. Why visualize data? What is the point?

Money: This is a bold claim but will be put forward regardless: if you are not visualizing data in your organization, you are leaving money on the table. This is more pronounced if your organization is actually paying for Data Visualization tools. As a basic example Tableau (one of the most popular data visualization tools out there) Pricing is 75USD per Creator, 45USD per Explorer and 15 USD per Creator – all per month). A previous organization I worked for had 10K plus employees. Enterprise pricing is usually costed differently, but going by this user based model, you are looking at minimum $180,000 per year if all employees needed view-only access. If this much money is being spent on a software, and it is not being used frequently, money is being left on the table. There is also the opportunity cost of the money you would have saved from a well made data visualization. Here are two examples of this: In a previous role which supported a manufacturing plant, a simple pivot table displaying what parts were rejected for quality issues with the ability to filter by the reason the parts were rejected provided a means to foster conversations about returns of defective materials to suppliers, and led to about $1 million dollars in cost savings. In another previous role, a visualization which enabled a quick way to view third party vs company owned services in a certain area of the business provided a path to begin the transfer of third party to company owned services, and by doing so save upwards of $10million dollars. These two are good examples of the possible gains when Data Visualization is done right.
Human Nature: The work of Stephen Few will be extensively called referenced to illustrate this. Stephen Few has done a lot work in the field of Human Perception and Cognition and how that applies to Data Visualization. One basic principle he typically espouses is that humans are naturally better at pattern recognition. Humans have interacted with the physical world for a long time, and as such are designed to easily recognize and quickly process certain attributes of the objects we see e.g. (length, position, size, shape, and color). When we take advantage of this and represent data in this way, it is typically easier to process, rather than reading text off a table. Here is an example that He typically uses to illustrate this: Take a look at the table below. Something like this(i.e. a table) is usually good for “Lookups” e.g. What were the sales for the month of June?
However, when making business decisions, you typically need to answer more complex questions e.g.
– Have sales been rising or declining over the past year ? (the year represented in the table)
– What month had the highest sales? Does this repeat annually?
– Are there any months with particularly low sales?

A table showing international and domestic sales.

Now, pause for a moment (no peeking).

Try to answer those questions with the table above.

Now, try to answer those same questions with the graph below.

More than likely, you’ll notice that the answers are immediately obvious, with less mental processing and panning around on your end.

In the words of Mr. Few

“Good data visualization takes the burden of effort off the brain and puts it on the eyes”

When pattern recognition abilities of the human visual system are harnessed, your business decision making is much faster and efficient as a result.

What’s Next ?

With these in mind, your next board meeting could involve creating a simple bar chart or line chart of the quarterly numbers being presented. It would definetely lend more oomph to your presentation. This could even be done on Excel. However, if you have a Data Visualization tool like Tableau and Power BI you might want to embark on a data visualization project, especially if this information is something you are using to make decisions and if you refer to it on a recurring cadence.

NB: This is not to say Data visualization as a silver bullet that will solve all your organizational issues and end world hunger. However, to get Data Visualization right there is a lot that goes into it, data Visualization(when done right) is usually at the tail end of a lot of work e.g. Establishing consensus on a central source of truth such as a database, a master excel sheet everyone agrees upon, an ERP system (Salesforce, Epicor), Data Generation, Data Ingestion, Data Storage, Data Transformation, Interviewing/Requirements gathering etc. All these culminate in the benefits of the visualization at the end of the process.

Trust you found this enlightening. If so exercise your first amendment right and share this with a colleague, let us know what stood out to you in the comments, reach out if you would like a training session in your organization to boost interest and adoption of data visualization.

How to Make Dashboards People Actually Use 2

by Lota Chukwu | Oct 25, 2024 | Data

From the same studios that brought you “How to Make Dashboards People actually Use Pt 1” comes this riveting tale of suspense, intrigue, frustration, betrayal and triumph. The long awaited SQL (sequel) (pun intended) is finally here….

Alright enough drama. The previous post spoke about a problem most business teams/operators have with data teams, and that most data teams have with business teams. A dashboard, automation or data related related request is sent to the data team. The data team builds it, but more often than not these things happen:

The business teams receive a different outcome than what they expected, and the solution ends up not being used.
There is infinite scope creep: “Can we view it this way”, “can we add this view”. 6 months to one year later, you have a dashboard with 20 views and no real value extracted from all the work that has been put in.

This typically results in a return to silos (everyone having their individual excel sheets), and a frustrated and demoralized data team.

This is bad for a number of reasons, the primary issue here being one of “waste”. The lack of adoption and repeat usage of created dashboards mean:

All the hours of work that the Data team and business team invested in the dashboard initiative goes down the drain.
Money is being left on the table. Organizations pay a lot for Data Visualization software (e.g. for Tableau Pricing is 75USD per Creator, 45USD per Explorer and 15 USD per Creator – all prices per month). The organization I worked for had 10K plus employees. Not sure what enterprise pricing would cost, but going by this user based model, you are looking at minimum $180,000 per year if all employees needed view access.
Lost opportunities: when done right, proper data analysis and visualization can uncover insights which could save your organization millions of dollars ( I have seen this done in excel and in tools like Tableau).

So, how can organizations stop leaving money on the table? How can lack of adoption and lack of repeat usage of dashboards be addressed? What can we done to ensure dashboard projects do not end up in the abyss? In the sea of forgotten fellows, in the outer darkness ?

In addition to Collaborative Prototyping ( see previous post How to Make Dashboards People Actually Use 1). Another fundamental change that greatly increased adoption, interest and repeat usage at my previous organization was an “Agile Mode of Delivery”:

Agile Mode of Delivery:

Here is some context to preface the definition of this phrase. There is a popular saying “Culture eats strategy for breakfast”. This proves to be true, no matter how good a plan or strategy is, the recurring habits and accepted behaviors in any group determines the outputs or results that ensue.

Depending on what industry, or department, or business unit the dashboard or data product is being built for, there are certain mindsets, habits (culture) that hold sway. E.g. in Supply Chain, most leaders are used to Capital intensive projects (things like Warehouse builds or renovations, transportation fleet purchases or revamps etc.)

These projects adopt the mindset of traditional waterfall project management methodology i.e. tight dependencies (one thing has to happen before another thing does, there is no room for much concurrency). However, data teams primarily work with software which favors an Agile project management approach. Software success is usually largely dependent on user preferences, which can often change on a dime.

In most projects that follow tradition waterfall project management methodology e.g. Civil engineering projects, safety is prioritized, customer needs and satisfaction are secondary. In majority of software projects agility and adaptability are prioritized.

This causes a conflict, when data teams interface with a business or team with this fundamentally different mindset, there can be tension.

What was observed in this scenario was that the dashboard build process was being done with a traditional waterfall approach. The requirements were given, the data team goes and builds for weeks and months and then delivers the dashboard at the end of that time period.

From the time the work begins, all you have is a discovery or requirements document, and then at the end of the period a solution is presented. For the time frame in between, the team doing the build is working in isolation.

With the pace at which the world changes today, the needs when the project was initially documented could have completely change in the weeks or months the development work was ongoing.

The solution:

The solution to this was to borrow one of the fundamental values of Agile development (develop working software over comprehensive document.) The goal is to deliver a working version as early as possible and then build incrementally on top of that. (see image below).

In the traditional approach, at every stage from requuirements, to design, to implementation to verification; all the end user has are written updates e.g. discovery documents, email updates etc.

In the agile/incremental delivery approach, at every stage, there is a tangible product (no matter how basic) the user can interact with. This product is then incrementally improved until the final product is delivered.

Here is a specific example of how this is different from the previous approach in the context two dashboard build projects I was involved in.

Waterfall Approach (at least in the department I had worked for)	Agile Approach
Dashboard builds were usually done mostly in isolation after delivery, there was a huge gap without any direct communication after discovery.	From the first week after discovery, there was a tangible product. In week 2 there was a powerpoint prototype, in week 3, View 1 of the dashboard prototype had been built with a subset of the data while the rest of the historical data was being sourced.
During the build stage, there were minimal meetings with business users/end users. Sometimes weeks would go by before any correspondence. The correspondence was usually a verbal or written update.	Weekly touchpoints were setup, sometimes as short as 15 mins per week. Each week a visual or set of visuals were displayed, the visual prototypes were made available online for the users to interact with.

These small changes made a huge difference. Here are some of the major benefits I noticed from this approach:

It built trust between the data team and the business. The frequency and the honesty of the communication made a lot of difference in this regard. On some weeks when there was no progress from the previous week, that was communicated, and the reason for the lack of forward movement was communicated as well.
It gave the business a sense of ownership, the end users felt like they were a part of the building process. They saw the dashboard grow an evolve from a prototype to single visualization to the multiple views on the dashboard. By the end of this process, it was their baby, they watched it grow into a full blown adult (they grow up so fast 😪)

At the end of the day, in combination with the visual prototype which ensured that both the business and the data team knew what the finished product would like; at the end of this project. The interest and adoption and repeat usage of said dashboard was such as the world has never seen (ok a tad exaggerated. Such as the department had never seen).

For future dashboard builds , consider delivering something tangible as early as possible in the process (a visual prototype, an actual visualization on the dashboarding tool etc). Give the end users something to use and play with as soon as possible and build on top of that.

How to Make Dashboards People Actually Use 1

by Lota Chukwu | Mar 14, 2023 | Data

It was late 2022.

I had just transitioned from a BI Solution Architect role (it took a while to understand what this role actually meant, but essentially it was tech sales for a Data Visualization tool. Think Tableau, PowerBI, Sigma competitor). I had transitioned from that role to a Data Analyst role.

One of the biggest, unexpected shocks that I got in the Data Analyst role was this: ”I couldn’t wrap my head around the concept of building a dashboard that would not eventually get used.

You see, as a Solution Architect/Tech Sales guy for a dashboard creation tool. Every dashboard we created was utilized in one way or the other. We created customized dashboards for product demonstrations to prospective buyers (the dashboard was sure to be used for the presentation). We also created dashboards for proof of concepts with the prospective buyers data to see how well our tool integrated with their internal tools and how well it handled the use cases they needed data visualization for (e.g. internal data analysis, embedding a product of their own etc.)

It was rare that a dashboard would be created to aid Sales efforts and it would not be utilized (except the prospect was a no show for the Sales Call).

So, you can imagine the total shock and horror when as a freshly minted Data Analyst, someone reaches out to the team and says: “ Hey, we need to be able to see how much product we have across all our locations in a single unified view”. And without hesitation, I jumped right into it. Wrote the SQL query, verified the correctness of the data, created an interactive Tableau dashboard for the user to lookup what they needed. Said Dashboard was presented, requester remarked “Oh this is wonderful”. And then crickets…..

Weeks later, when looking at the dashboard metrics, it stopped being accessed a few days after the initial delivery.

At so many organizations, dashboards are created which end up in the sea of lost souls, never again to be used or resurrected, never again to see the light of day ( ok this is getting a bit dramatic, but you get the point).

How can this lack of adoption and lack of repeat usage be addressed?

After all, organizations pay a lot for Data Visualization software (e.g. for Tableau Pricing is 75USD per Creator, 45USD per Explorer and 15 USD per Creator – all prices per month). The organization I worked for had 10K plus employees.

Not sure what enterprise pricing would cost, but going by this user based model, you are looking at minimum $180,000 per year if all employees needed view-only access. If this much money is being spent on a software, and it is not being used frequently, money is being left on the table.

This doesn’t include the opportunity cost i.e. when done right, proper data analysis and visualization can uncover insights which would save your organization millions of dollars (I have seen this done in excel and in tools like Tableau). So, to rephrase the question above, how can organizations stop leaving money on the table?

Fast forward to Mid 2024.

Another dashboard project is at the tail end of being wrapped up. The requester is very appreciative, he brings on members of his team to take a look at the dashboard. The members of the team who do not have access already request access to the data visualization tool so they can begin using said dashboard. The requester expresses his intention to socialize the dashboard to his superiors in the future.

Someone from an adjacent team starts using the dashboard prototype in one of his external meetings, he assigns a direct report to begin using the dashboard to verify the data. The direct report reaches out because he found some discrepancies on the data being displayed, and it was promptly corrected.

An external consultant reaches out to get a walkthrough of the dashboard as it might be applicable to the work he is currently doing for the organization etc.

What brought about this massive jump in adoption, interest and repeat usage?

These series of posts walks through two fundamental changes in the dashboard build process that made these happen:

Collaborative Visual Prototyping (Applying Product Thinking to Dashboard Creation)

In the past, dashboard creation projects would go like this:
Discovery > Data Collection/Preparation > Build Dashboard > Present Dashboard > Implement Feedback > Rinse/Repeat.

For this particular dashboard project, after Discovery (asking questions to further explore the end user request), prototyping was done before any additional legwork went into the dashboard build.

Discovery > Collaborative Prototyping > Data Collection/Preparation > Build Dashboard > Present Dashboard > Implement Feedback > Rinse/Repeat

For Definition of Terms:

What is meant by Collaborative visual prototyping is this:

A visual prototype of the dashboard is created (the tool you use should be dependent on the tech savviness of your stakeholders/end users). Initially Figma was used, it was seeming too complex for the end users, eventually PowerPoint was settled upon.

The tool you use for this is the secondary issue here (it could be power point, it could be Figma, it could be a paper sketch). The primary focus should be translating a written document to a visual prototype before building begins.

The primary concept here was borrowed from Product thinking. It was the idea of creating a visual prototype and getting feedback on that prototype before actual development work begins.

The idea is this, it is easier to modify the prototype, make changes upfront, rather than getting feedback after time and effort has been invested in building the final solution. Making changes at that point will be more costly and time consuming.

The prototype can easily be changed and modified. A fully built solution however might take more time.

This prototype is created in collaboration with the end users/requester of the dashboard (hence the “collaborative”).

Get as much feedback as possible on the prototype, every visual, every dashboard screen etc. Until the end user(s)/ stakeholder(s) are satisfied with the prototype. Once there is agreement on the visual prototype, and the stakeholders agree (this is what is needed) then the build process can begin.

A caveat to this. This is more upfront work. There were more meetings at the start of the initiative than the typical process. However, it saves a lot of time and infinite scope creep down the line.

The visual prototype defines the output very clearly: in a language (graphical and visual language) much more than a written document would.

The next post will discuss the 2nd fundamental change that ensured this dashboard did not end up in the abyss.

In the meantime, for future dashboard projects. Consider a visual prototype, share the prototype with the audience, get feedback on the prototype. Iterate on this until there is an agreed upon prototype, and then let the build begin. At this point, the “Definition of Done” (what it looks like when the project is finished is clearly defined, and there would be a more pleasant experience for everyone.

The second fundamental change would be shared in part 2 of this article.

In the meantime. Have a productive day ✌🏿