This activity will take 20 minutes. You should have these materials on hand:
- Giant gridded piece of paper. The bigger and more colorful, the better! You can order cheap rolls of colored bulletin board paper on Amazon.
- Large, thick markers
- Tape and a flat wall to hang your Paper Spreadsheet
Before beginning to analyze and tell stories with data, you have to understand what data is and how to clean it before it can be used. Build a paper spreadsheet together to introduce non-technical newcomers to concepts like “data”, “datasets”, “data types” and “clean data.” This activity also introduces participants to ethics and privacy concerns in data. It serves as an icebreaker at the beginning of a session to introduce participants to one another.
Kick off the Activity
Prior to the activity, create your paper spreadsheet on a large piece of gridded paper. Create at least five columns to capture different forms of personal information. It’s best to include columns for each major data type, for example:
- ‘First name’ for qualitative data
- ‘Hometown’ for geographic data
- ‘Color of Your Shirt’ for categorical data
- ‘# Siblings you Have’ for quantitative data
- ‘Day & Month of your Birthday’ for temporal data
- ‘Describe any experience you have with data’ for open text as data
You should choose other questions that are more relevant for your audience, but make sure that you are not collecting sensitive personal information or anything that could be embarrassing. No really, we mean it! It is very easy to accidentally ask a question that is revealing in an inappropriate way.
Post the spreadsheet up on the wall before the beginning of the session. As participants arrive, ask them to fill out a row about themselves, but don't force them to if they opt out.
Allow enough time for all participants to fill-in the spreadsheet. When you're ready to start, invite a couple volunteers to introduce themselves using the information they wrote down. These should be short intros that give a little more back-story to what they wrote down; just 1 or 2 minutes.
Lead a Conversation
Lead your group in a conversation about the “dataset” they have created. Highlight that data are systematically collected observations about the world. At the same time, collecting data requires some kind of helpful reduction of the world, so they are not the final word and may not be everything you need to know in order to answer any questions. You can get at this by asking your group a couple questions related to the Paper Spreadsheet:
- Do the colors people wrote down accurately capture all the texture and nuance of their fabric? For example, if anyone wore a striped shirt, what did they write in the 'shirt color' column?
- Do these questions capture enough of who each person is to analyze?
- What is missing that might help use this dataset?
Also highlight the different types of data that the Paper Spreadsheet captures. You can ask things like:
- People often think of data as just numbers, but what other kinds of data have we captured here? Is there temporal data? Geographic data?
- Numbers are easy to add up, sum, and average; how else could we organize data that isn't numbers? For instance, we could plot dates on a timeline, or locations on a map.
Next, introduce what needs to be done to clean and tell stories with this data. Ask participants questions like:
- If you wanted to map out people's hometowns, how could we do that? For example, are there inconsistencies in how people wrote down their hometown? And did everyone interpret what 'hometown' meant in the same way?
- What kind of patterns do you see in the consistencies or inconsistencies in the data collected? Did everyone fill in every column? Did anyone *not* put themselves on the paper spreadsheet?
Finally, discuss issues around data privacy, consent, and ethics. Ask participants things like:
- Were there any questions that you felt uncomfortable writing down information for?
- Who usually has the power to determine what is captured in data?
- Who decides how those questions are asked and how the answers are interpreted?
Review What You Have Learned
Review the key points in the conversation that you just had. Remind the participants that data are systematic observations of the world, and a dataset is the collection of those observations. Data are a helpful reduction of the world, but it’s important to keep in mind that they do not capture everything. Data can capture different types of information (quantitative, qualitative, temporal, geographic, etc.), and the type of the data affects the kinds of exploration and pattern seeking that you can do. Data often needs to be cleaned and standardized. Data storytelling and analysis involves looking for patterns, making comparisons, finding outliers, and then testing that knowledge in dialogue with others. Finally, the way data are collected and used can raise questions of privacy, consent, and ethics, which must be taken into account when analyzing and presenting them.