Titanic Passengers.csv
This is a manifest of passengers from on the doomed Titanic cruise. It was downloaded from the Kaggle Machine Learning Challenge in 2014 by Catherine D'Ignazio.
891 rows of data grouped into 12 columns.
Here's some metadata about each column.
.
- This column is full of numbers
- The smallest number is 1.0
- The biggest number is 891.0
- The total is 397386.0
- The average is 446.0
- The median is 446.0
- The standard deviation is 257.21
- There are 891 unique values
value | frequency |
---|---|
1 - 90 | 89 |
90 - 179 | 89 |
179 - 268 | 89 |
268 - 357 | 89 |
357 - 446 | 89 |
446 - 535 | 89 |
535 - 624 | 89 |
624 - 713 | 89 |
713 - 802 | 89 |
802 - 891 | 89 |
- This column is full of text
- The longest string has 82 characters
- There are 891 unique values
value | frequency |
---|---|
mr | 521 |
miss | 182 |
mrs | 129 |
william | 64 |
john | 44 |
master | 40 |
henry | 34 |
george | 24 |
james | 24 |
charles | 24 |
thomas | 21 |
mary | 20 |
edward | 18 |
anna | 17 |
joseph | 16 |
johan | 15 |
frederick | 15 |
elizabeth | 15 |
samuel | 13 |
richard | 13 |
- This column is full of numbers
- The smallest number is 0.42
- The biggest number is 80.0
- The total is 21205.17
- The average is 29.7
- The median is 28.0
- The standard deviation is 14.52
- There are 177 rows of missing data
- There are 88 unique values
value | frequency |
---|---|
0 - 8 | 54 |
8 - 16 | 46 |
16 - 24 | 177 |
24 - 32 | 169 |
32 - 40 | 118 |
40 - 48 | 70 |
48 - 56 | 45 |
56 - 64 | 24 |
64 - 72 | 9 |
72 - 80 | 1 |
- This column is full of numbers
- The smallest number is 693.0
- The biggest number is 3101298.0
- The total is 172070561.0
- The average is 260318.55
- The median is 3101265.0
- The standard deviation is 471252.39
- There are 514 unique values
value | frequency |
---|---|
693 - 310754 | 389 |
310754 - 620814 | 256 |
620814 - 930874 | 0 |
930874 - 1240935 | 0 |
1240935 - 1550996 | 0 |
1550996 - 1861056 | 0 |
1861056 - 2171116 | 0 |
2171116 - 2481177 | 0 |
2481177 - 2791238 | 0 |
2791238 - 3101298 | 15 |
- This column is full of numbers
- The smallest number is 0.0
- The biggest number is 512.3292
- The total is 28693.95
- The average is 32.2
- The median is 14.45
- The standard deviation is 49.67
- There are 248 unique values
value | frequency |
---|---|
0 - 51 | 732 |
51 - 102 | 106 |
102 - 154 | 31 |
154 - 205 | 2 |
205 - 256 | 11 |
256 - 307 | 6 |
307 - 359 | 0 |
359 - 410 | 0 |
410 - 461 | 0 |
461 - 512 | 0 |
- This column is full of text
- The most frequent values in this column are:
- B96 B98 (4)
- C23 C25 C27 (4)
- G6 (4)
- C22 C26 (3)
- D (3)
- The longest string has 15 characters
- There are 687 rows of missing data
- There are 147 unique values
value | frequency |
---|---|
B96 B98 | 4 |
C23 C25 C27 | 4 |
G6 | 4 |
C22 C26 | 3 |
D | 3 |
Other | 186 |
What do I do next?
Understanding the data in your csv file is the first step in analyzing it for stories. Looking at individual columns can help you identify questions that might be fun to ask about your data. For instance, is it surprising that "0.0" is the most frequent value in the Parch column? Does it make any sense to compare the Ticket column to the PassengerId column? Are there any other datasets you could find to ask interesting questions about the Parch column?
Asking these types of questions is the first step in understanding the data you have, and what kind of stories you can find in it. Check out our activity guide for more help on asking questions of data sets.
Try these other tools to do more full-fledged analysis: