Python Pandas DataFrames Quiz

Python
0 Passed
0% acceptance

A 35-question quiz covering Pandas DataFrame operations, including column manipulation, row filtering, data import, statistics, missing values, and data types.

35 Questions
~70 minutes
1

Question 1

You have a DataFrame `df` with columns 'Name', 'Age', and 'City'. How do you select just the 'Name' column as a Series?

javascript

import pandas as pd
data = {'Name': ['Alice', 'Bob'], 'Age': [25, 30], 'City': ['NY', 'LA']}
df = pd.DataFrame(data)
                
A
df['Name']
B
df[['Name']]
C
df.select('Name')
D
df.column('Name')
2

Question 2

What is the correct syntax to add a new column 'Bonus' to an existing DataFrame `df`, where every value is 100?

A
df.add_column('Bonus', 100)
B
df['Bonus'] = 100
C
df.Bonus = 100
D
df.insert('Bonus', 100)
3

Question 3

How do you rename the column 'Age' to 'Years' in place?

javascript

df = pd.DataFrame({'Age': [20, 30]})
# Rename command
                
A
df.columns['Age'] = 'Years'
B
df.rename(columns={'Age': 'Years'}, inplace=True)
C
df.replace('Age', 'Years')
D
df.set_names({'Age': 'Years'})
4

Question 4

Which command drops the column 'City' from the DataFrame `df`?

A
df.drop('City', axis=1)
B
df.drop('City', axis=0)
C
df.remove('City')
D
del df.City
5

Question 5

You want to create a new column 'Total' that is the sum of columns 'A' and 'B'. Which syntax is most efficient and readable?

javascript

df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
                
A
df['Total'] = df['A'] + df['B']
B
df['Total'] = df.sum(axis=1)
C
for i in df.index: df.at[i, 'Total'] = df.at[i, 'A'] + df.at[i, 'B']
D
df['Total'] = df.apply(lambda x: x['A'] + x['B'], axis=1)
6

Question 6

How do you select multiple columns 'Name' and 'City' to create a new DataFrame?

A
df['Name', 'City']
B
df[['Name', 'City']]
C
df.get(['Name', 'City'])
D
df.select(['Name', 'City'])
7

Question 7

How do you filter rows where the 'Age' column is greater than 30?

javascript

df = pd.DataFrame({'Name': ['A', 'B', 'C'], 'Age': [25, 35, 40]})
                
A
df[df['Age'] > 30]
B
df.filter('Age' > 30)
C
df.query(Age > 30)
D
df.where('Age' > 30)
8

Question 8

What is the difference between `loc` and `iloc`?

A
`loc` is label-based; `iloc` is integer position-based.
B
`loc` is integer position-based; `iloc` is label-based.
C
`loc` is for columns; `iloc` is for rows.
D
They are identical aliases.
9

Question 9

Which command selects the first 5 rows of the DataFrame?

A
df.head(5)
B
df.iloc[:5]
C
df[:5]
D
All of the above
10

Question 10

You want to select rows where 'Age' > 25 AND 'City' is 'London'. Which syntax is correct?

A
df[df['Age'] > 25 and df['City'] == 'London']
B
df[(df['Age'] > 25) & (df['City'] == 'London')]
C
df[df['Age'] > 25 && df['City'] == 'London']
D
df.select(Age > 25, City == 'London')
11

Question 11

What does `df.iloc[0, 1]` return?

A
The value at the first row and second column.
B
The value at the first column and second row.
C
The first two rows.
D
The index of the first row.
12

Question 12

How do you select rows where the 'Status' column is EITHER 'Active' OR 'Pending'?

A
df[df['Status'].isin(['Active', 'Pending'])]
B
df[df['Status'] == 'Active' or 'Pending']
C
df.query('Status' in ['Active', 'Pending'])
D
df.filter(like=['Active', 'Pending'])
13

Question 13

Which function is used to load data from a CSV file into a DataFrame?

A
pd.load_csv()
B
pd.read_csv()
C
pd.import_csv()
D
pd.csv_to_df()
14

Question 14

You have a CSV file without a header row. How do you read it so the first row isn't treated as columns?

javascript

# data.csv contains:
# 1, 2, 3
# 4, 5, 6
                
A
pd.read_csv('data.csv', header=None)
B
pd.read_csv('data.csv', skip_header=True)
C
pd.read_csv('data.csv', names=False)
D
pd.read_csv('data.csv', header=-1)
15

Question 15

How can you read only a specific subset of columns (e.g., 'Name' and 'ID') from a large CSV?

A
pd.read_csv('file.csv', columns=['Name', 'ID'])
B
pd.read_csv('file.csv', usecols=['Name', 'ID'])
C
pd.read_csv('file.csv', select=['Name', 'ID'])
D
pd.read_csv('file.csv').get(['Name', 'ID'])
16

Question 16

Which parameter in `read_csv` handles parsing dates automatically?

A
parse_dates
B
convert_dates
C
date_parser
D
dates=True
17

Question 17

What does `index_col=0` do when reading a CSV?

A
It drops the first column.
B
It uses the first column of the CSV as the DataFrame index (row labels).
C
It renames the index to 0.
D
It adds a new index column starting at 0.
18

Question 18

Can Pandas read data directly from a JSON string?

A
No, only from files.
B
Yes, using pd.read_json()
C
Yes, but you must use the json library first.
D
Only if it is formatted as a CSV.
19

Question 19

Which method gives a quick summary of statistics (count, mean, std, min, max) for numeric columns?

A
df.info()
B
df.stats()
C
df.describe()
D
df.summary()
20

Question 20

How do you calculate the mean of the 'Salary' column?

A
df['Salary'].mean()
B
mean(df['Salary'])
C
df.mean('Salary')
D
df['Salary'].avg()
21

Question 21

What does `df.corr()` compute?

A
The correction factor for errors.
B
The pairwise correlation of columns.
C
The correspondence between rows.
D
The covariance matrix.
22

Question 22

You want to find the unique values in the 'Department' column. Which method should you use?

A
df['Department'].distinct()
B
df['Department'].unique()
C
df['Department'].values()
D
df['Department'].count()
23

Question 23

How do you count the occurrences of each unique value in a column?

A
df['Col'].value_counts()
B
df['Col'].count_values()
C
df.groupby('Col').count()
D
Both A and C
24

Question 24

What is the result of `df.max(axis=1)`?

A
The maximum value in each column.
B
The maximum value in each row.
C
The maximum value in the entire DataFrame.
D
It raises an error.
25

Question 25

Which method returns a boolean DataFrame indicating where values are missing (NaN)?

A
df.missing()
B
df.isnull() or df.isna()
C
df.empty()
D
df.nan()
26

Question 26

How do you drop all rows that contain at least one missing value?

A
df.dropna()
B
df.drop_nan()
C
df.fillna(0)
D
df.remove_nulls()
27

Question 27

You want to replace all NaN values in the 'Age' column with the average age. Which code works?

javascript

mean_age = df['Age'].mean()
                
A
df['Age'].replace(np.nan, mean_age)
B
df['Age'].fillna(mean_age, inplace=True)
C
df['Age'] = mean_age
D
df.fillna('Age', mean_age)
28

Question 28

What does `df.dropna(how='all')` do?

A
Drops rows where ALL values are missing.
B
Drops rows where ANY value is missing.
C
Drops all rows.
D
Drops columns where all values are missing.
29

Question 29

Can Pandas distinguish between missing numeric data (NaN) and missing object data (None)?

A
No, it treats both as NaN in object columns, but uses NaN for numbers.
B
Yes, they are completely different types.
C
Pandas does not support None.
D
It converts everything to 0.
30

Question 30

What happens if you calculate the sum of a column containing NaNs?

A
The result is NaN.
B
It raises an error.
C
It treats NaNs as 0 by default.
D
It stops at the first NaN.
31

Question 31

How do you check the data types of all columns in a DataFrame?

A
df.types
B
df.dtypes
C
df.info(types=True)
D
type(df)
32

Question 32

You have a column 'Price' with string values like '$100'. How do you convert it to numeric?

javascript

df = pd.DataFrame({'Price': ['$100', '$200']})
                
A
df['Price'].astype(int)
B
df['Price'].str.replace('$', '').astype(int)
C
pd.to_numeric(df['Price'])
D
df['Price'].convert_objects(convert_numeric=True)
33

Question 33

What is the dtype 'object' usually used for in Pandas?

A
Complex numbers.
B
Strings or mixed data types.
C
Binary data.
D
Optimized integers.
34

Question 34

How do you convert a column 'ID' from float to integer?

javascript

df = pd.DataFrame({'ID': [1.0, 2.0, 3.0]})
                
A
df['ID'] = df['ID'].astype(int)
B
df['ID'].to_int()
C
df['ID'].format('int')
D
It happens automatically.
35

Question 35

Why might a column of integers become floats automatically?

A
If the numbers are too large.
B
If you introduce a NaN value into the column.
C
If you rename the column.
D
It never happens.

QUIZZES IN Python