Softenant
Technologies
Data Cleaning MCQs (25) — Answers at End

Data Cleaning MCQs (25)

1) What is Data Cleaning?

A. Creating new data
B. Removing errors and improving data quality
C. Encrypting data
D. Uploading data to cloud

2) Missing values are also called?

A. Null/NA values
B. Unique values
C. Primary keys
D. Index values

3) Which is a common way to handle missing values?

A. Imputation (mean/median/mode)
B. Increase duplicates
C. Convert to images
D. Disable dataset

4) Removing duplicate rows helps to?

A. Reduce data quality
B. Improve accuracy
C. Increase noise
D. Change data type

5) Outliers are?

A. Typical values
B. Extreme/unusual values
C. Duplicate values
D. Null values

6) Which method helps detect outliers visually?

A. Box plot
B. Pie chart
C. Word document
D. Image filter

7) Standardization means?

A. Convert to 0-1 range always
B. Scale data to mean 0 and std 1
C. Replace text with numbers only
D. Remove columns

8) Normalization usually means?

A. Scaling to a fixed range like 0 to 1
B. Only removing duplicates
C. Only removing nulls
D. Only converting datatypes

9) Data type conversion example?

A. “25” (text) → 25 (number)
B. 25 → “apple”
C. Image → PDF only
D. Text → video

10) Trimming is used to remove?

A. Spaces before/after text
B. Numbers
C. Dates
D. Rows

11) Handling inconsistent categories means?

A. Making labels consistent (Male/M, male)
B. Adding more random labels
C. Encrypting labels
D. Removing dataset

12) Data validation checks?

A. Correctness and allowed values
B. Only file size
C. Only colors
D. Only font style

13) Removing irrelevant columns helps to?

A. Reduce noise
B. Increase errors
C. Increase file size
D. Reduce accuracy

14) Handling wrong date formats is called?

A. Data formatting
B. Data encryption
C. Data backup
D. Data duplication

15) What is a common issue in text data?

A. Typos/spelling mistakes
B. CPU overheating
C. Internet speed
D. Screen resolution

16) Removing special characters is part of?

A. Text preprocessing
B. Cloud hosting
C. Networking
D. Hardware setup

17) What is data profiling?

A. Understanding data quality/statistics
B. Designing posters
C. Creating passwords
D. Installing software

18) Range check example?

A. Age must be 0 to 120
B. Age must be a color
C. Age must be a file
D. Age must be a photo

19) Consistency check example?

A. State and city matching
B. Watching videos
C. Changing wallpapers
D. Playing games

20) Data deduplication means?

A. Removing duplicates
B. Adding duplicates
C. Encrypting duplicates
D. Hiding duplicates

21) Which is a common tool for cleaning in Excel?

A. Remove Duplicates
B. Paint
C. Notepad only
D. Camera

22) In Power BI, data cleaning is mostly done in?

A. Power Query
B. DAX only
C. Dashboard view only
D. Report export

23) In Python, data cleaning commonly uses?

A. Pandas
B. MS Paint
C. Calculator
D. Windows Media Player

24) Why do we clean data before analysis?

A. To improve accuracy and reliability
B. To reduce internet cost
C. To increase errors
D. To avoid visualization

25) Best practice in data cleaning?

A. Keep original raw data backup
B. Delete raw data immediately
C. Never validate data
D. Ignore missing values always

Answer Key

1) B

2) A

3) A

4) B

5) B

6) A

7) B

8) A

9) A

10) A

11) A

12) A

13) A

14) A

15) A

16) A

17) A

18) A

19) A

20) A

21) A

22) A

23) A

24) A

25) A