NAN is the floating-point value in Python language that stands for “Not a Number”. It usually considers a “missing value” or “undefined value” in the panda’s dataframe. However, sometimes, the programmer wants to clean the dataset from all the missing values or replace the “NaN” values with some defined value for data analysis. For this purpose, first, the programmers need to check for NaN values in their dataset and then get rid of them.

This article is about how a programmer can check for NaN values. To search for the “NaN” values dataset in Python, the user can implement the below-listed approaches:

  • How to Check For NaN Values in Python? 
  • Check for NaN Values in Pandas DataFrame
  • Search for NaN Values in Numpy 
  • Using the “cmath” library to Check for NaN Values 
  • Using a User-Defined Function to Check for NaN Values
  • BonusTip: NaN Vs. None in Python

How to Check for NaN Values in Python? 

The need for checking the “NaN” values in Python arises when the programmer is interested in cleaning the dataset from all the irrelevant features and missing values. This is because the missing value in the dataset may lead the results towards wrong validation. Removing “NaN” values will help the coder visualize data in a better way and get useful insights from the data. 

Python offers different functions, libraries, etc. to check the NaN values in the DataSet. In the following sections, we will implement these functions and libraries in different use cases.

Check for NaN Values in Pandas DataFrame

Users can encounter NaN Values while working with Pandas DataFrame. Python gives flexible access to programmers to search for all the missing values in the Pandas DataFrame. Doing so will help them save their results from overfit and underfit scenarios arising from missing values in the dataset. Hence, “NaN” values in Python become easy to handle with Pandas. To check for NaN values in Python, follow the below-listed set of instructions:

Importing Dataset to the Python Script

First, import the dataset of “CSV” format into the Python script using the “read_csv()” function in Pandas. Here’s how you can do this:

#import pandas library to the Python script
#use pnd as a shorthand
import pandas as pnd

#reading dataset with pandas library
df = pnd.read_csv ('tested.csv')

#head() displays the first 5 rows from the dataset on the console
df.head()

To load and read the “CSV” file in the Pandas library, follow the below-listed instructions: 

  • First, import the Pandas library that is necessary for reading “CSV” files in the Python script. For this particular purpose, use the “import” keyword.  
  • Here, in the above example, we import “pandas” as a shorthand “pnd” to make the code concise. 
  • Next, load the “CSV” file from your path using the built-in Python library “read_csv”. Wrap the name of the selected “CSV” file within a single quotation and specify it within the parentheses (). Doing this will convert the dataset to a Pandas DataFrame. 
  • To display the first “five” rows of information on the output, the built-in “head()” function can be utilized

The above example code demonstrates the process of loading and reading the “CSV” file into the Python script using the “read_csv()” function and upon parsing DataFrame, it shows the presence of “NaN” values in a particular dataset. 

Output

The output will present the first five rows as the “head()” function is utilized in the below code. In the below snap, check for the “Cabin” feature containing a “NaN” value:

However, other missing values in the entire dataset are not visible here and are difficult to read from a large dataset. To search for the “NaN” values in the entire dataset and to search for a total count of “NaN” values, proceed toward the below-listed approach.    

Using the “isna()” Function to Check NaN Values

To check for the “NaN” value in Python, utilize the pandas “isna()” function. The “isna()” function will represent the output in the form of boolean values (True, False) as an item in “Series”, or “Dataframe”.  However, to check for NaN values, all you need to do is pass the pandas DataFrame within its parenthesis () as an argument:

import pandas as pd

#using isna() function that returns output as boolean  
check_for_nan= pd.isna(df)

#head() function will display the output of first 5 rows
NAN_values=check_for_nan.head()

#printing the results
print(f"\nCheck for Nan in a dataset using pandas: \n{(NAN_values)}")

Output

The output shows that the dataset elements which contain NaN values are represented as “True” otherwise “False”:

Using the “isna()” Along With “sum()” Function to Check and Calculate NaN Values

The “isna()” can be executed along with the “sum()” function to check and calculate the total NaN values available in the given dataset. Here is a practical demonstration of this concept: 

#import pandas library to the Python Script
#use pnd as shorthand
import pandas as pnd

#reading dataset with pandas library
df = pnd.read_csv ('tested.csv')

#using isna() function along with sum() function 
check_for_nan= pnd.isna(df).sum()

#printing the results using f-string format
print(f"Check for Nan in a dataset using pandas: \n{(check_for_nan)}")
  • After importing the dataset as done in the above sections, invoke the “pandas” library with “isna()” and “sum()” functions with the help of dot(.) operators. 
  • Remember that you need to call the dataframe as an argument into the parenthesis() of the “isna()” function.  
  • After that, print the output in a preferred format using “f-string” string literals by the “print()” function. 

Output  

The demonstration below shows the count of “NaN” values for each column in the dataset using the “sum()” function with “isna()”: 

To Check the Total Count of NaN Values in a Pandas DataFrame

To investigate the NaN value in the dataset, the programmer can utilize the “isna()” function with the “sum()” function as follows:

#import pandas library to the Python script
#use pnd as as shorthand
import pandas as pnd
#reading dataset with pandas library
df = pnd.read_csv ('tested.csv')

#using isna() and sum() functions
check_for_nan= pnd.isna(df).sum().sum()
#printing results
print(f"Check for Nan in a dataset using pandas: \n{(check_for_nan)}")

In the above code:

  • For instance, in a scenario, where the programmer is interested in checking the total count of “NaN” values in a dataset, then they can utilize the “sum()” function two times within the code.
  • The first sum() function will retrieve the individual “column name” count of NaN values in the DataFrame, using another “sum()” function will return the total count of missing values in the entire dataset.

Output

The output demonstrates the total count of “NaN” values in the entire dataset by using “isna().sum().sum()” functions together using the dot(.) notation:  

Using the “isna()” Along With “sum()” and “any()” Functions to Check NaN Values

To find NaN values in the entire DataFrame, use the “any()” function in conjunction with “sum()” and “isna()” functions with the dot (.) operator.

#import pandas library to the Python script
#use pnd as shorthand 
import pandas as pnd
#reading dataset with pandas library 
df = pnd.read_csv ('tested.csv')

#using isna(), sum() and any()) function  
check_for_nan= pd.isna(df).sum().any()
print(f"Check for Nan in a dataset using pandas: \n{(check_for_nan)}")

In the above code:

  • Required modules are imported.
  • The desired dataset is read using the read_csv() function.
  • Pass the dataframe as an argument to the isna() function, along with the sum() and any() functions to calculate the NaN values in the dataset.

Output

The output returns the results in the boolean value, depicting that the particular pandas Datafram has “NaN” values:

Search for NaN Values in Numpy

Numpy is the library to handle arrays, matrices, or vectors in Python. For instance, if the array contains the missing value, it eventually leads to the overfitting or underfitting of the model. However, in such scenarios, the programmer needs to check that the array is free from the “NaN” values:

#import the numpy library in the Python script for handling arrays
import numpy as np

# Create a Numpy array with 'NAN' values
array = np.array([1, 2, np.nan, 4, 5])

#  check the how many of 'NAN' values are present in the array
print(np.isnan(array).sum())

To check For NaN Values in numpy, follow the below-listed set of instructions:

  • Import the numpy library into the Python script using the “import” keyword. Import as a shorthand name “np” to concise the script. 
  • Then, construct or call the vector, array into the array() function as an argument.
  • Using the “isnan()” function, invoke the created array variable in its parenthesis as an argument to it. 
  • After that, use the sum() function along with the isnan() function using the “dot(.)” operator in numpy. The invoked “sum()” function will check for the NaN values and return the total count as an output.
  • Printing the output using the “print()” function. You can use the string literal approach to present the output in a desired format. 

The above demonstration for the code explains that to check for “NaN” values use the “isnan().sum()” function in numpy. 

Output

The output depicts that the array contains only one “NaN” value:  

Using “cmath” Library to Check for NaN Values 

As discussed earlier the “NaN” is the numeric value of class “float”. However, sometimes, the data is fetched to perform some computation mathematics for the model validation. In such scenarios, having a missing value (like zero/zero or undefined, unpresentable values) in the dataset or array may lead to the wrong validation results in numpy. To get the desired results, it needs to check for “NaN” values before training a model:

#import cmath library in the Python script for computation
import cmath

#check that real number is present or not
real_num=float(np.nan)

#if..else statement to check for nan value isnan() will check that the number is real or not and returns #the output accordingly
if cmath.isnan(real_num): 
    print("NAN value is detected") 
else: 
    print("NAN value is not detected")

For a demonstration of how to check for NaN values in numpy, follow the below-listed straightforward set of instructions: 

  • Import the “math” or “cmath” library within a program using the “import” keyword.
  • To check for NaN values in Python using numpy, use the isnan() function with the “cmath” library using the “dot(.)” operator.
  • Use the if…else statement to iterate the isnan() function over the defined value, and print the results in the form of a string. 

The above set of instructions demonstrates that to check for “NaN” values in numpy Python, iterate the numeric real value over the “isnan()” function using the if…else statement.     

Output

The if statement executes and prints the corresponding message on the output:

Approach 4: Using a User-Defined Function to Check for NaN Values 

To check for the NaN values using a defined function, you simply call it by its name and pass the arguments within parenthesis() explicitly. The beauty of using an explicit function is that it returns the data type that is passed to it. If the float is passed to a variable, the “return” statement checks the results using operators for the “float” data type:

#define a function using 'def' keyword
def is_NaN_(df):
    return df!= df
#this return the output in boolean
check_for_nan=float("nan") 

#passing an argument to function that will check that the variable contains nan values
check_for_nan_values=is_NaN_(check_for_nan)
check_for_nan_values

For demonstration, follow the below set of instructions:

  • Define a function using a “def” keyword, and pass an argument within its parentheses.
  • The return statement is used here to check whether the passed string is matched or not. 
  • Then, construct a string of a float type and save it to a variable.
  • Pass a created string variable as an argument to a defined function. It should be of a float type to check for “NaN” values. 
  • Invoke a function by passing a variable.
  • Print the results. 

The example code demonstrates to use of the def function explicitly with a return statement to check for NaN values in Python. 

Output

The output is returned as a boolean value:

BonusTip: NaN Vs. None in Python

The “NaN” is a placeholder for the missing numeric values. In Python, it is an unpresentable (0/0), undefined value of float datatype, as shown in the below code example: 

import numpy as nptype(np.nan)

The “NaN” is of a float type that belongs to the class “float”.

Output

The “None” is utilized to define the null object in Python. It’s not a numeric value, it represents the empty, non-existence value in data. The “None” belongs to the “NoneType” class in Python:  

type(None)

Output

This article is all about checking for “NaN” values in Python.

Conclusion

To check for NaN values in Python, the user can utilize the isna() to handle DataFrames in Pandas, isnan() when dealing with mathematical problems or arrays in Numpy, and by explicitly defining a function. Handling “NaN” values in Python can be tricky to identify in a large dataset and leads to inaccurate results. This article has demonstrated approaches to check for “NaN” values in Python.