Data Analysis and Visualization – Set 2:

Storing of DataFrame in CSV format in Python for Data Science:

In Pandas there is a function to.csv (‘filename’ , index = “False|True”) provided which is there for writing DataFrame into a CSV file. The filename is the name of CSV file we want to create, and the index tells index of DataFrame should be overwritten or not. In case index = False, then index is not overwritten. The default value of index is True, and in such case, index is not overwritten.

Data Analysis and Visualization using Python for Data Science Set 2

OUTPUT:

Data Analysis and Visualization using Python for Data Science Set 2

Handling Missing Data:

Data Analysis Phase comprises of the ability to handle the missing data from the dataset, and Pandas lives up to the expectation. So, the dropna and/or fillna method comes into play.

  • Dropping missing data:

Data Analysis and Visualization using Python for Data Science Set 2

Data Analysis and Visualization using Python for Data Science Set 2

OUTPUT:

axis = 0

axis = 1

  • Filling the missing values:

For replacing NaN values with mean or mode of data, fillna is there. So, it can replace all Nan values from a particular column or from a whole DataFrame.

OUTPUT:

  • Groupby Method (Aggregation):

This method is there for grouping together the data based on row or column. So, we can apply aggregate function for analyzing our data. The group series is there by using mapper (dict or key function, applying the given function to group and returning the result as series) or by using a series of columns.

OUTPUT:

So, to learn more about visualisation in python for data science, you can check this and this as well. These blogs will help you in understanding the visualistion aspect of data science better and easily.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.