Data Analysis and Visualization – Set 1:

Python is very useful for data analysis because of the ecosystem of the data-centric Python packages. Pandas is such a package in Python and makes importing and analyzing data easily for Data Science. Here we have Pandas in order to analyze data on Country Data.csv file from the UN public data sets from the website ‘statweb.stanford.edu’.

Installation of Pandas:

The most effective way to install pandas is:

pip install pandas

Creation of DataFrame in Pandas:

Dataframe is create by passing multiple series into Dataframe class using pd.Series method. It is pass in two series objects, s1 as first row and s2 as second row.

Data Analysis and Visualization using Python for Data Science Set 1

OUTPUT:

Importing Data with Pandas:

 The first step is reading the data. Data is store in csv file. Each row is separate by new line, and a comma separates each column. So, a csv file should be read in a Pandas DataFrame to work with data in Python. DataFrame is there for representing and working with tabular data. Similar to csv file, tabular data has rows and columns.

OUTPUT:

Data Analysis and Visualization using Python for Data Science Set 1

29, 10

Indexing DataFrames with Pandas:

The method pandas.DataFrame.iloc is there for indexing. The iloc method is there for retrieving as many rows and columns by position.

Data Analysis and Visualization using Python for Data Science Set 1

Indexing using Labels in Python:

We can work indexing with labels using pandas.DataFrame.loc method. So, this also allows indexing using labels instead of positions.

The example provided above is not much different from df.iloc[0:5,:]. This happens because row labels can take any values, but row labels match the position exactly. In case of column labels, we can match things much easily when working with data.

OUTPUT:

DataFrame Math with Pandas:

Data frames can be computed using Statistical Functions of pandas tools.

Data Analysis and Visualization using Python for Data Science Set 1

OUTPUT:

Pandas plotting:

In this, we will see standard conventions for referencing the matplotlib API that will provide basics in pandas for creation of plots.

OUTPUT:

Data Analysis and Visualization using Python for Data Science Set 1

So, to learn more about visualisation in python for data science, you can check this and this as well.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.