A Complete Beginner-Level Python Course to Learn Data Science and Machine Learning

Day 5: Pandas

Muhammad Umair
Python in Plain English

--

It is the fifth day of our journey to Learn all the Python we need for Machine Learning and DataScience. All of my new code is linked to the previous code from the earlier parts of this series. You can find them below.

Day 05

Pandas Series

Definition

A series in Python is a kind of one-dimensional array of any data type that we specified in the Numpy module. The only difference you can find was, each value in a Python series is associated with the index. The default index value of Python Series is from 0 to n-1, or you can specify your own indexes.

Pandas Series is nothing but a column in an excel sheet. As depicted in the picture below, columns with Name, Age, and Designation representing a Series

Purpose

Series is a one-dimensional labeled array capable of holding data of any type (integer, string, float, Python objects, etc.). The axis labels are collectively called indexes.

Importance

Series is a very important data structure in Python it is like an excel datasheet or a database that is used to save our data in the form of tables so that data will remain categorized and readable.

They are used in many ML Algorithms to hold different types of data and to perform different functions.

Strengths

It helps us in many functions to save the data in tabular form to use it and it makes the data more readable and the series datatype is used in many algorithms.

There are some algorithms build in the libraries for Python that requires the data in the series format as it is easy to manipulate the data in series.

It is very easy to index and geet the relevant subset of information from the series containing a large amount of data.

It is easy to replicate the series and to make changes in them and update it

Weakness

It does not has a weakness as I say but there are some functions that allow th manipulation on the series data and other functions.

Example 01

Task

create a pandas series that use index, values, and use slicing to get some data out of it.

Code

import pandas as pdimport numpy as npimport matplotlib.pyplot as pltnp.random.randn(5)s = pd.Series(np.random.randn(5), index=['a', 'b', 'c', 'd', 'e'])s.indexIndex(['a', 'b', 'c', 'd', 'e'], dtype='object')s.valuespd.Series(np.random.randn(5))studentsSeries=pd.Series(studentsDict)studentsSeriespd.Series({3.2,3.06,0.29,0.36,1},index=['s1','s2','s3','s4','s5'])studensMarks[0][0]studensMarks[0:3]

Output

array([ 0.58100127, -0.08284663, -1.10031443, -1.3100632 ,  1.690669  ])a    1.363077 
b -0.957584
c 0.192168
d -0.441061
e -0.904848
dtype: float64
array([ 1.36307667, -0.95758415, 0.19216809, -0.44106068, -0.90484761])0 0.363901
1 -0.013482
2 1.105250
3 0.300175
4 0.867615
dtype: float64
Student1 {'name': 'Muhammad Umair', 'age': 20, 'Departm... Student3 {'name': 'Muhammad Abdullah Tahir', 'age': 20,... Student2 {'name': 'Hamdan Ijaz', 'age': 22, 'Department... dtype: objectstudensMarks=pd.Series(studentsResult)studensMarks0 (Muhammad Umair, 3.06)
1 (Hamdad Ijaz, 2.8)
2 (Muhammad Abdullah Tahir, 2.7)
dtype: object
TypeError
Traceback (most recent call last)
<ipython-input-168-57b600e94cf5> in <module>
----> 1 pd.Series({3.2,3.06,0.29,0.36,1},index=['s1','s2','s3','s4','s5'])
~/anaconda3/lib/python3.8/site-packages/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
272 pass
273 elif isinstance(data, (set, frozenset)):
--> 274 raise TypeError(f"'{type(data).__name__}' type is unordered")
275 elif isinstance(data, ABCSparseArray):
276 # handle sparse passed here (and force conversion)
TypeError: 'set' type is unordered'Muhammad Umair'0 (Muhammad Umair, 3.06)
1 (Hamdad Ijaz, 2.8)
2 (Muhammad Abdullah Tahir, 2.7)
dtype: object

Example 02

Task

Get conditional Data out of series

Code

s[s > s.median()]s[[4, 3, 1]]

Output

a    1.363077 
c 0.192168
dtype: float64
e -0.904848
d -0.441061
b -0.957584
dtype: float64

Example 03

Task

Get exponent of the data using exp function and get the data out of it.

Code

np.exp(s)s['a']s['e'] = 12.s'e' in ss['f']

Output

a    3.908199 
b 0.383819
c 1.211874
d 0.643354
e 0.404604
dtype: float64
1.3630766691037222a 1.363077
b -0.957584
c 0.192168
d -0.441061
e 12.000000
dtype: float64
TrueTypeError Traceback (most recent call last)~/anaconda3/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_value(self, series, key)
4410 try:
-> 4411 return libindex.get_value_at(s, key)
4412 except IndexError:

Example 04

Task

Get Data out of series using get function and apply +, *,’**’

Code

s.get('f')s.get('f', np.nan)s+ss*2s**2s = pd.Series(np.random.randn(5), name='something')s

Output

nan
a 2.726153
b -1.915168
c 0.384336
d -0.882121
e 24.000000
dtype: float64
a 2.726153
b -1.915168
c 0.384336
d -0.882121
e 24.000000
dtype: float64
a 1.857978
b 0.916967
c 0.036929
d 0.194535
e 144.000000
dtype: float64
0 1.278812
1 -0.416320
2 1.495156
3 0.313534
4 -1.240909
Name: something, dtype: float64

Example 05

Task

use name function on the series and use rename fun to change its name

Code

s.names2 = s.rename("different")s2.name

Output

'something'
'different'

More content at plainenglish.io

--

--

MERN Stack Developer | Software Engineer| Frontend & Backend Developer | Javascript, React JS, Express JS, Node JS, MongoDB, SQL, and Python