Pandas Tutorial #2 – Introduction to Series

In this part of the Pandas tutorials, we will learn about the basics of Pandas Series.

Table Of Contents

What is a Pandas Series?

The Pandas module provides a one-dimensional data structure called Series in Python. It is like a one-dimensional labeled array that can store elements of different data types. Each value in the Series has a label associated with it.

Pandas Series Object
Pandas Series Object Layout

Elements in the right-hand side column are the Series object’s actual values, and the elements in the left-hand side column are the index labels associated with each value.

We can access values from the Series using the label name like a hashmap or by just index position.

Think of Series like a column in an Excel file. In Excel, each cell box in the column has a row label associated with it, similar to that each value in a Series has a label associated with it.

We can create a Series object using a list, tuple or a numpy array. Let’s see some examples,

Create a Pandas Series object from List

First, we need to import the pandas module.

import pandas as pd

Here, pd is an alias to the pandas. You can choose any other name too for the alias, but pd is accepted as default alias industry wise and in most of the source codes, you will find pd as the pandas alias.

The Pandas module provides a function Series(), which accepts a sequence as argument and returns a Series object containing the given elements. For example, we can pass a list to it and get a Series object like this,

import pandas as pd

# Create a Series object from a list
names = pd.Series(['Mark', 'Rita', 'Vicki', 'Justin', 'John', 'Michal'])

# Display the Pandas series object
print(names)

Output

0      Mark
1      Rita
2     Vicki
3    Justin
4      John
5    Michal
dtype: object

It created a Series object with default index labels and initialized with all the values from list. By default the index label is numeric and starts from 0. Like in the above example.

What if we want to have custom index labels in the Series object? Pass the index parameter with label names in the Series() function for the custom index labels. For example,

import pandas as pd

# Create a Series object from a list
names = pd.Series(  ['Mark', 'Rita', 'Vicki', 'Justin', 'John', 'Michal'],
                    index = ['a', 'b', 'c', 'd', 'e', 'f'])

# Display the Pandas series object
print(names)

Output:

a      Mark
b      Rita
c     Vicki
d    Justin
e      John
f    Michal
dtype: object

It returned a Series object, where index labels are custom string values. In this Series object, each value has a custom label i.e.,

  • Value ‘Mark’ has an index label ‘a’
  • Value ‘Rita’ has an index label ‘b’
  • Value ‘Vicki’ has an index label ‘c’
  • Value ‘Justin’ has an index label ‘d’
  • Value ‘John’ has an index label ‘e’
  • Value ‘Michal’ has an index label ‘f’

Later we will see how we can access Series values using these label names. But before that, let’s see some other ways to create a Pandas series object,

Create a Pandas Series object from NumPy Array

We can pass a numpy array to the Series() function to get a Series object,

import pandas as pd
import numpy as np

# Array of numbers
values = np.array([100, 200, 300, 400, 500, 600])

# Create a Series object from a NumPy Array
seriesObj = pd.Series(  values,
                        index = ['a', 'b', 'c', 'd', 'e', 'f'])

# Display the Pandas series object
print(seriesObj)

Output:

a    100
b    200
c    300
d    400
e    500
f    600
dtype: int32

Here, we created a Series object where values are of integer type and labels are of string type.

When we printed the Series object in the last line, it published the data type of elements, i.e., int32. Pandas deduced the data type of values automatically while creating the Series object. Although we want, we can also pass the different data dtype as an argument while creating a Series object. For example,

import pandas as pd
import numpy as np

# Array of numbers
values = np.array([100, 200, 300, 400, 500, 600])

# Create a Series object from a NumPy Array
seriesObj = pd.Series(  values,
                        index = ['a', 'b', 'c', 'd', 'e', 'f'],
                        dtype = float)

# Display the Pandas Series object
print(seriesObj) 

Output

a    100.0
b    200.0
c    300.0
d    400.0
e    500.0
f    600.0
dtype: float64

Here the data type of values in the Series object is float instead of int. To check the data type of a Series object just use the dtype property of the Series object. For example,

import pandas as pd

# Create a Series object of integers
seriesObj = pd.Series([100, 200, 300, 400, 500, 600])

# Display the Data ttype of values in the Series
print(seriesObj.dtype) 

Output:

int64

Important Point:

Always pass the same number of values and index labels while creating a Series object, otherwise it will raise a Value Error. Let’s see an example,

import pandas as pd

# Create a Series object from a list
names = pd.Series(  ['Mark', 'Rita', 'Vicki', 'Justin', 'John', 'Michal'],
                    index = ['a', 'b', 'c'])

print(names)

Error

ValueError: Length of values (6) does not match length of index (3)

It raised the ValueError because size of index labels and values are not same.

Create a Pandas Series object from a Dictionary

In Python, dictionary stores the data in key-value pairs. To create a Series object from dictionary, just pass the dictionary object to the Series() function. It will return a Series object with the following data,

  • All the keys from dictionary will be used as index labels for the Series object
  • All the value fields from dictionary will be used as values for the Series object.

For example,

import pandas as pd

dictObj = { 'a': 'Mark',
            'b': 'Rita',
            'c': 'Vicki',
            'd': 'Justin',
            'e': 'John',
            'f': 'Michal'}

# Create a Series object from a list
names = pd.Series(dictObj)

# Display the Pandas series object
print(names)

Output:

a      Mark
b      Rita
c     Vicki
d    Justin
e      John
f    Michal
dtype: object

Here, keys from the dictionary became the index labels and values from dictionary became the values of the Series object.

Creating Series object with mixed data type values

A Series object can contain values of different data types. For example,

import pandas as pd

# Create a Series object with mixed data type values
seriesObj = pd.Series(  ['Mark', 100, 'Tokyo', 89.22])

print(seriesObj) 

Output:

0     Mark
1      100
2    Tokyo
3    89.22
dtype: object

This series object contains values of String, integer, and float data types. Therefore, Series uses a generic data type object because internal elements are of different data types.

What is the Object data type?

Object data type means keeping a reference to values in memory. If the Series object contains elements of equal size like only integers or floats that can be stored in equal memory space, then the data type will be that only like int or float. But if a Series contains different-sized strings or mixed data type elements, then the dtype will be the object type.

Summary

In this article, we learned about the basics of Series in Pandas and how to create a Series object from list, NumPy Array or dictionary.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top