An Introduction to Python Data Structures — Hash-map, Tree, Graph

In this article, we will cover non-linear data structures in Python such as Tree, Hash-map, and Graph.

Nandit Shah
Python in Plain English

--

Prerequisite

You should have basic knowledge of the Python programming language.

The entire code for this article is available in my GitHub repo.

In the first part of this article, we covered the basics of data structures and linear data structures such as Stack, Queue, Linked-List, Arrays.

Now in this article, we will cover non-linear data structures such as Tree, Hash-map, and Graph.

What are Non-linear Data Structures?

A data structure in which data is not stored in a linear format like arrays or stack. This type of data structure is known as non-linear data structures.

Non-linear data structure is a recursive type of data structure.

In non-linear data structures, we cannot traverse the data structure in a linear format; there are multiple levels for storing the data.

The implementation of non-linear data structures is a little bit more complex than linear data structures.

There are various types of non-linear data structuresbut we will cover —

  1. Hash-map
  2. Tree
  3. Graph

Non-Linear Data Structures

1. Hash-Map —

Basically, Hash-map is a key-value pair type of data structure in which the key is calculated using the Hash function and then that data is stored as key (calculated by the Hash function).

Now, first we have to understand what is a hash function. So the hash function is the function that will calculate a key using any mathematical function so that we can store data at the key position. Here, the mathematical function could be anything. There is no fixed function for finding the key using a hash function.

But the main disadvantage of this hash function is that the hash function will return the same key for two different data. So there will be a collision at that key position. This is the major disadvantage of hash functions.

There are various ways to handle this collision. We will handle it by creating a list at that key if the hash function returns the same key for different data. And then append data to the list of that particular key.

Hash tables are generally implemented using arrays.

Hash-Map

Basic operations you can perform on Hash-map

  1. Insert — Insert Item into the hash-map
  2. Delete — Delete item from the hash-map
  3. Get Item — get item from hash-map using keys

Implementation of Hash-map using Arrays —

class Hashmap:
def __init__(self):
self.MAX = 100
self.arr = [[] for i in range(self.MAX)]
def get_hash(self,key):
hash_key = 0
for char in key:
hash_key += ord(char)
return hash_key % 100
def add(self,key,value):
hash_key = self.get_hash(key)
found = False

for index,element in enumerate(self.arr[hash_key]):
if element[0] == key and len(element) == 2:
self.arr[hash_key][index] = (key,value)
found = True
break
if not found:
self.arr[hash_key].append((key,value))

def get(self,key):
hash_key = self.get_hash(key)
for element in self.arr[hash_key]:
if element[0] == key:
return element[1]

def __delitem__(self,key):
hash_key = self.get_hash(key)
for index,element in enumerate(self.arr[hash_key]):
if element[0] == key:
del self.arr[hash_key][index]

def print_hm(self):
values = [i for i in self.arr if i != []]
return values

Code explanation —

  • First of all, we have to create a class named Hashmapand in that class, we have to initialize a list called self.arr of size self.MAX. And in that self.arr , we will have an empty nested list (List in list, so if self.MAX is 10 there will be 10 empty lists in self.arr).
  • Now, we will implement the most important method in Hash-map data structures which is get_hash. This method will take the key as an argument and calculate and return hash_key using ord(this function will return numeric value for given string key) function. So using this method, all the hash_key gets calculated.
  • Now the next method is add which is used to add the item to the hash map. This method will take data and key as a parameter. First, this method will calculate the hash value(or hash key) for that given key parameter using get_hash function. After that, we will traverse all the lists to find that hash_key position and check if there are any elements added at that hash_key before this data. And if there is an element at the given hash_key list, then we will append new data as a form of tuple in that list. This tuple will include key and data. And if there is no data at that hash_key, then we will simply add that data to that hash_key list.
  • Now, the next method is get which is used to fetch items from the hash map. This method will take key as a parameter. First, this method will calculate the hash value(or hash_key) for that given key parameter using get_hash function. After that, we will traverse through the list of a given hash_key position and check if there are any elements in that list and if there is (key, data) tuple available, then it will return data for the given key parameter.
  • Next is __delitem__. This method will take the key as a parameter and just like all other methods, it will calculate hash_key using get_hash method. After that, it will traverse through the list at given hash_key and find the tuple with the given parameter key and delete that item.
  • And the print_hm method is self-explanatory so no need to describe that.

2. Tree—

Tree data structure is a hierarchical type of data structure in which there is a main root node which has two fields — data and children. In the data field, it stores data and in children field, it stores its children nodes.

So mainly, a tree has two main parts — root node and sub-tree.

As the name suggests, trees have the same structure as a real tree but only upside-down.

Trees are a very broad data structure and I cannot describe everything about them on this blog.

There are many types of Trees —

  1. General Tree (this is the tree which parent node can have as many children as we want to add).
  2. Binary Tree (in this type of tree a parent node can only have 2 or less than 2 child).
  3. Binary Search Tree (this tree is the same as binary tree; the only difference is the left child node of the parent can have a value less than its parent and the right child node can have a value greater than its parent).

There are many other Trees such as Red-black tree, AVL tree, N-ary Tree, etc.

Tree Data Structure

Basic operations you can perform on a Tree —

  1. Add Child (Add the child to the node)
  2. Delete Child (Delete child from the node)
  3. Print (Print whole Tree)
  4. Get Level (Get the level of the node)
class TreeNode:
def __init__(self,data):
self.data = data
self.parent = None
self.children = []
# For Adding child to tree
def add_child(self,child):
child.parent = self
self.children.append(child)
# For Getting level to any node
def get_level(self):
level = 0
p=self.parent
while p:
p=p.parent
level+=1
return level
# For Printinf whole treedef print_tree(self):
spaces = " " * self.get_level() * 3
prefix = spaces+"|---" if self.parent else ''
print(prefix+self.data)
if self.children:
for child in self.children:
child.print_tree()
# For Deleting child from tree
def delete_child(self,child):
if child in self.children:
self.children.remove(child)
print(f"{child.data} Removed from f{child.parent.data}")
else:
print(f'There is no {child.data} in Tree')
#Example for making code explanation Easy
def electronincs_tree():
root = TreeNode('Electronics')

hp = TreeNode('HP')
mac = TreeNode('Mac-Book')
dell = TreeNode('Dell')
laptops = TreeNode('Laptops')
laptops.add_child(hp)
laptops.add_child(mac)
laptops.add_child(dell)

motorola = TreeNode('Motorola')
apple = TreeNode('Apple')
nokia = TreeNode('Nokia')
mobiles = TreeNode('Mobiles')
mobiles.add_child(motorola)
mobiles.add_child(apple)
mobiles.add_child(nokia)

reconnect = TreeNode('Reconnect')
lg = TreeNode('LG')
samsung = TreeNode('Samsung')
tvs = TreeNode('TVs')
tvs.add_child(reconnect)
tvs.add_child(lg)
tvs.add_child(samsung)

root.add_child(laptops)
root.add_child(mobiles)
root.add_child(tvs)

root.print_tree()

tvs.delete_child(lg)
root.print_tree()

electronincs_tree()

Code explanation —

  • So in this code explanation, I have taken used an example as otherwise it would be very difficult to explain using only code.
  • So firstly, we will create a class for the tree node. In this, we will initialize three variables self.data, self.parent, self.children using the __init__ method. Here, self.data will store data of that tree node. self.parent will store the reference to the parent of that tree node and self.children will store the list of all the child tree nodes of the current tree node.
  • Now, the first method that we will implement is add_child which has one parameter called child. This method will simply add a new child to the parent tree node (for example, in electronincs_tree method, first we have created a few instances of the tree node and then we are adding that tree node to their parents using the add_child method).
  • The second method we will implement is delete_child which also has one parameter called child. Firstly, this method will check if that parent Tree Node has any child that we want to delete; and if yes, then it will remove it from the self.children list. And if there is no child that we want to delete, then it will print that “there is no child like this in the self.children list” (for example, in the electronincs_tree method, we are removing lg tree node from tvs parent tree node).
  • Now we will implement get_level method. This method does not take any parameters. This method will return the level of tree node by which we are calling this method. So what this method will do is create one variable called level and give value to 0 and then we will create a while loop and loop through all the parent of the given tree node and if tree node has a parent tree node, then level value will be increased by 1. And loop will terminate when we reach the top of the tree, that is the root tree node of the tree. And then, we will return that level.
  • Now the last method is the print_tree method. This method simply just prints all the node hierarchicies so we can visualize the whole tree. This method will decide hierarchy by the get_level method. And then assign the appropriate prefix to data to print them in hierarchy as per the level of the tree node.

3. Graph—

Graph is a non-linear data structure that consists of vertex and edge. Here the vertex is also called a node. In Graph data structure, vertex or node contains data or some value to be stored. And edges connect those vertices. And by combining both one or more vertex and zero or more edges, we get the graph data structure.

The graph and tree data structures are the most common and favourite data structures asked in interview questions.

There are so many problems that we can solve using the Graph data structure.

For example:

  1. Finding the shortest path between point A to point B.
  2. Finding the connection between two users on social media (Facebook, LinkedIn).
  3. Finding the best route in computer networks.
  4. Finding the best route between two cites or places.

There are many more problems that can be solved using the graph data structure.

Also, there are many types of graphs like Null Graph, Infinite Graph, but we can not cover everything about graphs in this article so here is the link to learn more about it.

Graph Data Srtucture

Code explanation —

  • Now in this code, we have created a class called Graph which takes edges as parameters as a list of tuples. Here the routes variable is an example of edges variable which has a list of tuples in which the first value of the tuple represents the start of the edge and the second value represents the end of that particular edge ( (“Mumbai”, “Paris”) Mumbai is the start of that edge and Paris is the end of that edge; and all the cities are represented as nodes).
  • Now in the __init__ method of this class, we will create a graph_dict variable which is a representation of the graph as a dictionary. So, in this method, we will loop through all the tuples of the edges list and create the graph accordingly. It will check if any other entry of the start of the edge is added to graph_dict variable or not, and if it is added, then it will append end value to graph_dict[start]. And if the value of start is added for the first time to graph_dict, then it will create a new key-value pair.
  • Now the second method we have implemented in the Graph class is the get_paths method which takes three arguments as parameters — start ,end , paths. Here, start represents the start of the path from where we want to find the path and end represents the end of the path to which path should terminate or end. This method will find all the paths from start to end.
  • First, this method will check if start and end variables are the same and if so, then it will simply return start variable as a path. This method will also check if start node in graph_dict’s keys list or not. And if start node is not in graph_dict’s keys list, then it means that there is no path from start to end . So, at that time, this method will simply return an empty list. Otherwise, this method will create an empty list called paths . In the for loop section of this method, we will go through all the adjacent nodes of the start node and on that adjacent node, we will use recursion to find the path from the adjacent node to the end node and after that, append it to the main path variable. And it will append that path to the paths list and at the end of the method, we will return the paths list that contains all the paths from start to end.

Conclusion

So here in this article, we have covered the basics of data structures and non-liners data structures.

In case you missed the first part of this article, here is the link.

Hope you guys have enjoyed reading and learning.

If you liked this article, feel free to leave a clap and share it with others so that they can learn from it too.

More content at plainenglish.io

--

--