Machine Learning: Popular Libraries and Frameworks (Part 1)

An overview of industry-standard frameworks, modules, and libraries used by Machine Learning practitioners.

Best Nyah
Python in Plain English
6 min readNov 15, 2021

libraries & frameworks

The field of Machine Learning amid other popular fields in computer science is quite broad which hinges on the application of many complex algorithms based on mathematical and statistical equations.

It could be burdensome sometimes to code an algorithm from scratch to solve every unique Machine Learning problem we come across. Doing that would be a real pain-in-the-neck especially for newbies in the field. That is why the use of libraries and tools is recommended because they are more efficient and bug-free as they have been in use for a long time by experts in various industrial applications.

Similarly, it aids beginners to get profound insights into how the popular algorithms work in real-life industrial applications.

In this part series, we will be talking about a sample of popular libraries that are an indispensable part of a Machine Learning practitioner's arsenal to research and write complex programs while saving themselves from writing a lot of redundant code.

We’ll focus on the Machine Learning libraries unique to the Python programming language.

Image from python.org

Why Python?

Python is becoming popular day by day and has replaced many popular languages in the industry. The simplicity of python has attracted many developers to build a vast number of libraries for Machine Learning and Data Science, and for this reason, Python has grown to become the most preferred programming language for machine learning. Other important reasons for Python’s popularity over other languages include:

  • Python’s syntax is very simple and high level when compared to other languages such as Java, C, and C++, therefore it aids machine learning practitioners to focus their attention more on the algorithms and model workflow rather than the complex syntaxes of the language.
  • Code solutions can be achieved with fewer lines.
  • Python is popularly known as a Beginner’s language because of its simplicity.
  • Python has a vast collection of libraries for numerous applications.
  • Portability, as it is compatible with many use areas.

Well that being said, I guess we have some intuition on why we will be discussing the popular libraries and tools for Machine Learning based on the Python programming language. Some of the popular and best Machine Learning libraries for Python are :

1. NumPy

NumPy logo from google.com

The NumPy library is very important for machine learning and data science. Of course, it is one of the greatest Mathematical and Scientific computing library which was built with python originally in 2006 by Travis Oliphant and is presently managed by the NumPy community. The library aids python developers to save a lot of time on scientific computations that involve vast matrix-based calculations in mere milliseconds which is an integral part of machine learning. This is made possible due to the implementation of the NumPy arrays in the C programming language.

One of the most unique features of NumPy is its Array interface which is grossly used to represent images, sound waves, and many other raw binary streams as arrays of real numbers with N dimensions.

Install NumPy:

The prerequisite for installing Numpy on your computer is python (at least the python3 version is quite okay). It can be installed with python-pip (a standard package management system that is used to install and manage other software modules) and conda on the anaconda framework.

Here is a previous article of mine that gives an overview and installation of the anaconda framework for the Linux machine.

Install NumPy with pip:

Install NumPy with conda :

For more details on NumPy check out the documentation.

2. Pandas

image from pandas.pydata.org

The popular Pandas library is a go-to machine learning library when it comes to dealing with an enormous density of tabular data. It is grossly used for data analysis with support for fast, flexible, and expressive data structures designed to work on both “relational” or “labeled” data. It is an open-source Python package that is built on top of the Numpy library, which provides support for multi-dimensional arrays.

As one of the most popular data-wrangling packages for python, Pandas works well with many other machine learning and data science modules inside the Python ecosystem.

Here are some of the features unique to the pandas library:

  • Handling of data
  • Alignment and indexing
  • Handling missing data
  • Cleaning up data
  • Input and output tools
  • Multiple file formats supported
  • Merging and joining of datasets
  • A lot of time series
  • Optimized performance
  • Visualization
  • Grouping of data
  • Perform mathematical operations on the data
  • Python support

Install Pandas:

Install pandas with pip:

Install pandas with conda :

For more details check out the pandas documentation

3. Scikit-Learn

scikit-learn logo from Pinterest

The Scikit-Learn library was previously known as scikit-learn project started as a Google Summer of Code project by French data scientist David Cournapeau.

The Scikit-Learn library is a commercially available open-source python library that is built on some popular libraries you might be already familiar with which include; NumPy, SciPy, and matplotlib. It is quite a simple and efficient tool for data mining and data analysis. It provides many unsupervised and supervised learning algorithms used to implement machine learning models including statistical modeling. It also provides functionality for dimensionality reduction, feature selection, feature extraction, ensemble techniques which are used for data analysis and manipulations, and as well as inbuilt datasets.

The scikit-learn library is accompanied by numerous functionalities that include:

  • Regression, including Linear and Logistic Regression
  • Classification, including K-Nearest Neighbors
  • Clustering, including K-Means and K-Means++
  • Preprocessing, including Min-Max Normalization, Linear Encoder
  • Data Splitting
  • Model selection
  • Bagging
  • Model Boosting
  • Principal Component Analysis (PCA)
  • Feature Extraction
  • Scaling, Standardization, and Normalization

Install Scikit-learn:

Install scikit-learn with pip:

Install scikit-learn with conda :

For more details check out the scikit-learn documentation

4. matplotlib

Image from matplotlib.org

Matplotlib is an open-source drawing library introduced by John Hunter in the year 2002 which supports various drawing types. It is an amazing visualization library in Python for 2D plots of arrays that include generating plots, histograms, bar charts, box plots, and other types of charts with just a few lines of code. It also provides an object-oriented API that enables it, in extending the functionality to put the static plots in applications by using various Python GUI toolkits available like Tkinter, PyQt, etc.

Here are some of the features unique to the matplotlib library:

  • It is used as a data visualization library for the Python programming language.
  • It provides quite the simplest and most common way to plot data in python.
  • It provides such tools that can be used to create publication-standard plots and figures in a variety of export formats and various environments across platforms.

Install matplotlib:

Install matplotlib with pip:

Install matplotlib with conda :

For more details check out the matplotlib documentation

Thanks for reading ❤️

Please feel free to leave your comments and ideas on the post.

I can imagine how helpful this post has been, do leave a clap 👏 below a few times to show your support for the author!

More content at plainenglish.io

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Published in Python in Plain English

New Python content every day. Follow to join our 3.5M+ monthly readers.

Written by Best Nyah

Data Engineer | Machine Learning Engineer

Responses (1)

What are your thoughts?