A hospital tracker app using Node.js and Python

An application that tracks the number of active covid cases in India state-wise and provides an estimate of the availability of hospital beds using Nodejs and Python.

Published in

Python in Plain English

10 min readDec 28, 2020

We all know how important projects are to consolidate the conceptual theories one has learnt. Sometimes, however, one simply wants to build side projects for enjoyment, for nothing equals the joy obtained by converting an idea into a functioning application.

Here’s one such idea I had: a dashboard that lists the number of covid cases per state in India and along with it, it displays an approximate number of hospital beds available for that state.

Let’s see how we can make such an application.

What we will be using

A server to serve the pages, specifically Node.js. We won’t use express.js.
Python’s BeautifulSoup library for parsing HTML.
Regex to extract meaningful information from a huge jumble of HTML.
Custom code to serve as a simplistic templating engine.
Basic HTML and CSS to create a frontend.

An overview of the different components is as shown

Part 1. Obtaining the HTML pages

We first create a JSON that’ll hold the number of current active cases per state. We could get the data directly using a suitable API. Or, we could select a suitable website that lists this data and fetch the HTML code for it, so as to practice HTML parsing. To fetch the HTML code, we could use urllib.requests module or requests module of Python. What if the page content gets injected by JavaScript(a website built with React for instance) after the page loads ? In such cases, you’ll see the page clearly using browsers but will not get the injected content using any of the above two modules.

Here’s where Selenium comes in, specifically WebDriver.

Browser vendors provides some endpoints that can be used to automate browser testing. These are used by WebDriver to get the page via the browser itself and returns it for our usage. You can read more about it here. To set up the web driver, we just use the following imports and statements

from selenium import webdriver
from configs import config as cfgdriver=webdriver.Chrome(executable_path=cfg.secret_info["executable_path_chrome_driver"])
driver.get("URL TO REQUIRED SITE")

The webdriver for Chrome needs to be downloaded and the location of the file is specified as a value of the key executable_path_chrome_driver in our configs file. Next, we issue a GET request to whatever is the URL of the site we are trying to access. The required HTML, inline CSS and scripts are obtained as

dummyContent=driver.page_source

However, things aren’t always so simple or goes exactly as they say in the docs or the tutorials. Many times the above code returned nothing from the site. If you’re an experienced developer, you’ve most likely worked out the bug by now but I was naive and took longer. Remember the page we are obtaining is injected by JavaScript. The problem was Selenium returned as soon as the root div was added. But that div doesn’t have the active cases information, that information is in a table which hasn’t been added yet! We need to make Selenium wait till the required element is present.

try:
    element_present = EC.presence_of_element_located((By.CSS_SELECTOR, '.state-name'))
    WebDriverWait(driver, 10).until(element_present)
    '''As it's injected by JS, we have to wait for the entire stuff to load. Else Selenium returns after only div id=root is loaded but not the rest of the elements. The state class of the table contents is safe to use and guarantees the page has indeed loaded'''
    print("APP CLASS ELEMENT FOUND")
except:
    print("Timed out waiting for page to load")

Check out the examples for making Selenium wait here. The above code introduces a wait time of 10 seconds.

We have a horrid looking mess now on our hands that we are going to parse using the BeautifulSoup library of Python.

from bs4 import BeautifulSoup
import re
soup = BeautifulSoup(dummyContent, 'lxml')
step1=soup.find_all("div", class_="state-name")
all_rows=soup.find_all("div", class_="row")

We use the library’s inbuilt methods to get all divs with a class of state-name and all divs with class of row. These values will change when the website structure is updated. Careful examination of the structure will guide us as to what regular expressions to follow to get each piece of information. For example, state names such as:

example = '<div class=”state-name”>STATENAME</div>'

The regex to extract that is:

stateName=re.findall(r'>[a-zA-Z\s]+<', example)

To get >STATENAME<, now just remove the first and last characters as stateName[1:-1]. Similarly, we can obtain the active cases from the HTML code. Finally, we create an array of Python dictionaries where each dictionary has the schema:

{  
   "statename":
   "activeCases":
   "totalBeds":
   "bedsLeft":}

The code for this:

for i in range(0, len(activeCases)-1):
    data_dict_term={}
    data_dict_term["state"]=states[i][0][1:-1]
    data_dict_term["activeCases"]=activeCases[i]
    if(beds[data_dict_term["state"]] == "-1"):
        data_dict_term["totalBeds"]="Not known"
        data_dict_term["bedsLeft"]="Not known"
    else:
        data_dict_term["totalBeds"]=beds[data_dict_term["state"]]
        data_dict_term["bedsLeft"]=int(data_dict_term["totalBeds"]) - int(data_dict_term["activeCases"])
    data_dict_array.append(data_dict_term)

The number of hospital beds per state was manually obtained from https://www.kaggle.com/sudalairajkumar/covid19-in-india which in turn have got the data from https://pib.gov.in/PressReleasePage.aspx?PRID=1539877. Values are subject to change and should be treated as an approximation. Now the only thing we need to do is to save the results for later inspection.

import json
#print("Length of datadict: " , len(data_dict_array))
json_string=json.dumps(data_dict_array)
print(json_string)#Printing the data is needed so that JavaScript can access the value directly using Newprocess.stdout.on
with open('./data/dataFromScraping.json', 'w+') as f:
    json.dump(data_dict_array, f)

Why we print it will become apparent later. When we execute the script we just assembled, the required JSON with the above schema is created and stored.

Part 2. Setting up the server

What I wanted is a website that performs the parsing and shows the output within the webpage itself. So let’s create a server for this in Node.js. Of course, I could have just used Django or some other pure Python framework for this but since I had already done that in a similar side project I wanted to get Node.js and Python to work together. Setting up a Node.js server using express.js is super easy. Without express, you need to write a lot more verbose code but its good practice for beginners like me.

const http = require('http');
const url = require('url');
const path = require('path');
const fs = require('fs');const mimeTypes = {
 "html": "text/html",
 "jpeg": "image/jpeg",
 "jpg": "image/jpg",
 "png": "image/png",
 "js": "text/javascript",
 "css": "text/css",
 "json":"text/json",
 "webp":"image/webp"
};

The above is all the imports we need along with the MIME types. These are the types of files that would be requested to the server so we need to specify the extensions appropriately. When the site asks the server to load image files for instance, we need to set Content-type field to image/png or image/jpg or image/webp.

http.createServer(function(req, res){
 
 try
 {
 var uri = url.parse(req.url).pathname;//url.parse takes in a URL as argument and returns an object, each part of the URL is now a property of the returned object like uri.host, uri.pathname etc var fileName = path.join(process.cwd(), decodeURI(uri));
 console.log("File name ", fileName) //Use decodeURI as unescape is depreciated(source Mozilla docs) it converts string to a form taking into account the escape sequences like unescape('%u0107');  becomes "ć"  console.log("path.extname", path.extname(fileName).split("."))   //Array like ['' 'html']
  var mimeType = mimeTypes[path.extname(fileName).split(".")[1]];
  console.log(mimeType)
  if(uri=='/main.html')
  {
     //do main stuff here
  }
  else
  {
   //Execute this for requests for all other files such as image files.
   res.writeHead(200, {'Content-type': mimeType});
   var fileStream = fs.createReadStream(fileName);
   fileStream.pipe(res);
  }}
 catch(Exception)
 {}}).listen(1337);

The above code is self-explanatory. For any requests to server for files other than main.html, the mime type is obtained based on file extension and the file stream is piped to facilitate output on the browser. When main.html is requested for, the server uses child_process module to spawn or create a new process. This runs the Python script that creates the JSON. The print statement of the script helps to transfer the output back to Node.js using stdout.on(‘data’, callback) function. This data will be converted to a String.

Newprocess.stdout.on('data', function(data_from_python) { 
    //on receiving output from python where data_from_python is whatever python prints to console during processing, so if there are prints due to exceptions, they'll show up and cause a JSON parse error later on
    insert_data_in_template=data_from_python.toString()//This is the data from python containing array of JS objects
    file_contents=fs.readFile('main.html','utf-8', (err, data)=>{
    final_output = utils.augment_template(data, insert_data_in_template)//Custom code to inject the data_from_python into the main.html file
    res.writeHead(200, { 'Content-type': 'text/html'})
    res.write(final_output)
    return res.end()
    //response.end() signals to the server that all headers and body has been sent, the server must consider the message to be complete. res.end() MUST BE CALLED at end of each response.
    })
   })

We create an HTML file that will be injected with the data from Python to create the required output. Since this is not a frontend project and my UI skills are pretty basic, the following is all I could come up with

<!DOCTYPE html>
<html>
    <head>
        <link rel="stylesheet" href="./styles/stylesForMain.css">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
    </head><body>
            <h1 style="color: white;">COVID 19 HOSPITAL BED TRACKER</h1>
            <div style="position: relative; top: 20vh;">
            <div class="flex-container">
                @<div class="flex-item card-layout">
                    <div id="statename"><h2>State: { </h2></div>
                    <div id="text-container">
                    <p>Active COVID19 cases: { </p>
                    <p>Total hospital beds: { </p>
                    <p>Remaining beds: { </p></div>
                </div>@
            </div>
            </div>
                
        </div>
    </body>
</html>

A simple card layout, one card for each state. Generally, the injection of data dynamically into HTML, what is called templating is easily done using ejs and express. However, it’s a no-express zone here so we write our own custom code for templating using JavaScript. The HTML is just read like we read a file normally using fs.readFile. The entire file contents is placed in memory and augment_template function is called. It first extracts the template format, which I define as “everything between first @ and the next @”. Next, the { is defined as “the place where data is to be put”. Note all these are my own definitions. The data we got from Python is converted to an array of JSON objects. Each term of array creates an HTML card. Each value of each object is suitably injected at the 4 { to make 1 card. This is achieved by simple substring manipulations as shown below

augment_template = (data, insert_data_in_template) =>
{
 /*
 Input: data which is the HTML code as a large string and insert_data_in_template which is a string containing the data to be entered.Output: returns the HTML code, replacing the { in template with the data that is supposed to be placed their for dynamic display.*/
 //step1 is extract the template which is everything between @ and @ in main.html. Template is just the div with card-layout that's gonna be repeated.
 let step1 = get_start_and_end_indices(data, "@")
 let extracted_template = data.substring(step1[0], step1[1])
 let extracted_template_copy = extracted_template
 //Keep a copy of this template format as the format is updated with new data in each iteration of for loop.let data_to_be_output = JSON.parse(insert_data_in_template)//The data passed was a string
 let part_one = data.substring(0, step1[0]-1)//All HTML code before template enclosed in @@
 let part_two = data.substring(step1[1]+1)//All HTML code after the closing @
 let final_output = ""
 for (item in data_to_be_output)
 {
  for(key in data_to_be_output[item])
  {
   step2 = get_start_and_end_indices(extracted_template, "{")//{ opening curly braces marks the spot where data is to be //inserted. The opening { must have a space before and after it.   replace_brackets_with_this = data_to_be_output[item][key]
   extracted_template = extracted_template.substring(0, step2[0]-1) + replace_brackets_with_this + extracted_template.substring(step2[0]+1)
   step2 = extracted_template
   //console.log("replace_brackets_with_this ", replace_brackets_with_this)
  }
  final_output = final_output + extracted_template
  extracted_template = extracted_template_copy
 }final_output = part_one + final_output + part_two
 return final_output}
get_start_and_end_indices = (data, symbol) =>
{
 /*
 Input: data, which is a string and symbol, whose index is to be found out.
 Output: Returns a list of 2 items, item[0] is the first index at which symbol is present + 1, item[1] is the last index at which symbol is present. Symbol is usually @ or { symbol. 
 */
 return [ data.indexOf(symbol)+1, data.lastIndexOf(symbol)]
}module.exports= {
 "augment_template":augment_template,
}

Rules are { must have 1 space before and after it and there can’t be any @ in the HTML code itself like for keyframes or media queries. Hence I had to include all of that as external CSS files.

Putting it all together

We have reached a point now where we got our Python script for parsing, a server without express.js and our own custom templating (sort of, at least). Here’s what my final application looked like

Here’s how the final application looks like

We hence know that there are N active cases, M total hospital beds at a state so M-N beds at worst case would be left for others. Note M-N some of these beds are also occupied by patients so actual number of free hospital beds have M-N as an upper bound only.

Conclusion

If you’ve read this far, you’ve experienced the journey of creating a simple hospital bed tracker app using Node.js and Python. Thank you so much for sparing a few minutes of your busy day to read this. I hope you enjoyed the article, feel free to point out any bugs which may have crept in despite my best efforts, share feedback, best practices or anything else that you feel is appropriate for the occasion. Take care, stay safe and happy learning !