Excel Selenium Element Test

During web scraping sometimes an expected element is not present. This recently happened during a web scraping project when the client input data was not correct causing the page to load with an error and didn’t have the expected data to scrape. The web scraping was being done in Excel using Selenium and Chrome browser. The error handling can be done a couple of different ways. I used the approach of assigning a webelement object to the expected element and then test if the object is Nothing.

Element Test Code

' class that has distance in miles
Set miClass = driver.FindElementByClass("mi", timeout:=0, Raise:=False)
If Not miClass Is Nothing Then
    {Element Present}
Else
    { Element not Present, Error Handling}

This project required loading almost 20,000 web pages. Sometimes the web page url failed to load. Although not directly tied to testing for the present of an element this error case still needed to be handled. For this case Excel VBA error handing is used. Multiple attempts are made to load the url. If the web page loads then the “attempt loop” is exited early. If the maximum number of attempts is reached, then error handling is used.

In an Excel VBA, On Error is used to control the error handing. Before making the web page loading attempt On Error GoTo and Exit For are used to manage the multiple attempts.

Web Page Loading Error handling

' attempt to get website
On Error GoTo tryAgain    ' if error GoTo tryAgain
For try = 1 To Attempts   ' number of attempts to try (const)
    ' get website
    driver.Get url        ' attempt to load web page, if error GoTo tryAgain
    Exit For              ' web page loaded exit For
tryAgain:
Next try                  ' next attempt
On Error GoTo 0           ' reset Error handling to stop on error
                    
If (try <> (Attempts + 1)) Then   ' Test if maximum attempts were reached
    {Page Loaded, Do Stuff}
Else
    {Page Didn't Load, Error Handling Stuff}

Calculating Moon Phase

Recently I published a Fitbit app that calculates the Moon phase and then displays the proper phase image and illumination value. The user can select a different date or simply increment the date using buttons. Some feedback I received was that the app was fast and they wanted to understand why?

First, the app performs all calculations on the watch. No phone resources are need No GPS. No Internet. The calculations are based on a dateNow date object (this object is settable by the user though the interface that is not part of this discussion).

The calculation is based on an article I found by SubsySTEMs while performing an Internet search. Their paper broke the Moon phase calculation into a series of steps, which was easy to follow and translated directly to JavaScript. The output of their calculation is the number of days since the last New Moon. From there the Moon phase is determined. There are eight Moon phases (in order from New):

  • New
  • Waning Crescent
  • Third Quarter
  • Waning Gibbous
  • Full
  • Waxing Gibbous
  • First Quarter
  • Waxing Crescent

Using the days since the last New Moon you can find the phase but beware, each phase isn’t just 1/8 of 29.53 days. Shown below is the JavaScript that implements the attached paper.

function calMoonPhase() {
  // calculate moon phase for date
  let month = dateNow.getMonth() + 1;
  let year = dateNow.getFullYear();
  if(month < 3) {
    year = year - 1;
    month = month + 12;
  }

  let a = Math.floor(year/100);
  let b = Math.floor(a/4);
  let c = 2 - a + b;
  let d = dateNow.getDate();
  let e = Math.floor(365.25*(year+4716));
  let f = Math.floor(30.6001 * (month + 1));
  let jd = c + d + e + f - 1524.5;
  let daysSinceNew = jd - 2451549.5;
  let newMoons = daysSinceNew / 29.53;

  <...Determine Moon phase...>
  <...Determine image to show...>
  <...Calculate illumination...>


}

Excel Selenium Web Scraping Error 33

I’ve done Chrome web scraping in both Python and Excel using Selenium on a Windows 10 platform. I have also web scraped in Python using BeatifulSoup4. A new client had a requirement to find the distance between two zip codes. The client had created an Excel worksheet with a matrix of zip codes (vertical) and cities (horizontal). I found a web site that calculates the distance between two zip codes in a web page. The zip codes are part of the web page URL I converted the cities to city airport zip codes so I just needed to read rows and columns to create the URL and the scrape the results. Easy right?

Error 33

I implemented the VBA code and got a runtime error 33 when opening the Chrome browser. I checked other Excel scrapers that I had written and they all failed. So something was wrong with my Excel environment. An Internet search indicated that the version of Chrome and the chromedriver were not the same. I remembered that I had created a directory for Selenium webdrivers and added it to the system path. I updated the driver to match the Chrome version and I still got the same runtime error. I then tried one of my Python scrapers and they worked! So my problem was with Excel. I continued my search and couldn’t find a resolution until….

Excel Selenium Environment

To setup the Excel environment you need to add SeleniumBasic library, which I already had and the version hadn’t changed since 2016. To add the library to your VBA you need to select the Selenium Type Libary (editor Tools > References…). As I re-read the installation instructions I found that the driver needs to be placed in the same directory as the SeleniumBasic library, which for me was not the webdriver directory I had already updated. That was the difference. Excel did not use the system path for the webdriver.

Once I updated the chromdriver in the SeleniumBasic directory very thing started working again. Watch out for Chrome updates. For my environment I need to update the driver in two locations – the system path directory (Python) and the SeleniumBasic directory (Excel). A good description on setting up Excel to work with Selenium can be found at myOnlineTrainingHub.

Sunrise/Sunset Fitbit App Released

Anyware, LLC Fitbit app for displaying the sunrise and sunset times were approved and published on the Fitbit Gallery.

The Sunrise/Sunset app displays the sunrise and sunset times for the users current location and time settings. The initial date displayed is the current date. The date display is a button that allows the user to select a different date for the sunrise and sunset data or the user can push the increment button to advance to the next date. This app uses the phone’s GPS and network connection to obtain the sunrise and sunset data. No user data is saved and only the user’s current location is transmitted to an API.

RTC DS3231 with Ardunio

Most embedded systems require some kind of real time clock for different timing events. Recently a client wanted a chicken coop controller. Central to the implementation is an Adafruit DS3231 Precision RTC Breakout. This board uses Maxim’s DS3231 Integrated Real Time Clock.

The DS3231 is an extremely accurate real-time clock with an integrated temperature compensated crystal oscillator and crystal. This device supports battery backup to maintain timekeeping when main power is interrupted. This device maintains seconds, minutes, hours, day, date, month, and year information that includes leap year corrections. The DS3231 also has two programmable time of day alarms that trigger an interrupt output when the specified time occurs. The DS3231 interface is I2C.

The chicken coop controller included an Ardunio UNO, Adafruit DS3231 precision real-time clock (RTC), Pololu 5V 1A regulator, 2-channel relay module, AC light dimmer (forward phase) module, 12V linear actuator, and a 12V battery. The Ardunio sketch was to:

  • At sunrise retract the linear actuator for 3 minutes.
  • 1-hour after sunrise turn on lights at 100% brightness
  • Leave lights at 100% brightness for 12.5 hours, then gradually dim over a 1-hour period.
  • At sunset extent the linear actuator for 3 minutes.

Looking at the requirements, all events except the dimming occur on the minute boundary (sunrise, sunset, actuator time, lights on full time). The DS3231 has two alarms. One alarm works down to the second and the other alarm works down to the minute. The need for an alarm to occur on seconds boundary is driven by the light dimming over a 1-hour period. The light dimming uses the 8-bit PWM control. Using all of the PWM control states (256) over 1-hour period has the PWM value updating every 14 seconds (255 down to 0). Since light dimming needs an alarm to occur every 14 seconds, alarm 1 was selected for light control and alarm 2 for actuator control.

The next alarm time is determined by adding the seconds, minutes, hours, and days to a time. For example if the next alarm is to occur in 3 minutes from now, 3 minutes is added to the current time. An add time function was developed to support this sketch. The time is added to the appropriate parameter and then rollover values are tested and updated. When adjusting months the code continues to loop through rollovers until all adjustments are made.

Function to Add time

// function to add time to input time
void addTime(ts *t, int secs, int mins, int hours, int days) {
  boolean adjustMonths = true;
  
  // add time then adjust for rollovers
  t->sec += secs;
  t->min += mins;
  t->hour += hours;
  t->mday += days; 

  // adjust date/time based on rollovers
  if (t->sec >= 60) {
    t->min += t->sec / 60;
    t->sec %= 60;
  }
  if (t->min >= 60) {
    t->hour += t->min / 60;
    t->min %= 60;
  }
  if(t->hour >= 24) {
    t->mday = t->hour / 24;
    t->hour %= 24;
  }

  // adjust months
  while(adjustMonths) {
    switch(t->mon) {
      case 1:
      case 3:
      case 5:
      case 7:
      case 8:
      case 10:
      case 12:
        if (t->mday > 31) {
          t->mday -= 31;
          t->mon += 1;
          if (t->mon > 12) {
            t->mon = 1;
            t->year += 1;
          }
        }
        else
          adjustMonths = false;
        break;
      case 4:
      case 6:
      case 9:
      case 11:
        if (t->mday > 30) {
          t->mday -= 30;
          t->mon += 1;
        }
        else
          adjustMonths = false;
      case 2:
        if (t->year % 4 == 0) {   // leap year, don't worry about % 400
          if (t->mday > 29) {
            t->mday -= 29;
            t->mon += 1;
          }
          else
            adjustMonths = false;
        }
        else
          if (t->mday > 28) {
            t->mday -= 28;
            t->mon += 1;
          }
          else
            adjustMonths = false;
        break;
    }
  }
}

In setting the alarm the developer must enable all values (seconds, minutes, hours, date) up to the desired match (i.e. if alarm is on a hour value, then seconds, and minutes must also be enabled).

When the time matches the enabled alarm parameters then the DS3231 INT/ pin goes active low. Since the I2C Adruino library uses interrupts to communicate, this sketch polls this DS3231 pin instead of using an interrupt. During the polling each alarm is checked and the proper flag(s) is(are) set.

One final item with the DS3231 is the control register used to enable/disable alarms and set other parameters. This register is not readable. If the alarms or other controls are enabled at different times in the code, then a “shadow” register is needed to maintain the current control register state. For example is you are enabling alarm 2 you don’t know the current state of alarm 1 unless you maintain an internal variable in your code.

Finally the library select to support the DS3231 was ds3231FS by Petre Rodan. This library provided an interface to setting the alarms.

Fitbit Apps Published

Anyware, LLC Fitbit apps for Poker Hand ranks and Moon Phases were approved and published on the Fitbit Gallery.

Poker Hand app is designed as a simple player reminder of poker hand ranks while sitting at the table. The app is a simple scrolling list that displays poker hand ranks from highest to lowest. This app uses no user data.

The moon phases app displays the current moon phase and illumination percentage for the given date. The initial date displayed is the current date. The date display is a button that allows the user to select a different date for the moon phase, or the user can push the increment button to advance to the next date. This app uses no external resources and no user data is saved or transmitted.

The Node Method

In the next update to my book, AC/DC Principles and Applications, I plan to add a small section on the Node Method to Chapter Nine – Complex Network Analysis Techniques. Discussed in this chapter are Kirchoff’s voltage and current laws, superposition, Thevenin’s and Norton’s theorems. This post is an advance installment of the update.

The Node Method is a circuit analysis approach that uses the circuit element properties with Kirchoff’s voltage and current laws. This method helps to reduce analysis complexity as seen in the example.

Each element connection point is a circuit node. In using the Node Method, each node voltage is label with one node being selected as ground or 0 volts. Each node also has the current labeled showing the currents entering and leaving each node.

After all the node voltages and currents are labeled, KCL is written for each node. Based on Ohm’s Law, the current through a resistor is the voltage across the resistor divided by the resistance. The voltage across a resistor is the voltage difference between the terminals.

Using the set of KCL equations, solve for the missing information.

Node Method Steps

  1. Select a reference node to be 0 volts (ground). A good choice for this reference node is one that has the most connections.
  2. Label voltages and currents.
  3. Write KCL equations for each node that has an unknown voltage.
  4. Solve for missing values using the equations developed in step 3.

Example

Using Figure 9-3, what is the voltage drop for a 2Ω load resistor that is connected between a 12V closed-loop circuit (loop A) with a resistance of 4Ω and a 6V closed-loop circuit (loop B) with a resistance of 3Ω?

Write KCL
IL = I1 + I2
e/2Ω = (12V – e)/4Ω + (6V – e)/3Ω

Solve for node voltage e
6e/12Ω = 3(12V – e)/12Ω + 4(6V – e)/12Ω
6e/12Ω = (60V – 7e)/12Ω
13e/12Ω = 60V/12Ω
e = 60V/13
e = 4.6V
IL = 4.6V/2Ω
IL = 2.3A

This solution was simpler than creating the KVL for each loop, solving for I1, I2, and then IL, and then finding the node voltage e.

Combining PDF Files using Python

I was surprise how easy it was to combine separate PDFs into a single file using Python. I was scanning a large document and my scanner could not handle the entire document. So I had to perform multiple scans that created multiple PDF files. I was looking for free PDF combining software when I decided to search for Python code.

I found sample code on Stack Overflow that uses PyPDF2 library. The code isn’t fancy and uses listdir to get all PDF files in a common directory. I added the input directory but didn’t program a user interface so the directory files names need to sort in the order you want to merge. I had 17 files so I renamed them to 01.pdf, 02.pdf, … 10.pdf, 11.pdf, …, 16.pdf, 17.pdf.

The code below was developed under Python 3.8.2 on Windows 10 platform using the PyPDF2 library. To install the PyPDF2 library use pip install PyPDF2.

Code

import os
import sys                          # system interface (argvs)
from PyPDF2 import PdfFileMerger    # pdf library

def mergePDFs(pdfdir):
    if os.path.exists(pdfdir) :
        x = [a for a in os.listdir(pdfdir) if a.endswith('.pdf')]
    
        merger = PdfFileMerger()
        for pdf in x:
            merger.append(open(pdfdir + '\\' + pdf, 'rb'))
    
        if os.path.isfile(pdfdir + '\\' + 'result.pdf') is not True:
            with open(pdfdir + '\\' + 'result.pdf', 'wb') as fout:
                merger.write(fout)
        else:
            print("Output file result.pdf exists. Exit without saving.")

    else:
        print('Directory " + pdfdir + " does not exist.')
        
if __name__ == "__main__": 
    # get pdf input directory
    pdfdir = ''
    if len(sys.argv) == 2:
        pdfdir = sys.argv[1]
        
    if pdfdir == '':
        pdfdir = input('Input pdf input file directory: ')
    
    # call  function 
    mergePDFs(pdfdir) 

I like using command line arguments so the “main” startup function checks for an input argument that is the PDF directory for input and output. If the argument is missing then prompt the user.

A list x is created using os.listdir and files ending with .pdf. The list x only includes file names so the path is added when performing the append method. Once all the files have been appended then result file is written.

Error checking is performed to ensure the input directory exists and the output file is not present.

The next code improvement is to add a user interface where files are selected and ordered for merging.

Announcing Anyware, LLC

On May 7, 2020 Anyware, LLC, a Maryland limited liability company, was formed as a single member business. Anyware, LLC was created to provide cost effective Computer Engineering hardware, software, and firmware development services for a range of applications and platforms in today’s gig-based economy.

Anyware, LLC is also a producer of consumer based app products. With the formation of the LLC, four apps are scheduled for release on the Fitbit App Gallery for the Fitbit VersaTM, Versa 2TM and IonicTM fitness tracker watches. These apps include:

  • Simple poker hand rankings list
  • Moon phase and illumination for any date
  • Sunrise and sunset for any date
  • Current preliminary air quality index (AQI) from closest monitoring station

Mr. Paul Shultz, Managing Member, has over 35-years of design engineering, product development, manufacturing, technical management, and executive leadership experience with commercial and military customers.