Web crawling is an incredibly useful skill to have, especially when it comes to gathering data from various websites. However, some websites make the crawling process more challenging by introducing session-based authentication or other security measures.

To overcome these obstacles, you can automate the session setup by utilising a user account and Selenium, a powerful web automation tool. While this approach may slow down the crawler, it is perfectly suitable for my personal projects where speed is not a crucial factor.

All of my web-related projects are hosted in-house on Raspberry Pis. These small, efficient devices are always up and running, making them perfect for side projects.

Despite the convenience of using a pre-installed Chromium browser on my Raspberry Pi, I encountered some difficulties in finding the correct versions of Chromedriver, which is essential for selenium to interact with the browser and execute automated tasks. However, with some effort and research, I was able to find the appropriate versions that are compatible with my setup.

First, make sure that you have Chromium installed on your Raspberry Pi. You can check if it is installed by running the following command:

which chromium-browser
# The output should be a path to the Chromium binary

# If it is not installed
sudo apt install chromium-browser -y

# Check the version
chromium-browser --version

# Output example: Chromium 120.0.6099.102 Built on Debian, running on Debian 11

To find the correct version of Chromedriver, you need to match it with the version of Chromium installed on your Raspberry Pi. At the time of writing this, the Chromium version is 120.0.6099.102.

After researching and testing various sources, I found the Electron Releases page on GitHub to be the most up-to-date and reliable source for Chromedriver.

On this page, search for the version number of your chromium-browser and look for a release changelog close to it. In my case, I found a release for 120.0.6099.199. A slight difference in version numbers is acceptable, and the Chromedriver should work on your system. If it doesn’t work, you might need to try different versions.

Electron Releases on GitHub

Once you have found the correct versions, you only need to download the corresponding driver for your operating system architecture. In my case, I use the linux-arm64 drivers because I have the 64-bit Raspberry Pi OS installed.

curl -o webdriver.zip -L "<https://github.com/electron/electron/releases/download/v26.3.0/chromedriver-v28.1.2-linux-arm64.zip>"
unzip webdriver.zip -d webdriver

sudo mv webdriver/chromedriver /usr/bin/
sudo ln -s /usr/bin/chromedriver /usr/lib/chromium-browser/chromedriver

sudo rm -rf webdriver
sudo rm webdriver.zip

To verify that the Chromedriver is correctly installed, you can run the following command:

chromedriver --version
# The output should display the version of Chromedriver installed

With the Chromedriver successfully set up on your Raspberry Pi, you can now use Selenium to automate web crawling tasks and interact with websites that require session-based authentication or other security measures.