Wednesday, 26 June 2019

Browser Automation in Google Cloud with Puppeteer Sharp

Recently I was working on creating web browser automation tool that needed to be hosted on Google Cloud Compute Engine. Currently it is a .Net Core console app that uses Puppeteer Sharp library to run chrome headless browser. Compute Engine is a VM with Debian Linux OS with no graphical interface.

After deploying .Net console app to Debian OS and running it I got the error ‘libX11-xcb.so.1: cannot open shared object file: No such file or directory’. To fix it install the following packages.
sudo apt install -y gconf-service libasound2 libatk1.0-0 libc6 libcairo2 libcups2 libdbus-1-3 libexpat1 libfontconfig1 libgcc1 libgconf-2-4 libgdk-pixbuf2.0-0 libglib2.0-0 libgtk-3-0 libnspr4 libpango-1.0-0 libpangocairo-1.0-0 libstdc++6 libx11-6 libx11-xcb1 libxcb1 libxcomposite1 libxcursor1 libxdamage1 libxext6 libxfixes3 libxi6 libxrandr2 libxrender1 libxss1 libxtst6 ca-certificates fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils wget
The initial error went away, but I was presented with the message that chrome needs to run outside sandbox by specifying '--no-sandbox' in the arguments list. Google Cloud Compute Engine VM can run Headless Chrome only by disabling the sandbox.

For more detailed information about issues and solutions discussed check out references below.

References: