Wednesday, 26 June 2019

Browser Automation in Google Cloud with Puppeteer Sharp

Recently I was working on creating web browser automation tool that needed to be hosted on Google Cloud Compute Engine. Currently it is a .Net Core console app that uses Puppeteer Sharp library to run chrome headless browser. Compute Engine is a VM with Debian Linux OS with no graphical interface.

After deploying .Net console app to Debian OS and running it I got the error ‘ cannot open shared object file: No such file or directory’. To fix it install the following packages.
sudo apt install -y gconf-service libasound2 libatk1.0-0 libc6 libcairo2 libcups2 libdbus-1-3 libexpat1 libfontconfig1 libgcc1 libgconf-2-4 libgdk-pixbuf2.0-0 libglib2.0-0 libgtk-3-0 libnspr4 libpango-1.0-0 libpangocairo-1.0-0 libstdc++6 libx11-6 libx11-xcb1 libxcb1 libxcomposite1 libxcursor1 libxdamage1 libxext6 libxfixes3 libxi6 libxrandr2 libxrender1 libxss1 libxtst6 ca-certificates fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils wget
The initial error went away, but I was presented with the message that chrome needs to run outside sandbox by specifying '--no-sandbox' in the arguments list. Google Cloud Compute Engine VM can run Headless Chrome only by disabling the sandbox.

For more detailed information about issues and solutions discussed check out references below.
