Someone reached out and asked if I could adapt my Find Duplicates Apps Script to identify duplicates across all of the sheets in a Google Spreadsheet. Because the requirement is to examine a single column across the sheets, I removed the creation of a temp sheet and used an object to track cell contents and location.
This example also contains logic to identify similar URLs by ignoring “https://”, “http://”, leading “www”, and trailing “/” and query parameters.
The code is here on github. You can add it to a Google Spreadsheet by choosing Tools – Script editor and pasting the code in code.gs. Edit the constants at the beginning that indicate which column to examine and how many header rows to ignore. Save and close the Script editor. Reload the spreadsheet, and it will ask for authorization. You will probably need to reload the spreadsheet one more time before you can run the script.
Here’s a brief video demonstration.
If you’re still using my “Insert and Delete Cells” Add-on for Google Sheets, you’re now seeing: “401. That’s an error”, “Error: deleted_client”, and “The OAuth client was deleted”.
I shut down the “Insert and Delete Cells” Google Sheets Add-on yesterday because the functionality has been built into Google Sheets for a few years now.
You will find the Delete functions in the Edit menu and the Insert functions in the Insert menu.
Sorry for the abrupt shut-down of the Add-on. I was trying to figure out a graceful way to remove the Add-on from the store, and instead made an irrevocable shut down choice.
I’m glad the Add-on was useful to people while it lasted!
Zapier has published a free ebook called The Ultimate Guide to Google Sheets, and it mentions my “Insert and Delete Cells” Sheets Add-on!
The Google Spreadsheets API allows you to send simple queries to retrieve subsets of the rows in your sheet – it’s as easy as appending the query to the end of the URL.
For example, if I have a column called “Quantity” and want rows with Quantity greater than 9, I would append ?sq=quantity>9 to the end of the URL and issue the GET request (don’t forget that column names are always converted to lowercase and spaces are removed).
The other method is to add the sq query parameter to the Guzzle $httpClient. I’ve illustrated this in my example code.
See this page for more info.
It turns out that my Google Sheets API instructions don’t work right away on Windows. People running WAMP or XAMPP have been reporting “cURL error 60: SSL certificate problem: unable to get local issuer certificate”.
For some reason, cURL on Windows doesn’t come with a certificate bundle, so it can’t verify Google’s certificate to establish a TLS/SSL connection. There are many articles out there making it sound really simple — just
- download a cacert.pem file
- add a line to your php.ini file: “curl.cainfo = path to cacert file“
But it took me hours to work out the details. First, the cacert.pem file that most people point you to (at curl.haxx.se/docs/caextract.html) didn’t work for me. After a while, I found that the cacert.pem from Shane Stebner at flwebsites.biz/posts/how-fix-curl-error-60-ssl-issue made all the difference. I haven’t figured out why.
Second, note that there are at least two php.info files on your system: one for running php from the command line (mine is C:\wamp\bin\php\php5.5.12\php.ini), and another for when php is called by Apache (for me, that’s C:\wamp\bin\apache\apache2.4.9\bin\php.ini). Since I’m running php from the command line, I only edited the first one.
I saved the cacert.pem file in c:\wamp and added this to my php.ini: curl.cainfo = c:\wamp\cacert.pem. It didn’t matter which kind of slashes I used.
Hope this helps a few people start accessing Google Sheets with PHP!
Here’s how to get up to speed with the Google Sheets cell-based feed and PHP.
My previous posts have illustrated the use of Google’s PHP API client to interact with Google Sheets using list-based feeds. This method is ideal for reading from and writing to spreadsheets that have a single header row, but isn’t useful for more complicated sheets.
The cell-based feed allows individual cells to be read and written. I’ve written some example PHP code that implements the cell-based feed methods to help get you up to speed a little quicker. Google documentation is here.
To start, follow the steps in this post to set up your service account, share a spreadsheet, and install the API client. Then you should be able to adapt my example code to interact with your spreadsheet.
Please let me know if you have any comments or questions!
AKA: Sheets API access with PHP part 3
Continuing my series on using PHP to manipulate Google Spreadsheets, I’ve added a section to my code that reads the XML representation of a spreadsheet and then parses out the rows and columns. Note that whenever you’re working with list-based feeds, we’re assuming that the first row contains column headings.
Using SimpleXML, it wasn’t difficult to parse out the cell data. The biggest hurdle was learning about XML namespaces — at first, I thought that the XML had tags like “gd:etag” or “gsx:quantity” and had to learn that gd and gsx were namespaces. In SimpleXML, several of the methods take a namespace as a parameter. You can specify the prefix or namespace URL as described in the doc for the children method here.
I think it’s also possible to get the spreadsheet data as json, but I haven’t tried that yet.
Yesterday I updated my example PHP code for manipulating Google Spreadsheets with the Google-provided PHP API client. If you tried getting it to work recently and were frustrated, take another look. Google made major changes to the client that required changing my code.
My previous Sheets API post demonstrated how to add rows to Google Sheets with PHP. Some readers were interested in editing or deleting rows.
Google has a very informative reference here that describes the list-based feed for manipulating sheets. All of my scripts have been using this method, which requires the sheet to have a single header row (I’d been lucky to use that format without knowing the requirement). I use the example HTTP code from Google’s reference pages to give me guidance for my PHP.
I’ve added a file to the gist demonstrating how to edit a row. Here’s what you need to know:
- When you get the sheet list feed (Section 3 of the first file in the gist), you need to add a header (“GData-Version” => “3.0”) to the GET request so that the Etag is returned for each line.
- Once you have a list of entries and find the one that you want to edit, you need to grab its Etag, Id, and the edit URL (see lines 1, 2, and 9 in the image above, and click on the image to see the whole entry).
- Now you just need to issue a PUT request to the edit URL and include the XML with the Etag, Id, and updated sheet data.
Hope this helps a few people!
People who use my Insert and Delete Cell Add-on for Google Sheets have been reporting this error. I was unable to recreate the error until someone shared a spreadsheet with me and I found the culprit: merged cells.
What’s going on?
When you insert cells and shift the existing contents down, my Add-on selects the range of cells from your selection to the bottom of the sheet and moves it down. But if there are merged cells below your selection, the new range doesn’t make sense and an error is generated. The same thing happens if you try to delete and shift up, or if you try to shift left or right with merged cells to the right of your selection.
Why not catch the error and generate a friendly error message?
I’ve been trying, but this error isn’t triggering the try – catch section of my code. There also isn’t a way to detect merged cells without trying some strange tricks that would take a lot more time than I have. I’ve opened a bug report with Google about this.
This error seems to be Google’s version of “We’re not sure what happened, and hope that if the customer tries again it might go away”.