A decade ago, part of a job was to help automate creation of a report that normally involved a human scanning through dozens of spreadsheets each year, copying and pasting numbers and generating graphs and tables, manually.The place where I worked predominantly used Perl. (It was a science laboratory.) My solution was to use OpenOffice's GUI to bulk-convert the Microsoft Excel spreadsheets (binary XLS files) to whichever file format OpenOffice used at the time that was actually a collection of XML in a zipped directory. (Was that .sdc?)
From there, I wrote some Perl CGI scripts to automatically generate most of the tables and graphics into HTML form, in some fashion that looked nice when imported into OpenOffice. (Really?!) The person who was responsible for the report, who normally took a few weeks to do, said that it would save him so much of the time, that it would take him now a couple days to look at the data and write conclusions and maybe do some cleaning. He marveled at automation and how quickly he could be replaced. The automation revolution in action.
Today I used LibreOffice to convert about a hundred spreadsheets into CSV files with this one command:
find . -name "*.xls" | while read line; do
(cd "$(dirname "$line")" &&
libreoffice --headless --convert-to csv "$(basename "$line")");
I am not sure that OpenOffice could do headless conversion back then. I was probably overwhelmed by all the potential everywhere to discover the functionality if it exists. This would have helped free a few hours of my life.Automation, a revolution.
Converting to CSV only does the 1st sheet tab. You apparently can't specify a different sheet to be the active sheet. I also had to convert some other old spreadsheets for easy parsing (mostly just grep'ing through them to find data). In that case, I used --convert-to html which was beautiful, even if one of the resultant HTML files was 138MB large.