38209 XLSX Workbook

Read from and write to XLSX format files without any automation or export with support for full cell formatting in Excel

Note: this project has moved to GitHub: https://github.com/ggreen86/XLXS-Workbook-Class

Release 18 - Fixed some bugs and added new methods. This release also attempts to reduce the time to save a XLSX file. Restructuring of the internal cursors and changes to cell formatting were performed to reduce the time to create a new XLSX file. The cell formatting process has been changed to now define a style for a cell to be formatted with. These methods are listed in the Release Notes and more explained in the Documentation. The older cell formatting methods are not removed for backward compatibility; however, these methods are now depreciated and will not be further enhanced.

Release 19 - Fixed bugs in two methods referring to the event name; see Release Notes for more information. I did not update the Documentation (see the event Method in the class for additional documentation).

I have begun to expand the read capability of this class. Currently if an xlsx document contains images, text blocks or other content, these objects are lost when the workbook is re-saved. I am working on the ability to retain these objects; however, I am not currently working on the ability to add new objects (just retain what is in an existing xlsx document). I decided to release now the code in its current state with the additions for retaining objects removed. This is to allow for the new functionality added to be used and tested by the community.


38236

Project Manager: Greg Green

38362 Latest release of XLSX Workbook

Last edited Apr 26 at 6:50 PM by DougHennig, version 9

Comments

gagreen1214 Mar 14 at 4:08 PM 
All--

The release today is for the reading of double-byte characters in strings (reading in an existing spreadsheet). The xml is stored in UTF-8 and needs to be converted to double-byte for the handling of character sets that include characters above the ASCII 128 range. This will hopefully correct the text display for international users; additionally this corrects the display of symbols such as the registered trademark or copyright.

Greg

gagreen1214 Mar 13 at 9:53 PM 
All--

Sorry for the 'flurry' of releases today. I have been 'stomping' on the bugs today and found another one that came up in converting a string with an embedded double-quote that has to be assigned to a node property. Since the node property is surrounded by double-quotes, any embedded quotes will break it. I ran into this. This is now corrected.

Greg

rkaye Mar 13 at 9:40 PM 
So I didn't get very far with a refactor on GetNextID. (Arrays aren't very OOPy.) Also, by this time of the day I'm down a good 40 IQ points... ;-) But I will keep reporting back here if I can come up with other meaningful optimizations.

rkaye Mar 13 at 8:06 PM 
No worries. I clearly relish my anonymity.

I should have been more specific. The class should filter out control type characters, with the exception of TAB, LF & CR, because Excel will refuse to open if those characters are in the data. It's relatively easy to use a regex to do that and my experience has been that as fast as the Fox can be with string manipulations, a proper C library will be much faster.

I do understand your concern about using 3rd party libraries. Maybe there's a relatively easy way to allow a 3rd party library to be plugged in for the purposes of cleaning strings.

gagreen1214 Mar 13 at 7:27 PM 
Richard--

I didn't 'match' you to the previous person... thanks for update.

I changed the old code for creating an xml string and enhanced it using DOM object (i.e., CREATEOBJECT('MSXML2.DOMDocument')). I use the method createTextNode to create the node object and then use the property xml to retrieve the xml string. I am not sure of just stripping all non-printing lower control characters as I believe most will be ignored; but I think one of these is used for multiple lines in a cell (when pressing ALT-Enter in Excel within the cell text entry). The way I implemented MSXML2.DOMDocument is if not present, then this will fail (caught in a try-catch) and then I revert to old code method.

Also, I hesitate to use an external library such as Craig's regex library as this then creates a dependency to an external library. The goal in this class is to be mostly self-contained.

Greg

rkaye Mar 13 at 6:57 PM 
It's Richard. Check your R13 release notes. :-)

Don't take my word for it. Let's get some of the other folks here who have been participating in the optimization discussion to try it out.

I like what you did with adding the messaging methods. Doing something similar for logging would allow more people to track performance. I've added some very simple logging code in my implementation that keeps track of when the important parts of the SaveTableToWorkbook method starts and finishes and dumps out a string to a text file. If you want it I'm happy to send it along to you.

I would also bet that implementing a regex library like Craig Boyd's, to clean non-printing lower ASCII characters from the data would be faster than the CHRTRAN stuff just added. But to be sure, you'd have to test it. :-)

gagreen1214 Mar 13 at 6:45 PM 
kraye--

I will make the changes to change SEEK(.., ..., ...) to INDEXSEEK(..., True, ..., ...); I don't see the harm in doing this; in fact it is apparently better in your testing. Getting ~5% better performance for large files can be a significant amount of time.

BTW - what is your first name?

Greg

rkaye Mar 13 at 6:37 PM 
Hi Greg,

The ability to control moving the record pointer is indeed the functional difference between INDEXSEEK and SEEK. But it's also not necessary to call it twice. The key here is when looking for potential optimizations, I substituted the equivalent INDEXSEEK() for SEEK() in the GetCellRecord method and it was ~4-5% faster with my test dataset. This implies that the underlying VFP core code for INDEXSEEK has been optimized a bit more than SEEK.

Having said that, I have not yet seen similar performance bumps for the other places where SEEK is being used but I also haven't generated coverage logs for this stuff.

gagreen1214 Mar 13 at 6:19 PM 
kraye--

I have not used INDEXSEEK() before (in fact I was unaware of it). Reading the documentation on it, it seems the only difference between it and SEEK() is the moving of the record. The second parameter in INDEXSEEK() allows for false which does not move the record but allows for a true value returned when the record is found. The documentation further states if record is not moved, then a second call is required with true as second parameter to have the record pointer moved. Since I use SEEK() to position to a record to return some values it would not be helpful here (would now need two calls); however, if the only use of SEEK() is to check for existence of a record, then INDEXSEEK() with false would be preferred.

I will check for places to use INDEXSEEK() rather than SEEK(). Thank you for the tip.

Greg

gagreen1214 Mar 13 at 6:01 PM 
kraye--

Yes it should be 'xl_ndxcolors'; but this miss-typing occurs in GetNextId(), SetLastId(), and ReadStylesXML(). It would always have returned a value of zero from the AddIndexColor() method as this miss-typing did not include the extra 'i'.

Alas, another quick release...

rkaye Mar 13 at 5:30 PM 
Does this line (~72) in GetNextID contain a typo?

CASE tcCursor = "xl_indxcolors"

Based on what I see elsewhere in the code the alias name used is "xl_ndxcolors".

rkaye Mar 13 at 5:08 PM 
Yep. I read it and followed along. :-)

I also noticed that the GetNextID method was used for multiple cursors and expressed my uncertainty about Brett's solution in an earlier comment. Since the code inside each case is nearly identical, I'm wondering if that can be further optimized by passing in the array property to be updated and thereby eliminate the DO CASE. I'll let you know what I find.

One other small optimization that I have been trying out, and which does seem to get me back a few more seconds in my control dataset, is to use INDEXSEEK() in the GetCellRecord method instead of SEEK().

If you don't want to talk about m dots, how about tabs vs spaces... :-)

gagreen1214 Mar 13 at 5:00 PM 
rkaye & BrettHudson--

In your implementation of AddStringValue() where you are using a single property for the counter may cause problems if you have multiple workbooks created in the class. The counter has to be specific to a workbook; your implementation is global to the class. I am not sure what this will do to the internal xml's that are generated for each workbook. So caution here and test. The current release now has a property for managing the various counters which is an array for managing multiple workbooks.

Greg

gagreen1214 Mar 13 at 4:50 PM 
rkaye--

My code uses a property for the counter. But it is expanded to for other counters as well and uses a method to determine which property to increment. The cursor name is passed as a parameter to this method which uses a CASE statement to determine the property to increment.

I guess the overhead is in the method call and CASE lookup. I also don't use the m.var notation due to style preference (I know the consequences and do not want to enter into a debate...) which also adds overhead for large rows.

rkaye Mar 13 at 2:53 PM 
Thanks for the quick update, Greg.

Here are the results for the first test:

03/13/2017 10:47 AM EXPORTPICKER
Total time to export: 99.007
Total rows: 3452
Total cells: 113916

This is using the same dataset as earlier. I have not done multiple passes yet but the first results are about 10% slower than Brett's change to use a straight memvar based counter.

gagreen1214 Mar 13 at 1:53 PM 
All--

I apologize for missing this bug. I had renamed the event name and missed updating the value in the methods to save from a grid and a table. This has now been fixed.

Greg

rkaye Mar 13 at 12:52 PM 
Trying to test and got this message. I will work around.

---------------------------
*ERROR*
---------------------------
Error: 1734
LineNo: 72
Message: Property OnShowMessage is not found.
Procedure: savetabletoworkbook
Details: OnShowMessage
StackLevel: 4
LineContents: RAISEEVENT(this, "OnShowMessage", 0, lnRowCount, 1, 1)

rkaye Mar 13 at 11:57 AM 
Hi Greg,

I'll try out the new release and report back here.

gagreen1214 Mar 12 at 1:34 AM 
Hello All--

Reading through your comments I wonder how Release 18 compares. I have changed the way GetXMLString creates the XML string and how a XML string is converted to a string so I would like to know how this works. Also, what is the save time for a large spreadsheet? How does it compare between R17 and R18?

DanGoodwin Mar 8 at 4:52 PM 
Hi Brett,

I just noticed that myself. The first RETURN should be RETURN nvl(tcstring,'') -- Duh! (I deleted the first post so nobody uses it)

making it:

I've had some problems with ASC characters below 32 as well as above 128. Excel seems to have it's own way of encoding those. Here's my GetXMLString with m. and a bailout at the top for when CHARTRAN of ASC 32-128 returns empty. Testing on a 100,000 street addresses it cut the time from 90 sec to 6 sec.

LPARAMETERS tcString
Local lcChar, lcString, lcXMLString, lnChar, lnNdx

DO case
case isnull(m.tcString)
RETURN ""
case EMPTY(RTRIM(m.tcString));
OR EMPTY(CHRTRAN(UPPER(m.tcString)," ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890-',./|\?;:[]{}_=+)(*^%$#@!~",''))
RETURN m.tcString
otherwise
IF EMPTY(m.tcString)
RETURN ""
ENDIF
m.lcString = RTRIM(m.tcString)
m.lcString = STRTRAN(m.lcString, CHR(38), '&')
m.lcString = STRTRAN(m.lcString, '>', '>')
m.lcString = STRTRAN(m.lcString, '<', '&lt;')
m.lcString = STRTRAN(m.lcString, '"', '&quot;')

IF this.CodePage = 0
m.lcXMLString = ""
FOR m.lnNdx=1 TO LEN(m.lcString)
m.lcChar = SUBSTR(m.lcString, m.lnNdx, 1)
m.lnChar = ASC(m.lcChar)
m.lcXMLString = m.lcXMLString + ICASE(m.lnchar<32,'_x00'+STRCONV(m.lcChar ,15 )+'_',;
BETWEEN(m.lnChar,32, 128), m.lcChar,;
"&#" + TRANSFORM(m.lnChar) + ";")
ENDFOR
ELSE
m.lcXMLString = m.lcString
ENDIF
RETURN m.lcXMLString
endcase

BrettHudson Mar 7 at 10:35 PM 
Hi Dan,

I may be doing it wrong but I tried the new code for the GETXMLSTRING but ended up with an XLSX file that had no headers (row 1 had all blank value cells) and the data in the remaining rows and cells had blank or zero data. It would appear that if a value in a field of the table being loaded up was a straight character field and did not have any 'special characters' in it, it was returned back as blank. If the field had a date or a numeric value, or one of the characters that are specifically tested for, such as the ampersand, it looked all OK. It is something to do with the CHRTRAN statement. If I comment out that test, all works well and the row headers are correctly written into row 1 and 'straight' character fields are also correctly loaded up.

rkaye Mar 6 at 1:49 PM 
The other thing I'm just thinking through a bit more is that Greg's original method has a cursor parameter so that implies more than one counter could be necessary.

rkaye Mar 6 at 1:27 PM 
You're working with larger datasets than I am for the most part, at least as far as number of columns is concerned.

Yes, the XML is very finicky about what it considers illegal characters. In an ideal world there would be a method in the library that uses regex to sanitize but at the moment I'm handling it in preprocessing before invoking the SaveTableToWorkbook method.

BrettHudson Mar 6 at 2:58 AM 
Good work rkaye,

I have been testing with files that have over 70,000 rows and about 250 columns. The thing I noticed is that in the older version it would become progressively slower and slower but with the amended code it keeps plowing through the records at pretty much the same speed as it is no longer doing the SQL select for each cell. However, it still takes quite some time to get through a big file so hopefully there may be some other similar areas that can be enhanced for speed.

PS> I also updated the code in the GETXMLSTRING method as I came across some data in my testing table that was invalid and it error'd out. I put in code to check for unreadable ASC() characters :

LPARAMETERS tcstring
LOCAL lcstring, lcxmlstring, lnndx, lcchar
IF LEFT(tcstring, 1) != " " .AND. EMPTY(tcstring)
RETURN ""
ENDIF

lcstring = STRTRAN(tcstring, CHR(38), '&amp;')
lcstring = STRTRAN(tcstring, '&', '&amp;')
lcstring = STRTRAN(lcstring, '>', '&gt;')
lcstring = STRTRAN(lcstring, '<', '&lt;')
lcstring = STRTRAN(lcstring, '"', '&quot;')

IF THIS.CODEPAGE = 0
lcxmlstring = ""
FOR lnndx=1 TO LEN(lcstring)
lcchar = SUBSTR(lcstring, lnndx, 1)
lnchar = ASC(lcchar)
IF lnchar < 32 &&& BCH unreadable/unprintable characters can cause errors within Excel so reset to space
lnchar = 32
lcchar = ' '
ENDIF
lcxmlstring = lcxmlstring + IIF(lnchar < 128, lcchar, "&#" + TRANSFORM(lnchar) + ";")
ENDFOR
ELSE
lcxmlstring = lcstring
ENDIF
RETURN lcxmlstring

rkaye Mar 3 at 10:03 PM 
Hi Brett,

I implemented your code and it does seem to help. I've got some benchmarking code integrated for tracking how long these exports take and got these results for an export that has about 3500 rows of data with around 30 columns:

03/03/2017 04:22 PM EXPORTPICKER
Total time to export: 219.518
Total cells: 113916
03/03/2017 04:27 PM EXPORTPICKER
Total time to export: 91.943
Total cells: 113916

The time went from almost 4 minutes to a minute and a half.

rkaye Mar 1 at 2:01 PM 
Hi Brett,

I will try that out when I have a few spare cycles and report back here. I also primarily use SaveTableToWorkbook. One enhancement I can think of immediately is the counter should be reset after the export is complete so that multiple exports from the same instance of the object will always start from 0.

BrettHudson Mar 1 at 12:05 AM 
Hi Rkaye and Greg et al,

I have been able to get some speed enhancement by a slight adjustment to some of the code as follows (see below). Note that I am using this code principally for the SaveTableToWorkBook method and therefore am only really producing single page worksheets. The changes detailed below haven't been fully tested against other methods, or having multiple worksheets in the same XLSX file, so there may be some issues that I am not aware of so the following code updates are somewhat 'buyer beware'. Hopefully others with better VFP skills than I will be able to review it. Anyway, the following can be a start to see if there are other enhancements that can be made or if there are issues with the changes I have outlined below.

1. I setup a new variable called LNNEXTID and initialised this to 0 in the OnInit method as follows:
*-*
*-* TODO: Add your code here
*-*
THIS.lnnextid = 0 &&bch

2. I amended the code in the AddStringValue method as follows by commenting out the call to the GenNextID method and replaced it with just a straight update from the new LNNEXTID counter variable which is incremented by 1 AFTER each call. Note that its initial value is 0:

LPARAMETERS tnwb, tcstring, tlinline
LOCAL lcstringvalue, lnstringid, llpreservespace
IF tlinline
llpreservespace = IIF(LEFT(tcstring, 1) = " ", true, false)
lcstringvalue = THIS.getxmlstring(tcstring)
*!* lnstringid = THIS.getnextid(tnwb, "xl_strings") &&& BCH
lnstringid = THIS.lnnextid &&& BCH
THIS.lnnextid = THIS.lnnextid + 1 &&& BCH
INSERT INTO xl_strings (ID, workbook, stringvalue, STRING, presvspace, FORMATTED) VALUES (lnstringid, tnwb, lcstringvalue, tcstring, llpreservespace, true)
ELSE
IF SEEK(BINTOC(tnwb)+PADR(UPPER(LEFT(tcstring, 180)), 180, " "), "xl_strings", "stringndx") .AND. xl_strings.FORMATTED = false
lnstringid = xl_strings.ID
ELSE
llpreservespace = IIF(LEFT(tcstring, 1) = " ", true, false)
lcstringvalue = THIS.getxmlstring(tcstring)
*!* lnstringid = THIS.getnextid(tnwb, "xl_strings") &&& BCH
lnstringid = THIS.lnnextid &&& BCH
THIS.lnnextid = THIS.lnnextid + 1 &&& BCH
INSERT INTO xl_strings (ID, workbook, stringvalue, STRING, presvspace, FORMATTED) VALUES (lnstringid, tnwb, lcstringvalue, tcstring, llpreservespace, false)
ENDIF
ENDIF
RETURN lnstringid

rkaye Feb 27 at 3:52 PM 
Hi Brett,

Greg has indicated to me in the past that he hasn't really had a need to use this with larger data sets so he has not devoted his resources to optimizations; instead concentrating on features. Every now and then when I need to take a break from what I'm doing, I take a run through the code to see if I can find some optimizations as I've also noticed some pretty significant performance issues with larger record sets. I haven't found any magic bullets yet. It would be good for the project (and all of us using it) to have more eyes on possible ways to optimize.

BrettHudson Feb 27 at 2:13 AM 
Hi Greg,

Just further to my earlier comments about the speed with the SaveTableToWorkBook method, I have also noticed that the more records that are to be exported from a table, the slower it progressively becomes. For example the first 1000 records to be loaded may take 20 seconds but say the 10,000 to 11,000 records may take a 10 or so minutes. I am wondering whether there is an index on a table that is not optimised so that Rushmore optimisation is not coming into play when performing SCAN's etc.

MarianoZ Feb 24 at 11:54 AM 
Hello Greg, I started to use XLSX Workbook, and like Brett, I've also issues saving, kind of, big dbf files (50000 records).
Other thing I've noted is that, if I filter the table (set filter to) and then use SaveTableToWorkBook, the xlsx file, contains blank rows before and after the filtered data. It's like the xlsx file contained all the records, but only shows the filtered ones.
Thanks a lot for your work!!

BrettHudson Feb 24 at 3:18 AM 
Hi Greg,

Again, a fantastic piece of work. I have been using it for a little while now but have had some issues when trying to export large VFP tables using the SaveTableToWorkBook method. We have some tables we are trying to copy out to Excel where the number of records in the table is greater than the old XLS row limit of 65536. We use the SaveTableToWorkBook method to copy out tables with large numbers of records but the process can be very slow. It can take up to or over a hour to export the details of some of the larger files. I was wondering whether there was some scope within the code to speed things up a bit. Typically when exporting a table, say as distinct from exporting a grid, all of the data in any given field will have all of the same attributes, although of course the actual data itself could be different. I was wondering whether there was any way to have some kind of auto column setup so that rather than having to setup styles and formatting on a cell by cell basis, if using a table as input, it could set this up using a column format? I'm not sure how but maybe somebody out there knows of a way.

gagreen1214 Feb 18 at 3:42 AM 
DanGoodWin--

I missed your point on which CASE statement (the code has been changed in this method but not this CASE section). It is interesting that you have found a situation where a decimal is not present in the Date-Time format code for Excel. The date-time is encoded with whole number part being the date and the decimal part being the time (hence date-time). Without the decimal it would just be a date format which typically has a different formatting code applied to the internal tables. But I will now apply your changes to the code. Thank you for finding this and being persistent to point it out.

DanGoodwin Feb 17 at 4:00 PM 
Hi Greg,

Starting from line 124 in ReadCellValueFormat, I have:

CASE INLIST(t_cellxfs.numFmtId, 22, 29, 30, 31, 32, 33, 34) && Date-time format
lnNdx = ATC(".", loCellFormat.CellVal)
lcWholeNbr = LEFT(loCellFormat.CellVal, lnNdx-1)
lcDecNbr = SUBSTR(loCellFormat.CellVal, lnNdx+1)
loCellFormat.CellVal = lcWholeNbr + "." + PADR(lcDecNbr, 17, "0")
loCellFormat.DataType = DATA_TYPE_DATETIME
loCellFormat.NumDec = -1

Unless I'm looking at an old version, it looks like there's a case statement on the Date format, but not the DateTime.

rkaye Feb 16 at 5:08 PM 
I've been using SaveTableToWorkbook quite successfully without any problems, Andrew. Perhaps you can post some detail on what these issues are?

gagreen1214 Feb 16 at 2:14 PM 
All--

I have been very busy over the last number of weeks in a customer escalation with my job. Hopefully in a couple of weeks I will be able to devote time to the changes being made in the current code for Release 18.

Greg

tiredoftry Feb 13 at 2:45 PM 
very nice project Greg. Are you planning a release soon as noted in your December comment and if so, does it resolve issues with the 'SaveTableToWorkbook' when a table is already in use?

@Akselsoft, can you share the issues you found when using this function? I am about to implement in this manner and would be grateful with your findings so I can avoid/address the same issues.

Akselsoft Feb 13 at 8:48 AM 
Great project - had a few issues with the SaveTableToWorkbook() - when using a table that's already open. But much better way for creating Excel files.
Great trick using the Shell.Application to save the actual Excel files.

DanGoodwin Dec 22, 2016 at 5:36 PM 
Thanks for sharing this great tool.

I beleive I found a bug in ReadCellValueFormat(). If a datetime cell value doesn't have a decimal in it, the method returns the integer as the decimal. So "42709" becomes ".42709000" instead of "42709.0000". The line lnNdx = ATC(".", loCellFormat.CellVal) returns 0, which causes the next lines' LEFT and SUBSTR to incorrectly parse the string. Suggested fix is to change the next two lines:

lcWholeNbr = IIF(lnNdx=0, loCellFormat.CellVal, LEFT(loCellFormat.CellVal, lnNdx-1))
lcDecNbr = IIF(lnNdx=0,"", SUBSTR(loCellFormat.CellVal, lnNdx+1))

gagreen1214 Dec 12, 2016 at 1:19 AM 
Please send me an email with your code changes and I will incorporate. I have some other minor bug fixes and getting ready for another release - so I will wait for your code to incorporate. Thank you.

As for the method SetColumnBestFit() I have not worked on it in a while now. Not sure when I will...

Greg

tfederer_ipap Dec 6, 2016 at 4:58 PM 
The SaveTableToWorkbook function does not export rows in the correct order when an index is set. I was able to fix this by using a row counter variable instead of RECNO() when exporting.

I also added a wait window with the export progress so users can tell the export is still runnning when exporting large tables.

I would like to submit these code changes to the project. How can I do that?

tfederer_ipap Dec 5, 2016 at 4:04 PM 
The documentation for the method SetColumnBestFit says "this method is not yet fully working and is not
currently saved in the sheet". Is this feature currently being worked on? Will this be in the next release?

gagreen1214 Sep 30, 2016 at 12:38 AM 
Matt--

This is now fixed in Release 17. Thank you for sending a sample file to test with.

Greg

mattslay Aug 28, 2016 at 11:56 AM 
Hi Greg - I'm experimenting with this library for use in my app. When I opened a file, set some cell values, and then saved it, all of the cell coloring are stripped out and any cell border lines are removed.

I sent you an email with screenshot. Also attached the file in you need it to study the situation.

gagreen1214 Jul 17, 2016 at 2:27 AM 
tamayok--

There is a method to read an XLSX file into the cursors -- see OpenXlsxWorkbook() method; there are other methods for returning the cell values, formulas, and formatting.

Greg

tamayok Jul 15, 2016 at 3:42 AM 
Just became aware of this GREAT tool... THANKS!

I second the request to have a method exactly like requested by: oscarrd - May 14

I really need the inverse to your: 'tabletoexcel.prg' ... ExcelToCursor()


what names for fields? .T. if the first record is the fieldname or .F. if a standard 'Column1', 'Column2'... if the name provided fails DBF stds, assign 'Column1', 'Column2'..

what columns? all populated ones

what field size? the longest required by the data in the cell or 255-56; it is not perfect but at least it is something.

DougHennig Jul 1, 2016 at 6:01 PM 
Hi Greg.

I can see the issue with the grid: I was creating one on the fly rather than using one in a form as a quick-n-dirty way to get a spreadsheet from a cursor with a little bit of formatting.

Doug

gagreen1214 Jul 1, 2016 at 5:02 PM 
Doug--

I have incorporated your comments into the code base except for #5 at this point. There was a problem with having a private datasession separate from the grid. I had another user have a problem with the grid losing the cursorsource setting when switching datasessions. So there is the need to define the working cursors in the same datasession as the grid. I also agree with your basic comment in #5 so I am looking at how to accommodate both needs. I also elected not to set the General field type value to "" in the code location that you gave; instead I added this to the SetCellValue() method (DO CASE on VARTYPE) so that it would be applicable for any use of General field types.

Greg

DougHennig Jun 30, 2016 at 10:22 PM 
Hi Greg.

I have some other suggestions:

1. If you have a sheet named "Test1" and want to add a sheet named "Test", the SEEK in AddSheet can find the Test1 record if EXACT is set off. I changed it to this:

*** DH 2016-06-30: use PADR on seek so no partial name match
***IF SEEK(BINTOC(tnWB), "c_workbooks", "workbook") .AND. !SEEK(BINTOC(tnWB)+UPPER(tcSheetName), "c_sheets", "shname")
IF SEEK(BINTOC(tnWB), "c_workbooks", "workbook") .AND. !SEEK(BINTOC(tnWB)+UPPER(padr(tcSheetName, len(c_sheets.shname))), "c_sheets", "shname")

2. SaveTableToWorkbook: accept an optional tcSheetName parameter:

*** DH 2016-06-30: accept optional tcSheetName parameter
***LPARAMETERS tcAlias, tcWBName, tlFreeze, tlSaveWB
LPARAMETERS tcAlias, tcWBName, tlFreeze, tlSaveWB, tcSheetName

Use the parameter if passed:

*** DH 2016-06-30: use either the alias or tcSheetName as the sheet name
*** lnSh = this.AddSheet(lnWB, lcAlias)
local lcSheetName
if vartype(tcSheetName) = 'C' and not empty(tcSheetName)
lcSheetName = tcSheetName
else
lcSheetName = lcAlias
endif vartype(tcSheetName) = 'C' ...
lnSh = this.AddSheet(lnWB, lcSheetName)
*** DH 2016-06-30: end of new code

3. Same with SaveGridToWorkbook:

*** DH 2016-06-30: accept optional tcSheetName parameter
***LPARAMETERS toGrid, tcWBName, tlFreeze, tlSaveWB
LPARAMETERS toGrid, tcWBName, tlFreeze, tlSaveWB, tcSheetName

Use the parameter if passed:

*** DH 2016-06-30: use either the workbook name or tcSheetName as the sheet name
*** lnSh = this.AddSheet(lnWB, JUSTSTEM(tcWBName))
local lcSheetName
if vartype(tcSheetName) = 'C' and not empty(tcSheetName)
lcSheetName = tcSheetName
else
lcSheetName = JUSTSTEM(tcWBName)
endif vartype(tcSheetName) = 'C' ...
lnSh = this.AddSheet(lnWB, lcSheetName)
*** DH 2016-06-30: end of new code

4. Have SaveGridToWorkbook call GetWorkbook if CreateWorkbook returns 0 like SaveTableToWorkbook does:

lnWB = this.CreateWorkbook(tcWBName)
*** DH 2016-06-30: if workbook exists, get reference to it
IF lnWB = 0
lnWB = this.GetWorkbook(tcWBName)
ENDIF
*** DH 2016-06-30: end of new code

5. Both SaveTableToWorkbook and SaveGridToWorkbook call CreateWorkingCursors which prevents them from adding a new sheet to an existing workbook if OpenXLSWorkbook is called first. I commented out the code calling CreateWorkingCursors and DestroyWorkingCursors in both methods since they aren't needed (Init creates the cursors):

*** DH 2016-06-30: don't call CreateWorkingCursors: they're already open and that prevents
*** calling OpenXlsxWorkbook first to append to existing workbook
***this.AllowChangeDE = False
***IF this.UsePrivateDE
*** this.CreateWorkingCursors()
***ENDIF

and:

*** DH 2016-06-30: don't call DestroyWorkingCursors.
***IF this.UsePrivateDE
*** this.DestroyWorkingCursors()
***ENDIF

6. If the grid HeaderHeight is 0, don't output column headers in SaveGridToWorkbook:

FOR lnCol=1 TO toGrid.ColumnCount
*** DH 2016-06-30: added next line so only output headings if HeaderHeight > 0
if toGrid.HeaderHeight > 0
this.SetCellValue(lnWB, lnSh, lnRow, lnCol, toGrid.Columns(lnCol).Header1.Caption)
*** DH 2016-06-30: added ENDIF
endif toGrid.HeaderHeight > 0
this.SetColumnWidth(lnWB, lnSh, lnCol, this.ConvertPixelsToExcelUnits(toGrid.Columns(lnCol).Width))
ENDFOR
*** DH 2016-06-30: if no headers, start in first row
if toGrid.HeaderHeight = 0
lnRow = 0
endif toGrid.HeaderHeight = 0
*** DH 2016-06-30: end of new code

Doug

gagreen1214 Jun 21, 2016 at 1:36 PM 
Doug--

Thank you for this addition -- I will incorporate as soon as I can and will post an update.

Greg

DougHennig Jun 20, 2016 at 5:39 PM 
Hi Greg.

A small bug in SaveGridToWorkbook: it should ignore General fields to avoid an error (not a common thing, I realize):

FOR lnCol=1 TO toGrid.ColumnCount
lcField = toGrid.Columns(lnCol).ControlSource
*** 2016-06-20 DH: use blank for General fields
if type(lcField) = 'G'
this.SetCellValue(lnWB, lnSh, lnRow, lnCol, '')
else
*** 2016-06-20 DH: end of new code
this.SetCellValue(lnWB, lnSh, lnRow, lnCol, &lcField)
*** 2016-06-20 DH: added ENDIF
endif
ENDFOR

Doug

BennyThomas Jun 10, 2016 at 10:08 PM 
Hi Greg,

I downloaded the latest class. Now I am getting error on wordwrap.

Error 13 occurred @ line 7 of VFPXWORKBOOKXLSX.SETCELLWORDWRAP.
Message: Alias not found
Error 13 occurred @ line 9 of VFPXWORKBOOKXLSX.SETCELLWORDWRAP.
Message: Alias 'C_CELLS' is not found.
With the previous class (Workbookxlsx) there was no problem.

rega10 Jun 1, 2016 at 9:15 AM 
Hi Greg, great work!

Is there a way to deactivate the gridlines similar to DisplayGridlines = False when using automation?
Thank you!
RG

John_Harbour May 27, 2016 at 3:05 PM 
Hi, Greg

Just some thoughts on the design of a ToCursor() method.

It should take two parameters, the name (alias) of the cursor and the number of the sheet to be imported.

The fields names could be F1, F2, ... F255. Make each a char(254) and populate with the headers and data, converted to char where necessary.

The user programmer can then do
select left(f1, 20) as foo, val(f2) as bar, etc. into MyCursor
to create another cursor with the data converted as she wants it.

The cursor you create should be READWRITE to allow rows to be deleted, values changed etc.

Hope this is useful.
Regards

John

John_Harbour May 26, 2016 at 2:02 PM 
Hi,
Thanks for this. It's very useful. I found a couple of bugs today. Methods SetCellFont() and GetCellDatatype() contain two calls to SetDefaultDataSessionId(), rather than one to SetClassDataSession() and another to SetDefaultDataSessionId().

Regards
John Harbour

PaladinTodd May 17, 2016 at 2:01 PM 
I don't see any distribution notes for this. Does a client need anything beyond the source code contained in the ZIP? No Dlls? No ODBC install?

gagreen1214 May 16, 2016 at 1:56 AM 
All--

I have updated the class to include support for British Pound and Euro currencies -- thanks to another on this site who did most of the ground work for this. I also corrected some bugs in the class using a private datasession for its cursors. I and another person have been using these updates for some time to try to make sure the bugs are worked out before my posting.

I believe this is now stable and want to release this.

As for a ToCursor() method... I would have questions: what columns? what names for fields? I have not included this type of method as I am not sure how I would 'decide' the field names. There is a method for reading the spreadsheet into the internal cursors. Once read in, there are methods to determine the cell values, number of rows and number of columns. From this you could write your own implementation of storing to a cursor or table.

oscarrd May 14, 2016 at 4:06 PM 
Hello,
it's there a way to import and XLSX to a cursor?
I have read documentation and test program from the folder but I have not found a way to do this

Something like:
lcFile="myExcelfile.xlsx"
loExcel = NEWOBJECT("VFPxWorkbookXLSX", "VFPxWorkbookXLSX.vcx")
loExcel.OpenXlsxWorkbook(lcFile)
loExcel.SaveToCursor("MyCursor")

select MyCursor


Thanks

rkaye Apr 4, 2016 at 4:40 PM 
Hi Brett,

I just noticed that there is a release 12 available which came out in February. (The version I've been working with is v10.) It is marked as a beta but might be worth checking out.

rkaye Apr 1, 2016 at 6:18 PM 
It also doesn't seem to be very fond of currency symbols other than $.

rkaye Apr 1, 2016 at 2:43 PM 
Looks like I'm running into similar issues with SET POINT. There is some discussion in the comments here about a possible solution being implemented. Was this ever done?

TIA

rkaye Mar 31, 2016 at 9:02 PM 
Hi Brett,

Sounds like a good reason to use the save from grid. At the moment I don't have that requirement but if I ever do I will report back here.

I just added regular old Excel automation code to open the spreadsheet after it gets created. With the appropriate prompts and exception handling, of course. :-)

BrettHudson Mar 28, 2016 at 11:03 PM 
Thanks Rkaye for reviewing this. Again, I think that this is a great routine.

The issue we are having appears to be related with saving off and restoring the data sessions. I note that if I comment out the code in the SetClassDataSession and SetDefaultDataSession procedures in the class that I do then have access to all the control source settings in the passed grid object. However, if I comment out the code in these procedures, I then get messages asking if I would like to overwrite the working XML files that are setup in the WriteSupportXMLS procedure (such as CORE.XML). There appears to be some issue with saving off and then resetting the default data sessions.

PS. Like you, we are placing the class directly on the form itself and then calling if from a button on the form.

PPS. It would also be nice if there were some procedure type call to actually cause the XLSX file that was created, be it from the grid or the table, to be activated and displayed to the screen. At present (unless I am missing something), it creates the XLSX field but doesn't actually access it.

BTW. The reason we are using the grid to extract the data from, rather than just a table, is that we have developed our grid processing so that the end users can dynamically amend the grid layouts, (the header caption, the individual fields, setup new fields etc) and as we want the XLSX workbook to reflect the details of the grid as displayed on the screen rather than the underlying cursor, we need to export the details of the grid.

rkaye Mar 25, 2016 at 1:23 PM 
Hi Brett,

I've just finished ripping out all my old copy to type xl? and replacing with this class. I have not implemented the savegridtoworkbook method so can't offer any specific advice on that. I may poke at it a bit but I've found the SaveTableToWorkbook method to be so simple to use I haven't bothered with the grid method. Here are a few things I've noted (consolidating my previous comments into one for easier reading).

Excel is sensitive to the presence of lower ASCII control characters in the XML. You will need to sanitize your data. I found the Microsoft Open XML SDK 2.5 Productivity Tool for Microsoft Office to be invaluable for troubleshooting the XML created by the class.

My standard implementation has to been to drop the class on a form as opposed to creating in method code, etc. I found during my testing that it was helpful to reset things before each call to create a workbook. There doesn't appear to be an explicit reset method in the class but I found that deleteallworkbooks does the trick. But, if you do that it also wipes out all the working cursors. The answer I found was to trigger the assign method on the codepage property.

The createworkingcursors method does not save and restore the current workarea. As I've primarily implemented the class by using the SaveTableToWorkbook method this bit me on a few occasions. I've updated my local copy of the class to do that.

Kudos again to the folks who put this class together! Any plans to tackle the DOCX file format? :-)

BrettHudson Mar 24, 2016 at 4:36 AM 
This looks to be great code and I am very keen to get it up and running.

However, I am having a couple of issues trying to get it working when exporting the contents of a grid using the 'SaveGridtoWorkbook' procedure. (It may be something with the setup of our forms but I am not sure).

We have our own locally developed class that programmatically generates the grids as the form itself is generated at run time. (We drop our local grid class onto the form at design time). I find that if I place a copy of the XLSX workbook class directly onto my form, and I then execute the following code (passing the name of the local grid on the form):

thisform.VFPXWORKBOOKXLSX1.savegridtoworkbook(mylocalform.grid, "My Local Grid Workbook")

that the code in the 'SaveGridToWorkbook' procedure is unable to 'see' the COLUMNS[i].CONTROLSOURCE field. They are all being processed as blank.
(I.E. the lcfield = togrid.COLUMNS(lncol).CONTROLSOURCElcfield line of code resolves to blank). Interestingly, the code that sets up the workbook headers works in that the correct 'captions' are being written to the worksheet column headings.

Not sure why this is happening but would appreciate any suggestions as to what may the issue.

Thanks, Brett

rkaye Mar 8, 2016 at 12:11 PM 
I'm going to handle the illegal characters (ASCII values below 32) by removing them via regex. (I use Craig Boyd's FLL for that.) My gut is that this should be part of the getxmlstring method which currently handles transforming &, <, >, and ' but for now I'm doing this before passing the data over to VFPXWorkbookXLSX.

rkaye Mar 7, 2016 at 6:16 PM 
In my research I found something called the Open XML 2.5 Productivity Tool for Microsoft Office and it's proving invaluable in figuring out what is making Excel unhappy. For example, if it doesn't seem to like ampersands in property values. (i.e. setting CompanyName to "My Company & Son" makes Excel complain when opening the file).

It also appears to dislike other characters such as the one referenced below (the fun continues):
---------------------------
Open XML SDK 2.5 Productivity Tool for Microsoft Office
---------------------------
Cannot open the file: Part /xl/sharedStrings.xml: '', hexadecimal value 0x18, is an invalid character. Line 1373, position 2368.

rkaye Mar 4, 2016 at 9:45 PM 
One more comment before I put this away for the weekend. I'm going to guess that this implementation is for an earlier version of the OpenXML spec ( > 2.5 ) and there's something that needs to be updated to avoid Excel 2013 thinking the files created by the class are corrupt. If any work is still being done on this library it would be good to know before I start duplicating efforts.

rkaye Mar 4, 2016 at 12:53 PM 
Some more feedback. I'm finding that Excel 2013 detects what it thinks is file corruption. Allowing the repair to proceed will let the file be opened. My best guess at this point is it's unhappy about something in the metadata XML but unfortunately Excel's log doesn't provide any useful details.

rkaye Mar 2, 2016 at 5:13 PM 
Excellent work in this library! And I love the documentation!

I recently had to refactor an old form that was using COPY TO...TYPE XLS and decided to replace with this. For the most part things have been easy to understand and implementing is a snap. I added an instance to a form and went from there. I have a couple of thoughts to share. First, I think that integrating this into an EXE requires some refactoring on the error handling. I'd rather have my application level handler kick in instead of throwing up messageboxes with Microsoft Visual Foxpro captions. Second, whilst trying to troubleshoot some issues I was having with repeated calls to create a file, I thought that it was possible some properties needed to be reset to default values after each invocation of SaveTableAsWorkbook. In that process I put a call to deleteallworkbooks at the beginning of my method. I'm not sure if you would classify this as a bug but the destroyallworkbooks method closes the c_strings cursor. According to the comments I saw in the code, the creation of this particular cursor was moved out of the createcursors method into the codepage_assign method. This meant that the first time it worked fine and subsequent passes returned alias not found errors. Once I realized what was happening it was easy enough to handle but I thought I would mention it.

Again, really good work in this class and I tip my hat to you for making it so easy to use.

jbrand Feb 19, 2016 at 11:48 AM 
The problem with the decimal point also exist in other methodes. For example in SetCellValue() you used lcCellValue = TRANSFORM(lnWholePart + lnFractPart). If set point is set to "," then Excel ignores the fractional part. Or some lines later you look with ATC("." explicit for a point, so this also wont work. I think possible problems can occur in all methodes where you write xml-contents with transform etc.
Perhaps it could be a solution, if you put the whole class in a separate datasession. Then you can setup your environment independent from the users settings.

gagreen1214 Feb 11, 2016 at 1:16 PM 
Thank you for the tip on the decimal point -- I did find that as the potential problem but trying to solve when writing the numeric formats. Will add this to the beginning of the save method and restore back to user default value.

jbrand Feb 11, 2016 at 8:29 AM 
First of all, many thanks for this great product. I have the same problem as Tom (hbgmail) posted on Dec 1, 2015. I think I have found the problem. It is the decimal point character. In most languages this is the period. If you change this setting via "set point to" for example to a comma (Germany), then all numeric values in the xml-files are printed with comma and Excel dosn't work with that. As a workarround I put set point to "." bevore SaveWorkbook() and directly after set point to ",".

DarrenL_DSS Jan 26, 2016 at 3:41 PM 
Many thanks for a great product, got me out of a real hole recently! One enhancement that would be really great to implement if possible is the Excel equivalent loWS.UsedRange.EntireColumn.Autofit. I know there is a column best fit option but this does not work correctly especially when the first few rows are merged cells for headings etc.

excelvbaisfun Dec 28, 2015 at 8:43 PM 
Hi Greg,

Thanks for the follow up. Sorry for my own delay. I wasn't sure if this forum would email me a reply. Anyways, great work and thank you so much for trying to find a method for this amazing task.

This is fantastic!

Yours,

Dan

gagreen1214 Dec 7, 2015 at 11:28 PM 
Dan--

I have the code in development for reading an XLSX file in development and have not yet released it on this forum. The code is partially working and still has some work to do in determining the cell types correctly.

Greg

excelvbaisfun Dec 7, 2015 at 8:03 PM 
This is really cool! I noticed the summary up top says you can Read XLSX files. I'm trying to use an XLSX file for importing as if it were an XLS file, or at least to form a cursor from the XLSX data, but I can't figure out how to create a cursor so that I can import data from that cursor or directly from the XLSX if necessary.

The old way is:
APPEND FROM (m.lcInFile) TYPE XLS

but how can we open an XLSX file and append our cursor from the XLSX file? Basically, how do we read this or create a useable cursor from it?

Thanks!!!
Dan

gagreen1214 Dec 6, 2015 at 2:28 PM 
hbgmail--

Please send me an example XML file with a PRG file that will reproduce the error and I will troubleshoot the problem. Send to anduril58 at hotmail dot com. Thank you.

Greg

hbgmail Dec 1, 2015 at 7:29 AM 
Hello,
I tried the not modified code on window7 pro with excel 2013. It produces a file but when I try to open it I always get "file must be repaired. sheet2.xml part with xml error.error on load line1 col0, sheet3.xml line1 col0, sheet4 line1, col 595"
If I agree , i can see it.
Does someone have the same effect ? Any ideas ?

I remember the same problems with an early version of Vilhelms Exel tools. He said it was something with the Euro sign.

Thanks a lot in advance
tom

gagreen1214 Oct 30, 2015 at 5:53 PM 
Cortiel--

I just tried the code:

this.SetRowHeight(lnWb,lnSh1, 6, 25)

In the method Demo() of the class for sheet 1 and the row height is set without any errors. I will need your XLSX file to be able to determine the cause of the problem. Also, please indicate which sheet number you are trying to set this value for the row height.

Greg

gagreen1214 Oct 30, 2015 at 5:08 PM 
Cortiel--

Please send me the XLSX file that you created with the oExcel.SetRowHeight(lnWb,lnSheet, 6, 25) statement so that I can debug the problem at anduril58 at gmail dot com. Thank you.

Greg

cortiel Oct 30, 2015 at 4:16 PM 
also, a method to add header and footers would be nice

P

cortiel Oct 30, 2015 at 4:15 PM 
everyting works great except when I add this line:

?oExcel.SetRowHeight(lnWb,lnSheet, 6, 25)


Excel claims the workbook is corrupted and does not open it :-(

gagreen1214 Oct 26, 2015 at 8:01 PM 
Thank you for the contribution -- will add this to the class (however, the reference to the r:id has to be dynamically determined as this applies to the sheet number - I will update to get this correct). The Margins can be set by the method SetSheetMargins() - see the Documentation() method for more details.

cortiel Oct 26, 2015 at 4:19 PM 
i added a property called pcLandscape to the object
and at the bottom of WriteSheetXMLs
if added a ifelse:

IF NOT this.pcLandscape
FWRITE(lhFile, '"/></worksheet>')
ELSE
FWRITE(lhFile, '"/><pageSetup orientation="landscape" r:id="rId1"/></worksheet>')
ENDIF

cortiel Oct 26, 2015 at 1:16 PM 
how do I change the layout to landscape and the margins to 0
other than that it's great!