your OOXML file (“.docx”, “.xlsx”, “.vsdx”, …) and its “modified” timestamp

VSDX does not get listed as an OOXML conform file format, but for this purpose (here) we can treat it like one.

Your “.docx” (or “.xlsx”) file is a ZIP file with a docProps/core.xml inside:

$ unzip -l YOUR.docx
…
… docProps/core.xml
…

This is a convenient way to extract docProps/core.xml to STDOUT:

$ unzip -p YOUR.docx docProps/core.xml
…

This is how to get the XML reformatted using xmlstartlet:

$ unzip -p YOUR.docx docProps/core.xml | xml fo

This command line shows you the possible XPath expressions:

$ unzip -p YOUR.docx docProps/core.xml | xml el
…
cp:coreProperties/dcterms:modified
…

How to extract “modified” to STDOUT?

$ unzip -p YOUR.docx docProps/core.xml | xml sel --template --value-of cp:coreProperties/dcterms:modified

And how to extract the timestamp w/o anything but decimal digits?

$ unzip -p YOUR.docx docProps/core.xml | xml sel --template --value-ofcp:coreProperties/dcterms:modified | tr -d ':TZ-'


Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.