Category: XML
-
converting a Jenkins CI job’s config.xml to several flat files (.properties, .sh, .bat, …)
Over time Jenkins jobs can grow into something “a little confusing”, in other words: like cancer.
The Jenkins developers were thoughtful enough to provide an API to all the data structures, that Jenkins and its jobs operate on, so we are able to export an entire Jenkins job as XML. You certainly do not want to edit a Shell script encapsulated within this XML, or a Windows batch script. You are certainly not the first one to need an export facility for this, and certainly a couple of approaches got developed over time. I am trying to collect them here for you and myself. I actually only found Ken Dreyer’s tool in the beginning – but only after I started developing something myself. NIH applies maybe …
- https://github.com/ktdreyer/jenkins-job-wrecker – Ken Dreyer’s jenkins-job-wrecker – converts Jenkins job XML to JJB YAML
Translate Jenkins XML jobs to YAML. The YAML can then be fed into Jenkins Job Builder.
Have a lot of Jenkins jobs that were crafted by hand over the years? This tool allows you to convert your Jenkins jobs to JJB quickly and accurately.
Initially I was / I am only interested in “project/builders” build steps:
- project/builders/hudson.tasks.BatchFile/command
- project/builders/hudson.tasks.Shell/command
- project/builders/EnvInjectBuilder/info/propertiesContent
- project/builders/EnvInjectBuilder/info/propertiesFilePath
I am extracting the bits and pieces with XPath expressions using XMLStarlet within Shell scripts. Every build step goes into its own file, with names derived from the step’s ordinal number within the Jenkins job, have a look at this example:
- 00–___.properties
- 01–___.propertiesFilePath
- 02–___.sh
- 03–___.bat
These “raw names” push you to think about more reasonable names, that will remind you of their meaning and content from then on.
I called my script “jenkins_config2files.sh“. I am going to upload it to my github account within a couple of days.
-
once you are getting familiar with XPath and XMLStarlet, you are using it for rather “ordinary tasks”
- http://xmlstar.sourceforge.net
- https://www.cygwin.com – “Get that Linux feeling – on Windows“
- https://cygwin.com/cgi-bin2/package-grep.cgi?grep=xmlstarlet
- http://www.finkproject.org – “The Fink project wants to bring the full world of Unix Open Source software to Darwin and Mac OS X. …“
- http://pdb.finkproject.org/pdb/package.php/xmlstarlet
Areas, where you will want to make use of XPath expressions and xmlstarlet in order to extract details:
- HTML web pages – if only more of them were proper XHTML! but xmlstarlet can even help you “tidying” improper HTML
- RSS feeds and esp. podcasts – listing all the MP3 files referred to within the feed
- various XML files …
-
using XPath on non-XML HTML – how to tidy dirty HTML?
Scraping HTML using XPath is far nicer than through low-level text processing. But how to proceed, if your XPath tool cannot deal with the HTML, because it is not XHTML conform resp. properly formatted XML?
My XPath tool is XMLStarlet:
And it can also help reformatting HTML, so that XPath expressions can get applied. I pipe “dirty HTML” through this command line:
$ xmlstarlet fo --html --recover 2>/dev/null
-
Q: how to get updates from web pages w/o RSS feed? A: XPath + cron or Jenkins job
- sadly enough even now in 2016 a lot of web pages are not XHTML conform, but getting them fairly conform is not that expensive: use “xmlstarlet fo –html –recover“
- get the (cron or) Jenkins job to save the current page content in the job’s workspace
- let the Jenkins job compare the current to the last state …
- … and message you through XMPP, if there’s a change
-
Jenkins: how to authenticate as a scripted client?
To make scripted clients (such as wget) invoke operations that require authorization (such as scheduling a build), use HTTP BASIC authentication to specify the user name and the API token. This is often more convenient than emulating the form-based authentication.
The article quote above mentions “buildToken“, but I don’t need it at all.
The article quoted above has a section on wget. It recommends using “
--auth-no-challenge” and also “--secure-protocol=TLSv1“, but a simple “wget --http-user=user --http-password=apiToken” works for me. The article explains, where to find the user’s apiToken (→ within the Jenkins user’s own configuration).I also successfully tried “
curl --user user:apiToken“. wget‘s “--auth-no-challenge” corresponds to curl‘s “--basic“. (But I will only apply them, once I am stuck w/o them.)My Jenkins URLs are usually actually …/api/xml ones, and I use XMLStarlet for the XPath-style extractions. My command lines look like this:
$ wget --quiet --output-document - --http-user=user --http-password=apiToken .../api/xml | xmlstartlet ... $ curl --silent --user user:apiToken .../api/xml | xmlstartlet ...
-
Python and XPath
- https://en.wikipedia.org/wiki/XPath#Implementations has a section on Python
- http://shop.oreilly.com/product/9780596001285.do — O’Reilly Media book: Python & XML — published in 2003, you cannot use the “4DOM” samples any more — I discussed it a little in this article (on this blog): http://wp.me/p4qjMw-1Ac
- https://docs.python.org/3/library/xml.etree.elementtree.html#xpath-support
- https://docs.python.org/2/library/xml.etree.elementtree.html#xpath-support
- the Python2 and Python3 samples are practically the same, you can use identical code (apart from the “print” and “format” stuff)
I had done some XPath using XMLStarlet in a shell script, and I quite like it.
Feeling “safe enough” with XPath I managed to deal with the Python XPath pitfall(s), and I quite like my 1st respective XPath Python script. And of course my Python script has a nicer CLI interface than my shell script.
My Python XPath pitfall briefly: You cannot (at least by my impression) start an XPath expression from the root — simply omit the root and you are fine.
-
XMLStarlet – a command-line utility to deal with XML documents
- https://en.wikipedia.org/wiki/XMLStarlet
- https://de.wikipedia.org/wiki/XMLStarlet
- http://xmlstar.sourceforge.net
- http://xmlstar.sourceforge.net/docs.php
- http://xmlstar.sourceforge.net/doc/UG/xmlstarlet-ug.html – User’s Guide
- http://xmlstar.sourceforge.net/doc/UG/xmlstarlet-ug.html#idm47077139502176 – the User’s Guide section on “Other XmlStarlet Resources” (with a few broken links)
- http://xmlstar.sourceforge.net/doc/xmlstarlet.txt – yet another document called “User’s Guide” but with more rather educative examples
- http://www.ibm.com/developerworks/library/x-starlet
- http://www.heise.de/ct/inhalt/15/14/172 — behind a paywall; my PDF copy lives on my archive at: Computers/Data_Formats/Markup_Languages/XML/Addressing_and_Querying/XPath/
- http://www.freesoftwaremagazine.com/articles/xml_starlet
CAVEAT: “
xmlstarlet sel --template --value-of XPATH” lists all values on a separate new line — but the last entry comes without a trailing new-line character. If you want to pipe xmlstarlet’s output into “while read e; do ...; done“, the last entry won’t get read. xmlstarlet has another option for this: “--nl” (it means: “finish each printed match with a new line”). So if you write your command line like this: “xmlstarlet sel --template --value-of XPATH --nl“, everything will be fine.
It took me a while to find out, that the last entry does not got processed, and then how to fix this. -
the GNU packages that I need most seriously on my “finkified” Macs
- https://wiki.jochen.hayek.name/w?title=The_GNU_packages_that_I_need_most_seriously_on_my_finkified_Macs – this is where this text should actually live and get maintained
CAVEAT!!! Read carefully beforehand:
I.e. install
- Xcode Tools/Developer Tools,
- Xcode Command Line Tools,
- and X11 !
This message wants to tell you, that the Xquartz X11 distribution is not installed (it is not something, that fink can do for you):
Can’t resolve dependency “x11-dev” for package “poppler46-shlibs-0.26.2-3” (no matching packages/versions found)
$ fink install coreutils-default grep xmlstarlet pwgen saxon wget $ fink install image-exiftool-pm # extracts date+time from JPEG files $ fink install xpdf # which includes pdfinfo; extracts date+time from PDF files $ fink install ghostscript # needed by emacs doc-view (?!?)
Note 2019-10-21: I fail installing wget, because it depends on some uncompilable gpgme11. (Solved < 2020-05-27.)
-
updating resp. re-installing “fink” on OS X “El Capitan”
Updating did not work at all. I needed a fresh installation from finkproject.org .
And then:
$ fink install coreutils xmlstarlet pwgen $ fink install xpdf
…
Nota bene: you should always upgrade fink before upgrading OS X.