listing a Jenkins CI folder recursively

https://github.com/JochenHayek/misc – pls find my script here! First I wrote this as a Shell script … calling XMLStarlet resp. SaxonHE (for XPath) and curl (for retrieving the Jenkins details as XML files). $ …/jenkins_find_jobs.sh https://integration.wikimedia.org/ci folder … freeStyleProject … matrixProject … The listing is quite helpful for documentational purposes. I was really proud on my little achievement… Continue reading listing a Jenkins CI folder recursively

an XQuery recipe: generating lots of documents in a single XQuery run …

http://www.gnu.org/software/qexo/XQ-Gen-XML.html – search there for “Generate all the HTML output files“! … by putting them in a single large XML object – then use a post-processor to split this into separate files. (Alright this isn’t really a true “single XQuery run” approach. But it is close enough.) With Saxon-HE there is no way to write to separate text… Continue reading an XQuery recipe: generating lots of documents in a single XQuery run …

Xidel – yet another HTML/XML/JSON data extraction tool

Xidel is a command line tool to download html/xml pages and extract data from them using CSS 3 selectors, XPath 3 expressions or pattern-matching templates. http://www.videlibri.de/xidel.html https://en.wikipedia.org/wiki/XQuery – I am “watching” the changes on this article, and somebody just added Xidel, that’s how I came across Xidel Cygwin’s and Fink’s repository do not have Xidel, but Xidel’s… Continue reading Xidel – yet another HTML/XML/JSON data extraction tool

once you are getting familiar with XPath and XMLStarlet, you are using it for rather “ordinary tasks”

http://xmlstar.sourceforge.net https://www.cygwin.com – “Get that Linux feeling – on Windows“ https://cygwin.com/cgi-bin2/package-grep.cgi?grep=xmlstarlet http://www.finkproject.org – “The Fink project wants to bring the full world of Unix Open Source software to Darwin and Mac OS X. …“ http://pdb.finkproject.org/pdb/package.php/xmlstarlet Areas, where you will want to make use of XPath expressions and xmlstarlet in order to extract details: HTML web pages –… Continue reading once you are getting familiar with XPath and XMLStarlet, you are using it for rather “ordinary tasks”

using XPath on non-XML HTML – how to tidy dirty HTML?

Scraping HTML using XPath is far nicer than through low-level text processing. But how to proceed, if your XPath tool cannot deal with the HTML, because it is not XHTML conform resp. properly formatted XML? My XPath tool is XMLStarlet: https://en.wikipedia.org/wiki/XMLStarlet And it can also help reformatting HTML, so that XPath expressions can get applied.… Continue reading using XPath on non-XML HTML – how to tidy dirty HTML?

Q: how to get updates from web pages w/o RSS feed? A: XPath + cron or Jenkins job

sadly enough even now in 2016 a lot of web pages are not XHTML conform, but getting them fairly conform is not that expensive: use “xmlstarlet fo –html –recover“ get the (cron or) Jenkins job to save the current page content in the job’s workspace let the Jenkins job compare the current to the last… Continue reading Q: how to get updates from web pages w/o RSS feed? A: XPath + cron or Jenkins job

Jenkins: how to authenticate as a scripted client?

https://wiki.jenkins-ci.org/display/JENKINS/Authenticating+scripted+clients To make scripted clients (such as wget) invoke operations that require authorization (such as scheduling a build), use HTTP BASIC authentication to specify the user name and the API token. This is often more convenient than emulating the form-based authentication. The article quote above mentions “buildToken“, but I don’t need it at all. The… Continue reading Jenkins: how to authenticate as a scripted client?

Python and XPath

https://en.wikipedia.org/wiki/XPath#Implementations has a section on Python http://shop.oreilly.com/product/9780596001285.do — O’Reilly Media book: Python & XML — published in 2003, you cannot use the “4DOM” samples any more — I discussed it a little in this article (on this blog): http://wp.me/p4qjMw-1Ac https://docs.python.org/3/library/xml.etree.elementtree.html#xpath-support https://docs.python.org/2/library/xml.etree.elementtree.html#xpath-support the Python2 and Python3 samples are practically the same, you can use identical code (apart from the “print” and… Continue reading Python and XPath

XMLStarlet – a command-line utility to deal with XML documents

https://en.wikipedia.org/wiki/XMLStarlet https://de.wikipedia.org/wiki/XMLStarlet http://xmlstar.sourceforge.net http://xmlstar.sourceforge.net/docs.php http://xmlstar.sourceforge.net/doc/UG/xmlstarlet-ug.html – User’s Guide http://xmlstar.sourceforge.net/doc/UG/xmlstarlet-ug.html#idm47077139502176 – the User’s Guide section on “Other XmlStarlet Resources” (with a few broken links) http://xmlstar.sourceforge.net/doc/xmlstarlet.txt – yet another document called “User’s Guide” but with more rather educative examples http://www.ibm.com/developerworks/library/x-starlet http://www.heise.de/ct/inhalt/15/14/172 — behind a paywall; my PDF copy lives on my archive at: Computers/Data_Formats/Markup_Languages/XML/Addressing_and_Querying/XPath/ http://www.freesoftwaremagazine.com/articles/xml_starlet CAVEAT: “xmlstarlet sel –template –value-of XPATH” lists all… Continue reading XMLStarlet – a command-line utility to deal with XML documents

the GNU packages that I need most seriously on my “finkified” Macs

https://wiki.jochen.hayek.name/w?title=The_GNU_packages_that_I_need_most_seriously_on_my_finkified_Macs – this is where this text should actually live and get maintained CAVEAT!!! Read carefully beforehand: http://finkproject.org/download/ http://finkproject.org/download/srcdist.php I.e. install Xcode Tools/Developer Tools, Xcode Command Line Tools, and X11 ! This message wants to tell you, that the Xquartz X11 distribution is not installed (it is not something, that fink can do for you):… Continue reading the GNU packages that I need most seriously on my “finkified” Macs