add another blog post, and infrastructure to support numbering lines on code snipits..

4 years ago · 784fbcba25
--- a/NOTES.md
+++ b/NOTES.md
@@ -55,3 +55,28 @@ via <var>.<markname>.

 Dependancy tracking isn't great.  Modifying the `meta.yaml` file will not take effect until
 the files are touched/update.  Make sure you do a `touch *` after updating the `meta.yaml` file.

 Plugins to look at:
 ImageSizerPlugin -- adds image sizes to html (could be expanded to include hires versions, and and/or generate normal res versions)
 ImageThumbnailsPlugin -- creates thumbnails
 JPEGOptimPlugin -- optimize jpegs
 OptiPNGPlugin -- optimize PNGs
 TaggerPlugin -- tagger plugin

 FlattenerPlugin -- this might be useful, but also useful to add in deploy paths
 PaginatorPlugin -- makes pages
 TextlinksPlugin -- converts text into links, that is [[/fname]] -> link and [[!!/media]] to a media_url
 UrlCleanerPlugin -- removes index.html and more
 GitDatesPlugin -- automatically get post dates from git

 Figure out how to limit column width to a max size.

 The order of the plugins DO mater!!  I switched MetaPlugin to be after AutoExtendPlugin, and the
 yaml header appeared in the blog post.

 fenced_code.py is an updated version that splits the code block into multiple code lines.  This
 allows some css magic to add line numbers before them and not get copy/pasted.  Installed
 via:
 ```
 cp fenced_code.py p/lib/python3.8/site-packages/markdown/extensions/
 ```
--- a/content/2015/05/xml-schema-validation-for-command-line.html
+++ b/content/2015/05/xml-schema-validation-for-command-line.html
@@ -0,0 +1,106 @@
 ---
 title: XML Schema Validation for the command line
 description: >
  XML Schema Validation for the command line
 created: !!timestamp '2015-05-07'
 time: 2:17 PM
 tags:
  - xml
  - schema
 ---

 It turns out that unless you use a full fledge XML editor, validating
 your XML document against a schema is difficult.  Most tools require you
 to specify a single schema file.  If you have an XML document that
 contains more than one name space this doesn't work too well as often,
 each name space is in a separate schema file.

 The XML document has xmlns attributes which use a URI as the identifier.
 These URIs are for identifing it, and not a URL, so not able to be used.
 In fact, different cases in the URIs specify different name spaces even
 in the "host" part, though that is not the case with URLs.  In order for
 validators to find the schema, the attribute
 [xsi:schemaLocation](http://www.w3.org/TR/xmlschema-1/#schema-loc) is
 used to map the name space URIs to the URLs of the schema.

 The `xsi:schemaLocation` mapping is very simple.  It is simply a white
 space delimited list of URI/URL pairs.  None of the command line tools
 that I used uses this attribute to make the schema validation simple.
 This includes [xmllint](http://xmlsoft.org/xmllint.html) which uses
 the libxml2 library.  I also tried to use the Java XML library
 Xerces, but was unable to get it to work.  Xerces did not provide a
 simple command line utility, and I couldn't figure out the correct java
 command line to invoke the validator class.

 My coworker, [Patrick](http://fivetwentysix.com/), found the blog entry,
 [Nokogiri XML schema validation with multiple schema files](http://avinmathew.com/nokogiri-xml-schema-validation-with-multiple-schema-files/),
 which talks about using `xs:import` to have a single schema file support
 multiple name spaces.  With this, we realized that we could finally get
 our XML document verified.

 As I know shell scripting well, I decided to write a script to automate
 creating a unified schema and validate a document.  The tools don't cache
 the schema documents, requiring fetching the schema each time you want
 to validate the XML document.  We did attempt to write the schema files
 to disk, and reuse those, *but* there are issues in that some schemas
 reference other resources in them.  If the schema is not retrieved from
 the web, these internal resources are not retrieved also, causing errors
 when validating some XML documents.

 With a little bit of help from `xsltproc` to extract xsi:schemaLocation,
 it wasn't to hard to generate the schema document and provide it to
 xmllint.

 The code ([xmlval.sh](http://www.funkthat.com/~jmg/xmlval.sh)):

 ``` { .shell .showlines }
 #!/bin/sh -

 cat <<EOF |
 <?xml version="1.0"?>
 <xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 >

 <xsl:output method="text"/>
 <xsl:template match="/">
  <xsl:value-of select="/*/@xsi:schemaLocation"/>
 </xsl:template>

 </xsl:stylesheet>
 EOF
    xsltproc - "$1" |
    sed -e 's/ */\
 /g' |
    sed -e '/^$/d' |
    (echo '<?xml version="1.0" encoding="UTF-8"?>'
     echo '<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:nospace="nospace" targetNamespace="http://www.example.com/nospace">'
     while :; do
        if ! read a; then
            break
        fi
        if ! read b; then
            break
        fi
        echo '<xs:import namespace="'"$a"'" schemaLocation="'"$b"'"/>'
    done
    echo '</xs:schema>') |
    xmllint --noout --schema - "$1"
 ```


 Though the script looks complicated, it is a straight forward pipeline:

 1. Lines 3-16 provide the xslt document to `xsltproc` on line 17 to
   extract schema location attribute.
 1. Lines 18-20 replace multiple spaces with new lines and deletes any
   blank lines.  It should probably also handle tabs, but none of the
   documents that I have had tabs.  After this, we now have the odd
   lines containing the URI of the name space, and the even lines
   contain the URL for the schema.
 1. Lines 21 and 22 are the header for the new schema document.
 1. Lines 23-31 pulls in these line pairs and create the necessary
   `xs:import` lines.
 1. Line 32 provides the closing element for the schema document.
 1. Line 33 gives the schema document to xmllint for validation.
--- a/content/2015/meta.yaml
+++ b/content/2015/meta.yaml
@@ -0,0 +1,3 @@
 extends: blog.j2
 default_block: post
 listable: true
--- a/content/media/css/custom.css
+++ b/content/media/css/custom.css
@@ -7,3 +7,19 @@ ul.tags  li {
    padding-right: .5em;
    display: inline-block;
 }

 pre.showlines {
  white-space: pre-wrap;
 }
 pre.showlines::before {
  counter-reset: listing;
 }
 pre.showlines code {
  counter-increment: listing;
 }
 pre.showlines code::before {
  content: counter(listing) ". ";
  display: inline-block;
  width: 3em;         /* now works */
  text-align: right;  /* now works */
 }
--- a/requirements.txt
+++ b/requirements.txt
@@ -1 +1,2 @@
 git+git://github.com/jmgurney/hyde.git@c8a8aafe081ce7bf9ea90b7e260914522e546210
 -e git+https://www.funkthat.com/gitea/jmg/hyde.git@aa02fc981079e243a91242eb9e90eaa272f26b59#egg=hyde
 markdown==3.3.4