Help:Word format

From semantic-mediawiki.org
Jump to: navigation, search
Word format
Outputs the result in Microsoft Office Word file format (doc/docx).
Further Information
Provided by: Semantic Result Formats
Added:
Removed:
Requirements: MW 1.21+
"PhpOffice/PhpWord" library)
or

MW 1.22+
"PhpWord" library (handled by Composer)

Format name: word
Enabled by default: 
Indicates whether the result format is enabled by default upon installation of the respective extension.
yes
Authors: Wolfgang Fahl
Categories: export
Group:
Table of Contents

↓ INFO ↓

Note Note: As of 2015-07-22 this format is proposed as an enhancement request and only available via a fork from GitHub. See issue 114 for this.

The result format word is used to format query results as a word file.

If the PHPWord library (required) is installed this format will automatically be available (SRF ≥ 1.9.1).

Parameters

  • templatefile - the name of a docx word file containing ${needle} placeholders. It is automatically searched for in the File: namespace.

Example

Note Note: This result format is not available on this wiki. Thus an example output cannot be provided.

Query

We'd like to get a table of cities that have a population of more than 1 million people sorted by population

{{#ask:
 [[Category:City]] [[population::>1000000]]
 |?population
 |sort=population
}}
 Population
San Diego1,307,402
Munich1,353,186
Warsaw1,720,398
Paris2,234,105
Rome2,761,477
Berlin3,520,061
Sydney4,575,532
London7,825,200
New York City8,175,000
8,337,000
Tokyo8,967,665

Same query for Word format

{{#ask:
 [[Category:City]]
 |?population
 |searchlabel=Download result as Word file
 |templatefile=GermanCities.docx
 |format=word
}}

Preparing a Template File

The Template file needs to have ${needle} placeholders where the field results are to be inserted, e.g. ${population} would hold the population result.

Caveats

Unfortunately when saving Microsoft Word files extra characters might get inserted see:

See this issue on stackoverflow.com

To avoid this you might want to

  • switch off correction mode (which might add red markups)
  • use cut&paste in a formatless mode

You might want to check that the needles $ { … } where not spoiled in the resulting Docx xml format. You can check this by unzipping the docx file and looking into the word/document.xml file.

A Tool like xmlstarlet can help with doing this.

Here is a few lines of bash script as an example

unzip -o GermanCities.docx
for keyword in population 
do
 xmlstarlet fo word/document.xml | grep $keyword
done

The result should look like:

  <w:t>${population}</w:t>
…

As a script "caveat" this looks like:

#!/bin/bash
#   Copyright (C) 2015 BITPlan GmbH
#   wf 2015-09-29
#   check that a word template is ok for being used with the
#   SMW word result format
#   see http://semantic-mediawiki.org/wiki/Help:Word_format
 
#
# show usage
#
usage() {
  echo "usage: $0 wordtemplatefile keywords"
  exit 1
}
 
# check command line parameters - there must be at least one
if [ $# -lt 2 ]
then
  usage
fi
 
file="$1"
keywords="$2"
if [ ! -f $file ]
then
  echo "$file does not exist" 1>&2
	exit 1
else
  unzip -o $file > /dev/null
  for keyword in $keywords 
  do
    xmlstarlet fo word/document.xml | grep $keyword
  done
fi

Installation

This describes how to install the required PHPWord library with Composer, which is recommended method for MW 1.22+. Either enter the following in you command line:

composer require phpoffice/phpword dev-master

or add the following as the last line of the "require" section in your "composer.json" file:

"phpoffice/phpword": "dev-master"

Note Note: Replace the version number "dev-master" of this example with the version number you want to install at your convenience.

Patching TemplateProcessor.php for Image handling

If you'd like to insert Images into your word file you might want to patch the TemplateProcessor.php file of PhpOffice/PhpWord like this: see

The SRF_Word format will automatically detect that the method searchImageId is available and will use it.

neso:PhpWord wf$ rcsdiff TemplateProcessor.php 
===================================================================
RCS file: RCS/TemplateProcessor.php,v
retrieving revision 1.1
diff -r1.1 TemplateProcessor.php
61a62,68
>     
>     /**
>      * Content of document rels (in XML format) of the temporary document.
>      *
>      * @var string
>      */
>     private $temporaryDocumentRels; 
101a109
>         $this->temporaryDocumentRels = $this->zipClass->getFromName('word/_rels/document.xml.rels');
508a517,583
>     // 
>     // Image handling
>     // see http://stackoverflow.com/questions/24018003/how-to-add-set-images-on-phpoffice-phpword-template
>     // 
>  
>     /**
>      * Set a new image
>      *
>      * @param string $search
>      * @param string $replace
>      */
>  
>     public function setImageValue($search, $replace){
>         // Sanity check
>         if (!file_exists($replace))
>         {
>             return;
>         }
>  
>         // Delete current image
>         $this->zipClass->deleteName('word/media/' . $search);
>  
>         // Add a new one
>         $this->zipClass->addFile($replace, 'word/media/' . $search);
>     }
>  
>     /**
>      * Search for the labeled image's rId
>      *
>      * @param string $search
>      */
>  
>     public function searchImageId($search){
>         if (substr($search, 0, 2) !== '${' && substr($search, -1) !== '}') {
>             $search = '${' . $search . '}';
>         }
>         $tagPos = strpos($this->tempDocumentMainPart, $search);
>         $rIdStart = strpos($this->tempDocumentMainPart, 'r:embed="',$tagPos)+9;    
>         $rId=strstr(substr($this->tempDocumentMainPart, $rIdStart),'"', true);
>         return $rId;
>     }
>  
>     /**
>      * Get img filename with it's rId
>      *
>      * @param string $rId
>      */
>  
>     public function getImgFileName($rId){
>         $tagPos = strpos($this->temporaryDocumentRels, $rId);
>         $fileNameStart = strpos($this->temporaryDocumentRels, 'Target="media/',$tagPos)+14;
>         $fileName=strstr(substr($this->temporaryDocumentRels, $fileNameStart),'"', true);
>         return $fileName;
>     }
>  
>     /**
>      * set the image with the given searchAlt alternate text
>      * @param searchAlt - the alternate text to search for
>      * @param replace - the image filename to replace the image with that is found
>      */
>     public function setImageValueAlt($searchAlt, $replace){
>     	$_rid=$this->searchImageId($searchAlt);
>     	$_imagefile=$this->getImgFileName($_rid);
>       $this->setImageValue($_imagefile,$replace);
>     }
>     
>


This page is a DRAFT!
The content of this page is incomplete and might contain errors.
You may consult this page which contains more credible information: Help:Word format
This documentation page applies to all SMW versions from to the most current version.
      Other languages:
Some use of "<nowiki>&#x005B;&#x005B;</nowiki>" in your query was not closed by a matching "]]".

Help:Word format en