Language Shifting in EPUB 3
I was listening to a podcast recently - Episode 11 of The Digital Publishing Podcast with Simon Mellins and Simon Holt. They were speaking with Laura Brady about her work improving accessibility in publishing workflows and outputs and mentioned in passing the 'language shift' feature of EPUB screen readers, which was new to me.
I read about it and wondered how I would handle it in an EPUB project in the SEED.html app.
SEED.html app workflow
To recap, the SEED.html app implements a text transform pipline that is customized per EPUB project, and evaluated on every change to a chapter's plain text source files. The generated xhtml is rendered in the app so the author can see immediately the combined effect of source, css, transform and Reading System script changes, without needing to package and push to a device.
Here's a screenshot in which you can partially see the transformText script (center), the markdown source editor (focused, below) and the rendered preview (right) of the chapter titled 'Language shift'.
In the project for exploring Language Shift I chose the MarkdownIt library, and its plugin MarkdownItAttr (which provides a concise way to add class/id attributes to inline or block elements). This is my go-to setup for new projects at the moment.
Here's a summary of the data flow that runs on each change to one of the source files.
plain text (markdown, css, js)
|
transformText(markdown-it + markdown-it-attrs) -> html fragment
|
transformDom(custom dom manipulation) -> xhtml document
|
xhtml preview (combining css, js and xhtml) -> browser render
|
EPUB -> device, reading system testing
My typical but evolving workflow in the SEED.html app:
- Create sample markup - package EPUB, deploy - check that it renders the way I want,
- Review what's in the markup, for conciseness and author readability,
- Propose a minimal markdown-style representation of the plain text version.
The Language Shift Feature
The video is what I had working for step 1 in the Books app on an iPad using VoiceOver to illustrate the language shift as it is spoken in an English text, with some French dialogue.
Of note -- the French speaking voice is different from the English voice. I'm using the Karen (Premium) Australian option for VoiceOver. I don't know how the French voice is selected, but the 'Detect Languages' option in VoiceOver settings is enabled in this iPad.
Step 2: Review the markup
The xhtml chapter markup that enables this looks like this;
<htmlxmlns="http://www.w3.org/1999/xhtml"xml:lang="en" lang="en"><body>
<h2>Language shift</h2> <p> <span xml:lang="fr" lang="fr">“Bien sûr”</span>, he replied. </p>
The recommended practice is to include both the namespaced xml:lang attribute and the regular lang attribute.
But that means there's redundant information being specified here - the 'fr' value should be identical. I'd like to only specify it once only in the plain text.
So all I want to see is a span around the "Bien sûr" text, specifying the two-char language code 'fr'.
So, what's the minimal markdown formatting to make this happen?
Step 3: Minimal code proposal
Markdown uses underscore delimiters for emphasis. Semantically that seems reasonable and compact (though ultimately I'm going to replace the <em> with a <span>). Then the markdown-it-attrs plugin allows putting a classname onto the inline em element.
So in markdown my proposal is to write this;
_"Bien sûr"_{.lang-fr}, he replied.(This is just one solution, using a particular markdown library. It's not pretty, but I think it is pretty easy to understand if you're already working in markdown, and knowing that the output is xhtml for EPUB.)
After markdown transformation (refer to transformText() in the screenshot above) this is the generated html fragment:
<p><em class="lang-fr">“Bien sûr”</em>, he replied.</p>
But we're not done. There's still some custom javascript (called transformDom() which produces a full xhtml document) run by the editor before the actual chapter xhtml is saved. In that script the em element is replaced with a span that has the same innerHTML and the xml:lang and lang attributes.
<p><span xml:lang="fr" lang="fr">“Bien sûr”</span>, he replied.</p>What's the point?
The larger story here is an observation that the SEED.html app architecture dramatically expands the developer pool for driving innovation in the EPUB space.
Users of the app don't need to wait for a software application upgrade in order to support a moving target of best practice in accessibility. A large, existing pool of web designers and developers can contribute their expertise of regular old javascript, css and the dom to move the needle on ebook innovation.
This authoring simplicty paired with Reading Systems that are built on the web browser stack makes me very excited for the future of digital books and reading.