Skip to content

Instantly share code, notes, and snippets.

@algomaster99

algomaster99/Report.md Secret

Created Aug 24, 2020
Embed
What would you like to do?
GSoC 2020 report

Cicero Word Add-in

A word plugin to allow users to interact with Accord Project templates directly in MS Word.

Why?

< Add photo of the storyboard > Currently, the Accord Project ecosystem involved writing "code" for even drafting clauses or contracts from the smart contract templates which restricted such functionality to users who are familiar with coding environments. However, we needed to make this process as less technical as possible so that even the non-technical people have access to it. The reason being that this drafting process is mostly done by lawyers who aren't very technically sounds in terms of writing or executing code, nor they are expected to be.

Our solution for this problem relied on one very critical assumption - Most lawyers use MS Word to draft contracts. Considering, it to be accurate, we decided to integrate the Accord Project ecosystem with MS Word for which the best way was to create a Word plugin or more formally called as Word Add-in.

Ideas

After stating the assumptions, a lot of design thinking activities were conducted to prioritize the features of the application. The extensive deliberations boiled down to the following features: < photo of storyboard >

  1. User has a contract, has a clause template already made by associate
  2. User adds a template to contract
  3. User can now edit variables, export to markdown/CTA
  4. User can share with others, or upload to the platform

Thus, the final mockup was designed, keeping in mind the ideas mentioned above.

A detailed summary of deliberations in the design thinking workshop has been summarised here.

Coding begins!

Once the community bonding period was over, and since had a clear vision in our mind about the project, we got on with its implementation. The following tech stack was used to develop the project:

  1. Word Add-in APIs - methods to interact with the MS Word document. These APIs are specifically for MS Word.
  2. Common APIs - methods to interact with MS Word ecosystem). For example, UI, dialogs, and client settings. These APIs are common across multiple types of Office applications.
  3. React - frontend library to render the add-in components.
  4. Webpack - build the project to deploy it on the add-in interface.

Although the tech-stack may seem quite familiar, the Add-in development is in its rudimentary stages. Therefore, the community of developers involved is quite small, which brought us a lot of challenges.

Thus, to encourage the open-source culture, this GSoC report will also serve as an informal case study. Hopefully, the problems and its solutions listed here can serve as a panacea for all your Word add-in development related queries.

Add-in internals

Before diving deeper into the technicalities, let us first try to understand the structure of the add-in.

All add-ins are sideloaded inside a pane in MS Office products using a manifest.xml. These panes are essentially browsers which imply add-ins are similar to web applications. The interaction with this pane triggers Word API, which in turn drives how the text will appear on the MS Word.

< photo to show a click >

Setting up the development environment

The first and foremost step of developing a software project is to configure a development environment. We followed this technical blog post to set up a boilerplate code, and then we customized it according to our requirements.

However, we faced a lot of compatibility issues because one needs the right combination of Microsoft Windows' version and the Office's version. Otherwise, the browser used for rendering add-in is Internet Explorer, and it doesn't support the latest JavaScript features.

Hence, we did the necessary changes to shift to Microsoft Edge (which uses V8) and supported all the features which we may need. The setup instructions have been documented here.

MS Word concepts

OOXML (Open Office XML)

Office Open XML is a zipped, XML-based file format for representing spreadsheets, charts, presentations and word processing documents. Source: http://officeopenxml.com/.

The docx format is a representation of an OOXML which claims that the file is supposed to be a "word processing document".

To render the clause text in the document, we used an API called insertOoxml after converting the CiceroMark to OOXML. For example, the headings in the clauses or sample texts were processed like this:

CiceroMark
{
  "$class": "org.accordproject.commonmark.Heading",
  "level": "2",
  "nodes": [
    {
      "$class": "org.accordproject.commonmark.Text",
      "text": "Acceptance of Delivery."
    }
  ]
}

converted to

OOXML
<w:p>
  <w:pPr>
    <w:pStyle w:val="Heading2"/>
  </w:pPr>
  <w:r>
    <w:rPr>
      <w:sz w:val="40"/>
    </w:rPr>
      <w:t xml:space="preserve">Acceptance of Delivery.</w:t>
   </w:r>
</w:p>

and the following was rendered to the Word document: < photo of the example >

Note: the "w:val" isn't actually the font-size. Divide it by 2 and then one gets the correct font-size.

You might notice that font sizes and some other formatting settings in Word Office Open XML markup look like they're double the actual size. That's because paragraph and line spacing, as well some section formatting properties shown in the preceding markup, are specified in twips (one-twentieth of a point). Reference: https://docs.microsoft.com/en-us/office/dev/add-ins/word/create-better-add-ins-for-word-with-office-open-xml

Content controls

Since smart contracts are not entirely plain texts, we needed a way to distinguish both the entities. We used an MS Word feature called content control.

< photos showing the difference >

The content control is inserted whenever the code encounters org.accordproject.ciceromark.Variable in the CiceroMark. For example,

The following variable entity,

{
  "$class": "org.accordproject.ciceromark.Variable",
  "value": "\"Attachment X\"",
  "name": "attachment",
  "elementType": "String"
}

is converted to:

<w:sdt>
    <w:sdtPr>
        <w:rPr>
            <w:color w:val="000000" />
            <w:sz w:val="24" />
            <w:highlight w:val="green" />
        </w:rPr>
        <w:alias w:val="Attachment1 | String" />
        <w:tag w:val="attachment" />
        <w:id w:val="-1316484226" />
        <w15:webExtensionLinked />
    </w:sdtPr>
    <w:sdtContent>
        <w:r>
            <w:rPr>
                <w:color w:val="000000" />
                <w:sz w:val="24" />
                <w:highlight w:val="green" />
            </w:rPr>
            <w:t>"Attachment X"</w:t>
        </w:r>
    </w:sdtContent>
</w:sdt>
  1. <w:alias w:val="Attachment1 | String" /> - helped us to store the type of the variable and a unique identifier to attach listeners.
  2. <w:tag w:val="attachment" /> - helped us to store the name of the vairable.

< a photo showing the content control >

Bindings

This feature of the API allowed us to establish a dynamic interaction between the variables. It provided a way to attach listeners to the content controls so that each variable's value could depend upon the values of similar name variables.

Note: name is defined by this xml tag - <w:tag w:val="attachment" />.

< a GIF of variable change >

The code snippet here defines how the listeners are attached. On meticulous inspection of the code, one might question that this recursive function is prone to infinite loops! And that is correct, and we are well aware of it.

The reason for such a precarious piece of code is supported by slow syncing between the add-in and MS Word. So if we create a variable by inserting <w:sdt></w:sdt> (also called smart document tags), and then instantly try to attach the listeners, the API function wasn't able to detect the presence of these entities. But since we are sure a variable has been inserted, there is negligible scope of this recursive loop malfunctioning.

A possible solution was to use JS' setTimeout function, but we weren't sure about the time it would take to synchronize as it would depend from PC to PC.

Status of the project

The project was started from scratch, and a total of 75 commits have been pushed to the project. It has reached a stable state wherein we can consider deploying it on the Microsoft marketplace. However, the scope for adding features in the app is very high. As more functionalities are added, we will likely make the Accord Project technology accessible to a broader community.

The future goals of the project have been delineated in the presentation, to which the link is attached under resources. Consider contributing to the project as we are very contributory-friendly! :)

More than just an internship

I want to thank the whole Accord Project community for always helping me out with the comprehension of the entire ecosystem. Despite these people's various engagements, they were responsive, which helped me get my doubts cleared as soon as possible.

It was a great summer working on this project, and now I am quite familiar with add-in development. I would continue to contribute by writing code or encouraging people to do so by reviewing their PR and explaining them the concepts involved in the project. GSoC is not the end.

Resources

Refer to the following links to make yourself more familiar with the project. They involve small demos of the project, which might give you a deeper insight.

  1. Link to the presentation of the project.
  2. Link to the video. The presentation approximately begins at 00:37:35.

Links to learn more about OOXML.

  1. Blog listing out things one can do with OOXML.
  2. Anatomy of OOXML.

Tutorials to understand the interaction between Microsoft Word Add-in JavaScript API and MS Word.

  1. Word JavaScript add-in APIs
  2. Hands-on tutorial
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment