Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
VB Word-to-Markdown converter

from http://tips.naivist.net/2006/02/02/word_to_markdown_converter/ Word to Markdown converter

Some smart people who didn’t want to teach users how to code in HTML invented Textile markup language

Someone more lazy invented MarkDown syntax which is much easier to learn. MarkDown is really nice for small content management systems, blogging engines etc.

For instance, to mark text as bold, you write the “bold text” like “bold text”. Other rules can be found in the syntax page.

However, when you have a MS Word document with several pages of bolds, italics and lists, you don’t really want to re-code all the markup.

This is why I wrote a simple macro for converting the most trivial things automatically. This macro does:

Replace bold and italics
Replace headings (marked heading1-6)
Replace numbered and bulleted lists

It’s very buggy, I believe it hangs on larger documents, however I’m NOT stating it’s a stable release anyway! :-) Experimental use only, recode and reuse it as you like, post a comment if you’ve found a better solution.

Click here to download the code.

To use the code:

open WinWord,
press Alt+F11 to open the VBA editor,
right click the first project in the project browser
choose insert->module
paste the code from the file
close macro editor
go tools>macro>macros; run the macro named MarkDown

This entry was posted on Thursday, February 2nd, 2006 at 06:33 and is filed under Datorlietas.

'*** A simple MsWord->Markdown replacement macro by Kriss Rauhvargers, 2006.02.02.
'*** This tool does NOT implement all the markup specified in MarkDown definition by John Gruber, only
'*** the most simple things. These are:
'*** 1) Replaces all non-list paragraphs to ^p paragraph so MarkDown knows it is a stand-alone paragraph
'*** 2) Converts tables to text. In fact, tables get lost.
'*** 3) Adds a single indent to all indented paragraphs
'*** 4) Replaces all the text in italics to _text_
'*** 5) Replaces all the text in bold to **text**
'*** 6) Replaces Heading1-6 to #..#Heading (Heading numbering gets lost)
'*** 7) Replaces bulleted lists with ^p * listitem ^p* listitem2...
'*** 8) Replaces numbered lists with ^p 1. listitem ^p2. listitem2...
'*** Feel free to use and redistribute this code
Sub MarkDown()
Dim bReplace As Boolean
Dim i As Integer
Dim oPara As Paragraph
'remove formatting from paragraph sign so that we dont get **blablabla^p** but rather **blablabla**^p
Call RemoveBoldEnters
For i = Selection.Document.Tables.Count To 1 Step -1
Call Selection.Document.Tables(i).ConvertToText
Next
'simple text indent + extra paragraphs for non-numbered paragraphs
For i = Selection.Document.Paragraphs.Count To 1 Step -1
Set oPara = Selection.Document.Paragraphs(i)
If oPara.Range.ListFormat.ListType = wdListNoNumbering Then
If oPara.LeftIndent > 0 Then
oPara.Range.InsertBefore (">")
End If
oPara.Range.InsertBefore (vbCrLf)
End If
Next
'italic -> _italic_
Selection.HomeKey Unit:=wdStory
bReplace = ReplaceOneItalic 'first replacement
While bReplace 'other replacements
bReplace = ReplaceOneItalic
Wend
'bold-> **bold**
Selection.HomeKey Unit:=wdStory
bReplace = ReplaceOneBold 'first replacement
While bReplace
bReplace = ReplaceOneBold 'other replacements
Wend
'Heading -> ##heading
For i = 1 To 6 'heading1 to heading6
Selection.HomeKey Unit:=wdStory
bReplace = ReplaceH(i) 'first replacement
While bReplace
bReplace = ReplaceH(i) 'other replacements
Wend
Next
Call ReplaceLists
Selection.HomeKey Unit:=wdStory
End Sub
'***************************************************************
' Function to replace bold with _bold_, only the first occurance
' Returns true if any occurance found, false otherwise
' Originally recorded by WinWord macro recorder, probably contains
' quite a lot of useless code
'***************************************************************
Function ReplaceOneBold() As Boolean
Dim bReturn As Boolean
Selection.Find.ClearFormatting
With Selection.Find
.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Font.Bold = True
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
bReturn = False
While Selection.Find.Execute = True
bReturn = True
Selection.Text = "**" & Selection.Text & "**"
Selection.Font.Bold = False
Selection.Find.Execute
Wend
ReplaceOneBold = bReturn
End Function
'*******************************************************************
' Function to replace italic with _italic_, only the first occurance
' Returns true if any occurance found, false otherwise
' Originally recorded by WinWord macro recorder, probably contains
' quite a lot of useless code
'********************************************************************
Function ReplaceOneItalic() As Boolean
Dim bReturn As Boolean
Selection.Find.ClearFormatting
With Selection.Find
.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Font.Italic = True
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
bReturn = False
While Selection.Find.Execute = True
bReturn = True
Selection.Text = "_" & Selection.Text & "_"
Selection.Font.Italic = False
Selection.Find.Execute
Wend
ReplaceOneItalic = bReturn
End Function
'*********************************************************************
' Function to replace headingX with #heading, only the first occurance
' Returns true if any occurance found, false otherwise
' Originally recorded by WinWord macro recorder, probably contains
' quite a lot of useless code
'*********************************************************************
Function ReplaceH(ByVal ipNumber As Integer) As Boolean
Dim sReplacement As String
Select Case ipNumber
Case 1: sReplacement = "#"
Case 2: sReplacement = "##"
Case 3: sReplacement = "###"
Case 4: sReplacement = "####"
Case 5: sReplacement = "#####"
Case 6: sReplacement = "######"
End Select
Selection.Find.ClearFormatting
Selection.Find.Style = ActiveDocument.Styles("Heading " & ipNumber)
With Selection.Find
.Text = ""
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
bReturn = False
While Selection.Find.Execute = True
bReturn = True
Selection.Range.InsertBefore (vbCrLf & sReplacement & " ")
Selection.Style = ActiveDocument.Styles("Normal")
Selection.Find.Execute
Wend
ReplaceH = bReturn
End Function
'***************************************************************
' A fix-up for paragraph marks that ar are bold or italic
'***************************************************************
Sub RemoveBoldEnters()
Selection.HomeKey Unit:=wdStory
Selection.Find.ClearFormatting
Selection.Find.Font.Italic = True
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Font.Bold = False
Selection.Find.Replacement.Font.Italic = False
With Selection.Find
.Text = "^p"
.Replacement.Text = "^p"
.Forward = True
.Wrap = wdFindContinue
.Format = True
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.HomeKey Unit:=wdStory
Selection.Find.ClearFormatting
Selection.Find.Font.Bold = True
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Font.Bold = False
Selection.Find.Replacement.Font.Italic = False
With Selection.Find
.Text = "^p"
.Replacement.Text = "^p"
.Forward = True
.Wrap = wdFindContinue
.Format = True
End With
Selection.Find.Execute Replace:=wdReplaceAll
End Sub
'***************************************************************
' Function to replace bold with _bold_, only the first occurance
' Returns true if any occurance found, false otherwise
' Originally recorded by WinWord macro recorder, probably contains
' quite a lot of useless code
'***************************************************************
Sub ReplaceLists()
Dim i As Integer
Dim j As Integer
Dim Para As Paragraph
Selection.HomeKey Unit:=wdStory
'iterate through all the lists in the document
For i = Selection.Document.Lists.Count To 1 Step -1
'check each paragraph in the list
For j = Selection.Document.Lists(i).ListParagraphs.Count To 1 Step -1
Set Para = Selection.Document.Lists(i).ListParagraphs(j)
'if it's a bulleted list
If Para.Range.ListFormat.ListType = wdListBullet Then
Para.Range.InsertBefore (ListIndent(Para.Range.ListFormat.ListLevelNumber, "*"))
'if it's a numbered list
ElseIf Para.Range.ListFormat.ListType = wdListSimpleNumbering Or _
wdListMixedNumbering Or _
wdListListNumOnly Then
Para.Range.InsertBefore (Para.Range.ListFormat.ListValue & ". ")
End If
Next j
'inserts paragraph marks before and after, removes the list itself
Selection.Document.Lists(i).Range.InsertParagraphBefore
Selection.Document.Lists(i).Range.InsertParagraphAfter
Selection.Document.Lists(i).RemoveNumbers
Next i
End Sub
'***********************************************************
' Returns the MarkDown indent text
'***********************************************************
Function ListIndent(ByVal ipNumber As Integer, ByVal spChar As String) As String
Dim i As Integer
For i = 1 To ipNumber - 1
ListIndent = ListIndent & " "
Next
ListIndent = ListIndent & spChar & " "
End Function
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment