starbuck93/readme.md

## readme.md

      
    Raw
  

              readme.md
            
          
    The purpose of this sed command is to get rid of the extra stuff from my Google Takeout of a "new" Google Site to eventually import it into Outline Wiki.
This will read the whole file in a loop (:a;N;$!ba), then match everything from the beginning of the file (^.*) until the substring </header> and replace it with the replacement (blank).
Then, once I import the file into Outline Wiki, I'll re-link the pictures if there are any, and reorganize the structure of the sidebar from Google Sites. But, removing all the extra HTML will make this much faster.

  
## site-cleanup.sh
gsed -i ':a;N;$!ba;s:^.*</header>::' *.html

gsed -i ':a;N;$!ba;s:<script.*$::' *.html
	gsed -i ':a;N;$!ba;s:^.</header>::' .html

	gsed -i ':a;N;$!ba;s:<script.$::' .html