正在寫一個 Medium Post to Markdown 的完整轉換小工具,輸入文章連結就能自動爬取內容、下載圖片並轉換成 Markdown 格式。
爬取方式是取出 Medium 文章的前端 JSON Source,Source 會包含每個段落的資訊,將所有段落逐一爬取並轉換成 Markdown 格式。
JSON Source 的段落樣式會給以下格式表示
"Paragraph": {
"text": "code in text, and link in text, and ZhgChgLi, and bold, and I, only i",
"markups": [
{
"type": "CODE",
"start": 5,
"end": 7
},
{
"start": 18,
"end": 22,
"href": "http://zhgchg.li",
"type": "LINK"
},
{
"type": "STRONG",
"start": 50,
"end": 63
},
{
"type": "EM",
"start": 55,
"end": 69
}
]
}
意思是 code in text, and link in text, and ZhgChgLi, and bold, and I, only i
這段文字的:
- 第 5 到第 7 字元要標示為 Code (用`Text`格式包裝)
- 第 18 到第 22 字元要標示為 URL (用[Text](URL)格式包裝)
- 第 50 到第 63 字元要標示為 Code (用*Text*格式包裝)
- 第 55 到第 69 字元要標示為 Code (用_Text_格式包裝)
第 5 到 7 & 18 到 22 在這個例子裡好處理,因為沒有交錯到;但 50-63 & 55-69 會有交錯問題,Markdown 無法用以下交錯方式表示:
code `in` text, and [ink](http://zhgchg.li) in text, and ZhgChgLi, and **bold,_ and I, **only i_
正確的組合結果如下:
code `in` text, and [ink](http://zhgchg.li) in text, and ZhgChgLi, and **bold,_ and I, _**_only i_
50-55 STRONG 55-63 STRONG, EM 63-69 EM
另外要需注意包裝格式的字串頭跟尾要能區別,Strong 只是剛好頭跟尾都是 **
,如果是 Link 頭會是 [
尾則是 ](URL)
提出解決算法的朋友會把你加到專案完成開源後的 readme 上,標注提供算法技術支援。
"Paragraph": {
"text": "iCloud Private Relay is an iCloud+ service that prevents networks and servers from monitoring a person’s activity across the internet. Discover how your app can participate in this transition to a more secure and private internet: We’ll show you how to prepare your apps, servers, and networks to work with iCloud Private Relay.",
"markups": [
{
"type": "A",
"start": 24,
"end": 201,
"href": "https://medium.com/"
},
{
"type": "STRONG",
"start": 48,
"end": 65
},
{
"type": "STRONG",
"start": 125,
"end": 147
},
{
"type": "EM",
"start": 27,
"end": 133
}
]
}
result:
iCloud Private Relay is [an _iCloud+ service that _**_prevents networks_**_ and servers from monitoring a person’s activity across the _**_internet_. Discover how** your app can participate in this transition to a more](https://medium.com) secure and private internet: We’ll show you how to prepare your apps, servers, and networks to work with iCloud Private Relay.
iCloud Private Relay is an _iCloud+ service that prevents networks and servers from monitoring a person’s activity across the _internet. Discover how your app can participate in this transition to a more secure and private internet: We’ll show you how to prepare your apps, servers, and networks to work with iCloud Private Relay.
我有點看不懂你說的正確組合,我想像應該正確組合會像是
EM 要因為交錯而變成套用 STRONG 內部字,跟外部字
先 Note: 我發現如果 ** 或 _ 內部第一個字遇到 space 也會導致語法錯誤,所以可能再套用時候要位移一下
至於演算法,我在想是不是在 apply 樣式到文案中時候就要把文案轉成某種 decorator 形式,然後套用樣式是一層一層往裡面加,最後再一起輸出