rossmounce/gist:9f514d330ac2092200c7

## gistfile1.txt
I know I'm doing all types of wrong here:

Source HTML file here: http://mdpi.com/1420-3049/19/4/5150/htm

I want the text for the dc.source:

Molecules 2014, Vol. 19, Pages 5150-5162

Am using beautiful soup, so probably best to do it in that BUT it should also be regex-able. I can do this in bash no problem!

hand = open('1420-3049.19.4.5150.htm')
for ling in hand:
    ling = ling.rstrip()
    if re.search('name="dc.source"', ling) :
        bibinfo = ling.strip('\<').strip('>')
        print bibinfo+" "+originalurl

output:

<meta name="dc.source" content="Molecules 2014, Vol. 19, Pages 5150-5162" http://mdpi.com/1420-3049/19/4/5150/htm

#NotWhatIWanted / nor expected
	I know I'm doing all types of wrong here:

	Source HTML file here: http://mdpi.com/1420-3049/19/4/5150/htm

	I want the text for the dc.source:

	Molecules 2014, Vol. 19, Pages 5150-5162

	Am using beautiful soup, so probably best to do it in that BUT it should also be regex-able. I can do this in bash no problem!

	hand = open('1420-3049.19.4.5150.htm')
	for ling in hand:
	ling = ling.rstrip()
	if re.search('name="dc.source"', ling) :
	bibinfo = ling.strip('\<').strip('>')
	print bibinfo+" "+originalurl

	output:

	<meta name="dc.source" content="Molecules 2014, Vol. 19, Pages 5150-5162" http://mdpi.com/1420-3049/19/4/5150/htm

	#NotWhatIWanted / nor expected