Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dlangille/a3c162c3304cd5d4ea89650f3451bf9a to your computer and use it in GitHub Desktop.
Save dlangille/a3c162c3304cd5d4ea89650f3451bf9a to your computer and use it in GitHub Desktop.
The Log entry in this XML file is giving me a 'Wide character in subroutine entry' when I try to print it. I'm using XML::Node to parse it. When I call DBI's quote function, I get that error message. I have no idea how to deal with this.
<?xml version='1.0' encoding='UTF-8'?>
<UPDATES Version="1.4.0.0">
<UPDATE>
<DATE Year="2020" Month="7" Day="21"/>
<TIME Timezone="UTC" Hour="17" Minute="15" Second="7"/>
<OS Repo="ports" Id="FreeBSD" Branch="master"/>
<LOG>- Add devel/py-trio: Friendly Python library for async concurrency and I/O
The Trio project’s goal is to produce a production-quality,
permissively licensed, async/await-native I/O library for Python.
Like all async libraries, its main purpose is to help you write
programs that do multiple things at the same time with parallelized
I/O. A web spider that wants to fetch lots of pages in parallel, a
web server that needs to juggle lots of downloads and websocket
connections at the same time, a process supervisor monitoring
multiple subprocesses… that sort of thing. Compared to other
libraries, Trio attempts to distinguish itself with an obsessive
focus on usability and correctness. Concurrency is complicated; we
try to make it easy to get things right.
WWW: https://pypi.org/project/trio/</LOG>
<PEOPLE>
<UPDATER Handle="amdmi3 &lt;amdmi3@FreeBSD.org&gt;"/>
</PEOPLE>
<COMMIT Hash="24204ab75f8f780d312b56745536862be7120adc" HashShort="24204a" Subject="- Add devel/py-trio: Friendly Python library for async concurrency and I/O" EncodingLoses="false" Repository="ports"/>
<FILES>
<FILE Action="Modify" Path="devel/Makefile"/>
<FILE Action="Add" Path="devel/py-trio/Makefile"/>
<FILE Action="Add" Path="devel/py-trio/distinfo"/>
<FILE Action="Add" Path="devel/py-trio/pkg-descr"/>
</FILES>
</UPDATE>
</UPDATES>
re https://perldoc.perl.org/Encode.html#UTF-8-vs.-utf8-vs.-UTF8
print encode('utf-8', $this->{description});
- Add devel/py-trio: Friendly Python library for async concurrency and I/O
The Trio project’s goal is to produce a production-quality,
permissively licensed, async/await-native I/O library for Python.
Like all async libraries, its main purpose is to help you write
programs that do multiple things at the same time with parallelized
I/O. A web spider that wants to fetch lots of pages in parallel, a
web server that needs to juggle lots of downloads and websocket
connections at the same time, a process supervisor monitoring
multiple subprocesses… that sort of thing. Compared to other
libraries, Trio attempts to distinguish itself with an obsessive
focus on usability and correctness. Concurrency is complicated; we
try to make it easy to get things right.
WWW: https://pypi.org/project/trio/
print encode('utf8', $this->{description});
- Add devel/py-trio: Friendly Python library for async concurrency and I/O
The Trio project’s goal is to produce a production-quality,
permissively licensed, async/await-native I/O library for Python.
Like all async libraries, its main purpose is to help you write
programs that do multiple things at the same time with parallelized
I/O. A web spider that wants to fetch lots of pages in parallel, a
web server that needs to juggle lots of downloads and websocket
connections at the same time, a process supervisor monitoring
multiple subprocesses… that sort of thing. Compared to other
libraries, Trio attempts to distinguish itself with an obsessive
focus on usability and correctness. Concurrency is complicated; we
try to make it easy to get things right.
WWW: https://pypi.org/project/trio/
print encode('UTF8', $this->{description});
- Add devel/py-trio: Friendly Python library for async concurrency and I/O
The Trio project’s goal is to produce a production-quality,
permissively licensed, async/await-native I/O library for Python.
Like all async libraries, its main purpose is to help you write
programs that do multiple things at the same time with parallelized
I/O. A web spider that wants to fetch lots of pages in parallel, a
web server that needs to juggle lots of downloads and websocket
connections at the same time, a process supervisor monitoring
multiple subprocesses… that sort of thing. Compared to other
libraries, Trio attempts to distinguish itself with an obsessive
focus on usability and correctness. Concurrency is complicated; we
try to make it easy to get things right.
WWW: https://pypi.org/project/trio/
print encode('iso-8859-1', $this->{description});
- Add devel/py-trio: Friendly Python library for async concurrency and I/O
The Trio project?s goal is to produce a production-quality,
permissively licensed, async/await-native I/O library for Python.
Like all async libraries, its main purpose is to help you write
programs that do multiple things at the same time with parallelized
I/O. A web spider that wants to fetch lots of pages in parallel, a
web server that needs to juggle lots of downloads and websocket
connections at the same time, a process supervisor monitoring
multiple subprocesses? that sort of thing. Compared to other
libraries, Trio attempts to distinguish itself with an obsessive
focus on usability and correctness. Concurrency is complicated; we
try to make it easy to get things right.
WWW: https://pypi.org/project/trio/
my @all_encodings = Encode->encodings(":all");
foreach (@all_encodings) {
print "$_\n";
}
AdobeStandardEncoding
AdobeSymbol
AdobeZdingbat
ascii
ascii-ctrl
big5-eten
big5-hkscs
cp1006
cp1026
cp1047
cp1250
cp1251
cp1252
cp1253
cp1254
cp1255
cp1256
cp1257
cp1258
cp37
cp424
cp437
cp500
cp737
cp775
cp850
cp852
cp855
cp856
cp857
cp858
cp860
cp861
cp862
cp863
cp864
cp865
cp866
cp869
cp874
cp875
cp932
cp936
cp949
cp950
dingbats
euc-cn
euc-jp
euc-kr
gb12345-raw
gb2312-raw
gsm0338
hp-roman8
hz
iso-2022-jp
iso-2022-jp-1
iso-2022-kr
iso-8859-1
iso-8859-10
iso-8859-11
iso-8859-13
iso-8859-14
iso-8859-15
iso-8859-16
iso-8859-2
iso-8859-3
iso-8859-4
iso-8859-5
iso-8859-6
iso-8859-7
iso-8859-8
iso-8859-9
iso-ir-165
jis0201-raw
jis0208-raw
jis0212-raw
johab
koi8-f
koi8-r
koi8-u
ksc5601-raw
MacArabic
MacCentralEurRoman
MacChineseSimp
MacChineseTrad
MacCroatian
MacCyrillic
MacDingbats
MacFarsi
MacGreek
MacHebrew
MacIcelandic
MacJapanese
MacKorean
MacRoman
MacRomanian
MacRumanian
MacSami
MacSymbol
MacThai
MacTurkish
MacUkrainian
MIME-B
MIME-Header
MIME-Header-ISO_2022_JP
MIME-Q
nextstep
null
posix-bc
shiftjis
symbol
UCS-2BE
UCS-2LE
UTF-16
UTF-16BE
UTF-16LE
UTF-32
UTF-32BE
UTF-32LE
UTF-7
utf-8-strict
utf8
viscii
sql is insert into commit_log (id, message_id, message_date, message_subject, date_added, commit_date,
committer, description, system_id, svn_revision, repo_id, encoding_losses) values (
809517,
'db28bdd0b43bf9aef2d664afa8e6fb76f9864f74',
'2020/07/24 00:12:51 UTC',
'Give maintainership to a user of these ports',
'now()',
'2020/07/24 00:12:51 UTC',
'swills',
'Give maintainership to a user of these ports',
1,
'db28bdd0b43bf9aef2d664afa8e6fb76f9864f74',
(SELECT id FROM repo WHERE name = 'ports' and repository = 'git'),
0::boolean)
$sql = "insert into commit_log (id, message_id, message_date, message_subject, date_added, commit_date,
committer, description, system_id, svn_revision, repo_id, encoding_losses) values (
?,
?,
?,
?,
now(),
?,
?,
?,
?,
?,
(SELECT id FROM repo WHERE name = ? and repository = ?),
?::boolean)";
print "sql is $sql\n";
# $sth = $this->{dbh}->prepare($sql);
# $dbh->{pg_enable_utf8} = 1;
$sth = $dbh->do($sql, undef, $this->{id},
$this->{message_id},
$this->{message_date},
$this->{message_subject},
$this->{commit_date},
$this->{committer},
encode('UTF-8', $this->{description}),
$this->{system_id},
$this->{revision},
$this->{repo}, $this->{repository},
$this->{encoding_losses});
@dlangille
Copy link
Author

Is there anything odd in the LOG entry? How can I convert it to something XML::Node might be happy with?

BTW, I also have the code which generates this XML, so we can perhaps massage the data going in. Thanks.

@dlangille
Copy link
Author

The fix: add add ";client_encoding=UTF8" to the connection string

[dan@devgit-ingress01:~/modules] $ svn di database.pm
Index: database.pm
===================================================================
--- database.pm	(revision 5364)
+++ database.pm	(working copy)
@@ -59,7 +59,7 @@
          $password = $FreshPorts::Config::password_listening;
      }
        
-	my $dbh_pg = DBI->connect('DBI:Pg:dbname=' . $FreshPorts::Config::dbname . ';host=' . $FreshPorts::Config::host . ';sslmode=' . $sslmode, $user, $password);
+	my $dbh_pg = DBI->connect('DBI:Pg:dbname=' . $FreshPorts::Config::dbname . ';host=' . $FreshPorts::Config::host . ';sslmode=' . $sslmode . ';client_encoding=UTF8', $user, $password);
 	if ($dbh_pg->{Active}) {
 		$dbh_pg->{AutoCommit} = 0;
 
[dan@devgit-ingress01:~/modules] $ 

Thank you @ilmari

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment