Discussion:
Proper fix for Windows Unicode issues, and a couple more things.
(too old to reply)
Gregorio Litenstein
2018-06-28 17:51:16 UTC
Permalink
Hey, I’m one of the developers of Performous (cross-platform karaoke game),
we’ve been using portaudio for a while on Unix/Mac/Windows.

I’d initially written a really long post with an issue report but then I
realized a) It was, after pulling most of my hair out, an issue on our side
after all. And b) I sent it to the wrong address so I get it never actually
made it into the list. Anyway
 I’m back with a couple more real issues and
at least one fix.

First, we were facing issues with the display of unicode text in Windows;
essentially the same issue reported here:
https://lists.columbia.edu/pipermail/portaudio/2016-December/000961.html

I took a look at the hostapi implementations and noticed that only some
changed their behavior depending on whether UNICODE was defined or not.
while others always used CP_UTF8.

Initially I thought my issue might have been related to that, so I did some
testing and figured out that in recent versions of Windows, defining
UNICODE (i.e. having everything use CP_UTF8) made the text uniformly
garbled unless I checked the new setting “Use Unicode UTF-8 worldwide” or
something like that. And by contrast, if that setting was off, CP_ACP
properly rendered the text.

it appears that what this setting actually does is set the codepage to
UTF-8 (65001).

With this in mind, I created a patch modifying the behavior so instead of
checking for the definition of UNICODE or _UNICODE, portaudio checks (at
runtime) for the current codepage using GetACP(); If it’s 65001 it uses
CP_UTF8, if not it uses CP_ACP. I tested it on a laptop running Windows 10
Single Language Spanish and the text rendered appropriately both with the
setting turned on and with the setting turned off.

I will attach the diff file here but TBH I have no idea whether it is
possible to attach files to these lists, so you can also get it from
Dropbox below. I’m not opening a ticket/submitting a PR because Assembla is
paid.

https://www.dropbox.com/s/le6zyr1zjv6mank/pa_patch.diff?dl=1

Now, on to the “couple more things”
 it’s actually just one thing. The
GetVersion() function is (still) giving erroneous results. If compiling
portaudio using mingw-w64 (and thus not using WinRT), WASAPI gets the
windows version using

dwVersion = fnGetVersion();

// Get the Windows version
dwMajorVersion = (DWORD)(LOBYTE(LOWORD(dwVersion)));
dwMinorVersion = (DWORD)(HIBYTE(LOWORD(dwVersion)));

switch (dwMajorVersion)
{
case 0:
case 1:
case 2:
case 3:
case 4:
case 5:
break; // skip lower
case 6:
switch (dwMinorVersion)
{
case 0: version = WINDOWS_VISTA_SERVER2008; break;
case 1: version = WINDOWS_7_SERVER2008R2; break;
case 2: version = WINDOWS_8_SERVER2012; break;
case 3: version = WINDOWS_8_1_SERVER2012R2; break;
default: version = WINDOWS_FUTURE; break;
}
break;
case 10:
switch (dwMinorVersion)
{
case 0: version = WINDOWS_10_SERVER2016; break;
default: version = WINDOWS_FUTURE; break;
}
break;
default:
version = WINDOWS_FUTURE;
break;
}

But, from my tests, I noticed in practice Windows 10 (with latest updates)
returns the same value as Windows 8 (i.e. dwMajorVersion=6,
dwMinorVersion=2) and thus Win10 computers might end up using IAudioClient2.

All the best,

Gregorio.

P.S. Are you planning on a new stable release anytime soon?
R0b0t1
2018-06-28 18:44:31 UTC
Permalink
Hello,

On Thu, Jun 28, 2018 at 12:51 PM, Gregorio Litenstein
Hey, I’m one of the developers of Performous (cross-platform karaoke game),
we’ve been using portaudio for a while on Unix/Mac/Windows.
I’d initially written a really long post with an issue report but then I
realized a) It was, after pulling most of my hair out, an issue on our side
after all. And b) I sent it to the wrong address so I get it never actually
made it into the list. Anyway… I’m back with a couple more real issues and
at least one fix.
First, we were facing issues with the display of unicode text in Windows;
https://lists.columbia.edu/pipermail/portaudio/2016-December/000961.html
I took a look at the hostapi implementations and noticed that only some
changed their behavior depending on whether UNICODE was defined or not.
while others always used CP_UTF8.
Initially I thought my issue might have been related to that, so I did some
testing and figured out that in recent versions of Windows, defining UNICODE
(i.e. having everything use CP_UTF8) made the text uniformly garbled unless
I checked the new setting “Use Unicode UTF-8 worldwide” or something like
that. And by contrast, if that setting was off, CP_ACP properly rendered the
text.
it appears that what this setting actually does is set the codepage to UTF-8
(65001).
With this in mind, I created a patch modifying the behavior so instead of
checking for the definition of UNICODE or _UNICODE, portaudio checks (at
runtime) for the current codepage using GetACP(); If it’s 65001 it uses
CP_UTF8, if not it uses CP_ACP. I tested it on a laptop running Windows 10
Single Language Spanish and the text rendered appropriately both with the
setting turned on and with the setting turned off.
I will attach the diff file here but TBH I have no idea whether it is
possible to attach files to these lists, so you can also get it from Dropbox
below. I’m not opening a ticket/submitting a PR because Assembla is paid.
https://www.dropbox.com/s/le6zyr1zjv6mank/pa_patch.diff?dl=1
If Portaudio is using CP_UTF8 then it should be changed so that it
does not. UTF-8 support in Windows is horribly broken to the point of
being considered nonexistent. While programs *should* be compiled with
the UNICODE and _UNICODE macros, and *should* use the XxxW (wchar_t)
APIs, they most likely would be best served by using UTF-8 internally
as appropriate.

I recommend anyone following along read http://utf8everywhere.org/. I
realize you, the patch author, may not have the time to remove CP_UTF8
from Portaudio entirely.

My experience has indicated much the same as the above article
suggests, with the caveat that programs being compiled for Windows
only can usually use wchar_t everywhere. If you need to import or
export data generated by the OS, though, you should convert it from
the native codepage to UTF-8, which is why the author recommends the
things they do.

Cheers,
R0b0t1
Gregorio Litenstein
2018-06-28 19:19:54 UTC
Permalink
I agree that ideally one should use utf8 everywhere and I think that
Microsoft may want to start pushing in that direction eventually (as
suggested by the addition of this setting). My patch proves a better
alternative (with face to the end-user) than what portaudio is currently
doing.

I only touch Windows when I absolutely have to, so wouldn't even know where
to begin in order to correct the root issue, but at least this works.


From: R0b0t1 <***@gmail.com> <***@gmail.com>
Reply: portaudio list <***@lists.columbia.edu>
<***@lists.columbia.edu>
Date: June 28, 2018 at 14:44:31
To: portaudio list <***@lists.columbia.edu>
<***@lists.columbia.edu>
Subject: Re: [Portaudio] Proper fix for Windows Unicode issues, and a
couple more things.

Hello,
Post by R0b0t1
On Thu, Jun 28, 2018 at 12:51 PM, Gregorio Litenstein
Hey, I’m one of the developers of Performous (cross-platform karaoke game),
we’ve been using portaudio for a while on Unix/Mac/Windows.
I’d initially written a really long post with an issue report but then I
realized a) It was, after pulling most of my hair out, an issue on our side
after all. And b) I sent it to the wrong address so I get it never actually
made it into the list. Anyway
 I’m back with a couple more real issues and
at least one fix.
First, we were facing issues with the display of unicode text in Windows;
https://lists.columbia.edu/pipermail/portaudio/2016-December/000961.html
I took a look at the hostapi implementations and noticed that only some
changed their behavior depending on whether UNICODE was defined or not.
while others always used CP_UTF8.
Initially I thought my issue might have been related to that, so I did some
testing and figured out that in recent versions of Windows, defining UNICODE
(i.e. having everything use CP_UTF8) made the text uniformly garbled unless
I checked the new setting “Use Unicode UTF-8 worldwide” or something like
that. And by contrast, if that setting was off, CP_ACP properly rendered the
text.
it appears that what this setting actually does is set the codepage to UTF-8
(65001).
With this in mind, I created a patch modifying the behavior so instead of
checking for the definition of UNICODE or _UNICODE, portaudio checks (at
runtime) for the current codepage using GetACP(); If it’s 65001 it uses
CP_UTF8, if not it uses CP_ACP. I tested it on a laptop running Windows 10
Single Language Spanish and the text rendered appropriately both with the
setting turned on and with the setting turned off.
I will attach the diff file here but TBH I have no idea whether it is
possible to attach files to these lists, so you can also get it from Dropbox
below. I’m not opening a ticket/submitting a PR because Assembla is paid.
https://www.dropbox.com/s/le6zyr1zjv6mank/pa_patch.diff?dl=1
If Portaudio is using CP_UTF8 then it should be changed so that it
does not. UTF-8 support in Windows is horribly broken to the point of
being considered nonexistent. While programs *should* be compiled with
the UNICODE and _UNICODE macros, and *should* use the XxxW (wchar_t)
APIs, they most likely would be best served by using UTF-8 internally
as appropriate.
I recommend anyone following along read http://utf8everywhere.org/. I
realize you, the patch author, may not have the time to remove CP_UTF8
from Portaudio entirely.
My experience has indicated much the same as the above article
suggests, with the caveat that programs being compiled for Windows
only can usually use wchar_t everywhere. If you need to import or
export data generated by the OS, though, you should convert it from
the native codepage to UTF-8, which is why the author recommends the
things they do.
Cheers,
R0b0t1
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
Ross Bencina
2018-06-29 03:07:21 UTC
Permalink
Hello Gregorio,

If I understand correctly, your patch disables UTF-8 when UNICODE is not
defined, is that correct?

Ross.
Post by Gregorio Litenstein
I agree that ideally one should use utf8 everywhere and I think that
Microsoft may want to start pushing in that direction eventually (as
suggested by the addition of this setting). My patch proves a better
alternative (with face to the end-user) than what portaudio is currently
doing.
I only touch Windows when I absolutely have to, so wouldn't even know
where to begin in order to correct the root issue, but at least this works.
Date: June 28, 2018 at 14:44:31
Subject: Re: [Portaudio] Proper fix for Windows Unicode issues, and a
couple more things.
Post by R0b0t1
Hello,
On Thu, Jun 28, 2018 at 12:51 PM, Gregorio Litenstein
Hey, I’m one of the developers of Performous (cross-platform karaoke game),
we’ve been using portaudio for a while on Unix/Mac/Windows.
I’d initially written a really long post with an issue report but then I
realized a) It was, after pulling most of my hair out, an issue on our side
after all. And b) I sent it to the wrong address so I get it never actually
made it into the list. Anyway… I’m back with a couple more real issues and
at least one fix.
First, we were facing issues with the display of unicode text in Windows;
https://lists.columbia.edu/pipermail/portaudio/2016-December/000961.html
I took a look at the hostapi implementations and noticed that only some
changed their behavior depending on whether UNICODE was defined or not.
while others always used CP_UTF8.
Initially I thought my issue might have been related to that, so I did some
testing and figured out that in recent versions of Windows, defining UNICODE
(i.e. having everything use CP_UTF8) made the text uniformly garbled unless
I checked the new setting “Use Unicode UTF-8 worldwide” or something like
that. And by contrast, if that setting was off, CP_ACP properly rendered the
text.
it appears that what this setting actually does is set the codepage to UTF-8
(65001).
With this in mind, I created a patch modifying the behavior so instead of
checking for the definition of UNICODE or _UNICODE, portaudio checks (at
runtime) for the current codepage using GetACP(); If it’s 65001 it uses
CP_UTF8, if not it uses CP_ACP. I tested it on a laptop running Windows 10
Single Language Spanish and the text rendered appropriately both with the
setting turned on and with the setting turned off.
I will attach the diff file here but TBH I have no idea whether it is
possible to attach files to these lists, so you can also get it from Dropbox
below. I’m not opening a ticket/submitting a PR because Assembla is paid.
https://www.dropbox.com/s/le6zyr1zjv6mank/pa_patch.diff?dl=1
If Portaudio is using CP_UTF8 then it should be changed so that it
does not. UTF-8 support in Windows is horribly broken to the point of
being considered nonexistent. While programs *should* be compiled with
the UNICODE and _UNICODE macros, and *should* use the XxxW (wchar_t)
APIs, they most likely would be best served by using UTF-8 internally
as appropriate.
I recommend anyone following along read http://utf8everywhere.org/. I
realize you, the patch author, may not have the time to remove CP_UTF8
from Portaudio entirely.
My experience has indicated much the same as the above article
suggests, with the caveat that programs being compiled for Windows
only can usually use wchar_t everywhere. If you need to import or
export data generated by the OS, though, you should convert it from
the native codepage to UTF-8, which is why the author recommends the
things they do.
Cheers,
R0b0t1
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
Gregorio Litenstein
2018-06-29 19:07:42 UTC
Permalink
This got lost in the void accidentally:

Not really. It completely eliminates UNICODE from the equation.

Currently, the DirectSound hostapi does what you describe, while WASAPI and
WDM-KS use CP_UTF8 always. My patch makes it so all three use ANSI
codepages UNLESS Windows has been configured to try to use UTF8 even for
apps that aren’t entirely Unicode compliant (which is a new setting, marked
as Beta in Windows 10 build 1803).

In doing this, localized device names should be rendered correctly
regardless of the codepage being used by Windows.

Either way, as far as I can tell, the only real impact of this patch is how
are device names displayed.
--
Gregorio Litenstein Goldzweig
[image: glit_ind.png]
Médico Cirujano



- Fono: +56 9 96343643
- E-Mail: ***@gmail.com



From: Ross Bencina <rossb-***@audiomulch.com> <rossb-***@audiomulch.com>
Date: June 28, 2018 at 23:07:21
To: portaudio list <***@lists.columbia.edu>
<***@lists.columbia.edu>, Gregorio Litenstein <***@gmail.com>
<***@gmail.com>
Subject: Re: [Portaudio] Proper fix for Windows Unicode issues, and a
couple more things.

Hello Gregorio,
Post by Ross Bencina
If I understand correctly, your patch disables UTF-8 when UNICODE is not
defined, is that correct?
Ross.
I agree that ideally one should use utf8 everywhere and I think that
Microsoft may want to start pushing in that direction eventually (as
suggested by the addition of this setting). My patch proves a better
alternative (with face to the end-user) than what portaudio is currently
doing.
I only touch Windows when I absolutely have to, so wouldn't even know
where to begin in order to correct the root issue, but at least this works.
Date: June 28, 2018 at 14:44:31
Subject: Re: [Portaudio] Proper fix for Windows Unicode issues, and a
couple more things.
Hello,
On Thu, Jun 28, 2018 at 12:51 PM, Gregorio Litenstein
Hey, I’m one of the developers of Performous (cross-platform karaoke
game),
we’ve been using portaudio for a while on Unix/Mac/Windows.
I’d initially written a really long post with an issue report but then I
realized a) It was, after pulling most of my hair out, an issue on our side
after all. And b) I sent it to the wrong address so I get it never actually
made it into the list. Anyway
 I’m back with a couple more real
issues and
at least one fix.
First, we were facing issues with the display of unicode text in Windows;
https://lists.columbia.edu/pipermail/portaudio/2016-December/000961.html
I took a look at the hostapi implementations and noticed that only some
changed their behavior depending on whether UNICODE was defined or not.
while others always used CP_UTF8.
Initially I thought my issue might have been related to that, so I did some
testing and figured out that in recent versions of Windows, defining UNICODE
(i.e. having everything use CP_UTF8) made the text uniformly garbled unless
I checked the new setting “Use Unicode UTF-8 worldwide” or something like
that. And by contrast, if that setting was off, CP_ACP properly rendered the
text.
it appears that what this setting actually does is set the codepage to UTF-8
(65001).
With this in mind, I created a patch modifying the behavior so instead of
checking for the definition of UNICODE or _UNICODE, portaudio checks (at
runtime) for the current codepage using GetACP(); If it’s 65001 it uses
CP_UTF8, if not it uses CP_ACP. I tested it on a laptop running Windows 10
Single Language Spanish and the text rendered appropriately both with the
setting turned on and with the setting turned off.
I will attach the diff file here but TBH I have no idea whether it is
possible to attach files to these lists, so you can also get it from Dropbox
below. I’m not opening a ticket/submitting a PR because Assembla is paid.
https://www.dropbox.com/s/le6zyr1zjv6mank/pa_patch.diff?dl=1
If Portaudio is using CP_UTF8 then it should be changed so that it
does not. UTF-8 support in Windows is horribly broken to the point of
being considered nonexistent. While programs *should* be compiled with
the UNICODE and _UNICODE macros, and *should* use the XxxW (wchar_t)
APIs, they most likely would be best served by using UTF-8 internally
as appropriate.
I recommend anyone following along read http://utf8everywhere.org/. I
realize you, the patch author, may not have the time to remove CP_UTF8
from Portaudio entirely.
My experience has indicated much the same as the above article
suggests, with the caveat that programs being compiled for Windows
only can usually use wchar_t everywhere. If you need to import or
export data generated by the OS, though, you should convert it from
the native codepage to UTF-8, which is why the author recommends the
things they do.
Cheers,
R0b0t1
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
Ross Bencina
2018-06-30 05:55:15 UTC
Permalink
Hi Gregorio,

Thanks for the clarification. However I don't fully understand what
you're saying.

Are you saying that without this patch, if Windows is configured in a
certain way, PortAudio does not correctly translate device names
returned by the OS into UTF8?

As a general principle for any patch, it should not matter at all how
the executable or how Windows is configured, PortAudio *MUST* always
return UTF8 strings, because

(1) this is what we have previously agreed

and

(2) if we don't do this, applications that use PortAudio are no longer
portable because they can't rely on PortAudio strings being utf8.


Finally, the reason that we switched to requiring utf8 everywhere was
that there were bugs where device names with special characters were not
displayed correctly. And we agreed that the same encoding needs to be
used everywhere.

Ross.
Post by Gregorio Litenstein
Not really. It completely eliminates UNICODE from the equation.
Currently, the DirectSound hostapi does what you describe, while WASAPI
and WDM-KS use CP_UTF8 always. My patch makes it so all three use ANSI
codepages UNLESS Windows has been configured to try to use UTF8 even for
apps that aren’t entirely Unicode compliant (which is a new setting,
marked as Beta in Windows 10 build 1803).
In doing this, localized device names should be rendered correctly
regardless of the codepage being used by Windows.
Either way, as far as I can tell, the only real impact of this patch is
how are device names displayed.
--
Gregorio Litenstein Goldzweig
glit_ind.png
Médico Cirujano
* Fono: +56 9 96343643
Post by Gregorio Litenstein
Date: June 28, 2018 at 23:07:21
Subject: Re: [Portaudio] Proper fix for Windows Unicode issues, and a
couple more things.
Post by Ross Bencina
Hello Gregorio,
If I understand correctly, your patch disables UTF-8 when UNICODE is not
defined, is that correct?
Ross.
Post by Gregorio Litenstein
I agree that ideally one should use utf8 everywhere and I think that
Microsoft may want to start pushing in that direction eventually (as
suggested by the addition of this setting). My patch proves a better
alternative (with face to the end-user) than what portaudio is currently
doing.
I only touch Windows when I absolutely have to, so wouldn't even know
where to begin in order to correct the root issue, but at least this works.
Date: June 28, 2018 at 14:44:31
Subject: Re: [Portaudio] Proper fix for Windows Unicode issues, and a
couple more things.
Post by R0b0t1
Hello,
On Thu, Jun 28, 2018 at 12:51 PM, Gregorio Litenstein
Hey, I’m one of the developers of Performous (cross-platform karaoke game),
we’ve been using portaudio for a while on Unix/Mac/Windows.
I’d initially written a really long post with an issue report but then I
realized a) It was, after pulling most of my hair out, an issue on our side
after all. And b) I sent it to the wrong address so I get it never actually
made it into the list. Anyway… I’m back with a couple more real issues and
at least one fix.
First, we were facing issues with the display of unicode text in Windows;
https://lists.columbia.edu/pipermail/portaudio/2016-December/000961.html
I took a look at the hostapi implementations and noticed that only some
changed their behavior depending on whether UNICODE was defined or not.
while others always used CP_UTF8.
Initially I thought my issue might have been related to that, so I did some
testing and figured out that in recent versions of Windows, defining UNICODE
(i.e. having everything use CP_UTF8) made the text uniformly garbled unless
I checked the new setting “Use Unicode UTF-8 worldwide” or something like
that. And by contrast, if that setting was off, CP_ACP properly rendered the
text.
it appears that what this setting actually does is set the codepage to UTF-8
(65001).
With this in mind, I created a patch modifying the behavior so instead of
checking for the definition of UNICODE or _UNICODE, portaudio checks (at
runtime) for the current codepage using GetACP(); If it’s 65001 it uses
CP_UTF8, if not it uses CP_ACP. I tested it on a laptop running Windows 10
Single Language Spanish and the text rendered appropriately both with the
setting turned on and with the setting turned off.
I will attach the diff file here but TBH I have no idea whether it is
possible to attach files to these lists, so you can also get it from Dropbox
below. I’m not opening a ticket/submitting a PR because Assembla is paid.
https://www.dropbox.com/s/le6zyr1zjv6mank/pa_patch.diff?dl=1
If Portaudio is using CP_UTF8 then it should be changed so that it
does not. UTF-8 support in Windows is horribly broken to the point of
being considered nonexistent. While programs *should* be compiled with
the UNICODE and _UNICODE macros, and *should* use the XxxW (wchar_t)
APIs, they most likely would be best served by using UTF-8 internally
as appropriate.
I recommend anyone following along read http://utf8everywhere.org/. I
realize you, the patch author, may not have the time to remove CP_UTF8
from Portaudio entirely.
My experience has indicated much the same as the above article
suggests, with the caveat that programs being compiled for Windows
only can usually use wchar_t everywhere. If you need to import or
export data generated by the OS, though, you should convert it from
the native codepage to UTF-8, which is why the author recommends the
things they do.
Cheers,
R0b0t1
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
R0b0t1
2018-06-30 06:22:34 UTC
Permalink
On Sat, Jun 30, 2018 at 12:55 AM, Ross Bencina
Post by Ross Bencina
Hi Gregorio,
Thanks for the clarification. However I don't fully understand what you're
saying.
Are you saying that without this patch, if Windows is configured in a
certain way, PortAudio does not correctly translate device names returned by
the OS into UTF8?
As a general principle for any patch, it should not matter at all how the
executable or how Windows is configured, PortAudio *MUST* always return UTF8
strings, because
(1) this is what we have previously agreed
I agree exposing a UTF-8 interface is best, but it is necessary to
interact with Windows using UTF-16. Anything else is broken.
Consequently _UNICODE and UNICODE have to be defined - CP_UTF8 isn't a
real codepage, it only exists for use with MultiBytetoWideChar and
WideCharToMultiByte.
Post by Ross Bencina
and
(2) if we don't do this, applications that use PortAudio are no longer
portable because they can't rely on PortAudio strings being utf8.
ANSI fits within UTF-8 so it should work.
Post by Ross Bencina
Finally, the reason that we switched to requiring utf8 everywhere was that
there were bugs where device names with special characters were not
displayed correctly. And we agreed that the same encoding needs to be used
everywhere.
Ross.
I hope to not confuse the conversation, but the issue isn't with
PortAudio, per se - the issue is with Windows. *Not "using" UTF-8* is
the best way to solve these problems as far as the OS is concerned.
This may mean the bug you encountered can't be fixed. If a device
doesn't provide an ASCII or limited-symbol name then there are
potentially OS configurations which can't display the name.

The problem lies in that using any Windows string function with UTF-8
data when not in the UTF-8 codepage can potentially corrupt the data.
This is unfortunate considering that the UTF-8 codepage is not really
implemented and corrupts data anyway.

The ANSI codepages also corrupt data so they can not be used
facilitate OS-mediated UTF-8 transfer. The solution is to pass around
UTF-8 and then require users convert it to their code page when
interacting with the OS.

Cheers,
R0b0t1
Post by Ross Bencina
Post by Gregorio Litenstein
Not really. It completely eliminates UNICODE from the equation.
Currently, the DirectSound hostapi does what you describe, while WASAPI
and WDM-KS use CP_UTF8 always. My patch makes it so all three use ANSI
codepages UNLESS Windows has been configured to try to use UTF8 even for
apps that aren’t entirely Unicode compliant (which is a new setting, marked
as Beta in Windows 10 build 1803).
In doing this, localized device names should be rendered correctly
regardless of the codepage being used by Windows.
Either way, as far as I can tell, the only real impact of this patch is
how are device names displayed.
--
Gregorio Litenstein Goldzweig
glit_ind.png
Médico Cirujano
* Fono: +56 9 96343643
Post by Gregorio Litenstein
Date: June 28, 2018 at 23:07:21
Subject: Re: [Portaudio] Proper fix for Windows Unicode issues, and a
couple more things.
Post by Ross Bencina
Hello Gregorio,
If I understand correctly, your patch disables UTF-8 when UNICODE is not
defined, is that correct?
Ross.
Post by Gregorio Litenstein
I agree that ideally one should use utf8 everywhere and I think that
Microsoft may want to start pushing in that direction eventually (as
suggested by the addition of this setting). My patch proves a better
alternative (with face to the end-user) than what portaudio is currently
doing.
I only touch Windows when I absolutely have to, so wouldn't even know
where to begin in order to correct the root issue, but at least this works.
Date: June 28, 2018 at 14:44:31
Subject: Re: [Portaudio] Proper fix for Windows Unicode issues, and a
couple more things.
Post by R0b0t1
Hello,
On Thu, Jun 28, 2018 at 12:51 PM, Gregorio Litenstein
Hey, I’m one of the developers of Performous (cross-platform karaoke game),
we’ve been using portaudio for a while on Unix/Mac/Windows.
I’d initially written a really long post with an issue report but then I
realized a) It was, after pulling most of my hair out, an issue on our side
after all. And b) I sent it to the wrong address so I get it never actually
made it into the list. Anyway… I’m back with a couple more real issues and
at least one fix.
First, we were facing issues with the display of unicode text in Windows;
https://lists.columbia.edu/pipermail/portaudio/2016-December/000961.html
I took a look at the hostapi implementations and noticed that only some
changed their behavior depending on whether UNICODE was defined or not.
while others always used CP_UTF8.
Initially I thought my issue might have been related to that, so I did some
testing and figured out that in recent versions of Windows, defining UNICODE
(i.e. having everything use CP_UTF8) made the text uniformly garbled unless
I checked the new setting “Use Unicode UTF-8 worldwide” or something like
that. And by contrast, if that setting was off, CP_ACP properly rendered the
text.
it appears that what this setting actually does is set the codepage to UTF-8
(65001).
With this in mind, I created a patch modifying the behavior so instead of
checking for the definition of UNICODE or _UNICODE, portaudio checks (at
runtime) for the current codepage using GetACP(); If it’s 65001 it uses
CP_UTF8, if not it uses CP_ACP. I tested it on a laptop running Windows 10
Single Language Spanish and the text rendered appropriately both with the
setting turned on and with the setting turned off.
I will attach the diff file here but TBH I have no idea whether it is
possible to attach files to these lists, so you can also get it from Dropbox
below. I’m not opening a ticket/submitting a PR because Assembla is paid.
https://www.dropbox.com/s/le6zyr1zjv6mank/pa_patch.diff?dl=1
If Portaudio is using CP_UTF8 then it should be changed so that it
does not. UTF-8 support in Windows is horribly broken to the point of
being considered nonexistent. While programs *should* be compiled with
the UNICODE and _UNICODE macros, and *should* use the XxxW (wchar_t)
APIs, they most likely would be best served by using UTF-8 internally
as appropriate.
I recommend anyone following along read http://utf8everywhere.org/. I
realize you, the patch author, may not have the time to remove CP_UTF8
from Portaudio entirely.
My experience has indicated much the same as the above article
suggests, with the caveat that programs being compiled for Windows
only can usually use wchar_t everywhere. If you need to import or
export data generated by the OS, though, you should convert it from
the native codepage to UTF-8, which is why the author recommends the
things they do.
Cheers,
R0b0t1
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
Gregorio Litenstein
2018-06-30 06:30:41 UTC
Permalink
As portaudio is now, broken names are displayed for some (default)
configurations. My patch should fix that. I can probably send you some
practical examples tomorrow so it's clearer.
--
Gregorio Litenstein Goldzweig
[image: glit_ind.png]
Médico Cirujano



- Fono: +56 9 96343643
- E-Mail: ***@gmail.com


From: R0b0t1 <***@gmail.com> <***@gmail.com>
Date: June 30, 2018 at 02:22:34
To: portaudio list <***@lists.columbia.edu>
<***@lists.columbia.edu>
CC: Gregorio Litenstein <***@gmail.com> <***@gmail.com>
Subject: Re: [Portaudio] Proper fix for Windows Unicode issues, and a
couple more things.

On Sat, Jun 30, 2018 at 12:55 AM, Ross Bencina
Post by Ross Bencina
Hi Gregorio,
Thanks for the clarification. However I don't fully understand what you're
saying.
Are you saying that without this patch, if Windows is configured in a
certain way, PortAudio does not correctly translate device names returned by
the OS into UTF8?
As a general principle for any patch, it should not matter at all how the
executable or how Windows is configured, PortAudio *MUST* always return UTF8
strings, because
(1) this is what we have previously agreed
I agree exposing a UTF-8 interface is best, but it is necessary to
interact with Windows using UTF-16. Anything else is broken.
Consequently _UNICODE and UNICODE have to be defined - CP_UTF8 isn't a
real codepage, it only exists for use with MultiBytetoWideChar and
WideCharToMultiByte.
and
(2) if we don't do this, applications that use PortAudio are no longer
portable because they can't rely on PortAudio strings being utf8.
ANSI fits within UTF-8 so it should work.
Finally, the reason that we switched to requiring utf8 everywhere was that
there were bugs where device names with special characters were not
displayed correctly. And we agreed that the same encoding needs to be used
everywhere.
Ross.
I hope to not confuse the conversation, but the issue isn't with
PortAudio, per se - the issue is with Windows. *Not "using" UTF-8* is
the best way to solve these problems as far as the OS is concerned.
This may mean the bug you encountered can't be fixed. If a device
doesn't provide an ASCII or limited-symbol name then there are
potentially OS configurations which can't display the name.
The problem lies in that using any Windows string function with UTF-8
data when not in the UTF-8 codepage can potentially corrupt the data.
This is unfortunate considering that the UTF-8 codepage is not really
implemented and corrupts data anyway.
The ANSI codepages also corrupt data so they can not be used
facilitate OS-mediated UTF-8 transfer. The solution is to pass around
UTF-8 and then require users convert it to their code page when
interacting with the OS.
Cheers,
R0b0t1
Not really. It completely eliminates UNICODE from the equation.
Currently, the DirectSound hostapi does what you describe, while WASAPI
and WDM-KS use CP_UTF8 always. My patch makes it so all three use ANSI
codepages UNLESS Windows has been configured to try to use UTF8 even for
apps that aren’t entirely Unicode compliant (which is a new setting,
marked
as Beta in Windows 10 build 1803).
In doing this, localized device names should be rendered correctly
regardless of the codepage being used by Windows.
Either way, as far as I can tell, the only real impact of this patch is
how are device names displayed.
--
Gregorio Litenstein Goldzweig
glit_ind.png
Médico Cirujano
* Fono: +56 9 96343643
Date: June 28, 2018 at 23:07:21
Subject: Re: [Portaudio] Proper fix for Windows Unicode issues, and a
couple more things.
Hello Gregorio,
If I understand correctly, your patch disables UTF-8 when UNICODE is not
defined, is that correct?
Ross.
I agree that ideally one should use utf8 everywhere and I think that
Microsoft may want to start pushing in that direction eventually (as
suggested by the addition of this setting). My patch proves a better
alternative (with face to the end-user) than what portaudio is
currently
doing.
I only touch Windows when I absolutely have to, so wouldn't even know
where to begin in order to correct the root issue, but at least this works.
Date: June 28, 2018 at 14:44:31
Subject: Re: [Portaudio] Proper fix for Windows Unicode issues, and a
couple more things.
Hello,
On Thu, Jun 28, 2018 at 12:51 PM, Gregorio Litenstein
Hey, I’m one of the developers of Performous (cross-platform karaoke
game),
we’ve been using portaudio for a while on Unix/Mac/Windows.
I’d initially written a really long post with an issue report but
then I
realized a) It was, after pulling most of my hair out, an issue on our side
after all. And b) I sent it to the wrong address so I get it never actually
made it into the list. Anyway
 I’m back with a couple more real
issues and
at least one fix.
First, we were facing issues with the display of unicode text in Windows;
https://lists.columbia.edu/pipermail/portaudio/2016-December/000961.html
I took a look at the hostapi implementations and noticed that only some
changed their behavior depending on whether UNICODE was defined or not.
while others always used CP_UTF8.
Initially I thought my issue might have been related to that, so I did some
testing and figured out that in recent versions of Windows, defining UNICODE
(i.e. having everything use CP_UTF8) made the text uniformly garbled unless
I checked the new setting “Use Unicode UTF-8 worldwide” or something
like
that. And by contrast, if that setting was off, CP_ACP properly rendered the
text.
it appears that what this setting actually does is set the codepage to UTF-8
(65001).
With this in mind, I created a patch modifying the behavior so
instead of
checking for the definition of UNICODE or _UNICODE, portaudio checks (at
runtime) for the current codepage using GetACP(); If it’s 65001 it
uses
CP_UTF8, if not it uses CP_ACP. I tested it on a laptop running Windows 10
Single Language Spanish and the text rendered appropriately both with the
setting turned on and with the setting turned off.
I will attach the diff file here but TBH I have no idea whether it is
possible to attach files to these lists, so you can also get it from Dropbox
below. I’m not opening a ticket/submitting a PR because Assembla is
paid.
https://www.dropbox.com/s/le6zyr1zjv6mank/pa_patch.diff?dl=1
If Portaudio is using CP_UTF8 then it should be changed so that it
does not. UTF-8 support in Windows is horribly broken to the point of
being considered nonexistent. While programs *should* be compiled with
the UNICODE and _UNICODE macros, and *should* use the XxxW (wchar_t)
APIs, they most likely would be best served by using UTF-8 internally
as appropriate.
I recommend anyone following along read http://utf8everywhere.org/. I
realize you, the patch author, may not have the time to remove CP_UTF8
from Portaudio entirely.
My experience has indicated much the same as the above article
suggests, with the caveat that programs being compiled for Windows
only can usually use wchar_t everywhere. If you need to import or
export data generated by the OS, though, you should convert it from
the native codepage to UTF-8, which is why the author recommends the
things they do.
Cheers,
R0b0t1
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
Ross Bencina
2018-07-02 05:55:05 UTC
Permalink
Hi R0b0t1,

That's a very helpful perspective, some of the details are coming back
to me now. I don't entirely agree with your conclusions -- perhaps I am
misguided, please could you correct me...
Post by R0b0t1
I agree exposing a UTF-8 interface is best, but it is necessary to
interact with Windows using UTF-16. Anything else is broken.
I understand this much. Where relevant, PortAudio needs to use the
unicode versions of any Win32 APIs that accept or return strings.
Post by R0b0t1
Consequently _UNICODE and UNICODE have to be defined
No, we just need to always explicitly call the API functions that have
the W suffix. This may not currently be the case everywhere but it is
certainly the approach I have taken when implementing UTF-8 translations
in the PortAudio code that I worked on.
Post by R0b0t1
- CP_UTF8 isn't a
real codepage, it only exists for use with MultiBytetoWideChar and
WideCharToMultiByte.
Post by Ross Bencina
and
(2) if we don't do this, applications that use PortAudio are no longer
portable because they can't rely on PortAudio strings being utf8.
ANSI fits within UTF-8 so it should work.
I'm not sure what you're getting at here. The whole point of requiring
PA to return UTF-8 strings is PortAudio needs to return UTF-8 so that
drivers with non-ASCII names can be displayed correctly and
consistently. This problem is real, and it is why we decided to
stipulate the PortAudio always returns UTF-8 strings.
Post by R0b0t1
Post by Ross Bencina
Finally, the reason that we switched to requiring utf8 everywhere was that
there were bugs where device names with special characters were not
displayed correctly. And we agreed that the same encoding needs to be used
everywhere.
Ross.
I hope to not confuse the conversation, but the issue isn't with
PortAudio, per se - the issue is with Windows. *Not "using" UTF-8* is
the best way to solve these problems as far as the OS is concerned.
This may mean the bug you encountered can't be fixed. If a device
doesn't provide an ASCII or limited-symbol name then there are
potentially OS configurations which can't display the name.
I'm not sure how this is relevant. If Windows can't display a utf-8
string, that's a problem for the client application trying to "display"
the utf-8 name, not for PortAudio.

Also, could you clarify which OS configurations these might be? My
knowledge is limited, so I'm not aware of any. Note that we have tacitly
given up supporting anything older than Windows XP.
Post by R0b0t1
The problem lies in that using any Windows string function with UTF-8
data when not in the UTF-8 codepage can potentially corrupt the data.
This is unfortunate considering that the UTF-8 codepage is not really
implemented and corrupts data anyway.
Absolutely agree with that. That's why PortAudio should (and does)
translate from UTF-8 to to UTF-16 and then call the "W" suffix Windows
APIs. PortAudio should never be passing UTF-8 char* buffers to "A"
suffix Windows APIs. I think we both agree on that.
Post by R0b0t1
The ANSI codepages also corrupt data so they can not be used
facilitate OS-mediated UTF-8 transfer. The solution is to pass around
UTF-8 and then require users convert it to their code page when
interacting with the OS.
Erm, isn't the solution to to pass around UTF-8 and then require PA to
convert it to UTF-16 and pass it to the UTF-16 Windows APIs? Am I
missing something here? If we use the Wide-string APIs does what you're
referring to as "their code page" have any relevance?

Ross.
Post by R0b0t1
Cheers,
R0b0t1
Post by Ross Bencina
Post by Gregorio Litenstein
Not really. It completely eliminates UNICODE from the equation.
Currently, the DirectSound hostapi does what you describe, while WASAPI
and WDM-KS use CP_UTF8 always. My patch makes it so all three use ANSI
codepages UNLESS Windows has been configured to try to use UTF8 even for
apps that aren’t entirely Unicode compliant (which is a new setting, marked
as Beta in Windows 10 build 1803).
In doing this, localized device names should be rendered correctly
regardless of the codepage being used by Windows.
Either way, as far as I can tell, the only real impact of this patch is
how are device names displayed.
--
Gregorio Litenstein Goldzweig
glit_ind.png
Médico Cirujano
* Fono: +56 9 96343643
Post by Gregorio Litenstein
Date: June 28, 2018 at 23:07:21
Subject: Re: [Portaudio] Proper fix for Windows Unicode issues, and a
couple more things.
Post by Ross Bencina
Hello Gregorio,
If I understand correctly, your patch disables UTF-8 when UNICODE is not
defined, is that correct?
Ross.
Post by Gregorio Litenstein
I agree that ideally one should use utf8 everywhere and I think that
Microsoft may want to start pushing in that direction eventually (as
suggested by the addition of this setting). My patch proves a better
alternative (with face to the end-user) than what portaudio is currently
doing.
I only touch Windows when I absolutely have to, so wouldn't even know
where to begin in order to correct the root issue, but at least this works.
Date: June 28, 2018 at 14:44:31
Subject: Re: [Portaudio] Proper fix for Windows Unicode issues, and a
couple more things.
Post by R0b0t1
Hello,
On Thu, Jun 28, 2018 at 12:51 PM, Gregorio Litenstein
Hey, I’m one of the developers of Performous (cross-platform karaoke game),
we’ve been using portaudio for a while on Unix/Mac/Windows.
I’d initially written a really long post with an issue report but then I
realized a) It was, after pulling most of my hair out, an issue on our side
after all. And b) I sent it to the wrong address so I get it never actually
made it into the list. Anyway… I’m back with a couple more real
issues and
at least one fix.
First, we were facing issues with the display of unicode text in Windows;
https://lists.columbia.edu/pipermail/portaudio/2016-December/000961.html
I took a look at the hostapi implementations and noticed that only some
changed their behavior depending on whether UNICODE was defined or not.
while others always used CP_UTF8.
Initially I thought my issue might have been related to that, so I did some
testing and figured out that in recent versions of Windows, defining UNICODE
(i.e. having everything use CP_UTF8) made the text uniformly garbled unless
I checked the new setting “Use Unicode UTF-8 worldwide” or something like
that. And by contrast, if that setting was off, CP_ACP properly rendered the
text.
it appears that what this setting actually does is set the codepage to UTF-8
(65001).
With this in mind, I created a patch modifying the behavior so instead of
checking for the definition of UNICODE or _UNICODE, portaudio checks (at
runtime) for the current codepage using GetACP(); If it’s 65001 it uses
CP_UTF8, if not it uses CP_ACP. I tested it on a laptop running Windows 10
Single Language Spanish and the text rendered appropriately both with the
setting turned on and with the setting turned off.
I will attach the diff file here but TBH I have no idea whether it is
possible to attach files to these lists, so you can also get it from Dropbox
below. I’m not opening a ticket/submitting a PR because Assembla is paid.
https://www.dropbox.com/s/le6zyr1zjv6mank/pa_patch.diff?dl=1
If Portaudio is using CP_UTF8 then it should be changed so that it
does not. UTF-8 support in Windows is horribly broken to the point of
being considered nonexistent. While programs *should* be compiled with
the UNICODE and _UNICODE macros, and *should* use the XxxW (wchar_t)
APIs, they most likely would be best served by using UTF-8 internally
as appropriate.
I recommend anyone following along read http://utf8everywhere.org/. I
realize you, the patch author, may not have the time to remove CP_UTF8
from Portaudio entirely.
My experience has indicated much the same as the above article
suggests, with the caveat that programs being compiled for Windows
only can usually use wchar_t everywhere. If you need to import or
export data generated by the OS, though, you should convert it from
the native codepage to UTF-8, which is why the author recommends the
things they do.
Cheers,
R0b0t1
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
Robert Bielik
2018-07-02 06:21:56 UTC
Permalink
Hi all,
Post by Ross Bencina
Erm, isn't the solution to to pass around UTF-8 and then require PA to
convert it to UTF-16 and pass it to the UTF-16 Windows APIs? Am I
missing something here? If we use the Wide-string APIs does what you're
referring to as "their code page" have any relevance?
I agree, the correct way to handle this is to convert between UTF-8 and wide char and use the
W suffixed Windows API. UTF-8 is just an encoding.

My 2 cents

Regards,
/Rob
Post by Ross Bencina
Ross.
Post by R0b0t1
Cheers,
R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Not really. It completely eliminates UNICODE from the equation.
Currently, the DirectSound hostapi does what you describe, while
WASAPI
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
and WDM-KS use CP_UTF8 always. My patch makes it so all three use
ANSI
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
codepages UNLESS Windows has been configured to try to use UTF8
even for
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
apps that aren’t entirely Unicode compliant (which is a new setting,
marked
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
as Beta in Windows 10 build 1803).
In doing this, localized device names should be rendered correctly
regardless of the codepage being used by Windows.
Either way, as far as I can tell, the only real impact of this patch is
how are device names displayed.
--
Gregorio Litenstein Goldzweig
glit_ind.png
Médico Cirujano
* Fono: +56 9 96343643
Post by Gregorio Litenstein
Date: June 28, 2018 at 23:07:21
Subject: Re: [Portaudio] Proper fix for Windows Unicode issues, and a
couple more things.
Post by Ross Bencina
Hello Gregorio,
If I understand correctly, your patch disables UTF-8 when UNICODE is
not
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
defined, is that correct?
Ross.
Post by Gregorio Litenstein
I agree that ideally one should use utf8 everywhere and I think that
Microsoft may want to start pushing in that direction eventually (as
suggested by the addition of this setting). My patch proves a better
alternative (with face to the end-user) than what portaudio is currently
doing.
I only touch Windows when I absolutely have to, so wouldn't even
know
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
Post by Gregorio Litenstein
where to begin in order to correct the root issue, but at least this works.
Date: June 28, 2018 at 14:44:31
Subject: Re: [Portaudio] Proper fix for Windows Unicode issues, and
a
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
Post by Gregorio Litenstein
couple more things.
Post by R0b0t1
Hello,
On Thu, Jun 28, 2018 at 12:51 PM, Gregorio Litenstein
Hey, I’m one of the developers of Performous (cross-platform
karaoke
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
Post by Gregorio Litenstein
Post by R0b0t1
game),
we’ve been using portaudio for a while on Unix/Mac/Windows.
I’d initially written a really long post with an issue report but
then I
realized a) It was, after pulling most of my hair out, an issue on
our side
after all. And b) I sent it to the wrong address so I get it never
actually
made it into the list. Anyway… I’m back with a couple more real
issues and
at least one fix.
First, we were facing issues with the display of unicode text in Windows;
https://lists.columbia.edu/pipermail/portaudio/2016-
December/000961.html
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
Post by Gregorio Litenstein
Post by R0b0t1
I took a look at the hostapi implementations and noticed that only some
changed their behavior depending on whether UNICODE was
defined or
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
Post by Gregorio Litenstein
Post by R0b0t1
not.
while others always used CP_UTF8.
Initially I thought my issue might have been related to that, so I
did some
testing and figured out that in recent versions of Windows,
defining
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
Post by Gregorio Litenstein
Post by R0b0t1
UNICODE
(i.e. having everything use CP_UTF8) made the text uniformly
garbled
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
Post by Gregorio Litenstein
Post by R0b0t1
unless
I checked the new setting “Use Unicode UTF-8 worldwide” or
something
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
Post by Gregorio Litenstein
Post by R0b0t1
like
that. And by contrast, if that setting was off, CP_ACP properly
rendered the
text.
it appears that what this setting actually does is set the codepage
to UTF-8
(65001).
With this in mind, I created a patch modifying the behavior so instead of
checking for the definition of UNICODE or _UNICODE, portaudio
checks
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
Post by Gregorio Litenstein
Post by R0b0t1
(at
runtime) for the current codepage using GetACP(); If it’s 65001 it uses
CP_UTF8, if not it uses CP_ACP. I tested it on a laptop running
Windows 10
Single Language Spanish and the text rendered appropriately both
with
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
Post by Gregorio Litenstein
Post by R0b0t1
the
setting turned on and with the setting turned off.
I will attach the diff file here but TBH I have no idea whether it is
possible to attach files to these lists, so you can also get it from
Dropbox
below. I’m not opening a ticket/submitting a PR because
Assembla is
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
Post by Gregorio Litenstein
Post by R0b0t1
paid.
https://www.dropbox.com/s/le6zyr1zjv6mank/pa_patch.diff?dl=1
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
Post by Gregorio Litenstein
Post by R0b0t1
If Portaudio is using CP_UTF8 then it should be changed so that it
does not. UTF-8 support in Windows is horribly broken to the point
of
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
Post by Gregorio Litenstein
Post by R0b0t1
being considered nonexistent. While programs *should* be
compiled with
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
Post by Gregorio Litenstein
Post by R0b0t1
the UNICODE and _UNICODE macros, and *should* use the XxxW
(wchar_t)
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
Post by Gregorio Litenstein
Post by R0b0t1
APIs, they most likely would be best served by using UTF-8
internally
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
Post by Gregorio Litenstein
Post by R0b0t1
as appropriate.
I recommend anyone following along read
http://utf8everywhere.org/. I
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
Post by Gregorio Litenstein
Post by R0b0t1
realize you, the patch author, may not have the time to remove
CP_UTF8
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
Post by Gregorio Litenstein
Post by R0b0t1
from Portaudio entirely.
My experience has indicated much the same as the above article
suggests, with the caveat that programs being compiled for
Windows
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
Post by Gregorio Litenstein
Post by R0b0t1
only can usually use wchar_t everywhere. If you need to import or
export data generated by the OS, though, you should convert it
from
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
Post by Gregorio Litenstein
Post by R0b0t1
the native codepage to UTF-8, which is why the author
recommends the
Post by R0b0t1
Post by R0b0t1
Post by Gregorio Litenstein
Post by Gregorio Litenstein
Post by Ross Bencina
Post by Gregorio Litenstein
Post by R0b0t1
things they do.
Cheers,
R0b0t1
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
R0b0t1
2018-07-07 02:44:35 UTC
Permalink
On Mon, Jul 2, 2018 at 12:55 AM, Ross Bencina
Post by Ross Bencina
Hi R0b0t1,
That's a very helpful perspective, some of the details are coming back to me
now. I don't entirely agree with your conclusions -- perhaps I am misguided,
please could you correct me...
Thanks for listening. I don't mean to pass myself off as an expert. In
fact, I confused myself going back over this thread to write this
reply. Hopefully that's just Windows.
Post by Ross Bencina
Post by R0b0t1
I agree exposing a UTF-8 interface is best, but it is necessary to
interact with Windows using UTF-16. Anything else is broken.
I understand this much. Where relevant, PortAudio needs to use the unicode
versions of any Win32 APIs that accept or return strings.
Post by R0b0t1
Consequently _UNICODE and UNICODE have to be defined
No, we just need to always explicitly call the API functions that have the W
suffix. This may not currently be the case everywhere but it is certainly
the approach I have taken when implementing UTF-8 translations in the
PortAudio code that I worked on.
True, you should call the XxxW APIs directly. Setting _UNICODE and
UNICODE is just a safety measure. If someone accidentally calls the
Xxx macro the code won't compile.
Post by Ross Bencina
Post by R0b0t1
- CP_UTF8 isn't a
real codepage, it only exists for use with MultiBytetoWideChar and
WideCharToMultiByte.
Post by Ross Bencina
and
(2) if we don't do this, applications that use PortAudio are no longer
portable because they can't rely on PortAudio strings being utf8.
ANSI fits within UTF-8 so it should work.
I'm not sure what you're getting at here. The whole point of requiring PA to
return UTF-8 strings is PortAudio needs to return UTF-8 so that drivers with
non-ASCII names can be displayed correctly and consistently. This problem is
real, and it is why we decided to stipulate the PortAudio always returns
UTF-8 strings.
So - Windows won't properly display UTF-8 embedded within ASCII and
some APIs will mangle the data. But you should be able to return
whatever, assuming it is UTF-8. It may not be but if it is it will be
passed to the calling program properly.

Why would it not be? Well, the point of this patch seems to be to fix
problems with CP_UTF8 by doing something else. That something else
also may not work. As it is there seems to be no guaranteed way to
make it work, but this (as far as OP knows) is more likely to make
more things work.

I apologize as I may have missed the point on this one.
Post by Ross Bencina
Post by R0b0t1
Post by Ross Bencina
Finally, the reason that we switched to requiring utf8 everywhere was that
there were bugs where device names with special characters were not
displayed correctly. And we agreed that the same encoding needs to be used
everywhere.
Ross.
I hope to not confuse the conversation, but the issue isn't with
PortAudio, per se - the issue is with Windows. *Not "using" UTF-8* is
the best way to solve these problems as far as the OS is concerned.
This may mean the bug you encountered can't be fixed. If a device
doesn't provide an ASCII or limited-symbol name then there are
potentially OS configurations which can't display the name.
I'm not sure how this is relevant. If Windows can't display a utf-8 string,
that's a problem for the client application trying to "display" the utf-8
name, not for PortAudio.
Also, could you clarify which OS configurations these might be? My knowledge
is limited, so I'm not aware of any. Note that we have tacitly given up
supporting anything older than Windows XP.
Sorry, the problem is more than UTF-8 isn't implemented properly in
Windows because nothing really uses it.

Some of the examples I am aware of are programmer misuse, i.e.
encoding things twice. Others are harder to figure out, like the
issues the patch submitter was experiencing. On closer inspection it
looks like PortAudio is doing basically everything the right way.
Post by Ross Bencina
Post by R0b0t1
The problem lies in that using any Windows string function with UTF-8
data when not in the UTF-8 codepage can potentially corrupt the data.
This is unfortunate considering that the UTF-8 codepage is not really
implemented and corrupts data anyway.
Absolutely agree with that. That's why PortAudio should (and does) translate
from UTF-8 to to UTF-16 and then call the "W" suffix Windows APIs. PortAudio
should never be passing UTF-8 char* buffers to "A" suffix Windows APIs. I
think we both agree on that.
Post by R0b0t1
The ANSI codepages also corrupt data so they can not be used
facilitate OS-mediated UTF-8 transfer. The solution is to pass around
UTF-8 and then require users convert it to their code page when
interacting with the OS.
Erm, isn't the solution to to pass around UTF-8 and then require PA to
convert it to UTF-16 and pass it to the UTF-16 Windows APIs? Am I missing
something here? If we use the Wide-string APIs does what you're referring to
as "their code page" have any relevance?
Yes, that is the solution. No, sorry - UTF-16 was meant to remove
codepages. Their codepage and locale (but mostly their locale) can
matter in what is returned from the API call.

I was recently trying to do some driver development. I *think* there
is a requirement that all device at minimum provide Latin character
strings to work around this misbehavior, but I can't find that page
again. I am not sure how to preferentially request that translation if
it exists.

R0b0t1.
Post by Ross Bencina
Post by R0b0t1
Cheers,
R0b0t1
Post by Ross Bencina
Post by Gregorio Litenstein
Not really. It completely eliminates UNICODE from the equation.
Currently, the DirectSound hostapi does what you describe, while WASAPI
and WDM-KS use CP_UTF8 always. My patch makes it so all three use ANSI
codepages UNLESS Windows has been configured to try to use UTF8 even for
apps that aren’t entirely Unicode compliant (which is a new setting, marked
as Beta in Windows 10 build 1803).
In doing this, localized device names should be rendered correctly
regardless of the codepage being used by Windows.
Either way, as far as I can tell, the only real impact of this patch is
how are device names displayed.
--
Gregorio Litenstein Goldzweig
glit_ind.png
Médico Cirujano
* Fono: +56 9 96343643
Post by Gregorio Litenstein
Date: June 28, 2018 at 23:07:21
Subject: Re: [Portaudio] Proper fix for Windows Unicode issues, and a
couple more things.
Post by Ross Bencina
Hello Gregorio,
If I understand correctly, your patch disables UTF-8 when UNICODE is not
defined, is that correct?
Ross.
Post by Gregorio Litenstein
I agree that ideally one should use utf8 everywhere and I think that
Microsoft may want to start pushing in that direction eventually (as
suggested by the addition of this setting). My patch proves a better
alternative (with face to the end-user) than what portaudio is currently
doing.
I only touch Windows when I absolutely have to, so wouldn't even know
where to begin in order to correct the root issue, but at least this works.
Date: June 28, 2018 at 14:44:31
Subject: Re: [Portaudio] Proper fix for Windows Unicode issues, and a
couple more things.
Post by R0b0t1
Hello,
On Thu, Jun 28, 2018 at 12:51 PM, Gregorio Litenstein
Hey, I’m one of the developers of Performous (cross-platform
karaoke
game),
we’ve been using portaudio for a while on Unix/Mac/Windows.
I’d initially written a really long post with an issue report but then I
realized a) It was, after pulling most of my hair out, an issue on our side
after all. And b) I sent it to the wrong address so I get it never actually
made it into the list. Anyway… I’m back with a couple more real
issues and
at least one fix.
First, we were facing issues with the display of unicode text in Windows;
https://lists.columbia.edu/pipermail/portaudio/2016-December/000961.html
I took a look at the hostapi implementations and noticed that only some
changed their behavior depending on whether UNICODE was defined or not.
while others always used CP_UTF8.
Initially I thought my issue might have been related to that, so I did some
testing and figured out that in recent versions of Windows,
defining
UNICODE
(i.e. having everything use CP_UTF8) made the text uniformly
garbled
unless
I checked the new setting “Use Unicode UTF-8 worldwide” or
something
like
that. And by contrast, if that setting was off, CP_ACP properly
rendered the
text.
it appears that what this setting actually does is set the codepage
to UTF-8
(65001).
With this in mind, I created a patch modifying the behavior so instead of
checking for the definition of UNICODE or _UNICODE, portaudio
checks
(at
runtime) for the current codepage using GetACP(); If it’s 65001 it uses
CP_UTF8, if not it uses CP_ACP. I tested it on a laptop running Windows 10
Single Language Spanish and the text rendered appropriately both
with
the
setting turned on and with the setting turned off.
I will attach the diff file here but TBH I have no idea whether it is
possible to attach files to these lists, so you can also get it
from
Dropbox
below. I’m not opening a ticket/submitting a PR because Assembla is paid.
https://www.dropbox.com/s/le6zyr1zjv6mank/pa_patch.diff?dl=1
If Portaudio is using CP_UTF8 then it should be changed so that it
does not. UTF-8 support in Windows is horribly broken to the point of
being considered nonexistent. While programs *should* be compiled with
the UNICODE and _UNICODE macros, and *should* use the XxxW (wchar_t)
APIs, they most likely would be best served by using UTF-8 internally
as appropriate.
I recommend anyone following along read http://utf8everywhere.org/. I
realize you, the patch author, may not have the time to remove CP_UTF8
from Portaudio entirely.
My experience has indicated much the same as the above article
suggests, with the caveat that programs being compiled for Windows
only can usually use wchar_t everywhere. If you need to import or
export data generated by the OS, though, you should convert it from
the native codepage to UTF-8, which is why the author recommends the
things they do.
Cheers,
R0b0t1
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
Ross Bencina
2018-06-29 02:06:53 UTC
Permalink
Hi Gregorio,

A while ago we decided that all PortAudio API strings are UTF8 on all
platforms. UNICODE and UNICODE_ macros should be ignored on Windows.
Please let me know which host APIs are still taking account of UNICODE
macro.

Thanks,

Ross.
Dmitry Kostjuchenko
2018-06-30 08:52:27 UTC
Permalink
Hi Gregorio,
Post by Gregorio Litenstein
Now, on to the “couple more things”
 it’s actually just one thing. The
GetVersion() function is (still) giving erroneous results. If compiling
portaudio using mingw-w64 (and thus not using WinRT), WASAPI gets the
windows version using

Thank you for noticing this issue. GetVersion() will give Windows 8 version
on Windows 10. Luckily there is a workaround RtlGetVersion which works very
well (tested). Will make necessary adjustments in the code.

Currently there is no problem if IAudioClient2 is used instead of
IAudioClient3 on Windows 10 in non-WinRT mode but it is better to have
correct version just in case features of IAudioClient3 are needed.

Best regards,
Dmitry.
Post by Gregorio Litenstein
Hey, I’m one of the developers of Performous (cross-platform karaoke
game), we’ve been using portaudio for a while on Unix/Mac/Windows.
I’d initially written a really long post with an issue report but then I
realized a) It was, after pulling most of my hair out, an issue on our side
after all. And b) I sent it to the wrong address so I get it never actually
made it into the list. Anyway
 I’m back with a couple more real issues and
at least one fix.
First, we were facing issues with the display of unicode text in Windows;
essentially the same issue reported here: https://lists.columbia.
edu/pipermail/portaudio/2016-December/000961.html
I took a look at the hostapi implementations and noticed that only some
changed their behavior depending on whether UNICODE was defined or not.
while others always used CP_UTF8.
Initially I thought my issue might have been related to that, so I did
some testing and figured out that in recent versions of Windows, defining
UNICODE (i.e. having everything use CP_UTF8) made the text uniformly
garbled unless I checked the new setting “Use Unicode UTF-8 worldwide” or
something like that. And by contrast, if that setting was off, CP_ACP
properly rendered the text.
it appears that what this setting actually does is set the codepage to
UTF-8 (65001).
With this in mind, I created a patch modifying the behavior so instead of
checking for the definition of UNICODE or _UNICODE, portaudio checks (at
runtime) for the current codepage using GetACP(); If it’s 65001 it uses
CP_UTF8, if not it uses CP_ACP. I tested it on a laptop running Windows 10
Single Language Spanish and the text rendered appropriately both with the
setting turned on and with the setting turned off.
I will attach the diff file here but TBH I have no idea whether it is
possible to attach files to these lists, so you can also get it from
Dropbox below. I’m not opening a ticket/submitting a PR because Assembla is
paid.
https://www.dropbox.com/s/le6zyr1zjv6mank/pa_patch.diff?dl=1
Now, on to the “couple more things”
 it’s actually just one thing. The
GetVersion() function is (still) giving erroneous results. If compiling
portaudio using mingw-w64 (and thus not using WinRT), WASAPI gets the
windows version using
dwVersion = fnGetVersion();
// Get the Windows version
dwMajorVersion = (DWORD)(LOBYTE(LOWORD(dwVersion)));
dwMinorVersion = (DWORD)(HIBYTE(LOWORD(dwVersion)));
switch (dwMajorVersion)
{
break; // skip lower
switch (dwMinorVersion)
{
case 0: version = WINDOWS_VISTA_SERVER2008; break;
case 1: version = WINDOWS_7_SERVER2008R2; break;
case 2: version = WINDOWS_8_SERVER2012; break;
case 3: version = WINDOWS_8_1_SERVER2012R2; break;
default: version = WINDOWS_FUTURE; break;
}
break;
switch (dwMinorVersion)
{
case 0: version = WINDOWS_10_SERVER2016; break;
default: version = WINDOWS_FUTURE; break;
}
break;
version = WINDOWS_FUTURE;
break;
}
But, from my tests, I noticed in practice Windows 10 (with latest updates)
returns the same value as Windows 8 (i.e. dwMajorVersion=6,
dwMinorVersion=2) and thus Win10 computers might end up using IAudioClient2.
All the best,
Gregorio.
P.S. Are you planning on a new stable release anytime soon?
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
--
------------------------------------------------------------
Neutron Code Limited
skype: neutroncode
phone: +371 29267835
e-mail: ***@gmail.com
web: http://neutroncode.com
Gregorio Litenstein
2018-06-30 19:22:47 UTC
Permalink
So
 I was going to send you the examples from my patch but I noticed that
it’s still not really perfect. The main idea I had in mind was for it to
render text correctly for whatever settings were being used at the moment
but I guess Windows is more of a mess than I’d thought regarding this
issue. I was reading some of the MSDN articles on working with strings and
I swear my head hurts. I’ll submit a better patch when I’ve got time, if I
can come up with something better (i.e. that does things the way they’re
actually meant to be done)
--
Gregorio Litenstein Goldzweig
[image: glit_ind.png]
Médico Cirujano



- Fono: +56 9 96343643
- E-Mail: ***@gmail.com


On June 30, 2018 at 04:52:42, Dmitry Kostjuchenko (***@iauxsoft.com)
wrote:

Hi Gregorio,
Post by Gregorio Litenstein
Now, on to the “couple more things”
 it’s actually just one thing. The
GetVersion() function is (still) giving erroneous results. If compiling
portaudio using mingw-w64 (and thus not using WinRT), WASAPI gets the
windows version using

Thank you for noticing this issue. GetVersion() will give Windows 8 version
on Windows 10. Luckily there is a workaround RtlGetVersion which works very
well (tested). Will make necessary adjustments in the code.

Currently there is no problem if IAudioClient2 is used instead of
IAudioClient3 on Windows 10 in non-WinRT mode but it is better to have
correct version just in case features of IAudioClient3 are needed.

Best regards,
Dmitry.
Post by Gregorio Litenstein
Hey, I’m one of the developers of Performous (cross-platform karaoke
game), we’ve been using portaudio for a while on Unix/Mac/Windows.
I’d initially written a really long post with an issue report but then I
realized a) It was, after pulling most of my hair out, an issue on our side
after all. And b) I sent it to the wrong address so I get it never actually
made it into the list. Anyway
 I’m back with a couple more real issues and
at least one fix.
First, we were facing issues with the display of unicode text in Windows;
essentially the same issue reported here: https://lists.columbia.
edu/pipermail/portaudio/2016-December/000961.html
I took a look at the hostapi implementations and noticed that only some
changed their behavior depending on whether UNICODE was defined or not.
while others always used CP_UTF8.
Initially I thought my issue might have been related to that, so I did
some testing and figured out that in recent versions of Windows, defining
UNICODE (i.e. having everything use CP_UTF8) made the text uniformly
garbled unless I checked the new setting “Use Unicode UTF-8 worldwide” or
something like that. And by contrast, if that setting was off, CP_ACP
properly rendered the text.
it appears that what this setting actually does is set the codepage to
UTF-8 (65001).
With this in mind, I created a patch modifying the behavior so instead of
checking for the definition of UNICODE or _UNICODE, portaudio checks (at
runtime) for the current codepage using GetACP(); If it’s 65001 it uses
CP_UTF8, if not it uses CP_ACP. I tested it on a laptop running Windows 10
Single Language Spanish and the text rendered appropriately both with the
setting turned on and with the setting turned off.
I will attach the diff file here but TBH I have no idea whether it is
possible to attach files to these lists, so you can also get it from
Dropbox below. I’m not opening a ticket/submitting a PR because Assembla is
paid.
https://www.dropbox.com/s/le6zyr1zjv6mank/pa_patch.diff?dl=1
Now, on to the “couple more things”
 it’s actually just one thing. The
GetVersion() function is (still) giving erroneous results. If compiling
portaudio using mingw-w64 (and thus not using WinRT), WASAPI gets the
windows version using
dwVersion = fnGetVersion();
// Get the Windows version
dwMajorVersion = (DWORD)(LOBYTE(LOWORD(dwVersion)));
dwMinorVersion = (DWORD)(HIBYTE(LOWORD(dwVersion)));
switch (dwMajorVersion)
{
break; // skip lower
switch (dwMinorVersion)
{
case 0: version = WINDOWS_VISTA_SERVER2008; break;
case 1: version = WINDOWS_7_SERVER2008R2; break;
case 2: version = WINDOWS_8_SERVER2012; break;
case 3: version = WINDOWS_8_1_SERVER2012R2; break;
default: version = WINDOWS_FUTURE; break;
}
break;
switch (dwMinorVersion)
{
case 0: version = WINDOWS_10_SERVER2016; break;
default: version = WINDOWS_FUTURE; break;
}
break;
version = WINDOWS_FUTURE;
break;
}
But, from my tests, I noticed in practice Windows 10 (with latest updates)
returns the same value as Windows 8 (i.e. dwMajorVersion=6,
dwMinorVersion=2) and thus Win10 computers might end up using IAudioClient2.
All the best,
Gregorio.
P.S. Are you planning on a new stable release anytime soon?
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio
--
------------------------------------------------------------
Neutron Code Limited
skype: neutroncode
phone: +371 29267835
e-mail: ***@gmail.com
web: http://neutroncode.com
Loading...