[Portaudio] Golang portable sound i/o

Discussion:

Scott Cotton

2018-09-06 12:29:58 UTC

Hi all,

ZikiChombo <http://zikichombo.org> has started a Golang project
<http://github.com/zikichombo.org/sio> similar to port audio. I am writing
to solicit feedback and know-how from the portaudio community. It is early
in our project lifecycle and we would like to learn as much as possible
from port audio and share back anything we do differently (there are some
things we plan to do differently outlined in this github PR
<https://github.com/zikichombo/sio/pull/8>)

(Also, since zc/sio is in a different language, we're not sure how
portaudio authors request w.r.t. redistributed changes being sent back
would work; comments?)

Best,
Scott

sqweek

2018-09-08 15:26:00 UTC

Permalink

Cool! I can't really speak for the developers, but as someone who uses
portaudio within go (via https://github.com/gordonklaus/portaudio) I figure
I can offer some feedback. Audio processing in go is an interesting one,
because you don't have precise control over thread scheduling. I'm sure
you're aware of audio processing's soft-realtime requirements and the
audible artifacts which result if they are not satisfied.

Go's scheduler works by ensuring that N threads are active at any one time,
where N is customisable via runtime.GOMAXPROCS. Historically it defaulted
to 1, but I think that may have changed in more recent versions? Anyway,
when it comes time for an audio buffer to be processed, the OS/audio driver
triggers an event and some thread within your process wakes up to handle
the buffer. This is all happening at the system level; nothing to do with
go yet. But if you want to implement audio processing in go, then your
callback has to enter the go runtime to invoke the processing code. Maybe
you can see where this is going - because the go runtime ensures that only
N threads are running user code at any one time, if there are already N
active threads when the audio callback fires, then the audio processing
doesn't get to happen until another thread yields or finishes its timeslice.

For my use case I wasn't doing significant processing in my callback
routine so I just switched to portaudio's blocking interface (which still
involves a callback at the C level, but avoids invoking go's scheduler). It
occurs to me I should probably file a bug report, but it was also a fair
while ago I was experimenting with this stuff (I think go 1.6).

To be clear it's not like I was getting tons of audio glitches when using a
go audio callback, but I also wasn't satisfied that its worst case
performance provided a robust solution in terms of underruns.

-sqweek

Scott Cotton

2018-09-08 18:39:52 UTC

Permalink

Thanks for your thoughts.

Not having heard from the authors, I'll try to keep my follow up as
port-audio specific as possible.

Regarding the scheduling, with or without Go's portaudio bindings, yes Go's
scheduling introduces uncertainty about when a buffer can be scheduled.
However, this is also often true at the OS level (their are N cpus and M OS
threads/procs to schedule which is at best very hard to control). Go
doesn't have an option for a real-time priority OS thread. I don't think
any Go code will be able to compete for reliability/robustness as compared
to actual real time support. But real time thread support is also, to my
understanding, often not used, (except JACK and I would suspect Go-Jack
bindings violates the real time principle behind JACK). Since their are 2
schedulers (OS and Go runtime) to deal with with Go, I would guess that
1. Without realtime support; and with or without go, one needs to test for
reliability under system load.
2. In some contexts, the Go scheduler may help, and in some it may hurt.
3. If Go communicates directly with the host in a well tested and
engineered way, then it may help to not use the port audio bindings.

The Go garbage collector is low latency for large heaps, probably low
enough for sound apps without tight latency needs. For sound apps without
large heaps or which are smart about memory, GC in Go seems to me to not be
a huge issue for relatively low latency sound apps. Again speaking
independently of Go portaudio bindings or Go connecting directly to the
host.

Also, it appears to me that portaudio has put tons of work and had tons of
testing related to what configurations of interfacing the host it uses and
how it maps these configurations to it's public interface. My intuition is
that a lot of robustness is more related to this level than Go scheduling
via bindings or not, and this is where we hope to learn most from
portaudio. The zikichombo sio project is still a long way from port audio
in this respect.

But there is still the question of the request on the part of the port
audio project to give back any modifications and that this list be used to
discuss portaudio specific topics to help port-audio advance. I want to be
sure to respect that. I don't want to push the topic of using go bindings
for portaudio or not outside the scope and purpose of the portaudio list,
and I'm happy to share anything we learn along the way that might help
port-audio.

Best,
Scott

Post by sqweek

Cool! I can't really speak for the developers, but as someone who uses
portaudio within go (via https://github.com/gordonklaus/portaudio) I
figure I can offer some feedback. Audio processing in go is an interesting
one, because you don't have precise control over thread scheduling. I'm
sure you're aware of audio processing's soft-realtime requirements and the
audible artifacts which result if they are not satisfied.
Go's scheduler works by ensuring that N threads are active at any one
time, where N is customisable via runtime.GOMAXPROCS. Historically it
defaulted to 1, but I think that may have changed in more recent versions?
Anyway, when it comes time for an audio buffer to be processed, the
OS/audio driver triggers an event and some thread within your process wakes
up to handle the buffer. This is all happening at the system level; nothing
to do with go yet. But if you want to implement audio processing in go,
then your callback has to enter the go runtime to invoke the processing
code. Maybe you can see where this is going - because the go runtime
ensures that only N threads are running user code at any one time, if there
are already N active threads when the audio callback fires, then the audio
processing doesn't get to happen until another thread yields or finishes
its timeslice.
For my use case I wasn't doing significant processing in my callback
routine so I just switched to portaudio's blocking interface (which still
involves a callback at the C level, but avoids invoking go's scheduler). It
occurs to me I should probably file a bug report, but it was also a fair
while ago I was experimenting with this stuff (I think go 1.6).
To be clear it's not like I was getting tons of audio glitches when using
a go audio callback, but I also wasn't satisfied that its worst case
performance provided a robust solution in terms of underruns.
-sqweek
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio

--
Scott Cotton
President, IRI France SAS
http://www.iri-labs.com

Phil Burk

2018-09-08 22:15:13 UTC

Permalink

Hello,

The "project" link below was broken. I think I found sio at:
https://github.com/zikichombo/sio

Best of luck with your project. I'm sure some folks would be happy to help
or offer advice.

Given the complication of callbacks and real-time threading, I find that
blocking read/write calls are better than callbacks for high level
languages like Java (or Go).
The latency is generally higher. But that may not matter depending on your
anticipated use cases.

Regarding Android, if you want a native API then I suggest that you try
Oboe. It wraps the old OpenSL ES and the new AAudio API. AAudio was
inspired by PortAudio but does not share any code.
https://github.com/google/oboe

Post by Scott Cotton
(Also, since zc/sio is in a different language, we're not sure how

portaudio authors request w.r.t. redistributed changes being sent back
would work; comments?)

I cannot speak for all PortAudio authors, but I think one could honor the
spirit of the request by reporting back lessons that you learn by creating
sio. Also if you notice any bugs while studying PortAudio, please report
them.

Thank you,
Phil Burk

Post by Scott Cotton
ZikiChombo <http://zikichombo.org> has started a Golang project
<http://github.com/zikichombo.org/sio> similar to port audio. I am
writing to solicit feedback and know-how from the portaudio community. It
is early in our project lifecycle and we would like to learn as much as
possible from port audio and share back anything we do differently (there
are some things we plan to do differently outlined in this github PR
<https://github.com/zikichombo/sio/pull/8>)
(Also, since zc/sio is in a different language, we're not sure how
portaudio authors request w.r.t. redistributed changes being sent back
would work; comments?)
Best,
Scott
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio

Scott Cotton

2018-09-09 06:00:22 UTC

Permalink

Hello,

Thanks for the pointers and well wishes.

I looked at oboe, and while nice it doesn't let us only depend on the host
which is a goal. But it is at least a good example of how to link with
AAudio and OpenSL ES and clean code.

I wanted to ask about your observations for blocking mode. Let me give a
bit of context. While Go is a high level language, it is also in many ways
a low level language: static compilation, direct access to sys calls,
linking against assembly, similar to C/C++ inline assembly, even compiles
for some embedded systems these days, and talks directly to hardware (see
periph.io for example). That said, the blocking APIs do seem like a more
natural fit for Go.

I am curious as to how deep the problem of the source of the higher latency
is for blocking APIs, and if there is any data showing the difference on
concrete cases.

For how deep the problem is, I mean, at some layer, even in callback based
APIs, something is blocking to enforce timing related to PCM data
throughput and sample rate. I guess that is usually via IRQ to the OS from
the sound hardware. But it's not like AudioUnits or AAudio or anything
else is in a spin lock waiting for the next event. There is blocking, even
without a blocking API. Given this observation, and the fact that
non-blocking event based APIs have been so common in audio for so long,
does anyone have an observation about how much of the latency performance
limitations of blocking API are based on it being built on top of an event
API rather than directly exporting the blocking that is inherent at a lower
level?

Best,
Scott

Post by Phil Burk
Hello,
https://github.com/zikichombo/sio
Best of luck with your project. I'm sure some folks would be happy to help
or offer advice.
Given the complication of callbacks and real-time threading, I find that
blocking read/write calls are better than callbacks for high level
languages like Java (or Go).
The latency is generally higher. But that may not matter depending on your
anticipated use cases.
Regarding Android, if you want a native API then I suggest that you try
Oboe. It wraps the old OpenSL ES and the new AAudio API. AAudio was
inspired by PortAudio but does not share any code. https://github.com/
google/oboe

Post by Scott Cotton
(Also, since zc/sio is in a different language, we're not sure how

portaudio authors request w.r.t. redistributed changes being sent back
would work; comments?)
I cannot speak for all PortAudio authors, but I think one could honor the
spirit of the request by reporting back lessons that you learn by creating
sio. Also if you notice any bugs while studying PortAudio, please report
them.
Thank you,
Phil Burk

_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio

--
Scott Cotton
President, IRI France SAS
http://www.iri-labs.com

Phil Burk

2018-09-09 16:36:45 UTC

Permalink

Hello Scott,

Post by Scott Cotton
I am curious as to how deep the problem of the source of the higher

latency is for blocking APIs, and if there is any data showing the
difference on concrete cases.

If the underlying API uses a blocking read/write then the higher level API
can also use a blocking read/write or a callback without adding additional
latency.

But if the underlying API can only use a callback then a higher level
blocking read/write API requires the addition of a FIFO. For example, the
higher level code could write() into the FIFO and the callback could read
it. The FIFO will add latency. An example of this would be the Oboe API on
top of OpenSL ES. OpenSL ES only supports callbacks so Oboe has to add a
FIFO in order to support blocking writes.

Another consideration is thread priority. If an app uses blocking writes
then the app creates the thread that does the writes. But if a callback is
used then the OS can create a thread that runs at a higher priority. For
example with AAudio, the callback thread is created by the OS and uses a
different scheduler, SCHED_FIFO, that has better real-time performance.
This reduces scheduling jitter, which reduces glitches, which allows apps
to use smaller buffers, which reduces latency.

Phil Burk

Post by Scott Cotton
Hello,
Thanks for the pointers and well wishes.
I looked at oboe, and while nice it doesn't let us only depend on the host
which is a goal. But it is at least a good example of how to link with
AAudio and OpenSL ES and clean code.
I wanted to ask about your observations for blocking mode. Let me give a
bit of context. While Go is a high level language, it is also in many ways
a low level language: static compilation, direct access to sys calls,
linking against assembly, similar to C/C++ inline assembly, even compiles
for some embedded systems these days, and talks directly to hardware (see
periph.io for example). That said, the blocking APIs do seem like a more
natural fit for Go.
I am curious as to how deep the problem of the source of the higher
latency is for blocking APIs, and if there is any data showing the
difference on concrete cases.
For how deep the problem is, I mean, at some layer, even in callback based
APIs, something is blocking to enforce timing related to PCM data
throughput and sample rate. I guess that is usually via IRQ to the OS from
the sound hardware. But it's not like AudioUnits or AAudio or anything
else is in a spin lock waiting for the next event. There is blocking, even
without a blocking API. Given this observation, and the fact that
non-blocking event based APIs have been so common in audio for so long,
does anyone have an observation about how much of the latency performance
limitations of blocking API are based on it being built on top of an event
API rather than directly exporting the blocking that is inherent at a lower
level?
Best,
Scott

Post by Phil Burk
Hello,
https://github.com/zikichombo/sio
Best of luck with your project. I'm sure some folks would be happy to
help or offer advice.
Given the complication of callbacks and real-time threading, I find that
blocking read/write calls are better than callbacks for high level
languages like Java (or Go).
The latency is generally higher. But that may not matter depending on
your anticipated use cases.
Regarding Android, if you want a native API then I suggest that you try
Oboe. It wraps the old OpenSL ES and the new AAudio API. AAudio was
inspired by PortAudio but does not share any code.
https://github.com/google/oboe

Post by Scott Cotton
(Also, since zc/sio is in a different language, we're not sure how

portaudio authors request w.r.t. redistributed changes being sent back
would work; comments?)
I cannot speak for all PortAudio authors, but I think one could honor the
spirit of the request by reporting back lessons that you learn by creating
sio. Also if you notice any bugs while studying PortAudio, please report
them.
Thank you,
Phil Burk

_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio

--
Scott Cotton
President, IRI France SAS
http://www.iri-labs.com

Scott Cotton

2018-09-09 18:13:58 UTC

Permalink

Hi Phil,

Very clear and helpful explanation, thanks much.

Scott

Post by Phil Burk
Hello Scott,

Post by Scott Cotton
I am curious as to how deep the problem of the source of the higher

latency is for blocking APIs, and if there is any data showing the
difference on concrete cases.
If the underlying API uses a blocking read/write then the higher level API
can also use a blocking read/write or a callback without adding additional
latency.
But if the underlying API can only use a callback then a higher level
blocking read/write API requires the addition of a FIFO. For example, the
higher level code could write() into the FIFO and the callback could read
it. The FIFO will add latency. An example of this would be the Oboe API on
top of OpenSL ES. OpenSL ES only supports callbacks so Oboe has to add a
FIFO in order to support blocking writes.
Another consideration is thread priority. If an app uses blocking writes
then the app creates the thread that does the writes. But if a callback is
used then the OS can create a thread that runs at a higher priority. For
example with AAudio, the callback thread is created by the OS and uses a
different scheduler, SCHED_FIFO, that has better real-time performance.
This reduces scheduling jitter, which reduces glitches, which allows apps
to use smaller buffers, which reduces latency.
Phil Burk

Post by Scott Cotton
Hello,
Thanks for the pointers and well wishes.
I looked at oboe, and while nice it doesn't let us only depend on the
host which is a goal. But it is at least a good example of how to link
with AAudio and OpenSL ES and clean code.
I wanted to ask about your observations for blocking mode. Let me give a
bit of context. While Go is a high level language, it is also in many ways
a low level language: static compilation, direct access to sys calls,
linking against assembly, similar to C/C++ inline assembly, even compiles
for some embedded systems these days, and talks directly to hardware (see
periph.io for example). That said, the blocking APIs do seem like a more
natural fit for Go.
I am curious as to how deep the problem of the source of the higher
latency is for blocking APIs, and if there is any data showing the
difference on concrete cases.
For how deep the problem is, I mean, at some layer, even in callback
based APIs, something is blocking to enforce timing related to PCM data
throughput and sample rate. I guess that is usually via IRQ to the OS from
the sound hardware. But it's not like AudioUnits or AAudio or anything
else is in a spin lock waiting for the next event. There is blocking, even
without a blocking API. Given this observation, and the fact that
non-blocking event based APIs have been so common in audio for so long,
does anyone have an observation about how much of the latency performance
limitations of blocking API are based on it being built on top of an event
API rather than directly exporting the blocking that is inherent at a lower
level?
Best,
Scott

Post by Phil Burk
Hello,
https://github.com/zikichombo/sio
Best of luck with your project. I'm sure some folks would be happy to
help or offer advice.
Given the complication of callbacks and real-time threading, I find that
blocking read/write calls are better than callbacks for high level
languages like Java (or Go).
The latency is generally higher. But that may not matter depending on
your anticipated use cases.
Regarding Android, if you want a native API then I suggest that you try
Oboe. It wraps the old OpenSL ES and the new AAudio API. AAudio was
inspired by PortAudio but does not share any code. https://github.com/
google/oboe

Post by Scott Cotton
(Also, since zc/sio is in a different language, we're not sure how

portaudio authors request w.r.t. redistributed changes being sent back
would work; comments?)
I cannot speak for all PortAudio authors, but I think one could honor
the spirit of the request by reporting back lessons that you learn by
creating sio. Also if you notice any bugs while studying PortAudio, please
report them.
Thank you,
Phil Burk

_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio

--
Scott Cotton
President, IRI France SAS
http://www.iri-labs.com

Neil C Smith

2018-09-09 08:32:12 UTC

Permalink

Post by Phil Burk
Given the complication of callbacks and real-time threading, I find that
blocking read/write calls are better than callbacks for high level
languages like Java (or Go).

Scott Cotton

2018-09-09 18:30:47 UTC

Permalink

Hi Neil,

Turns out I was being overly paranoid about the go scheduler w.r.t.
real-time threads.
There is some cost getting into go from C, I guess as there is Java, but it
will just use
the supplied thread if it is a foreign thread and that thread will execute
in Go
independently, only for the work placed on it via C. So callbacks should
also just work
at the given priority, so long as they don't invoke or communicate with
other threads, as in any other language.

Best,
Scott

Post by Neil C Smith

Post by Phil Burk
Given the complication of callbacks and real-time threading, I find that
blocking read/write calls are better than callbacks for high level
languages like Java (or Go).

To be fair, callbacks and real-time threading on the JVM generally just
work because it uses the OS scheduler directly. I don't know a lot about
Go, but somewhat surprised there's not a better way of handing "foreign"
thread attachment.
Best wishes,
Neil
_______________________________________________
Portaudio mailing list
https://lists.columbia.edu/mailman/listinfo/portaudio

--
Scott Cotton
President, IRI France SAS
http://www.iri-labs.com

Neil C Smith

2018-09-09 19:29:45 UTC

Permalink

Post by Scott Cotton
Turns out I was being overly paranoid about the go scheduler w.r.t.
real-time threads.
There is some cost getting into go from C, I guess as there is Java, but
it will just use
the supplied thread if it is a foreign thread and that thread will execute
in Go
independently, only for the work placed on it via C. So callbacks should
also just work
at the given priority, so long as they don't invoke or communicate with
other threads, as in any other language.

Well, that sounds a lot more useful! :-) Quite a few things I can think of
would require that. I'd personally advocate callback over blocking with
higher level languages whenever possible for the latency reasons mentioned
and the benefits of thread setup externally (eg. real-time scheduled). I do
occasionally use PortAudio callbacks into the JVM indirectly with Jack on
Windows, and seems to work well, although I'm mostly a Linux man.
Incidentally, not sure what you'll find with Go but bulk of thread
attachment overhead in Java is first callback only.

Best wishes,

Neil