You can subscribe to this list here.
2004 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(4) |
Sep
|
Oct
|
Nov
(1) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2005 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
(3) |
Jul
|
Aug
(7) |
Sep
|
Oct
(2) |
Nov
(1) |
Dec
(7) |
2006 |
Jan
(1) |
Feb
(2) |
Mar
(3) |
Apr
(3) |
May
(5) |
Jun
(1) |
Jul
|
Aug
(2) |
Sep
(4) |
Oct
(17) |
Nov
(18) |
Dec
(1) |
2007 |
Jan
|
Feb
|
Mar
(8) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
(6) |
Dec
(1) |
2008 |
Jan
(17) |
Feb
(20) |
Mar
(8) |
Apr
(8) |
May
(10) |
Jun
(4) |
Jul
(5) |
Aug
(6) |
Sep
(9) |
Oct
(19) |
Nov
(4) |
Dec
(35) |
2009 |
Jan
(40) |
Feb
(16) |
Mar
(7) |
Apr
(6) |
May
|
Jun
(5) |
Jul
(5) |
Aug
(4) |
Sep
(1) |
Oct
(2) |
Nov
(15) |
Dec
(15) |
2010 |
Jan
(5) |
Feb
(20) |
Mar
(12) |
Apr
|
May
(2) |
Jun
(4) |
Jul
|
Aug
(11) |
Sep
(1) |
Oct
(1) |
Nov
(3) |
Dec
|
2011 |
Jan
(8) |
Feb
(19) |
Mar
|
Apr
(12) |
May
(7) |
Jun
(8) |
Jul
|
Aug
(1) |
Sep
(21) |
Oct
(7) |
Nov
(4) |
Dec
|
2012 |
Jan
(3) |
Feb
(25) |
Mar
(8) |
Apr
(10) |
May
|
Jun
(14) |
Jul
(5) |
Aug
(12) |
Sep
(3) |
Oct
(14) |
Nov
|
Dec
|
2013 |
Jan
(10) |
Feb
(4) |
Mar
(10) |
Apr
(14) |
May
(6) |
Jun
(13) |
Jul
(37) |
Aug
(20) |
Sep
(11) |
Oct
(1) |
Nov
(34) |
Dec
|
2014 |
Jan
(8) |
Feb
(26) |
Mar
(24) |
Apr
(5) |
May
|
Jun
|
Jul
|
Aug
(4) |
Sep
(28) |
Oct
(4) |
Nov
(4) |
Dec
(2) |
2015 |
Jan
|
Feb
|
Mar
(2) |
Apr
|
May
|
Jun
(13) |
Jul
|
Aug
(3) |
Sep
(8) |
Oct
(11) |
Nov
(16) |
Dec
|
2016 |
Jan
|
Feb
(6) |
Mar
|
Apr
(9) |
May
(23) |
Jun
(3) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
2017 |
Jan
|
Feb
|
Mar
|
Apr
(7) |
May
(3) |
Jun
|
Jul
(3) |
Aug
|
Sep
(8) |
Oct
|
Nov
|
Dec
(3) |
2018 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2019 |
Jan
(4) |
Feb
|
Mar
(2) |
Apr
(6) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2020 |
Jan
|
Feb
|
Mar
|
Apr
(31) |
May
|
Jun
|
Jul
|
Aug
(7) |
Sep
|
Oct
|
Nov
|
Dec
(1) |
2021 |
Jan
(2) |
Feb
(2) |
Mar
(5) |
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
S | M | T | W | T | F | S |
---|---|---|---|---|---|---|
1
|
2
(1) |
3
|
4
(1) |
5
(3) |
6
|
7
|
8
(1) |
9
(3) |
10
(3) |
11
(4) |
12
|
13
(4) |
14
|
15
(1) |
16
(1) |
17
(1) |
18
|
19
|
20
|
21
|
22
|
23
|
24
|
25
|
26
|
27
|
28
|
29
|
30
|
31
|
|
|
|
|
From: Stephen S. <rad...@gm...> - 2016-05-17 01:42:34
|
Hi, thanks for bumping it, it seems I missed the original email somehow. Yes, generally I try to stay ANSI C90 compliant in liblo, so it's an error.. I'll try to fix it. (I should test more often with the restricted standard.) Just to clarify, declaring variables closer to where they are used was made legal in C99. Here's a good SA topic that explains: http://stackoverflow.com/questions/8474100/where-you-can-and-cannot-declare-new-variables-in-c MSVC is sort of notorious for having an out of date C implementation. They sadly have no priority on supporting C11 for instance. Note that you can compile liblo with mingw and still use it from a MSVC application, if ever the need should arise. (Again, hasn't been tested for a while but it should work.) Steve On Mon, May 16, 2016 at 4:33 AM, John Emmas <jo...@ti...> wrote: > Hi guys, I'm just bumping this in case it got missed upstream... > > I've fixed the problem locally but it should probably get fixed in > master. Best regards, > > John > > On 02/05/2016 11:32, John Emmas wrote: >> I just came across a compiler issue when building with MSVC. It's in >> the function alloc_server_thread() (in 'src/server_thread.c'):- >> >> static lo_server_thread alloc_server_thread(lo_server s) >> { >> if (!s) >> return NULL; >> lo_server_thread st = (lo_server_thread) // <--- PROBLEM IS >> AT THIS LINE !!! >> malloc(sizeof(struct _lo_server_thread)); >> >> // rest of function. . . >> } >> >> The problem arises because 'server_thread.c' is a 'C' file (as opposed >> to C++). 'C' does not allow variables to get declared half way down a >> function (they need to get declared at the top of the function). Many >> compilers ignore that nowadays but MSVC is still quite strict about >> stuff like that. So making this small change enables it to compile >> correctly:- >> >> static lo_server_thread alloc_server_thread(lo_server s) >> { >> lo_server_thread st; // <--- MOVE THE DECLARATION TO HERE >> >> if (!s) >> return NULL; >> st = (lo_server_thread) malloc(sizeof(struct _lo_server_thread)); >> >> // rest of function. . . >> } >> >> Just thought I'd pass this upstream. Best regards, >> >> John >> >> ------------------------------------------------------------------------------ >> Find and fix application performance issues faster with Applications Manager >> Applications Manager provides deep performance insights into multiple tiers of >> your business applications. It resolves application problems quickly and >> reduces your MTTR. Get your free trial! >> https://ad.doubleclick.net/ddm/clk/302982198;130105516;z >> _______________________________________________ >> liblo-devel mailing list >> lib...@li... >> https://lists.sourceforge.net/lists/listinfo/liblo-devel >> > > > ------------------------------------------------------------------------------ > Mobile security can be enabling, not merely restricting. Employees who > bring their own devices (BYOD) to work are irked by the imposition of MDM > restrictions. Mobile Device Manager Plus allows you to control only the > apps on BYO-devices by containerizing them, leaving personal data untouched! > https://ad.doubleclick.net/ddm/clk/304595813;131938128;j > _______________________________________________ > liblo-devel mailing list > lib...@li... > https://lists.sourceforge.net/lists/listinfo/liblo-devel |
From: John E. <jo...@ti...> - 2016-05-16 08:34:06
|
Hi guys, I'm just bumping this in case it got missed upstream... I've fixed the problem locally but it should probably get fixed in master. Best regards, John On 02/05/2016 11:32, John Emmas wrote: > I just came across a compiler issue when building with MSVC. It's in > the function alloc_server_thread() (in 'src/server_thread.c'):- > > static lo_server_thread alloc_server_thread(lo_server s) > { > if (!s) > return NULL; > lo_server_thread st = (lo_server_thread) // <--- PROBLEM IS > AT THIS LINE !!! > malloc(sizeof(struct _lo_server_thread)); > > // rest of function. . . > } > > The problem arises because 'server_thread.c' is a 'C' file (as opposed > to C++). 'C' does not allow variables to get declared half way down a > function (they need to get declared at the top of the function). Many > compilers ignore that nowadays but MSVC is still quite strict about > stuff like that. So making this small change enables it to compile > correctly:- > > static lo_server_thread alloc_server_thread(lo_server s) > { > lo_server_thread st; // <--- MOVE THE DECLARATION TO HERE > > if (!s) > return NULL; > st = (lo_server_thread) malloc(sizeof(struct _lo_server_thread)); > > // rest of function. . . > } > > Just thought I'd pass this upstream. Best regards, > > John > > ------------------------------------------------------------------------------ > Find and fix application performance issues faster with Applications Manager > Applications Manager provides deep performance insights into multiple tiers of > your business applications. It resolves application problems quickly and > reduces your MTTR. Get your free trial! > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z > _______________________________________________ > liblo-devel mailing list > lib...@li... > https://lists.sourceforge.net/lists/listinfo/liblo-devel > |
From: Erik R. <er...@om...> - 2016-05-15 12:36:32
|
> I don't know your application, but I'm almost sure you could/should be > breaking up those huge messages. OSC was never meant for streaming > large messages. You can schedule multiple subsequent messages to be > triggered at the same time by just giving them the same time tag > slightly in the future. liblo will queue them and dispatch them > during the same call to lo_server_recv() once that time has been > surpassed. Yeah, I like the general idea of timetagging individual messages, but in practice, I don’t like to involve another scheduler over which I have no detailed control. Well of course I could change things in liblo as I like because it is open source, but it still splits handling of one thing (scheduling) into two different parts of the application. Erik |
From: Stephen S. <rad...@gm...> - 2016-05-13 13:45:44
|
On Fri, May 13, 2016 at 6:11 AM, Erik Ronström <eri...@do...> wrote: >> ahmm, does anybody still use a count-prefix for framing OSC/TCP? >> i thought that by now SLIP has been widely accepted (and is the >> suggested format as per OSC-1.1). > > Well, we do – not that it is necessarily the best solution, it was just the easiest to implement on the client side (which is a Lisp application). > > But since we use OSC to schedule playback of many MIDI events, the messages can be quite big, a few hundred kilobytes at times, and when using SLIP encoding, those messages have to be recoded inside liblo. You could argue that the performance hit isn’t that bad on a modern computer, but in the liblo implementation, the memory used for those operations is allocated from the stack. I’m not really comfortable using that much stack memory, especially given the limitations of alloca (it doesn’t always tell you if the stack overflows). I don't know your application, but I'm almost sure you could/should be breaking up those huge messages. OSC was never meant for streaming large messages. You can schedule multiple subsequent messages to be triggered at the same time by just giving them the same time tag slightly in the future. liblo will queue them and dispatch them during the same call to lo_server_recv() once that time has been surpassed. > So I’ll be happy to stick to count-prefix for the time being! > > I was thinking of rewriting the decoding code so that it uses stack memory for small messages (up to some constant, say 32kB), and allocates on the heap otherwise. But I don’t have the time right now. I’ll post a patch if I come to it later on! This was indeed done based on the assumption that the needed buffer is quite small. (i.e. small messages) However, in principle the receiving buffer doesn't need to be as big as the whole message, so I'll look into restricting it. If it's too difficult, it could perhaps be moved to the heap. However, I think I decided to use the stack here because SLIP requires an extra buffer, one for receiving and one for decoding, so I thought it more efficient to use the stack than either a) allocating and freeing on each recv(), or b) trying to maintain a second heap pointer per socket. It's admittedly a bit convoluted. The receiving buffer should be restricted in size, with the assumption that recv() will be called more than once, however I see that it's not done _quite_ like that at the moment -- but it's close. Steve |
From: Stephen S. <rad...@gm...> - 2016-05-13 13:02:45
|
On Fri, May 13, 2016 at 3:22 AM, IOhannes m zmölnig <zmo...@ie...> wrote: > On 05/11/2016 07:46 PM, Stephen Sinclair wrote: >> unfortunately the semantics of the poor excuse for a framing protocol >> used by OSC/TCP (count prefix) > > ahmm, does anybody still use a count-prefix for framing OSC/TCP? > i thought that by now SLIP has been widely accepted (and is the > suggested format as per OSC-1.1). Yep, but one thing that's been on my list for a long time but I have never got around to is checking liblo thoroughly against other OSC implementations. What libraries use SLIP or count-prefix by default? As far as I know there aren't many OSC libraries that even implement TCP. It's a problem, really, that OSC-over-TCP doesn't have a method of determining the packetizing protocol. I tried to make sure liblo can at least "guess" at which one it is receiving, but I have no idea whether it's robust -- I think not, probably. > i would not spend a single sleepless night on how to ease the use of > count-prefixed framing. I suppose I have to agree. On the other hand I'm interested in making doubly sure our SLIP implementation is good. The problem is that backwards compatibility doesn't allow us to switch to SLIP as the default -- more and more reasons to start thinking about liblo1. Maybe I'll start a list. Steve |
From: Erik R. <eri...@do...> - 2016-05-13 09:11:42
|
> ahmm, does anybody still use a count-prefix for framing OSC/TCP? > i thought that by now SLIP has been widely accepted (and is the > suggested format as per OSC-1.1). Well, we do – not that it is necessarily the best solution, it was just the easiest to implement on the client side (which is a Lisp application). But since we use OSC to schedule playback of many MIDI events, the messages can be quite big, a few hundred kilobytes at times, and when using SLIP encoding, those messages have to be recoded inside liblo. You could argue that the performance hit isn’t that bad on a modern computer, but in the liblo implementation, the memory used for those operations is allocated from the stack. I’m not really comfortable using that much stack memory, especially given the limitations of alloca (it doesn’t always tell you if the stack overflows). So I’ll be happy to stick to count-prefix for the time being! I was thinking of rewriting the decoding code so that it uses stack memory for small messages (up to some constant, say 32kB), and allocates on the heap otherwise. But I don’t have the time right now. I’ll post a patch if I come to it later on! Erik |
From: IOhannes m z. <zmo...@ie...> - 2016-05-13 06:22:46
|
On 05/11/2016 07:46 PM, Stephen Sinclair wrote: > unfortunately the semantics of the poor excuse for a framing protocol > used by OSC/TCP (count prefix) ahmm, does anybody still use a count-prefix for framing OSC/TCP? i thought that by now SLIP has been widely accepted (and is the suggested format as per OSC-1.1). i would not spend a single sleepless night on how to ease the use of count-prefixed framing. gfdsamr IOhannes |
From: Erik R. <er...@om...> - 2016-05-11 21:35:56
|
> The only legal and reliable way I can think of to indicate a reset, if > the message can't be completed, is to close the socket. So liblo > checks for an error return from send() and closes the socket in that > case. Yeah, and that seems to work well. > What you discovered is that an inundated receiving socket and produce > send() errors. (I think.) However, in a sense this is probably the > right thing to do, as it is similar to a DDoS attack, so dropping the > connection is probably fine. I agree that dropping the connection is fine! But a partial send must also be considered an error, therefore closing the socket – which it doesn’t, at present. A check in send_data() is needed: do { ret = send(sock, data, len, MSG_NOSIGNAL); if (ret >= 0 && ret != len) { // Close socket } if (a->protocol == LO_TCP) ai = ai->ai_next; else ai = 0; } while (ret == -1 && ai != NULL); (Actually, the first send() where the message size is send should probably also be checked, even though the unlikelihood of a partial send of 4 bytes…) > You should organize your system to not overload the server. Definitely! Actually, I’ve already changed my application so that the reading thread just decodes the messages and pass them in a queue to another thread, which then dispatches them. This way, a lengthy operation doesn’t stall the read process, so this minimizes the risk of a full send buffer on the server side. So far, it seems to work well! Erik |
From: Stephen S. <rad...@gm...> - 2016-05-11 21:00:54
|
On Wed, May 11, 2016 at 5:49 PM, Erik Ronström <eri...@do...> wrote: >>> Assuming that send(), write() and the like complete the requested >>> number of bytes is a pretty serious bug. >> >> For send(), if there is an error on sending, the socket is closed. In no case does it assume a send >> worked when it didn’t. > > Well, it depends on how you look at it. A partial send could certainly be regarded as a failure in this context. Indeed, the consequences of a partial send is probably worse than of a completely failed send, because in the latter case, the client code could just retry sending the message. The receiver has no way to know how to abort the message and wait for the next one unless the number of bytes specified has been sent. The options in case of a partial send are (1) memorize the unsent message and try to complete it next time -- would require changes to the liblo API (2) pad the message with zeros or something (likely a bad solution, since the receiver has no way to know that it received a bad message, not even a parity check, so the zeros could be interpreted as valid message data.) (3) close the stream The framing protocol doesn't allow for anything else, as far as I can think of. Solutions (1) and (2) gets worse if you consider that the message length might have been interrupted, so the client might mistake how many bytes the receiver is waiting for. The only legal and reliable way I can think of to indicate a reset, if the message can't be completed, is to close the socket. So liblo checks for an error return from send() and closes the socket in that case. What you discovered is that an inundated receiving socket and produce send() errors. (I think.) However, in a sense this is probably the right thing to do, as it is similar to a DDoS attack, so dropping the connection is probably fine. You should organize your system to not overload the server. The SLIP protocol does allow recovery, since the packet can be ended before all bytes were received and a new one started. Steve |
From: Erik R. <eri...@do...> - 2016-05-11 20:50:10
|
>> Assuming that send(), write() and the like complete the requested >> number of bytes is a pretty serious bug. > > For send(), if there is an error on sending, the socket is closed. In no case does it assume a send > worked when it didn’t. Well, it depends on how you look at it. A partial send could certainly be regarded as a failure in this context. Indeed, the consequences of a partial send is probably worse than of a completely failed send, because in the latter case, the client code could just retry sending the message. > I am more and more convinced that closing the socket is simply the right thing to do. Agreed! Erik |
From: Stephen S. <rad...@gm...> - 2016-05-11 17:47:12
|
On Tue, May 10, 2016 at 3:05 AM, Dan Muresan <da...@gm...> wrote: >> send_data calls send, and it checks the return value for errors (-1). >> However, it does NOT check that the number of bytes written to the send >> buffer matches the size of the data to send. Therefore, if the send buffer > > Assuming that send(), write() and the like complete the requested > number of bytes is a pretty serious bug. For write(), the value is checked. For send(), if there is an error on sending, the socket is closed. In no case does it assume a send worked when it didn't. Here we are investigating the possibility of recovering more gracefully from a failed send(), so this is now to be considered.. Unfortunately the semantics of the poor excuse for a framing protocol used by OSC/TCP (count prefix) doesn't allow to detect and recover from partially sent messages. liblo could memorize the send data (potentially expensive copies made!) and try to finish the message before sending the next one, but due to the byte-level protocol I don't see how it could recover more gracefully. I am more and more convinced that closing the socket is simply the right thing to do. Steve |
From: Erik R. <eri...@do...> - 2016-05-10 09:08:15
|
> Assuming that send(), write() and the like complete the requested number of bytes is a pretty serious bug. Yes. However, not very easily fixed in the current implementation, unfortunately. But if this is only supposed to happen when the write buffer is full, a fail-fast strategy may be good enough for now: closing the socket prevents further data from being written and thereby corrupting the stream on the receiving side. Then the calling code gets an error return value, and can choose to just call send_from again with the same message, which would implicitly create a new socket connection. With a new socket connection, the receiving side will get a new socket_context, in effect discarding any previously sent partial messages. I think… (Still, this strategy is obviously sub-optimal if this situation – send succeeding but writing less than the requested number of bytes – happens often) Erik |
From: Dan M. <da...@gm...> - 2016-05-10 06:05:58
|
> send_data calls send, and it checks the return value for errors (-1). > However, it does NOT check that the number of bytes written to the send > buffer matches the size of the data to send. Therefore, if the send buffer Assuming that send(), write() and the like complete the requested number of bytes is a pretty serious bug. |
From: Erik R. <eri...@do...> - 2016-05-10 05:54:54
|
> Ok, I did some searching and it seems a common thing to consider > EWOULDBLOCK as the same as EAGAIN. I'd be fine with an #ifdef to be > more explicit about it. however, when you "the source of this bug," > does that imply that you've tested this change? Does it work? > > I have my doubts since I have been testing on Linux! You are right: I was referring to this Windows-specific bug I discovered later on – not the original one. However, this EAGAIN/EWOULDBLOCK bug in combination with the send_data problems I wrote about the other day, MAY be enough to explain the behavior I’ve seen on Windows. And with the one fixed, the risk of hitting the other is greatly reduced when sending ordinary-sized messages (I hope). Erik |
From: Stephen S. <rad...@gm...> - 2016-05-09 20:17:42
|
On Mon, May 9, 2016 at 3:11 PM, Erik Ronström <eri...@do...> wrote: > Sorry for the spamming, I just felt the need to share my progress… please > let me know if it becomes annoying! No worries, I am glad you are working on it. I haven't had time yet this week. Don't worry if I'm slow to answer -- gotta keep up with the paid portion of my workday now and then ;P You make a good point in a previous email about testing with a non-liblo sender. Good to decouple things, I'll have to try that. > The source of this bug is in server.c, in lo_server_recv_raw_stream_socket: > > if (bytes_recv <= 0) > { > if (errno == EAGAIN) // <--- HERE!! > return 0; > > // Error, or socket was closed. > // Either way, we remove it from the server. > closesocket(s->sockets[isock].fd); > lo_server_del_socket(s, isock, s->sockets[isock].fd); > return 0; > } > > On Windows, that should be > > if (WSAGetLastError() == WSAEWOULADBLOCK) > > or, as there is already the geterror() abstraction for > errno/WSAGetLastError, that should probably be used. Also, EAGAIN could be > #defined to the value of WSAEWOULDBLOCK under Windows. Ok, I did some searching and it seems a common thing to consider EWOULDBLOCK as the same as EAGAIN. I'd be fine with an #ifdef to be more explicit about it. however, when you "the source of this bug," does that imply that you've tested this change? Does it work? I have my doubts since I have been testing on Linux! > This may or may not account for all of the bugs I got when testing, but it’s > definitely a major bug that should be fixed! I think there may be 2 or 3 things here.. it seems there are problems in the SLIP code too, so I'll have to give it some attention. Steve > 9 maj 2016 kl. 10:46 skrev Erik Ronström <eri...@do...>: > > One more strange phenomenon: > > On Windows, I tried to send relatively small messages (< 1024 bytes) to the > liblo server, no problem. When sending bigger messages, the server closed > the socket. Well, so I tried serializing the bigger message and sending it > in 1024 byte chunks. The server still closes the socket – and this is NOT a > timeout or buffer error: the same thing happens if I put one second of sleep > between each call! > > Now, the interesting thing is that if I chunk the serialized message into > 1023 bytes, all is well! > > There may be some bug related to being at the message buffer size > limit, although the same thing occurs if I set it to unlimited. > > > It seems like you were half correct: the bug seems to be related to being at > the current buffer size, but not at the buffer size *limit*. > > Erik > > > > > 9 maj 2016 kl. 00:13 skrev Erik Ronström <eri...@do...>: > > Diving into this, it seems that the message buffer is getting > increased in size until it hits the maximum. The reason for this is > not yet clear to me, but I suspect something is getting desynchronized > in the stream due to send failure, and it's reading the wrong bytes as > the message length. > > > I’ve been investigating this further, and I can confirm that this is the > case - at least in my example setup. > > send_data calls send, and it checks the return value for errors (-1). > However, it does NOT check that the number of bytes written to the send > buffer matches the size of the data to send. Therefore, if the send buffer > gets full (which obviously happens when sending ”too fast”), only part of > the buffer gets sent. send_data then returns happily, discarding the part of > the message that was never sent. When send_data is called the next time, it > will either fail with EAGAIN (if the send buffer is still full), or send the > beginning of a new message, thereby corrupting the stream on the receiving > side. > > I cannot see an easy fix for this: both the message serialization and > sending happens inside send_data, so even if send_data was to return with an > error code in this case, the amount of data left to write would be > ”forgotten”. So the responsibility of calling send again with the remaining > data would therefore best be placed on send_data, but that would essentially > make send_data blocking. > > HOWEVER, I’m actually not sure that this was my original problem, since at > that point I was not using liblo to send the messages, I only used it as a > server. So there may be another bug luring around, related or unrelated. > > Erik > > > ------------------------------------------------------------------------------ > Find and fix application performance issues faster with Applications Manager > Applications Manager provides deep performance insights into multiple tiers > of > your business applications. It resolves application problems quickly and > reduces your MTTR. Get your free trial! > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z > _______________________________________________ > liblo-devel mailing list > lib...@li... > https://lists.sourceforge.net/lists/listinfo/liblo-devel > > > > ------------------------------------------------------------------------------ > Find and fix application performance issues faster with Applications Manager > Applications Manager provides deep performance insights into multiple tiers > of > your business applications. It resolves application problems quickly and > reduces your MTTR. Get your free trial! > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z > _______________________________________________ > liblo-devel mailing list > lib...@li... > https://lists.sourceforge.net/lists/listinfo/liblo-devel > > > > ------------------------------------------------------------------------------ > Find and fix application performance issues faster with Applications Manager > Applications Manager provides deep performance insights into multiple tiers > of > your business applications. It resolves application problems quickly and > reduces your MTTR. Get your free trial! > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z > _______________________________________________ > liblo-devel mailing list > lib...@li... > https://lists.sourceforge.net/lists/listinfo/liblo-devel > |
From: Erik R. <eri...@do...> - 2016-05-09 18:12:08
|
Sorry for the spamming, I just felt the need to share my progress… please let me know if it becomes annoying! The source of this bug is in server.c, in lo_server_recv_raw_stream_socket: if (bytes_recv <= 0) { if (errno == EAGAIN) // <--- HERE!! return 0; // Error, or socket was closed. // Either way, we remove it from the server. closesocket(s->sockets[isock].fd); lo_server_del_socket(s, isock, s->sockets[isock].fd); return 0; } On Windows, that should be if (WSAGetLastError() == WSAEWOULADBLOCK) or, as there is already the geterror() abstraction for errno/WSAGetLastError, that should probably be used. Also, EAGAIN could be #defined to the value of WSAEWOULDBLOCK under Windows. This may or may not account for all of the bugs I got when testing, but it’s definitely a major bug that should be fixed! Erik > 9 maj 2016 kl. 10:46 skrev Erik Ronström <eri...@do...>: > > One more strange phenomenon: > > On Windows, I tried to send relatively small messages (< 1024 bytes) to the liblo server, no problem. When sending bigger messages, the server closed the socket. Well, so I tried serializing the bigger message and sending it in 1024 byte chunks. The server still closes the socket – and this is NOT a timeout or buffer error: the same thing happens if I put one second of sleep between each call! > > Now, the interesting thing is that if I chunk the serialized message into 1023 bytes, all is well! > >> There may be some bug related to being at the message buffer size >> limit, although the same thing occurs if I set it to unlimited. > > It seems like you were half correct: the bug seems to be related to being at the current buffer size, but not at the buffer size *limit*. > > Erik > > > > >> 9 maj 2016 kl. 00:13 skrev Erik Ronström <eri...@do...>: >> >>> Diving into this, it seems that the message buffer is getting >>> increased in size until it hits the maximum. The reason for this is >>> not yet clear to me, but I suspect something is getting desynchronized >>> in the stream due to send failure, and it's reading the wrong bytes as >>> the message length. >> >> I’ve been investigating this further, and I can confirm that this is the case - at least in my example setup. >> >> send_data calls send, and it checks the return value for errors (-1). However, it does NOT check that the number of bytes written to the send buffer matches the size of the data to send. Therefore, if the send buffer gets full (which obviously happens when sending ”too fast”), only part of the buffer gets sent. send_data then returns happily, discarding the part of the message that was never sent. When send_data is called the next time, it will either fail with EAGAIN (if the send buffer is still full), or send the beginning of a new message, thereby corrupting the stream on the receiving side. >> >> I cannot see an easy fix for this: both the message serialization and sending happens inside send_data, so even if send_data was to return with an error code in this case, the amount of data left to write would be ”forgotten”. So the responsibility of calling send again with the remaining data would therefore best be placed on send_data, but that would essentially make send_data blocking. >> >> HOWEVER, I’m actually not sure that this was my original problem, since at that point I was not using liblo to send the messages, I only used it as a server. So there may be another bug luring around, related or unrelated. >> >> Erik >> >> >> ------------------------------------------------------------------------------ >> Find and fix application performance issues faster with Applications Manager >> Applications Manager provides deep performance insights into multiple tiers of >> your business applications. It resolves application problems quickly and >> reduces your MTTR. Get your free trial! >> https://ad.doubleclick.net/ddm/clk/302982198;130105516;z >> _______________________________________________ >> liblo-devel mailing list >> lib...@li... >> https://lists.sourceforge.net/lists/listinfo/liblo-devel > > > ------------------------------------------------------------------------------ > Find and fix application performance issues faster with Applications Manager > Applications Manager provides deep performance insights into multiple tiers of > your business applications. It resolves application problems quickly and > reduces your MTTR. Get your free trial! > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z > _______________________________________________ > liblo-devel mailing list > lib...@li... > https://lists.sourceforge.net/lists/listinfo/liblo-devel |
From: Erik R. <eri...@do...> - 2016-05-09 08:46:20
|
One more strange phenomenon: On Windows, I tried to send relatively small messages (< 1024 bytes) to the liblo server, no problem. When sending bigger messages, the server closed the socket. Well, so I tried serializing the bigger message and sending it in 1024 byte chunks. The server still closes the socket – and this is NOT a timeout or buffer error: the same thing happens if I put one second of sleep between each call! Now, the interesting thing is that if I chunk the serialized message into 1023 bytes, all is well! > There may be some bug related to being at the message buffer size > limit, although the same thing occurs if I set it to unlimited. It seems like you were half correct: the bug seems to be related to being at the current buffer size, but not at the buffer size *limit*. Erik > 9 maj 2016 kl. 00:13 skrev Erik Ronström <eri...@do...>: > >> Diving into this, it seems that the message buffer is getting >> increased in size until it hits the maximum. The reason for this is >> not yet clear to me, but I suspect something is getting desynchronized >> in the stream due to send failure, and it's reading the wrong bytes as >> the message length. > > I’ve been investigating this further, and I can confirm that this is the case - at least in my example setup. > > send_data calls send, and it checks the return value for errors (-1). However, it does NOT check that the number of bytes written to the send buffer matches the size of the data to send. Therefore, if the send buffer gets full (which obviously happens when sending ”too fast”), only part of the buffer gets sent. send_data then returns happily, discarding the part of the message that was never sent. When send_data is called the next time, it will either fail with EAGAIN (if the send buffer is still full), or send the beginning of a new message, thereby corrupting the stream on the receiving side. > > I cannot see an easy fix for this: both the message serialization and sending happens inside send_data, so even if send_data was to return with an error code in this case, the amount of data left to write would be ”forgotten”. So the responsibility of calling send again with the remaining data would therefore best be placed on send_data, but that would essentially make send_data blocking. > > HOWEVER, I’m actually not sure that this was my original problem, since at that point I was not using liblo to send the messages, I only used it as a server. So there may be another bug luring around, related or unrelated. > > Erik > > > ------------------------------------------------------------------------------ > Find and fix application performance issues faster with Applications Manager > Applications Manager provides deep performance insights into multiple tiers of > your business applications. It resolves application problems quickly and > reduces your MTTR. Get your free trial! > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z > _______________________________________________ > liblo-devel mailing list > lib...@li... > https://lists.sourceforge.net/lists/listinfo/liblo-devel |
From: Erik R. <eri...@do...> - 2016-05-08 22:13:42
|
> Diving into this, it seems that the message buffer is getting > increased in size until it hits the maximum. The reason for this is > not yet clear to me, but I suspect something is getting desynchronized > in the stream due to send failure, and it's reading the wrong bytes as > the message length. I’ve been investigating this further, and I can confirm that this is the case - at least in my example setup. send_data calls send, and it checks the return value for errors (-1). However, it does NOT check that the number of bytes written to the send buffer matches the size of the data to send. Therefore, if the send buffer gets full (which obviously happens when sending ”too fast”), only part of the buffer gets sent. send_data then returns happily, discarding the part of the message that was never sent. When send_data is called the next time, it will either fail with EAGAIN (if the send buffer is still full), or send the beginning of a new message, thereby corrupting the stream on the receiving side. I cannot see an easy fix for this: both the message serialization and sending happens inside send_data, so even if send_data was to return with an error code in this case, the amount of data left to write would be ”forgotten”. So the responsibility of calling send again with the remaining data would therefore best be placed on send_data, but that would essentially make send_data blocking. HOWEVER, I’m actually not sure that this was my original problem, since at that point I was not using liblo to send the messages, I only used it as a server. So there may be another bug luring around, related or unrelated. Erik |
From: Stephen S. <rad...@gm...> - 2016-05-05 16:38:13
|
On Thu, May 5, 2016 at 1:26 PM, Erik Ronström <eri...@do...> wrote: > Hi Steve, > > Thanks for the digging and the support – I was starting to go crazy over > this! :) > > About the patch: > > it should return an error to allow you to try > again after reading the socket. A very simple patch does this > > > Who is ”you” referring to here? Does the patch make liblo automatically > trying to resend, or does it return an error to the client application which > is then supposed to call the send function again? I mean that it causes lo_send() (et al.) to return an error, without closing the socket. > Unfortunately this only seems to alleviate the problem, not fix it. > Indeed, progress now continues for a few hundred more messages, but at > some point the server stops receiving messages and is just waiting. > > > I’m curios, is it really waiting, or is it busy-looping? I’ve had problems > before on Mac when receiving really large messages (a few 100kB, I removed > the message size limit for incoming TCP), and the server got into a > never-ending loop (see thread on this list from April 8th 2016). Also note > that there is a bug in lo_server_recv_raw_stream_socket that can hang the > server (see same thread); however I fixed that bug in my fork so that cannot > be causing the original issue. It's blocked on poll() in lo_server_wait(). However, I'm not sure why. Unfortunately I'm out of time to keep investigating today, but I'll try to get back to it. There may be some bug related to being at the message buffer size limit, although the same thing occurs if I set it to unlimited. Steve |
From: Erik R. <eri...@do...> - 2016-05-05 16:26:21
|
Hi Steve, Thanks for the digging and the support – I was starting to go crazy over this! :) About the patch: > it should return an error to allow you to try > again after reading the socket. A very simple patch does this Who is ”you” referring to here? Does the patch make liblo automatically trying to resend, or does it return an error to the client application which is then supposed to call the send function again? > Unfortunately this only seems to alleviate the problem, not fix it. > Indeed, progress now continues for a few hundred more messages, but at > some point the server stops receiving messages and is just waiting. I’m curios, is it really waiting, or is it busy-looping? I’ve had problems before on Mac when receiving really large messages (a few 100kB, I removed the message size limit for incoming TCP), and the server got into a never-ending loop (see thread on this list from April 8th 2016). Also note that there is a bug in lo_server_recv_raw_stream_socket that can hang the server (see same thread); however I fixed that bug in my fork so that cannot be causing the original issue. Erik |
From: Stephen S. <rad...@gm...> - 2016-05-05 15:51:07
|
Hi Erik, On Wed, May 4, 2016 at 8:20 PM, Erik Ronström <eri...@do...> wrote: > Hi, > > I’ve happily been using liblo on Mac for some time, for an audio server to which you connect with TCP. Now when testing our application on Windows, I’ve had some strange problems where the client application suddenly gets disconnected from the audio server (using liblo), with the error code 10053 (aka WSAECONNABORTED, ”Software caused connection abort”). The problem occurs intermittently, and not very often, which has made it very hard to debug. But now I think I am closer to an explanation. > > DISCLAIMER: Speculation ahead! > > The problems seem to arise when sending and receiving concurrently. You might think that read/write concurrency is happening all the time when using a liblo server, so if there are problems with that, you would run into them very frequently. But actually, the most common usage of an OSC server is to receive messages, and reply to them. This means that the server first reads an entire message, then processes it, and then responds and writes back to the socket, and at that time, there is no reading going on. And since socket writes are so fast, the risk of a clash with the next ingoing message is minimal. > > ... except ... > > Receiving and sending very large messages takes more time. And on Windows (at least on my Machine), sending big blobs is slow enough to cause problems. > > Take the liblo example file example_tcp_echo_server.c. It works fine! Change the echo message to a 30 kb blob (32kb is the hardcoded maximum message size). Still works fine. Notice however, that each of the processes only sends the echo *as a response* to a received message, and therefore, there are no concurrent reads and writes inside any of the processes. > > Instead, start only one example_tcp_echo_server, connect another client to it, and try the following: > - Send one big message - OK > - Send ten big messages with a few milliseconds of sleep between each one - OK > - Send ten small messages in a tight loop - OK > - Send ten big messages in a tight loop - FAIL! > > Suddenly, the connection is broken! My interpretation is that a new message is received while the echo is being sent, and this causes some kind of problem inside liblo. > > Now, this may not even be a bug in liblo, but just the way the underlying Windows sockets work - I know far to little about network programming to tell. Sounds unlikely to me though, but if that really is the case, then liblo should probably handle it gracefully anyway, so some kind of action is needed! > > Unless I am completely wrong about the whole issue, and have missed something obvious! > > Any input greatly appreciated! I’ll be happy to provide code examples if anyone is interested in testing or debugging this! It seems that you are generally correct here. I was able to reproduce the problem by modifying example_tcp_echo_server: * remove the sleep * double the call to lo_send_message_from(). * increase the message count threshold before it quits. Now twice as many messages are sent than received. When the lo_server gets overloaded with large incoming messages, it seems that send() stops working. I can report that the same problem did not occur for small messages. In particular for TCP, when the 4 bytes for the message length are not successfully sent, and it closes the connection. However, more correctly, it should return an error to allow you to try again after reading the socket. A very simple patch does this: -------------------- diff --git a/src/send.c b/src/send.c index 4613601..6a27e8a 100644 --- a/src/send.c +++ b/src/send.c @@ -531,7 +531,7 @@ static int send_data(lo_address a, lo_server from, char *data, } if (ret == -1) { - if (a->protocol == LO_TCP) { + if (a->protocol == LO_TCP && geterror()!=EAGAIN) { if (from) lo_server_del_socket(from, -1, a->socket); closesocket(a->socket); -------------------- Unfortunately this only seems to alleviate the problem, not fix it. Indeed, progress now continues for a few hundred more messages, but at some point the server stops receiving messages and is just waiting. Diving into this, it seems that the message buffer is getting increased in size until it hits the maximum. The reason for this is not yet clear to me, but I suspect something is getting desynchronized in the stream due to send failure, and it's reading the wrong bytes as the message length. In principle SLIP mode should be more robust to this, but that seems to have its own bugs.. Steve |
From: Erik R. <eri...@do...> - 2016-05-04 23:20:49
|
Hi, I’ve happily been using liblo on Mac for some time, for an audio server to which you connect with TCP. Now when testing our application on Windows, I’ve had some strange problems where the client application suddenly gets disconnected from the audio server (using liblo), with the error code 10053 (aka WSAECONNABORTED, ”Software caused connection abort”). The problem occurs intermittently, and not very often, which has made it very hard to debug. But now I think I am closer to an explanation. DISCLAIMER: Speculation ahead! The problems seem to arise when sending and receiving concurrently. You might think that read/write concurrency is happening all the time when using a liblo server, so if there are problems with that, you would run into them very frequently. But actually, the most common usage of an OSC server is to receive messages, and reply to them. This means that the server first reads an entire message, then processes it, and then responds and writes back to the socket, and at that time, there is no reading going on. And since socket writes are so fast, the risk of a clash with the next ingoing message is minimal. ... except ... Receiving and sending very large messages takes more time. And on Windows (at least on my Machine), sending big blobs is slow enough to cause problems. Take the liblo example file example_tcp_echo_server.c. It works fine! Change the echo message to a 30 kb blob (32kb is the hardcoded maximum message size). Still works fine. Notice however, that each of the processes only sends the echo *as a response* to a received message, and therefore, there are no concurrent reads and writes inside any of the processes. Instead, start only one example_tcp_echo_server, connect another client to it, and try the following: - Send one big message - OK - Send ten big messages with a few milliseconds of sleep between each one - OK - Send ten small messages in a tight loop - OK - Send ten big messages in a tight loop - FAIL! Suddenly, the connection is broken! My interpretation is that a new message is received while the echo is being sent, and this causes some kind of problem inside liblo. Now, this may not even be a bug in liblo, but just the way the underlying Windows sockets work - I know far to little about network programming to tell. Sounds unlikely to me though, but if that really is the case, then liblo should probably handle it gracefully anyway, so some kind of action is needed! Unless I am completely wrong about the whole issue, and have missed something obvious! Any input greatly appreciated! I’ll be happy to provide code examples if anyone is interested in testing or debugging this! Erik |
From: John E. <jo...@ti...> - 2016-05-02 10:46:23
|
I just came across a compiler issue when building with MSVC. It's in the function alloc_server_thread() (in 'src/server_thread.c'):- static lo_server_thread alloc_server_thread(lo_server s) { if (!s) return NULL; lo_server_thread st = (lo_server_thread) // <--- PROBLEM IS AT THIS LINE !!! malloc(sizeof(struct _lo_server_thread)); // rest of function. . . } The problem arises because 'server_thread.c' is a 'C' file (as opposed to C++). 'C' does not allow variables to get declared half way down a function (they need to get declared at the top of the function). Many compilers ignore that nowadays but MSVC is still quite strict about stuff like that. So making this small change enables it to compile correctly:- static lo_server_thread alloc_server_thread(lo_server s) { lo_server_thread st; // <--- MOVE THE DECLARATION TO HERE if (!s) return NULL; st = (lo_server_thread) malloc(sizeof(struct _lo_server_thread)); // rest of function. . . } Just thought I'd pass this upstream. Best regards, John |