[WriteLog] "double numbers" and out-of-sequence problems (long)

Fri, 26 Jan 2001 02:24:18 -0500

> (My shields are now up. Fire away)
>
> Wayne, W5XD

OK, since you made an explicit invitation...

First, let me say that I am very grateful to you for all of the work you
have done on Writelog, and especially for the changes you have made in
response to my requests. I also believe that it is entirely your prerogative
what to work on in Writelog. For that reason, I generally stay out of
reflector discussions about what you should or should not do.

With that said, in this particular instance I strongly disagree with your
position. There is a complex technical argument I can make, but I'll just
cut to the chase and put it this way: 1) Writelog users are complaining
frequently and vocally about the serial number "problem", and 2) I have
never seen similar complaints about, nor experienced similar problems with,
other contest programs that support SO2R and networked PCs. My conclusion is
that you ought to fix this problem.

I don't think it helps to ask whether anyone has had QSOs disqualified
because of duplicate or out-of-sequence serial numbers. Changes made to
Writelog over the past year have dramatically increased the serial number
anomolies, and we don't yet know how all of the contest sponsors are going
to react. ARRL says it's OK for SS, but what about other sponsors? I landed
somewhere in the top ten in claimed scores for the 2000 CQ WPX CW contest --
I'm going to be really unhappy if I lose that spot because I had hundreds of
duplicate and out-of-sequence serial numbers in my log! Further, many S&P
ops listen to serial numbers as they are being handed out and assume that
theirs is the next one in the sequence. If a blast of static wipes out the
last digit, many ops would just log the next consecutive serial number
instead of asking for a fill. Granted, this is not great operating practice,
but Writelog's serial number problem increases the probability that such
QSOs will be disqualified.

As a professional software designer and engineer for over 25 years, I agree
completely with NW0L that for all practical purposes the problem can be
fixed. I would go further and say that the primary cause of the problem is
WriteLog's serial number assignment algorithm, which computes the next
serial number as the number of QSOs in the log, plus one. Simply put, this
is a bug.

Below, I have included a rather lengthy text I sent you after ARRL SS CW on
how to solve the problem for SO2R (minus the example C code.) I apologize
for sending such a large message to the reflector, but figure there might be
some interest.

Wayne,

OK, I've given it some thought, and here's what I think can be accomplished
for multiple radios on one PC (in order of importance):

1. Logged serial number always matches what is sent
2. No duplicate serial numbers
3. No missing serial numbers
4. Serial numbers not wildly different from number of QSOs in log
5. Out-of-sequence numbers minimized for SSB, "eliminated" for CW and RTTY

Before I describe my scheme, a word about networked PCs is in order. My
proposal should work for multiple radios on one PC and, in theory, for
multiple PCs in a network. However, as you noted in one of your posts to the
reflector, implementation for networked PCs requires more effort, and
problems created by network outages are extremely difficult or impossible to
solve. The limited amount of thought I have given to this issue leads me to
believe that the only reasonable solution for networked PCs is to assign
serial numbers by band or by radio.

********************************************

Premise: Out-of-sequence serial numbers result from the asynchronous nature
of logging QSOs from multiple radios, and simply cannot be avoided while
maintaining the all-important rule that the logged serial number *must*
match the transmitted serial number. The best that can be done is to
minimize the occurrence of out-of-sequence serial numbers.

However, duplicate and missing serial numbers can be eliminated. Both result
from calculating the next serial number as the number of QSOs in the log
plus one. This computation guarantees that there will be duplicate and
missing numbers, and once they are in the log their presence guarantees that
more will come. Although a system to reserve serial numbers was introduced,
it's not complete and still suffers from the fundamental problem of how the
next serial number is computed. The entire issue can be resolved by
selecting the next serial number from a queue of available serial numbers,
and allowing an unused serial number to be returned to the queue.

The queue contains a sorted list of serial numbers. The number of entries in
the queue is the same as the number of configured radios. At first I thought
it would be appropriate to do this with a linked list, but that's too much
work. A pair of arrays will do just as well (one for the serial number, one
for the state.) Assigning a serial number consists of finding one in the
list that is not in use, grabbing it, and flagging that it is in use.
Returning a serial number consists of finding the matching entry and marking
that it is not in use. Logging a QSO requires the entry with the matching
serial number to be deleted. This is accomplished simply by moving all
subsequent entries up one position (i.e., squeezing the deleted entry out of
the queue while keeping the queue sorted), then inserting the next serial
number at the end of the queue (this is always the number that previously
occupied this position, plus one.)

The major implementation problem with this approach is that the contents of
the queue must survive shutdowns, restarts and crashes, and therefore must
be stored somewhere. The simplest and most bullet-proof approach is to store
it invisibly in each QSO record. That way, it can be easily recovered from
the last record whenever the log is opened. Since the maximum number of
radios on one PC is four, this is not an excessive amount of data to store
in each QSO record. Another alternative would be to store the queue in each
transaction journal record, but not in each log record. When the transaction
journal is flushed, a single copy of the queue taken from the last record
could be written to the log file (I'm guessing you have some sort of data
block in there before or after the log records.)

Note that this scheme eliminates the need for the existing "reservation"
code and its limitations.

The attached text file contains some simple code that illustrates the
algorithms. Please pardon my very rusty C -- it's been years since I coded
anything important. Also, the code could be a lot tighter, but I wanted the
algorithms to stand out. Check me for off-by-one bugs, too.

The remaining question is when to select a serial number from the list and
when to put it back. Since we have eliminated duplicates and gaps, the only
considerations are to 1) ensure that the logged serial number matches what
was sent on the air, and 2) minimize out-of-sequence serial numbers as much
as possible.

The most conservative approach, which you tried in 10.19, is to assign the
serial number immediately upon entry of the callsign. The serial number
stays assigned to the entry window until the QSO is either logged or wiped.
Unfortunately, dupe checking on the second radio will cause a high frequency
of out-of-sequence serial numbers.

10.21 improved the situation by narrowing the circumstances under which
serial numbers are assigned. For the most part, the changes reduce the
frequency of out-of-sequence serial numbers. In particular, assigning the
serial number upon entry only if the window has transmit focus allows dupe
checking on the second radio without getting a serial number. That
eliminates a large number of out-of-sequence situations.

However, 10.21 also assigns the serial number when any function key is used
and the target window is not blank. I assume that this falls out of the code
that assigns the serial number when the transmit focus is moved to a
non-blank window. Assigning the serial number when the transmit focus moves
to a non-blank window is generally a good idea, but doing it for every
function key greatly increases the frequency of out-of-sequence serial
numbers. That's because it is quite common to be calling CQ on one radio,
while repeatedly dropping in one's call with F4 on the other radio. After
the first F4, a serial number is assigned. Many QSOs can then take place on
the first radio before a QSO is completed on the second radio. When the QSO
is finally completed on the second radio, the number is out-of-sequence.

(Bear with me now): This leads to the inevitable conclusion that the correct
time to assign the serial number is when it is actually transmitted. That
is, forget all the other 10.21 rules and assign the serial number *only*
when the % macro is executed on a non-blank window. This reduces the
out-of-sequence problem to the absolute minimum because there is a high
probablility that when % is executed, the QSO is going to be logged. The
only time an out-of-sequence condition can occur is when the QSO is wiped
after the serial number has been sent (because the serial number will be
reused.) Actually, it would not be unreasonable to throw the serial number
away in this particular case. Of course, this reintroduces missing serial
numbers, but that may be preferable to seeing a serial number that is way
out of sequence. I would contend that this circumstance is infrequent enough
(getting to the point of sending the exchange but not being able to complete
the QSO), that the missing serial number might not even be noticed. It may
be less noticeable than an out-of-sequence serial number.

Now the problem: This rule works beautifully for CW and RTTY, where the %
macro is always used to send the serial number. It won't work for SSB mode
because we can't be sure that the op will use phonetic function keys. In
this case, the 10.21 algorithm is about the only reasonable thing that can
be done.

My recommendation is as follows:

1. Implement the serial number queue to eliminate duplicate and missing
serial numbers.
2. For CW and RTTY, assign the serial number only when the % macro is
executed or the QSO is logged.
3. For SSB, assign the serial number according to the 10.21 rules.
4. Add an ini parameter that lets the user explicitly select the rules (so
CW and RTTY ops can do it ala 10.21, or SSB ops using phonetics can select
the % method.)

Under the % method, I slightly lean towards throwing away unused serial
numbers instead of returning them to the queue. Again, this is because the
frequency of assigned but unused serial numbers is so low under this scheme
that a missing serial number will be very rare. There's something elegant
about throwing away a number that you sent unsuccessfully. However, if the
number is released for any other reason (e.g., wipe or 60-second timeout
before sending), then it should be returned to the queue.

I said earlier that serial numbers will not be wildly different from the
number of QSOs in the log. I believe that this can only happen if an
assigned serial number sits around for a long time in one of the windows
while a lot of QSOs are logged in the other window. I don't think this is
likely to happen under the % method, and I believe the existing 60-second
rule prevents it from happening anyway.

Finally, I think the circumstances under which 10.21 unlocks a serial number
(and returns it to the queue in my scheme) are all OK and should be
preserved.

I realize the problem of storing the queue is not trivial, but the ability
to eliminate duplicate and missing serial numbers is worth it. As for the
assignment rules, my feeling is that just because we can't optimize serial
numbers for SSB, we shouldn't lose the opportunity to optimize them for CW
and RTTY.

Let me know what you think.

73, Dick WC1M

--
WWW:                      http://www.writelog.com/
Submissions:              writelog@contesting.com
Administrative requests:  writelog-REQUEST@contesting.com
Problems:                 owner-writelog@contesting.com