[TowerTalk] Analyzer and AIM comments to Jim Lux

Fri Jun 6 11:50:37 EDT 2008

Interesting, thanks for the effort.
Comments interspersed...

-----Original Message-----
>From: Jay Terleski <jayt at arraysolutions.com>
>
>Hi Jim,  
>
>I ran your comments by Bob Clunn, the designer of the AIM and Power Aim
>Analyzers and he has responded below.
>
>Jay
>
>Bob Clunn's comments:
>
>
>It's interesting to see a detailed discussion of the issues concerning
>calibration
>and measurement. 
>
>Statements like "works great" or "doesn't work" are subjective and depend on
>what the user is trying to accomplish and what kind of performance he expects. On
>eham.com reviews can be found ranging from 1 to 5 for the same product, the
>perception of quality depends on the circumstances.     

I fully agree on subjective evaluations of "usefulness" and the tendency of eham reviews to be distributed at the ends of the scale. However, here, we are talking about quantitative measures of accuracy, not a subjective evaluation of works for me or doesn't.

>
>"It's one thing to calibrate out a few degrees of phase or a few tenths of a
>dB shift to compensate for the non-ideal directivity in a probe or the
>loss/mismatch in a connector or test port cable.  It's entirely another to
>calibrate out the response of a reactive device with S11 and S21 varying by
>orders of magnitude over the passband."
>
> 
>
>The AIM uses a procedure that compensates for non-ideal parameters over a
>dynamic range of about 60+ dB.  This is illustrated in the manual in the section about Custom Calibration starting on page 29. Several graphs show the effectiveness of the cal procedure in canceling the effect of a complex transmission network.

Indeed, and I have no quibble with the calibration technique used, nor with the fact that it can potentially cancel over a 60 dB range.  My concern is with the effect on the measurement uncertainty of the resulting "calibrated measurement" (i.e. the number you get AFTER subtracting/dividing out the network), because, typically, this is not discussed.

That is, there's no "error bars" on the measurement display, and, because of the calibration process, the size of the error bars is NOT the same across the display, and, in fact, the bars could be larger than the height of the graph.  

Most people, when presented with a spec like "the analyzer is accurate to 5%" make the assumption that the uncertainty of the displayed calibrated measurements are about that value everywhere.  There's also often an unwritten assumption that the spec is "5% of full scale", which, of course, is 20% of a quarter scale measurement.

The problem is aggravated when making measurements with things like spectrum analyzers that display the results on a dB scale, rather than a linear one (such as is typical for a voltmeter), and even worse if the log function is being done in software from a basically linear measurement. A system using a 12 bit A/D to make a linear measurement is only accurate to 1 part in 4000 at full scale (admittedly MUCH better than an analog meter). For a full scale signal, the uncertainty is small (.025% or 0.002dB) However, that 1 LSB uncertainty is about -72dBFS.  If you start looking at signals, say, 40 dB down from saturation/Full Scale, (that is at 0.01 of full scale voltage), the uncertainty is now 100 times bigger (0.2dB).  

>
>"Basically, if you divide a not very accurate measurement by another not
>very accurate measurement, the result isn't very accurate."
>
>Agreed.  Another serious pitfall is to subtract two large numbers, their
>difference can have a large percentage variance. 
>
Exactly, so when you make that measurement that is going to compensate for the 60 dB loss (as you noted above), the uncertainty in the 60dB is huge.  If it's 5% of full scale, at 60 db down, it's 1000 times as large, that is, the uncertainty is 5 times larger than the measurement!  

When you subtract out the 60dB loss at a given frequency, maybe you really should have been subtracting 53 db or 67 dB, or somewhere in between.  The uncertainty of the calibrated measurement (displayed as 0dB, instead of the uncalibrated -60dB) could be +/-5dB, as opposed to a +/- 0.5dB for a measurement of the same signal, without the loss and corresponding compensation.

(presumably it's not that bad... the uncertainty of the actual measurement is more likely some small fixed amount (noise and quantization error of the A/D, for instance) and some measurand proportional value.

>"This kind of uncertainty analysis is what separates typical experimental
>gear and bench experiments from "lab-grade" instruments like a VNA from
>Agilent.  If I make a measurement on an unknown device with a professional
>VNA, not only do I get an estimate of what the unknown device's
>characteristics are, but I also get an estimate of the uncertainty of that
>estimate."
>
> 
>
>Knowledge of the uncertainty is nice to have. Some people, like Larry Benko,
>W0QE, have compared the
>AIM to HP  equipment (HP8753B, 3 GHz. two port vector network analyzer, with
>HP85046A S-parameter test set) and found it to be comparable over it's
>frequency range.

Has W0QE actually done an uncertainty analysis, or just a comparison of identical test articles. 
Were the test conditions designed to evaluate the calibration process? 

Certainly, point comparisons can provide a clue to the overall measurement uncertainty, but, without knowing the circumstances, you have no way to know if maybe you just got lucky.  The situation is similar to testing SWR meters.  It's easy to get good accuracy near 1:1, much tougher at 10:1, so unless you actually test with carefully designed cases, it doesn't tell you much.

And then, anytime you are comparing a measurement from one instrument to another, you have some combination of uncertainties to deal with. Often this is dealt with by the "three cornered hat" technique (measure A against B, B against C, A against C), especially if all the uncertainties are of similar magnitude.

 I don't have equipment like this myself, but I will be glad
>to hear from anyone who does a comparison study. 

Actually, a comparison study is the wrong approach.  This is more a mathematical analysis problem, and you can confirm the analysis with some well chosen attenuator pads (which don't, themselves, have to be calibrated, just stable).  That is, you make some measurements without the pads of some impedance test articles.  Then, you put the pad on, run the cal, and make measurements of the same test articles(through the pad). The uncertainty in the measurements of the test articles should increase (i.e. get noisier) in a predictable fashion.

There's quite a bit of literature out there on calibration and measurement uncertainty if one wants to get to the gnat's eyelash.  However, I think that for something like the AIM4170, one could probably do the analysis on the basis of the block diagram, and some information about things like the noise floor into the A/D, etc.  You're probably not looking to characterize the 5 digit of the reflection coefficient, after all.

>The AIM compares quite favorably to other antenna analyzers in it's price range, according to data obtained by the ARRL lab.
>These comparisons can be seen on the w5big.com website. 

Sure, but the ARRL lab didn't go out and to any sort of testing to quantify the measurement uncertainty.  They checked the meter against some standards and said, yep, it reads about the same as we expected.  They didn't run multiple calibrations with different networks and verify that the calibration process "cancels out" the network correctly.

Let's be clear.. I'm not knocking the AIM4170's performance. It does a great job for the price, but I think that the performance of these devices in general is somewhat oversold, particularly when using functions like "fixture deembedding" (which is what calibrating out the filter basically is).

>
>"So, can you point us to an analysis of the accuracy of the calibration
>method? How much (in numbers) does the filter degrade the accuracy of the
>measurement (or, more correctly.. what's the uncertainty of the calibrated
>measurement after applying the calibration)."
>
>
>The effectivness of the calibration procedure in several situations is
>illustrated in the manual. For a particular situation, you can check the calibration with known loads to see if it's sufficient for your 
>requirements. Depending on the degree of compensation required, the accuracy
>when measuring through a complex network, should be within 2 or 3 percent. Ideally it would depend
>only on the accuracy of the loads used for calibration, but there is some uncertainty in the raw data, so the compensation is not perfect.

OK, so you don't have the "analysis of accuracy".  WHat you have is some empirical evidence that it seems to work in a few cases, and an unsubstantiated assertion that it "should be within 2-3 percent" (which, frankly, is tighter than the advertised accuracy of 5%, but, I recognize that the actual performance of the device is probably a lot better than 5%)

As you point out, there is some uncertainty in the raw data.  The real question, then, is what is that uncertainty?

This whole thing of the theoretical performance and uncertainty of the new crop of VNAs and 1 port analyzers is actually pretty interesting, and I've been meaning to do the analysis and write it up, but just haven't got around to it.

One thing is certain: they are sufficiently different from the kinds of measurements hams typically do that "intuitive" approaches to the uncertainty analysis are full of traps.  It's sort of like the false precision one gets with the cheap 4 1/2 digit multimeter with a specified accuracy of 5%. The display may have 4 digits, but only the first two are meaningful.  I think most folks understand this by now, but with the new types of instruments, it's a bit tougher to have an intuitive understanding of the effects of various things on the measurement accuracy.

HP realized this, and provided an application that ran with the data extracted from your 8510 VNA that actually calculated the measurement uncertainty for your measurements.  The newer PNAs do it internally, and the data sheet for the VNA has little graphs that tell you what the measurement uncertainty is for various stimulus levels and S21 or S11 values (for instance, if you're measuring cable or attenuator loss, the measurement uncertainty gradually rises as the loss goes up, so there's a graph of uncertainty vs Mag(S21))

WHat they don't do (yet.. or maybe they do, and I haven't found the option button) is put error bars on the display. I'd love to have the phase(s21) display have error bars, so I don't have to explain that the 1 degree ripple in the graph is less than a 10th of the measurement uncertainty.

>
>
>"In fact, the Ap Note describing the 160m antenna measurements is quite
>suspicious, from a calibration standpoint, because you made use of
>"smoothing"..Those fluctuations with respect to frequency in the calibrated
>response before smoothing are often indicative of the calibration constants
>being derived from a noisy measurement of the standard.  (the whole dividing
>a small noisy number by another small noisy number problem...)."
>
>Calibration constants for any instrument are derived from measurements that
>are noisy.
>
>During the calibration procedure, the AIM averages 16 readings at each
>frequency point. This a perfectly valid mathematical technique.  Averaging
>is done in the hardware. Smoothing
>uses software to do a running average to further reduce the effect of
>measurement noise.  

It's a valid mathematical technique, if the statistical behavior of the thing you're averaging is understood well enough. The key is whether the errors are uncorrelated. If the thing being averaged out is, say, white gaussian noise, then you're in great shape.  If the thing being averaged out is, say, a spurious signal putting a sinusoidal error, then maybe not.  If the averaging is done before or after a nonlinear operation (e.g. linear to dB conversion) it makes a difference. 

This one is EASY to test.  Make a series of measurements of the same test article with different averaging sizes.  Say, 100 measurements without averaging, then 100 measurements averaging 10 samples, then 100 measurements averaging 100 samples, etc.  If you look at the variance of the collections of 100 measurements, they should change in proportion to the number of samples being averaged.  That is, the variance of the 100 measurements without averaging should be 10 times that of the variance of the 100 measurments which average 10 samples.  There are more sophisticated tests that require less data collection, but basically what they do is confirm your assumption that the errors have a consistent distribution, and that they are uncorrelated.

Is the running average over frequency points, or at a single frequency?  If it's over multiple points, then how do you know that you aren't "averaging out" actual differences in the signal you're measuring?  If you have a device with reasonably high Q, particularly if there's some transmission lines involved, the bounces up and down could be "real", and by smoothing them out

>
> 
>
>Comment added by Jay, The 160m Antenna in the application note performed
>exactly as the VSWR curve on the graph indicates as verified by using a
>transceiver, and our NIST traceable VSWR/ power meter.
>
>The intent of the application note was not to show .1% accuracy, but to
>demonstrate the usefulness of the hardware and software tools which we think
>are way beyond what any $500 instrument provides, I hope you agree with this
>statement. The average amateur can solve some typical amateur problems
>without resorting to high dollar equipment and tune up a 160m vertical.

Sure.. These small microprocessor based things are a godsend.  When I got my first MFJ 259, I thought "life is much better now" when doing mundane things like tuning a multiband dipole at field day.

The issue I have is when people throw around terms like "lab grade" or start to believe that because it works just like a Agilent PNA and has the same number of digits on the computer display, it must be as accurate as the PNA.  And, hams aren't unique here.  More than one junior engineer at work has come in asking why the device differs from the datasheet by 1 dB, based on power levels measured with a spectrum analyzer at 20dB above the noise floor, where the absolute accuracy is probably 2 or 3dB, at best.  (or worse, yet, the poor engineer puts that data up on the screen in a review, and gets beaten up by the review board...)

Enjoy all,

Jim, W6RMK