The case for taking an odd number of calibration frames [Deep Sky] Processing techniques · Y. · ... · 9 · 819 · 0

lunohodov 1.43
...
· 
·  3 likes
Hi all,

Some experienced members in my local astrophotography club advocate for taking an odd number of calibration frames i.e. 31 flats instead of 30. They also recommend integrating master calibration frames using Median instead of Average as the pixel combination method.

Why would an odd number of calibration frames be preferred and in what circumstances?

As a PixInsight user, all the learning resources I've encountered use average combination. Moreover, PixInsight's documentation states median combination will lead to a 20% loss of signal.

Are the above combination methods universal i.e. the same would apply in Siril? In what circumstances would median combination make sense for production work and why?

Thank you in advance for shedding light into this!
Like
HegAstro 11.99
...
· 
·  2 likes
What is the reason they provide for their advice? Much of imaging has solid mathematical  basis behind it, so if someone makes a statement like “an odd number of frames is better” or “median calibration is better”, they should be able to justify it via math or properly controlled experiment. Otherwise it is a myth. Also, define “experienced”. Median combination of light frames does result in better outlier rejection of things like satellite trails and I have used it when average combination didn’t properly remove them. But there are other methods to accomplish this.
Like
PathIntegral 5.01
...
· 
·  2 likes
Even if there were an even-odd effect, I would think an even number of frames would be better as it cancels any potential systematic error between even-numbered frames and odd-numbered ones.

Whats' more, 31 is kind of the worst choice, since it is a prime, and none of the systematic error that repeats every few frames can be canceled. :-)
Edited ...
Like
dkamen 6.89
...
· 
·  5 likes
The median of an odd number of measurements is an actual measurement, whereas the median of an even number of things is the average of the two middle ones and may or may not correspond to an actual measurement. E.g

median(3,3,6,6) = 4.5 

but there is nothing telling you the measured quantity really can take the value 4.5 (or 4 or 5 which is what 4.5 will be truncated/rounded to when doing integer arithmetic). For all we know, the quantity might grow in steps of 3 and the median represents an impossible value.

So from a purist point of view, the median of an even number of things is problematic: there is uncertainty as to whether any original sub, even an ideal one, could actually have the median value. There is no such uncertainty with the median of an odd number of things.

Of course this purist intuition is wrong given the kind of data one works with in astrophotography, especially with calibration subs. Quantization error is at least 2x more significant but nobody cares about it (and rightly so). 


To answer the other question, the median is perfectly fine for aggressive outlier rejection when you have reason to believe such outliers exist. An example is taking sky flats at dusk where the occasional star(s) might creep in. It takes a very large number of subs and the most aggressive sigma clipping settings (more aggressive than some tools will allow you to enter) to kill the stars. The median does the job even with ~15 subs which is also a typical data set size for flats.
Like
lunohodov 1.43
...
· 
Arun H:
What is the reason they provide for their advice? Much of imaging has solid mathematical  basis behind it, so if someone makes a statement like “an odd number of frames is better” or “median calibration is better”, they should be able to justify it via math or properly controlled experiment. Otherwise it is a myth. Also, define “experienced”. Median combination of light frames does result in better outlier rejection of things like satellite trails and I have used it when average combination didn’t properly remove them. But there are other methods to accomplish this.

The reason they provide is that the recommendation came from an expert photometry instructor. This doesn't really provide an answer. My search on the Internet did not yield something useful either. For example the current AAVSO guide to photometry does not mention anything related. I understand that astrophotography and photometry have very different goals. Hence, the means to achieve them are also different i.e. using a more fitting outlier rejection algorithm(s).

Experienced as someone who is in the hobby longer than me. Yes, having experience does not make one immune to mistakes. I should have not used adjectives.
Yuxuan:
Even if there were an even-odd effect, I would think an even number of frames would be better as it cancels any potential systematic error between even-numbered frames and odd-numbered ones.

Whats' more, 31 is kind of the worst choice, since it is a prime, and none of the systematic error that repeats every few frames can be canceled. :-)

That's an interesting perspective. I haven't thought about this! The number 31 is arbitrary, though.
The median of an odd number of measurements is an actual measurement, whereas the median of an even number of things is the average of the two middle ones and may or may not correspond to an actual measurement. E.g

median(3,3,6,6) = 4.5 

but there is nothing telling you the measured quantity really can take the value 4.5 (or 4 or 5 which is what 4.5 will be truncated/rounded to when doing integer arithmetic). For all we know, the quantity might grow in steps of 3 and the median represents an impossible value.

So from a purist point of view, the median of an even number of things is problematic: there is uncertainty as to whether any original sub, even an ideal one, could actually have the median value. There is no such uncertainty with the median of an odd number of things.

Of course this purist intuition is wrong given the kind of data one works with in astrophotography, especially with calibration subs. Quantization error is at least 2x more significant but nobody cares about it (and rightly so). 


To answer the other question, the median is perfectly fine for aggressive outlier rejection when you have reason to believe such outliers exist. An example is taking sky flats at dusk where the occasional star(s) might creep in. It takes a very large number of subs and the most aggressive sigma clipping settings (more aggressive than some tools will allow you to enter) to kill the stars. The median does the job even with ~15 subs which is also a typical data set size for flats.

I was also thinking in that direction but not as comprehensive as you Stacking sky flats was the only application for Median I came up with.

Putting astrophotography and photometry in the same bucket (see my answer to @Arun above) seemed unreasonable. Faint structures such as dust and nebulosity usually dwell in noisy areas. While stars, even faint ones, have higher SNR. I can imagine that rejecting outliers such as cosmic rays is important in photometry. Hence the use of Median algorithm. On the other hand it's calibration frames we're talking about. It doesn't make much sense.
Edited ...
Like
Geoff 2.81
...
· 
·  4 likes


Y.:
Hi all,

Some experienced members in my local astrophotography club advocate for taking an odd number of calibration frames i.e. 31 flats instead of 30. They also recommend integrating master calibration frames using Median instead of Average as the pixel combination method.

Why would an odd number of calibration frames be preferred and in what circumstances?

As a PixInsight user, all the learning resources I've encountered use average combination. Moreover, PixInsight's documentation states median combination will lead to a 20% loss of signal.

Are the above combination methods universal i.e. the same would apply in Siril? In what circumstances would median combination make sense for production work and why?

Thank you in advance for shedding light into this!

Take as many frames as you can—odd or even number—combine with average and ignore the advice of the so-called experienced members
Edited ...
Like
HegAstro 11.99
...
· 
·  2 likes
What Geoff said. I don’t think they know what they are talking about. You can come up with whatever convoluted explanation that justifies some claim - but IMO claims like you should take an odd number of calibration frames are little short of black magic. There are many nice threads here use sound mathematics to justify why taking more light frames is always a good idea and why taking more than 16-20 darks is almost always a waste of time. There is even a thread where I showed how you can use basic differential calculus to estimate the overall error due to flats. None of that makes any difference between odd and even numbers.
Edited ...
Like
rveregin 6.76
...
· 
·  1 like
First the median of the even (3,4,4,6) is 4, a real data point, so there is a problem with the whole concept, though it is true the odd sum will always fall on the middle point.  

Second, lets look at the mean. When you do a mean of a data set very unlikely that it will fall on a data point. This is not an issue, the whole idea of a mean is to best define the true mean by using all the data. Whether it falls on one data point is irrelevant. And if it does it doesn't make it more valid, of if it doesn't it doesn't make it less valid. The mean is the mean.

The median is not that different from the mean. It has some advantages if the distribution is not symmetrical or if there are outliers. The median is much less sensitive to these. But for a nice random distribution without outliers the mean and median are also the same. Like a mean, the median uses all the data, the central point of the median is selected by all the data, so it is no way depends on whether it hits an actual data point or not. The central data point, if it is actually a raw data point, is only special because it happens to be at the centre. It is the the set of data points that is critical here, the whole set of data selects where the center is, doesn't matter if there is a real data point there. 

As for using median for flats, if you do the flats correctly the signal is very strong and the S/N is very high, much higher than your images. So if using median, or alternates like a outlier rejection algorithm and it loses a bit of S/N it doesn't matter. The key is for flats it is a very good idea to do median or some sort of rejection. The reason is if you have a hot pixel in you flat, or a dead one for that matter, then it will show up in your image as a black spot or a white spot, since the flat correction will be way off. You do not want this. So getting the cleanest flats using median to get rid of bad pixels is important, losing a bit of S/N is not. 

For lights you don't want typically to use median since there S/N is low and critical. One can carefully use rejection algorithms for lights, I use them all the time. You need to use it gently, but it cleans up the image nicely--such as satellites, etc. I think this is what PI is alluding to, that there are better methods like rejection algorithms for lights.

Just make sure you take enough flats, I use at least 30, but typically 100 as they are fast to take. Also dither your lights, this helps reduce artifacts from your flats and darks.

There is only one place I can think of where it is nice to have an odd number of items for a median. If you have 5 apples for example, one of them will be the median in whatever attribute you measure. If you have 4 apples, there may not be a median apple. If one cares...

Rick
Like
lunohodov 1.43
...
· 
·  1 like
@Rick Veregin Thank you for taking the time to explain! I will stick with average and appropriate pixel rejection.

I had a funny feeling on the recommendation in question. On one hand it feels like I spent too much time on something that even if true, would not make a difference. On the other hand, I am glad I went into this rabbit hole as I learned a lot on the way down.

Thank you all for helping me find my bearings. Happy New Year and don't stop being awesome!
Like
HegAstro 11.99
...
· 
·  2 likes
median(3,3,6,6) = 4.5 

but there is nothing telling you the measured quantity really can take the value 4.5 (or 4 or 5 which is what 4.5 will be truncated/rounded to when doing integer arithmetic). For all we know, the quantity might grow in steps of 3 and the median represents an impossible value.

So from a purist point of view, the median of an even number of things is problematic: there is uncertainty as to whether any original sub, even an ideal one, could actually have the median value. There is no such uncertainty with the median of an odd number of things.

Of course this purist intuition is wrong given the kind of data one works with in astrophotography, especially with calibration subs. Quantization error is at least 2x more significant but nobody cares about it (and rightly so).

I think it is really important to stress here the second  point, which is that whether or not the median represents a “possible” number is largely irrelevant for our purposes. A fundamental assumption we make is that our data are continuous rather than discrete - that is that they can take any value within a range. If our dataset was truly discrete, then very different mathematics would have to be used - we would not for example, be easily able to use distributions like Normal, which require continuous data. The way we calibrate would also have to be different. For continuous data, these is no mathematics I am aware of that says there is some advantage to the median corresponding to the specific observation. In any event, remember that the median from a limited number of observations is only an estimate and has error associated with it - that is, the fact that the median of an odd number of observations corresponds to a specific observation does not by itself make it any more closer to the true median of the distribution. Taking more measurements reduces this error, which is far more important than taking odd or even number of measurements.
Edited ...
Like
 
Register or login to create to post a reply.