> *"RAW inputs improve prior methods, but our system outperforms them."* I under...

GrantMoyer · 2024-04-26T15:47:30

My guess: raw inputs preserve the linearity of radiance at each pixel. In other words, for a linear function f, f(Total Radiance) = f(Base Radiance + Reflected Radiance) → f(Total Radiance) = f(Base Radiance) + f(Reflected Radiance). Conversion from raw to another format may introduce a non-linear map on total radiance to compress the range to 8 bits while preserving contrast in most of the image (particularly for parts of the image washed out by a bright reflection).

So with raw images, the value you need to find is f(Reflected Radiance), which is probably why having a reference photo in the reflected direction helps. On the other hand, for other formats the reflection component of the image isn't a simple linear transform of whats being reflected, so even with a reference image, the reflection component would be hard to determine.

Derbasti · 2024-04-27T04:27:24

Maybe in this case, because these are phone pictures, which are quite heavily processed (sharpening, denoising, tone mapping, local white balance, local contrast). The raw image may contain a bit less of that stuff.