Publications
Erez Yosef publications in reversed chronological order.
2025
- IEEE OJSPTell Me What You See: Text-Guided Real-World Image DenoisingErez Yosef, and Raja GiryesIEEE Open Journal of Signal Processing 2025
Image reconstruction from noisy sensor measurements is challenging and many methods have been proposed for it. Yet, most approaches focus on learning robust natural image priors while modeling the scene’s noise statistics. In extremely low-light conditions, these methods often remain insufficient. Additional information is needed, such as multiple captures or, as suggested here, scene description. As an alternative, we propose using a text-based description of the scene as an additional prior, something the photographer can easily provide. Inspired by the remarkable success of text-guided diffusion models in image generation, we show that adding image caption information significantly improves image denoising and reconstruction for both synthetic and real-world images. All code and data will be made publicly available upon publication.
@article{yosef2025tell, title = {Tell Me What You See: Text-Guided Real-World Image Denoising}, author = {Yosef, Erez and Giryes, Raja}, journal = {IEEE Open Journal of Signal Processing}, year = {2025}, paper = {https://ieeexplore.ieee.org/document/11078899}, } - ACS PhotonicsInverse Design of Diffractive Metasurfaces Using Diffusion ModelsLiav Hen, Erez Yosef, Dan Raviv, and 2 more authorsACS Photonics 2025
Metasurfaces are ultrathin optical elements composed of engineered subwavelength structures that enable precise control of light. Their inverse design─determining a geometry that yields a desired optical response─is challenging due to the complex, nonlinear relationship between structure and optical properties. This often requires expert tuning, is prone to local minima, and involves significant computational overhead. In this work, we address these challenges by integrating the generative capabilities of diffusion models into computational design workflows. Using an RCWA simulator, we generate training data consisting of metasurface geometries and their corresponding far-field scattering patterns. We then train a conditional diffusion model to predict meta-atom geometry and height from a target spatial power distribution at a specified wavelength, sampled from a continuous supported band. Once trained, the model can generate metasurfaces with low error, either directly using RCWA-guided posterior sampling or by serving as an initializer for traditional optimization methods. We demonstrate our approach on the design of a spatially uniform intensity splitter and a polarization beam splitter, both produced with low error in under 30 min. To support further research in data-driven metasurface design, we publicly release our code and data sets.
@article{doi:10.1021/acsphotonics.5c01384, author = {Hen, Liav and Yosef, Erez and Raviv, Dan and Giryes, Raja and Scheuer, Jacob}, title = {Inverse Design of Diffractive Metasurfaces Using Diffusion Models}, journal = {ACS Photonics}, volume = {0}, number = {0}, pages = {null}, year = {2025}, doi = {10.1021/acsphotonics.5c01384}, paper = {https://doi.org/10.1021/acsphotonics.5c01384}, }
2024
- CVPR 24Mind The Edge: Refining Depth Edges in Sparsely-Supervised Monocular Depth EstimationLior Talker, Aviad Cohen, Erez Yosef, and 2 more authorsIn Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024
Monocular Depth Estimation (MDE) is a fundamental problem in computer vision with numerous applications. Recently, LIDAR-supervised methods have achieved remarkable per-pixel depth accuracy in outdoor scenes. However, significant errors are typically found in the proximity of depth discontinuities, i.e., depth edges, which often hinder the performance of depth-dependent applications that are sensitive to such inaccuracies, e.g., novel view synthesis and augmented reality. Since direct supervision for the location of depth edges is typically unavailable in sparse LIDAR-based scenes, encouraging the MDE model to produce correct depth edges is not straightforward. To the best of our knowledge this paper is the first attempt to address the depth edges issue for LIDAR-supervised scenes. In this work we propose to learn to detect the location of depth edges from densely-supervised synthetic data, and use it to generate supervision for the depth edges in the MDE training. %Despite the ’domain gap’ between synthetic and real data, we show that depth edges that are estimated directly are significantly more accurate than the ones that emerge indirectly from the MDE training. To quantitatively evaluate our approach, and due to the lack of depth edges ground truth in LIDAR-based scenes, we manually annotated subsets of the KITTI and the DDAD datasets with depth edges ground truth. We demonstrate significant gains in the accuracy of the depth edges with comparable per-pixel depth accuracy on several challenging datasets.
@inproceedings{talker2022mind, title = {Mind The Edge: Refining Depth Edges in Sparsely-Supervised Monocular Depth Estimation}, author = {Talker, Lior and Cohen, Aviad and Yosef, Erez and Dana, Alexandra and Dinerstein, Michael}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, year = {2024}, } - arXivDifuzCam: Replacing Camera Lens with a Mask and a Diffusion ModelErez Yosef, and Raja GiryesarXiv preprint arXiv:2408.07541 2024
The flat lensless camera design reduces the camera size and weight significantly. In this design, the camera lens is replaced by another optical element that interferes with the incoming light. The image is recovered from the raw sensor measurements using a reconstruction algorithm. Yet, the quality of the reconstructed images is not satisfactory. To mitigate this, we propose utilizing a pre-trained diffusion model with a control network and a learned separable transformation for reconstruction. This allows us to build a prototype flat camera with high-quality imaging, presenting state-of-the-art results in both terms of quality and perceptuality. We demonstrate its ability to leverage also textual descriptions of the captured scene to further enhance reconstruction. Our reconstruction method which leverages the strong capabilities of a pre-trained diffusion model can be used in other imaging systems for improved reconstruction results.
@article{yosef2025difuzcam, title = {DifuzCam: Replacing Camera Lens with a Mask and a Diffusion Model}, author = {Yosef, Erez and Giryes, Raja}, journal = {arXiv preprint arXiv:2408.07541}, year = {2024}, }
2023
-
Video reconstruction from a single motion blurred image using learned dynamic phase codingErez Yosef, Shay Elmalem, and Raja GiryesScientific Reports 2023Video reconstruction from a single motion-blurred image is a challenging problem, which can enhance the capabilities of existing cameras. Recently, several works addressed this task using conventional imaging and deep learning. Yet, such purely digital methods are inherently limited, due to direction ambiguity and noise sensitivity. Some works attempt to address these limitations with non-conventional image sensors, however, such sensors are extremely rare and expensive. To circumvent these limitations by simpler means, we propose a hybrid optical-digital method for video reconstruction that requires only simple modifications to existing optical systems. We use learned dynamic phase-coding in the lens aperture during image acquisition to encode motion trajectories, which serve as prior information for the video reconstruction process. The proposed computational camera generates a sharp frame burst of the scene at various frame rates from a single coded motion-blurred image, using an image-to-video convolutional neural network. We present advantages and improved performance compared to existing methods, with both simulations and a real-world camera prototype. We extend our optical coding to video frame interpolation and present robust and improved results for noisy videos.
@article{yosef2023video, title = {Video reconstruction from a single motion blurred image using learned dynamic phase coding}, author = {Yosef, Erez and Elmalem, Shay and Giryes, Raja}, journal = {Scientific Reports}, volume = {13}, number = {1}, pages = {13625}, year = {2023}, publisher = {Nature Publishing Group UK London}, paper = {https://www.nature.com/articles/s41598-023-40297-0}, } - Journal of OpticsDeep learning in optics-a tutorialBarak Hadad, Sahar Froim, Erez Yosef, and 2 more authorsJournal of Optics 2023
In recent years, machine learning and deep neural networks applications have experienced a remarkable surge in the field of physics, with optics being no exception. This tutorial aims to offer a fundamental introduction to the utilization of deep learning in optics, catering specifically to newcomers. Within this tutorial, we cover essential concepts, survey the field, and provide guidelines for the creation and deployment of artificial neural network architectures tailored to optical problems.
@article{hadad2023deep, title = {Deep learning in optics-a tutorial}, author = {Hadad, Barak and Froim, Sahar and Yosef, Erez and Giryes, Raja and Bahabad, Alon}, journal = {Journal of Optics}, year = {2023}, paper = {https://iopscience.iop.org/article/10.1088/2040-8986/ad08dc}, doi = {10.1088/2040-8986/ad08dc}, }
2022
- OSAVideo From Coded Motion Blur Using Dynamic Phase CodingErez Yosef, Shay Elmalem, and Raja GiryesIn Imaging and Applied Optics Congress, 2022
We present a method for video reconstruction of the scene dynamics from a single image using coded motion blur. Our approach addresses the limitations of the ill-posed task and utilizes a learned optical coding approach.
@inproceedings{Yosef:22, author = {Yosef, Erez and Elmalem, Shay and Giryes, Raja}, booktitle = {Imaging and Applied Optics Congress,}, journal = {Imaging and Applied Optics Congress, }, keywords = {Computational imaging; Image processing; Image reconstruction; Image sensors; Neural networks; Temporal resolution}, pages = {ITh3D.6}, publisher = {Optica Publishing Group}, title = {Video From Coded Motion Blur Using Dynamic Phase Coding}, year = {2022}, doi = {10.1364/ISA.2022.ITh3D.6}, }