Get trending papers in your email inbox once a day!
Get trending papers in your email inbox!
SubscribeHiFi-HARP: A High-Fidelity 7th-Order Ambisonic Room Impulse Response Dataset
We introduce HiFi-HARP, a large-scale dataset of 7th-order Higher-Order Ambisonic Room Impulse Responses (HOA-RIRs) consisting of more than 100,000 RIRs generated via a hybrid acoustic simulation in realistic indoor scenes. HiFi-HARP combines geometrically complex, furnished room models from the 3D-FRONT repository with a hybrid simulation pipeline: low-frequency wave-based simulation (finite-difference time-domain) up to 900 Hz is used, while high frequencies above 900 Hz are simulated using a ray-tracing approach. The combined raw RIRs are encoded into the spherical-harmonic domain (AmbiX ACN) for direct auralization. Our dataset extends prior work by providing 7th-order Ambisonic RIRs that combine wave-theoretic accuracy with realistic room content. We detail the generation pipeline (scene and material selection, array design, hybrid simulation, ambisonic encoding) and provide dataset statistics (room volumes, RT60 distributions, absorption properties). A comparison table highlights the novelty of HiFi-HARP relative to existing RIR collections. Finally, we outline potential benchmarks such as FOA-to-HOA upsampling, source localization, and dereverberation. We discuss machine learning use cases (spatial audio rendering, acoustic parameter estimation) and limitations (e.g., simulation approximations, static scenes). Overall, HiFi-HARP offers a rich resource for developing spatial audio and acoustics algorithms in complex environments.
Finite Difference Neural Networks: Fast Prediction of Partial Differential Equations
Discovering the underlying behavior of complex systems is an important topic in many science and engineering disciplines. In this paper, we propose a novel neural network framework, finite difference neural networks (FDNet), to learn partial differential equations from data. Specifically, our proposed finite difference inspired network is designed to learn the underlying governing partial differential equations from trajectory data, and to iteratively estimate the future dynamical behavior using only a few trainable parameters. We illustrate the performance (predictive power) of our framework on the heat equation, with and without noise and/or forcing, and compare our results to the Forward Euler method. Moreover, we show the advantages of using a Hessian-Free Trust Region method to train the network.
FD-Net with Auxiliary Time Steps: Fast Prediction of PDEs using Hessian-Free Trust-Region Methods
Discovering the underlying physical behavior of complex systems is a crucial, but less well-understood topic in many engineering disciplines. This study proposes a finite-difference inspired convolutional neural network framework to learn hidden partial differential equations from given data and iteratively estimate future dynamical behavior. The methodology designs the filter sizes such that they mimic the finite difference between the neighboring points. By learning the governing equation, the network predicts the future evolution of the solution by using only a few trainable parameters. In this paper, we provide numerical results to compare the efficiency of the second-order Trust-Region Conjugate Gradient (TRCG) method with the first-order ADAM optimizer.
Partial Differential Equations is All You Need for Generating Neural Architectures -- A Theory for Physical Artificial Intelligence Systems
In this work, we generalize the reaction-diffusion equation in statistical physics, Schr\"odinger equation in quantum mechanics, Helmholtz equation in paraxial optics into the neural partial differential equations (NPDE), which can be considered as the fundamental equations in the field of artificial intelligence research. We take finite difference method to discretize NPDE for finding numerical solution, and the basic building blocks of deep neural network architecture, including multi-layer perceptron, convolutional neural network and recurrent neural networks, are generated. The learning strategies, such as Adaptive moment estimation, L-BFGS, pseudoinverse learning algorithms and partial differential equation constrained optimization, are also presented. We believe it is of significance that presented clear physical image of interpretable deep neural networks, which makes it be possible for applying to analog computing device design, and pave the road to physical artificial intelligence.
Transform Once: Efficient Operator Learning in Frequency Domain
Spectral analysis provides one of the most effective paradigms for information-preserving dimensionality reduction, as simple descriptions of naturally occurring signals are often obtained via few terms of periodic basis functions. In this work, we study deep neural networks designed to harness the structure in frequency domain for efficient learning of long-range correlations in space or time: frequency-domain models (FDMs). Existing FDMs are based on complex-valued transforms i.e. Fourier Transforms (FT), and layers that perform computation on the spectrum and input data separately. This design introduces considerable computational overhead: for each layer, a forward and inverse FT. Instead, this work introduces a blueprint for frequency domain learning through a single transform: transform once (T1). To enable efficient, direct learning in the frequency domain we derive a variance-preserving weight initialization scheme and investigate methods for frequency selection in reduced-order FDMs. Our results noticeably streamline the design process of FDMs, pruning redundant transforms, and leading to speedups of 3x to 10x that increase with data resolution and model size. We perform extensive experiments on learning the solution operator of spatio-temporal dynamics, including incompressible Navier-Stokes, turbulent flows around airfoils and high-resolution video of smoke. T1 models improve on the test performance of FDMs while requiring significantly less computation (5 hours instead of 32 for our large-scale experiment), with over 20% reduction in average predictive error across tasks.
Generating arbitrary polarization states by manipulating the thicknesses of a pair of uniaxial birefringent plates
We report an optical method of generating arbitrary polarization states by manipulating the thicknesses of a pair of uniaxial birefringent plates, the optical axes of which are set at a crossing angle of {\pi}/4. The method has the remarkable feature of being able to generate a distribution of arbitrary polarization states in a group of highly discrete spectra without spatially separating the individual spectral components. The target polarization-state distribution is obtained as an optimal solution through an exploration. Within a realistic exploration range, a sufficient number of near-optimal solutions are found. This property is also reproduced well by a concise model based on a distribution of exploration points on a Poincar\'e sphere, showing that the number of near-optimal solutions behaves according to a power law with respect to the number of spectral components of concern. As a typical example of an application, by applying this method to a set of phase-locked highly discrete spectra, we numerically demonstrate the continuous generation of a vector-like optical electric field waveform, the helicity of which is alternated within a single optical cycle in the time domain.
Sound propagation in realistic interactive 3D scenes with parameterized sources using deep neural operators
We address the challenge of sound propagation simulations in 3D virtual rooms with moving sources, which have applications in virtual/augmented reality, game audio, and spatial computing. Solutions to the wave equation can describe wave phenomena such as diffraction and interference. However, simulating them using conventional numerical discretization methods with hundreds of source and receiver positions is intractable, making stimulating a sound field with moving sources impractical. To overcome this limitation, we propose using deep operator networks to approximate linear wave-equation operators. This enables the rapid prediction of sound propagation in realistic 3D acoustic scenes with moving sources, achieving millisecond-scale computations. By learning a compact surrogate model, we avoid the offline calculation and storage of impulse responses for all relevant source/listener pairs. Our experiments, including various complex scene geometries, show good agreement with reference solutions, with root mean squared errors ranging from 0.02 Pa to 0.10 Pa. Notably, our method signifies a paradigm shift as no prior machine learning approach has achieved precise predictions of complete wave fields within realistic domains. We anticipate that our findings will drive further exploration of deep neural operator methods, advancing research in immersive user experiences within virtual environments.
Refreshing idea on Fourier analysis
The "theoretical limit of time-frequency resolution in Fourier analysis" is thought to originate in certain mathematical and/or physical limitations. This, however, is not true. The actual origin arises from the numerical (technical) method deployed to reduce computation time. In addition, there is a gap between the theoretical equation for Fourier analysis and its numerical implementation. Knowing the facts brings us practical benefits. In this case, these related to boundary conditions, and complex integrals. For example, replacing a Fourier integral with a complex integral brings a hybrid method for the Laplace and Fourier transforms, and reveals another perspective on time-frequency analysis. We present such a perspective here with a simple demonstrative analysis.
nnAudio: An on-the-fly GPU Audio to Spectrogram Conversion Toolbox Using 1D Convolution Neural Networks
Converting time domain waveforms to frequency domain spectrograms is typically considered to be a prepossessing step done before model training. This approach, however, has several drawbacks. First, it takes a lot of hard disk space to store different frequency domain representations. This is especially true during the model development and tuning process, when exploring various types of spectrograms for optimal performance. Second, if another dataset is used, one must process all the audio clips again before the network can be retrained. In this paper, we integrate the time domain to frequency domain conversion as part of the model structure, and propose a neural network based toolbox, nnAudio, which leverages 1D convolutional neural networks to perform time domain to frequency domain conversion during feed-forward. It allows on-the-fly spectrogram generation without the need to store any spectrograms on the disk. This approach also allows back-propagation on the waveforms-to-spectrograms transformation layer, which implies that this transformation process can be made trainable, and hence further optimized by gradient descent. nnAudio reduces the waveforms-to-spectrograms conversion time for 1,770 waveforms (from the MAPS dataset) from 10.64 seconds with librosa to only 0.001 seconds for Short-Time Fourier Transform (STFT), 18.3 seconds to 0.015 seconds for Mel spectrogram, 103.4 seconds to 0.258 for constant-Q transform (CQT), when using GPU on our DGX work station with CPU: Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz Tesla v100 32Gb GPUs. (Only 1 GPU is being used for all the experiments.) We also further optimize the existing CQT algorithm, so that the CQT spectrogram can be obtained without aliasing in a much faster computation time (from 0.258 seconds to only 0.001 seconds).
Spectral-Refiner: Fine-Tuning of Accurate Spatiotemporal Neural Operator for Turbulent Flows
Recent advancements in operator-type neural networks have shown promising results in approximating the solutions of spatiotemporal Partial Differential Equations (PDEs). However, these neural networks often entail considerable training expenses, and may not always achieve the desired accuracy required in many scientific and engineering disciplines. In this paper, we propose a new Spatiotemporal Fourier Neural Operator (SFNO) that learns maps between Bochner spaces, and a new learning framework to address these issues. This new paradigm leverages wisdom from traditional numerical PDE theory and techniques to refine the pipeline of commonly adopted end-to-end neural operator training and evaluations. Specifically, in the learning problems for the turbulent flow modeling by the Navier-Stokes Equations (NSE), the proposed architecture initiates the training with a few epochs for SFNO, concluding with the freezing of most model parameters. Then, the last linear spectral convolution layer is fine-tuned without the frequency truncation. The optimization uses a negative Sobolev norm for the first time as the loss in operator learning, defined through a reliable functional-type a posteriori error estimator whose evaluation is almost exact thanks to the Parseval identity. This design allows the neural operators to effectively tackle low-frequency errors while the relief of the de-aliasing filter addresses high-frequency errors. Numerical experiments on commonly used benchmarks for the 2D NSE demonstrate significant improvements in both computational efficiency and accuracy, compared to end-to-end evaluation and traditional numerical PDE solvers.
Accelerating Automatic Differentiation of Direct Form Digital Filters
We introduce a general formulation for automatic differentiation through direct form filters, yielding a closed-form backpropagation that includes initial condition gradients. The result is a single expression that can represent both the filter and its gradients computation while supporting parallelism. C++/CUDA implementations in PyTorch achieve at least 1000x speedup over naive Python implementations and consistently run fastest on the GPU. For the low-order filters commonly used in practice, exact time-domain filtering with analytical gradients outperforms the frequency-domain method in terms of speed. The source code is available at https://github.com/yoyolicoris/philtorch.
NeRF2: Neural Radio-Frequency Radiance Fields
Although Maxwell discovered the physical laws of electromagnetic waves 160 years ago, how to precisely model the propagation of an RF signal in an electrically large and complex environment remains a long-standing problem. The difficulty is in the complex interactions between the RF signal and the obstacles (e.g., reflection, diffraction, etc.). Inspired by the great success of using a neural network to describe the optical field in computer vision, we propose a neural radio-frequency radiance field, NeRF^2, which represents a continuous volumetric scene function that makes sense of an RF signal's propagation. Particularly, after training with a few signal measurements, NeRF^2 can tell how/what signal is received at any position when it knows the position of a transmitter. As a physical-layer neural network, NeRF^2 can take advantage of the learned statistic model plus the physical model of ray tracing to generate a synthetic dataset that meets the training demands of application-layer artificial neural networks (ANNs). Thus, we can boost the performance of ANNs by the proposed turbo-learning, which mixes the true and synthetic datasets to intensify the training. Our experiment results show that turbo-learning can enhance performance with an approximate 50% increase. We also demonstrate the power of NeRF^2 in the field of indoor localization and 5G MIMO.
Mesh-robust stability and convergence of variable-step deferred correction methods based on the BDF2 formula
We provide a new theoretical framework for the variable-step deferred correction (DC) methods based on the well-known BDF2 formula. By using the discrete orthogonal convolution kernels, some high-order BDF2-DC methods are proven to be stable on arbitrary time grids according to the recent definition of stability (SINUM, 60: 2253-2272). It significantly relaxes the existing step-ratio restrictions for the BDF2-DC methods (BIT, 62: 1789-1822). The associated sharp error estimates are established by taking the numerical effects of the starting approximations into account, and they suggest that the BDF2-DC methods have no aftereffect, that is, the lower-order starting scheme for the BDF2 scheme will not cause a loss in the accuracy of the high-order BDF2-DC methods. Extensive tests on the graded and random time meshes are presented to support the new theory.
Disentangled Representation Learning for RF Fingerprint Extraction under Unknown Channel Statistics
Deep learning (DL) applied to a device's radio-frequency fingerprint~(RFF) has attracted significant attention in physical-layer authentication due to its extraordinary classification performance. Conventional DL-RFF techniques are trained by adopting maximum likelihood estimation~(MLE). Although their discriminability has recently been extended to unknown devices in open-set scenarios, they still tend to overfit the channel statistics embedded in the training dataset. This restricts their practical applications as it is challenging to collect sufficient training data capturing the characteristics of all possible wireless channel environments. To address this challenge, we propose a DL framework of disentangled representation~(DR) learning that first learns to factor the signals into a device-relevant component and a device-irrelevant component via adversarial learning. Then, it shuffles these two parts within a dataset for implicit data augmentation, which imposes a strong regularization on RFF extractor learning to avoid the possible overfitting of device-irrelevant channel statistics, without collecting additional data from unknown channels. Experiments validate that the proposed approach, referred to as DR-based RFF, outperforms conventional methods in terms of generalizability to unknown devices even under unknown complicated propagation environments, e.g., dispersive multipath fading channels, even though all the training data are collected in a simple environment with dominated direct line-of-sight~(LoS) propagation paths.
Coupled BEM-FEM for the convected Helmholtz equation with non-uniform flow in a bounded domain
We consider the convected Helmholtz equation modeling linear acoustic propagation at a fixed frequency in a subsonic flow around a scattering object. The flow is supposed to be uniform in the exterior domain far from the object, and potential in the interior domain close to the object. Our key idea is the reformulation of the original problem using the Prandtl--Glauert transformation on the whole flow domain, yielding (i) the classical Helmholtz equation in the exterior domain and (ii) an anisotropic diffusive PDE with skew-symmetric first-order perturbation in the interior domain such that its transmission condition at the coupling boundary naturally fits the Neumann condition from the classical Helmholtz equation. Then, efficient off-the-shelf tools can be used to perform the BEM-FEM coupling, leading to two novel variational formulations for the convected Helmholtz equation. The first formulation involves one surface unknown and can be affected by resonant frequencies, while the second formulation avoids resonant frequencies and involves two surface unknowns. Numerical simulations are presented to compare the two formulations.
Weak localization in radiative transfer of acoustic waves in a randomly-fluctuating slab
This paper concerns the derivation of radiative transfer equations for acoustic waves propagating in a randomly fluctuating slab (between two parallel planes) in the weak-scattering regime, and the study of boundary effects through an asymptotic analysis of the Wigner transform of the wave solution. These radiative transfer equations allow to model the transport of wave energy density, taking into account the scattering by random heterogeneities. The approach builds on the method of images, where the slab is extended to a full-space, with a periodic map of mechanical properties and a series of sources located along a periodic pattern. Two types of boundary effects, both on the (small) scale of the wavelength, are observed: one at the boundaries of the slab, and one inside the domain. The former impact the entire energy density (coherent as well as incoherent) and is also observed in half-spaces. The latter, more specific to slabs, corresponds to the constructive interference of waves that have reflected at least twice on the boundaries of the slab and only impacts the coherent part of the energy density.
In-Sensor Radio Frequency Computing for Energy-Efficient Intelligent Radar
Radio Frequency Neural Networks (RFNNs) have demonstrated advantages in realizing intelligent applications across various domains. However, as the model size of deep neural networks rapidly increases, implementing large-scale RFNN in practice requires an extensive number of RF interferometers and consumes a substantial amount of energy. To address this challenge, we propose to utilize low-rank decomposition to transform a large-scale RFNN into a compact RFNN while almost preserving its accuracy. Specifically, we develop a Tensor-Train RFNN (TT-RFNN) where each layer comprises a sequence of low-rank third-order tensors, leading to a notable reduction in parameter count, thereby optimizing RF interferometer utilization in comparison to the original large-scale RFNN. Additionally, considering the inherent physical errors when mapping TT-RFNN to RF device parameters in real-world deployment, from a general perspective, we construct the Robust TT-RFNN (RTT-RFNN) by incorporating a robustness solver on TT-RFNN to enhance its robustness. To adapt the RTT-RFNN to varying requirements of reshaping operations, we further provide a reconfigurable reshaping solution employing RF switch matrices. Empirical evaluations conducted on MNIST and CIFAR-10 datasets show the effectiveness of our proposed method.
Understanding Gradient Regularization in Deep Learning: Efficient Finite-Difference Computation and Implicit Bias
Gradient regularization (GR) is a method that penalizes the gradient norm of the training loss during training. While some studies have reported that GR can improve generalization performance, little attention has been paid to it from the algorithmic perspective, that is, the algorithms of GR that efficiently improve the performance. In this study, we first reveal that a specific finite-difference computation, composed of both gradient ascent and descent steps, reduces the computational cost of GR. Next, we show that the finite-difference computation also works better in the sense of generalization performance. We theoretically analyze a solvable model, a diagonal linear network, and clarify that GR has a desirable implicit bias to so-called rich regime and finite-difference computation strengthens this bias. Furthermore, finite-difference GR is closely related to some other algorithms based on iterative ascent and descent steps for exploring flat minima. In particular, we reveal that the flooding method can perform finite-difference GR in an implicit way. Thus, this work broadens our understanding of GR for both practice and theory.
Training Deep Surrogate Models with Large Scale Online Learning
The spatiotemporal resolution of Partial Differential Equations (PDEs) plays important roles in the mathematical description of the world's physical phenomena. In general, scientists and engineers solve PDEs numerically by the use of computationally demanding solvers. Recently, deep learning algorithms have emerged as a viable alternative for obtaining fast solutions for PDEs. Models are usually trained on synthetic data generated by solvers, stored on disk and read back for training. This paper advocates that relying on a traditional static dataset to train these models does not allow the full benefit of the solver to be used as a data generator. It proposes an open source online training framework for deep surrogate models. The framework implements several levels of parallelism focused on simultaneously generating numerical simulations and training deep neural networks. This approach suppresses the I/O and storage bottleneck associated with disk-loaded datasets, and opens the way to training on significantly larger datasets. Experiments compare the offline and online training of four surrogate models, including state-of-the-art architectures. Results indicate that exposing deep surrogate models to more dataset diversity, up to hundreds of GB, can increase model generalization capabilities. Fully connected neural networks, Fourier Neural Operator (FNO), and Message Passing PDE Solver prediction accuracy is improved by 68%, 16% and 7%, respectively.
Grid-free Harmonic Retrieval and Model Order Selection using Deep Convolutional Neural Networks
Harmonic retrieval techniques are the foundation of radio channel sounding, estimation and modeling. This paper introduces a Deep Learning approach for two-dimensional spectral estimation from frequency and time samples of a radio channel transfer function. Our work can estimate two-dimensional parameters from a signal containing an unknown number of paths. In contrast to existing deep learning-based methods, the signal parameters are not estimated via classification but instead in a quasi-grid-free manner. This alleviates the bias, spectral leakage, and ghost targets that grid-based approaches inherently produce. The proposed architecture also reliably estimates the number of spectral components in the measurement. Hence, the architecture jointly solves the model order selection problem and the parameter estimation task. Additionally, we propose a multi-channel windowing of the data during preprocessing, increasing the resulting estimator's robustness. We verify the performance compared to existing harmonic retrieval methods and also show how it can be integrated into an existing maximum likelihood estimator for efficient initialization of a gradient-based iteration.
Neural Operator: Is data all you need to model the world? An insight into the impact of Physics Informed Machine Learning
Numerical approximations of partial differential equations (PDEs) are routinely employed to formulate the solution of physics, engineering and mathematical problems involving functions of several variables, such as the propagation of heat or sound, fluid flow, elasticity, electrostatics, electrodynamics, and more. While this has led to solving many complex phenomena, there are some limitations. Conventional approaches such as Finite Element Methods (FEMs) and Finite Differential Methods (FDMs) require considerable time and are computationally expensive. In contrast, data driven machine learning-based methods such as neural networks provide a faster, fairly accurate alternative, and have certain advantages such as discretization invariance and resolution invariance. This article aims to provide a comprehensive insight into how data-driven approaches can complement conventional techniques to solve engineering and physics problems, while also noting some of the major pitfalls of machine learning-based approaches. Furthermore, we highlight, a novel and fast machine learning-based approach (~1000x) to learning the solution operator of a PDE operator learning. We will note how these new computational approaches can bring immense advantages in tackling many problems in fundamental and applied physics.
Experimental demonstration of superdirective spherical dielectric antenna
An experimental demonstration of directivities exceeding the fundamental Kildal limit, a phenomenon called superdirectivity, is provided for spherical high-index dielectric antennas with an electric dipole excitation. A directivity factor of about 10 with a total efficiency of more than 80\% for an antenna having a size of a third of the wavelength was measured. High directivities are shown to be associated with constructive interference of particular electric and magnetic modes of an open spherical resonator. Both analytic solution for a point dipole and a full-wave rigorous simulation for a realistic dipole antenna were employed for optimization and analysis, yielding an excellent agreement between experimentally measured and numerically predicted directivities. The use of high-index low-loss ceramics can significantly reduce the physical size of such antennas while maintaining their overall high radiation efficiency. Such antennas can be attractive for various high-frequency applications, such as antennas for the Internet of things, smart city systems, 5G network systems, and others. The demonstrated concept can be scaled in frequency.
Multi-Grid Tensorized Fourier Neural Operator for High-Resolution PDEs
Memory complexity and data scarcity have so far prohibited learning solution operators of partial differential equations (PDEs) at high resolutions. We address these limitations by introducing a new data efficient and highly parallelizable operator learning approach with reduced memory requirement and better generalization, called multi-grid tensorized neural operator (MG-TFNO). MG-TFNO scales to large resolutions by leveraging local and global structures of full-scale, real-world phenomena, through a decomposition of both the input domain and the operator's parameter space. Our contributions are threefold: i) we enable parallelization over input samples with a novel multi-grid-based domain decomposition, ii) we represent the parameters of the model in a high-order latent subspace of the Fourier domain, through a global tensor factorization, resulting in an extreme reduction in the number of parameters and improved generalization, and iii) we propose architectural improvements to the backbone FNO. Our approach can be used in any operator learning setting. We demonstrate superior performance on the turbulent Navier-Stokes equations where we achieve less than half the error with over 150x compression. The tensorization combined with the domain decomposition, yields over 150x reduction in the number of parameters and 7x reduction in the domain size without losses in accuracy, while slightly enabling parallelism.
Trapped acoustic waves and raindrops: high-order accurate integral equation method for localized excitation of a periodic staircase
We present a high-order boundary integral equation (BIE) method for the frequency-domain acoustic scattering of a point source by a singly-periodic, infinite, corrugated boundary. We apply it to the accurate numerical study of acoustic radiation in the neighborhood of a sound-hard two-dimensional staircase modeled after the El Castillo pyramid. Such staircases support trapped waves which travel along the surface and decay exponentially away from it. We use the array scanning method (Floquet--Bloch transform) to recover the scattered field as an integral over the family of quasiperiodic solutions parameterized by their on-surface wavenumber. Each such BIE solution requires the quasiperiodic Green's function, which we evaluate using an efficient integral representation of lattice sum coefficients. We avoid the singularities and branch cuts present in the array scanning integral by complex contour deformation. For each frequency, this enables a solution accurate to around 10 digits in a couple of seconds. We propose a residue method to extract the limiting powers carried by trapped modes far from the source. Finally, by computing the trapped mode dispersion relation, we use a simple ray model to explain an observed acoustic "raindrop" effect (chirp-like time-domain response).
Can photonic heterostructures provably outperform single-material geometries?
Recent advances in photonic optimization have enabled calculation of performance bounds for a wide range of electromagnetic objectives, albeit restricted to single-material systems. Motivated by growing theoretical interest and fabrication advances, we present a framework to bound the performance of photonic heterostructures and apply it to investigate maximum absorption characteristics of multilayer films and compact, free-form multi-material scatterers. Limits predict trends seen in topology-optimized geometries -- often coming within factors of two of specific designs -- and may be exploited in conjunction with inverse designs to predict when heterostructures are expected to outperform their optimal single-material counterparts.
Frequency-Domain Refinement with Multiscale Diffusion for Super Resolution
The performance of single image super-resolution depends heavily on how to generate and complement high-frequency details to low-resolution images. Recently, diffusion-based models exhibit great potential in generating high-quality images for super-resolution tasks. However, existing models encounter difficulties in directly predicting high-frequency information of wide bandwidth by solely utilizing the high-resolution ground truth as the target for all sampling timesteps. To tackle this problem and achieve higher-quality super-resolution, we propose a novel Frequency Domain-guided multiscale Diffusion model (FDDiff), which decomposes the high-frequency information complementing process into finer-grained steps. In particular, a wavelet packet-based frequency complement chain is developed to provide multiscale intermediate targets with increasing bandwidth for reverse diffusion process. Then FDDiff guides reverse diffusion process to progressively complement the missing high-frequency details over timesteps. Moreover, we design a multiscale frequency refinement network to predict the required high-frequency components at multiple scales within one unified network. Comprehensive evaluations on popular benchmarks are conducted, and demonstrate that FDDiff outperforms prior generative methods with higher-fidelity super-resolution results.
Sharp electromagnetically induced absorption via balanced interferometric excitation in a microwave resonator
A cylindrical TM_{0,1,0} mode microwave cavity resonator was excited using a balanced interferometric configuration that allowed manipulation of the electric field and potential within the resonator by adjusting the phase and amplitude of the interferometer arms driving the resonator. With precise tuning of the phase and amplitude, 25 dB suppression of the electric field at the resonance frequency was achieved while simultaneously resonantly enhancing the time-varying electric-scalar potential. Under these conditions, the system demonstrated electromagnetically induced absorption in the cavity response due to the annulment of the electric field at the resonance frequency. This phenomena can be regarded as a form of extreme dispersion, and led to a sharp increase in the cavity phase versus frequency response by an order of magnitude when compared to the cavity Q-factor. This work presents an experimental setup that will allow the electric-scalar Aharonov-Bohm effect to be tested under conditions involving a time-varying electric-scalar potential, without the presence of an electric field or magnetic vector potential, an experiment that has not yet been realised.
Standardized Benchmark Dataset for Localized Exposure to a Realistic Source at 10-90 GHz
The lack of freely available standardized datasets represents an aggravating factor during the development and testing the performance of novel computational techniques in exposure assessment and dosimetry research. This hinders progress as researchers are required to generate numerical data (field, power and temperature distribution) anew using simulation software for each exposure scenario. Other than being time consuming, this approach is highly susceptible to errors that occur during the configuration of the electromagnetic model. To address this issue, in this paper, the limited available data on the incident power density and resultant maximum temperature rise on the skin surface considering various steady-state exposure scenarios at 10-90 GHz have been statistically modeled. The synthetic data have been sampled from the fitted statistical multivariate distribution with respect to predetermined dosimetric constraints. We thus present a comprehensive and open-source dataset compiled of the high-fidelity numerical data considering various exposures to a realistic source. Furthermore, different surrogate models for predicting maximum temperature rise on the skin surface were fitted based on the synthetic dataset. All surrogate models were tested on the originally available data where satisfactory predictive performance has been demonstrated. A simple technique of combining quadratic polynomial and tensor-product spline surrogates, each operating on its own cluster of data, has achieved the lowest mean absolute error of 0.058 {\deg}C. Therefore, overall experimental results indicate the validity of the proposed synthetic dataset.
ConDiff: A Challenging Dataset for Neural Solvers of Partial Differential Equations
We present ConDiff, a novel dataset for scientific machine learning. ConDiff focuses on the parametric diffusion equation with space dependent coefficients, a fundamental problem in many applications of partial differential equations (PDEs). The main novelty of the proposed dataset is that we consider discontinuous coefficients with high contrast. These coefficient functions are sampled from a selected set of distributions. This class of problems is not only of great academic interest, but is also the basis for describing various environmental and industrial problems. In this way, ConDiff shortens the gap with real-world problems while remaining fully synthetic and easy to use. ConDiff consists of a diverse set of diffusion equations with coefficients covering a wide range of contrast levels and heterogeneity with a measurable complexity metric for clearer comparison between different coefficient functions. We baseline ConDiff on standard deep learning models in the field of scientific machine learning. By providing a large number of problem instances, each with its own coefficient function and right-hand side, we hope to encourage the development of novel physics-based deep learning approaches, such as neural operators, ultimately driving progress towards more accurate and efficient solutions of complex PDE problems.
Group Equivariant Fourier Neural Operators for Partial Differential Equations
We consider solving partial differential equations (PDEs) with Fourier neural operators (FNOs), which operate in the frequency domain. Since the laws of physics do not depend on the coordinate system used to describe them, it is desirable to encode such symmetries in the neural operator architecture for better performance and easier learning. While encoding symmetries in the physical domain using group theory has been studied extensively, how to capture symmetries in the frequency domain is under-explored. In this work, we extend group convolutions to the frequency domain and design Fourier layers that are equivariant to rotations, translations, and reflections by leveraging the equivariance property of the Fourier transform. The resulting G-FNO architecture generalizes well across input resolutions and performs well in settings with varying levels of symmetry. Our code is publicly available as part of the AIRS library (https://github.com/divelab/AIRS).
A priori compression of convolutional neural networks for wave simulators
Convolutional neural networks are now seeing widespread use in a variety of fields, including image classification, facial and object recognition, medical imaging analysis, and many more. In addition, there are applications such as physics-informed simulators in which accurate forecasts in real time with a minimal lag are required. The present neural network designs include millions of parameters, which makes it difficult to install such complex models on devices that have limited memory. Compression techniques might be able to resolve these issues by decreasing the size of CNN models that are created by reducing the number of parameters that contribute to the complexity of the models. We propose a compressed tensor format of convolutional layer, a priori, before the training of the neural network. 3-way kernels or 2-way kernels in convolutional layers are replaced by one-way fiters. The overfitting phenomena will be reduced also. The time needed to make predictions or time required for training using the original Convolutional Neural Networks model would be cut significantly if there were fewer parameters to deal with. In this paper we present a method of a priori compressing convolutional neural networks for finite element (FE) predictions of physical data. Afterwards we validate our a priori compressed models on physical data from a FE model solving a 2D wave equation. We show that the proposed convolutinal compression technique achieves equivalent performance as classical convolutional layers with fewer trainable parameters and lower memory footprint.
Multiphysics Bench: Benchmarking and Investigating Scientific Machine Learning for Multiphysics PDEs
Solving partial differential equations (PDEs) with machine learning has recently attracted great attention, as PDEs are fundamental tools for modeling real-world systems that range from fundamental physical science to advanced engineering disciplines. Most real-world physical systems across various disciplines are actually involved in multiple coupled physical fields rather than a single field. However, previous machine learning studies mainly focused on solving single-field problems, but overlooked the importance and characteristics of multiphysics problems in real world. Multiphysics PDEs typically entail multiple strongly coupled variables, thereby introducing additional complexity and challenges, such as inter-field coupling. Both benchmarking and solving multiphysics problems with machine learning remain largely unexamined. To identify and address the emerging challenges in multiphysics problems, we mainly made three contributions in this work. First, we collect the first general multiphysics dataset, the Multiphysics Bench, that focuses on multiphysics PDE solving with machine learning. Multiphysics Bench is also the most comprehensive PDE dataset to date, featuring the broadest range of coupling types, the greatest diversity of PDE formulations, and the largest dataset scale. Second, we conduct the first systematic investigation on multiple representative learning-based PDE solvers, such as PINNs, FNO, DeepONet, and DiffusionPDE solvers, on multiphysics problems. Unfortunately, naively applying these existing solvers usually show very poor performance for solving multiphysics. Third, through extensive experiments and discussions, we report multiple insights and a bag of useful tricks for solving multiphysics with machine learning, motivating future directions in the study and simulation of complex, coupled physical systems.
FreNBRDF: A Frequency-Rectified Neural Material Representation
Accurate material modeling is crucial for achieving photorealistic rendering, bridging the gap between computer-generated imagery and real-world photographs. While traditional approaches rely on tabulated BRDF data, recent work has shifted towards implicit neural representations, which offer compact and flexible frameworks for a range of tasks. However, their behavior in the frequency domain remains poorly understood. To address this, we introduce FreNBRDF, a frequency-rectified neural material representation. By leveraging spherical harmonics, we integrate frequency-domain considerations into neural BRDF modeling. We propose a novel frequency-rectified loss, derived from a frequency analysis of neural materials, and incorporate it into a generalizable and adaptive reconstruction and editing pipeline. This framework enhances fidelity, adaptability, and efficiency. Extensive experiments demonstrate that \ours improves the accuracy and robustness of material appearance reconstruction and editing compared to state-of-the-art baselines, enabling more structured and interpretable downstream tasks and applications.
Flying with Photons: Rendering Novel Views of Propagating Light
We present an imaging and neural rendering technique that seeks to synthesize videos of light propagating through a scene from novel, moving camera viewpoints. Our approach relies on a new ultrafast imaging setup to capture a first-of-its kind, multi-viewpoint video dataset with picosecond-level temporal resolution. Combined with this dataset, we introduce an efficient neural volume rendering framework based on the transient field. This field is defined as a mapping from a 3D point and 2D direction to a high-dimensional, discrete-time signal that represents time-varying radiance at ultrafast timescales. Rendering with transient fields naturally accounts for effects due to the finite speed of light, including viewpoint-dependent appearance changes caused by light propagation delays to the camera. We render a range of complex effects, including scattering, specular reflection, refraction, and diffraction. Additionally, we demonstrate removing viewpoint-dependent propagation delays using a time warping procedure, rendering of relativistic effects, and video synthesis of direct and global components of light transport.
Neural Spectral Methods: Self-supervised learning in the spectral domain
We present Neural Spectral Methods, a technique to solve parametric Partial Differential Equations (PDEs), grounded in classical spectral methods. Our method uses orthogonal bases to learn PDE solutions as mappings between spectral coefficients. In contrast to current machine learning approaches which enforce PDE constraints by minimizing the numerical quadrature of the residuals in the spatiotemporal domain, we leverage Parseval's identity and introduce a new training strategy through a spectral loss. Our spectral loss enables more efficient differentiation through the neural network, and substantially reduces training complexity. At inference time, the computational cost of our method remains constant, regardless of the spatiotemporal resolution of the domain. Our experimental results demonstrate that our method significantly outperforms previous machine learning approaches in terms of speed and accuracy by one to two orders of magnitude on multiple different problems. When compared to numerical solvers of the same accuracy, our method demonstrates a 10times increase in performance speed.
Toward a Better Understanding of Fourier Neural Operators: Analysis and Improvement from a Spectral Perspective
In solving partial differential equations (PDEs), Fourier Neural Operators (FNOs) have exhibited notable effectiveness compared to Convolutional Neural Networks (CNNs). This paper presents clear empirical evidence through spectral analysis to elucidate the superiority of FNO over CNNs: FNO is significantly more capable of learning low-frequencies. This empirical evidence also unveils FNO's distinct low-frequency bias, which limits FNO's effectiveness in learning high-frequency information from PDE data. To tackle this challenge, we introduce SpecBoost, an ensemble learning framework that employs multiple FNOs to better capture high-frequency information. Specifically, a secondary FNO is utilized to learn the overlooked high-frequency information from the prediction residual of the initial FNO. Experiments demonstrate that SpecBoost noticeably enhances FNO's prediction accuracy on diverse PDE applications, achieving an up to 71% improvement.
Doppler Invariant Demodulation for Shallow Water Acoustic Communications Using Deep Belief Networks
Shallow water environments create a challenging channel for communications. In this paper, we focus on the challenges posed by the frequency-selective signal distortion called the Doppler effect. We explore the design and performance of machine learning (ML) based demodulation methods --- (1) Deep Belief Network-feed forward Neural Network (DBN-NN) and (2) Deep Belief Network-Convolutional Neural Network (DBN-CNN) in the physical layer of Shallow Water Acoustic Communication (SWAC). The proposed method comprises of a ML based feature extraction method and classification technique. First, the feature extraction converts the received signals to feature images. Next, the classification model correlates the images to a corresponding binary representative. An analysis of the ML based proposed demodulation shows that despite the presence of instantaneous frequencies, the performance of the algorithm shows an invariance with a small 2dB error margin in terms of bit error rate (BER).
PREF: Phasorial Embedding Fields for Compact Neural Representations
We present an efficient frequency-based neural representation termed PREF: a shallow MLP augmented with a phasor volume that covers significant border spectra than previous Fourier feature mapping or Positional Encoding. At the core is our compact 3D phasor volume where frequencies distribute uniformly along a 2D plane and dilate along a 1D axis. To this end, we develop a tailored and efficient Fourier transform that combines both Fast Fourier transform and local interpolation to accelerate na\"ive Fourier mapping. We also introduce a Parsvel regularizer that stables frequency-based learning. In these ways, Our PREF reduces the costly MLP in the frequency-based representation, thereby significantly closing the efficiency gap between it and other hybrid representations, and improving its interpretability. Comprehensive experiments demonstrate that our PREF is able to capture high-frequency details while remaining compact and robust, including 2D image generalization, 3D signed distance function regression and 5D neural radiance field reconstruction.
Uncertainty quantification for stationary and time-dependent PDEs subject to Gevrey regular random domain deformations
We study uncertainty quantification for partial differential equations subject to domain uncertainty. We parameterize the random domain using the model recently considered by Chernov and Le (2024) as well as Harbrecht, Schmidlin, and Schwab (2024) in which the input random field is assumed to belong to a Gevrey smoothness class. This approach has the advantage of being substantially more general than models which assume a particular parametric representation of the input random field such as a Karhunen-Loeve series expansion. We consider both the Poisson equation as well as the heat equation and design randomly shifted lattice quasi-Monte Carlo (QMC) cubature rules for the computation of the expected solution under domain uncertainty. We show that these QMC rules exhibit dimension-independent, essentially linear cubature convergence rates in this framework. In addition, we complete the error analysis by taking into account the approximation errors incurred by dimension truncation of the random input field and finite element discretization. Numerical experiments are presented to confirm the theoretical rates.
A Neural PDE Solver with Temporal Stencil Modeling
Numerical simulation of non-linear partial differential equations plays a crucial role in modeling physical science and engineering phenomena, such as weather, climate, and aerodynamics. Recent Machine Learning (ML) models trained on low-resolution spatio-temporal signals have shown new promises in capturing important dynamics in high-resolution signals, under the condition that the models can effectively recover the missing details. However, this study shows that significant information is often lost in the low-resolution down-sampled features. To address such issues, we propose a new approach, namely Temporal Stencil Modeling (TSM), which combines the strengths of advanced time-series sequence modeling (with the HiPPO features) and state-of-the-art neural PDE solvers (with learnable stencil modeling). TSM aims to recover the lost information from the PDE trajectories and can be regarded as a temporal generalization of classic finite volume methods such as WENO. Our experimental results show that TSM achieves the new state-of-the-art simulation accuracy for 2-D incompressible Navier-Stokes turbulent flows: it significantly outperforms the previously reported best results by 19.9% in terms of the highly-correlated duration time and reduces the inference latency into 80%. We also show a strong generalization ability of the proposed method to various out-of-distribution turbulent flow settings. Our code is available at "https://github.com/Edward-Sun/TSM-PDE".
FACL-Attack: Frequency-Aware Contrastive Learning for Transferable Adversarial Attacks
Deep neural networks are known to be vulnerable to security risks due to the inherent transferable nature of adversarial examples. Despite the success of recent generative model-based attacks demonstrating strong transferability, it still remains a challenge to design an efficient attack strategy in a real-world strict black-box setting, where both the target domain and model architectures are unknown. In this paper, we seek to explore a feature contrastive approach in the frequency domain to generate adversarial examples that are robust in both cross-domain and cross-model settings. With that goal in mind, we propose two modules that are only employed during the training phase: a Frequency-Aware Domain Randomization (FADR) module to randomize domain-variant low- and high-range frequency components and a Frequency-Augmented Contrastive Learning (FACL) module to effectively separate domain-invariant mid-frequency features of clean and perturbed image. We demonstrate strong transferability of our generated adversarial perturbations through extensive cross-domain and cross-model experiments, while keeping the inference time complexity.
Modeling Temperature, Frequency, and Strain Effects on the Linear Electro-Optic Coefficients of Ferroelectric Oxides
An electro-optic modulator offers the function of modulating the propagation of light in a material with electric field and enables seamless connection between electronics-based computing and photonics-based communication. The search for materials with large electro-optic coefficients and low optical loss is critical to increase the efficiency and minimize the size of electro-optic devices. We present a semi-empirical method to compute the electro-optic coefficients of ferroelectric materials by combining first-principles density-functional theory calculations with Landau-Devonshire phenomenological modeling. We apply the method to study the electro-optic constants, also called Pockels coefficients, of three paradigmatic ferroelectric oxides: BaTiO3, LiNbO3, and LiTaO3. We present their temperature-, frequency- and strain-dependent electro-optic tensors calculated using our method. The predicted electro-optic constants agree with the experimental results, where available, and provide benchmarks for experimental verification.
CFDBench: A Large-Scale Benchmark for Machine Learning Methods in Fluid Dynamics
In recent years, applying deep learning to solve physics problems has attracted much attention. Data-driven deep learning methods produce fast numerical operators that can learn approximate solutions to the whole system of partial differential equations (i.e., surrogate modeling). Although these neural networks may have lower accuracy than traditional numerical methods, they, once trained, are orders of magnitude faster at inference. Hence, one crucial feature is that these operators can generalize to unseen PDE parameters without expensive re-training.In this paper, we construct CFDBench, a benchmark tailored for evaluating the generalization ability of neural operators after training in computational fluid dynamics (CFD) problems. It features four classic CFD problems: lid-driven cavity flow, laminar boundary layer flow in circular tubes, dam flows through the steps, and periodic Karman vortex street. The data contains a total of 302K frames of velocity and pressure fields, involving 739 cases with different operating condition parameters, generated with numerical methods. We evaluate the effectiveness of popular neural operators including feed-forward networks, DeepONet, FNO, U-Net, etc. on CFDBnech by predicting flows with non-periodic boundary conditions, fluid properties, and flow domain shapes that are not seen during training. Appropriate modifications were made to apply popular deep neural networks to CFDBench and enable the accommodation of more changing inputs. Empirical results on CFDBench show many baseline models have errors as high as 300% in some problems, and severe error accumulation when performing autoregressive inference. CFDBench facilitates a more comprehensive comparison between different neural operators for CFD compared to existing benchmarks.
Comparing Time and Frequency Domain for Audio Event Recognition Using Deep Learning
Recognizing acoustic events is an intricate problem for a machine and an emerging field of research. Deep neural networks achieve convincing results and are currently the state-of-the-art approach for many tasks. One advantage is their implicit feature learning, opposite to an explicit feature extraction of the input signal. In this work, we analyzed whether more discriminative features can be learned from either the time-domain or the frequency-domain representation of the audio signal. For this purpose, we trained multiple deep networks with different architectures on the Freiburg-106 and ESC-10 datasets. Our results show that feature learning from the frequency domain is superior to the time domain. Moreover, additionally using convolution and pooling layers, to explore local structures of the audio signal, significantly improves the recognition performance and achieves state-of-the-art results.
Investigation of intrinsic properties of high-quality fiber Fabry--Perot resonators
Fiber Fabry--Perot (FFP) resonators of a few centimeters are optimized as a function of the reflectivity of the mirrors and the dimensions of the intra-cavity waveguide. Loaded quality factor in excess of 10^9, with an optimum of 4___x___10^9, together with an intrinsic quality factor larger than 10^10 and intrinsic finesse in the range of 10^5 have been measured. An application to the stabilization of laser frequency fluctuations is presented.
DiffPace: Diffusion-based Plug-and-play Augmented Channel Estimation in mmWave and Terahertz Ultra-Massive MIMO Systems
Millimeter-wave (mmWave) and Terahertz (THz)-band communications hold great promise in meeting the growing data-rate demands of next-generation wireless networks, offering abundant bandwidth. To mitigate the severe path loss inherent to these high frequencies and reduce hardware costs, ultra-massive multiple-input multiple-output (UM-MIMO) systems with hybrid beamforming architectures can deliver substantial beamforming gains and enhanced spectral efficiency. However, accurate channel estimation (CE) in mmWave and THz UM-MIMO systems is challenging due to high channel dimensionality and compressed observations from a limited number of RF chains, while the hybrid near- and far-field radiation patterns, arising from large array apertures and high carrier frequencies, further complicate CE. Conventional compressive sensing based frameworks rely on predefined sparsifying matrices, which cannot faithfully capture the hybrid near-field and far-field channel structures, leading to degraded estimation performance. This paper introduces DiffPace, a diffusion-based plug-and-play method for channel estimation. DiffPace uses a diffusion model (DM) to capture the channel distribution based on the hybrid spherical and planar-wave (HPSM) model. By applying the plug-and-play approach, it leverages the DM as prior knowledge, improving CE accuracy. Moreover, DM performs inference by solving an ordinary differential equation, minimizing the number of required inference steps compared with stochastic sampling method. Experimental results show that DiffPace achieves competitive CE performance, attaining -15 dB normalized mean square error (NMSE) at a signal-to-noise ratio (SNR) of 10 dB, with 90\% fewer inference steps compared to state-of-the-art schemes, simultaneously providing high estimation precision and enhanced computational efficiency.
Complex-valued neural networks for machine learning on non-stationary physical data
Deep learning has become an area of interest in most scientific areas, including physical sciences. Modern networks apply real-valued transformations on the data. Particularly, convolutions in convolutional neural networks discard phase information entirely. Many deterministic signals, such as seismic data or electrical signals, contain significant information in the phase of the signal. We explore complex-valued deep convolutional networks to leverage non-linear feature maps. Seismic data commonly has a lowcut filter applied, to attenuate noise from ocean waves and similar long wavelength contributions. Discarding the phase information leads to low-frequency aliasing analogous to the Nyquist-Shannon theorem for high frequencies. In non-stationary data, the phase content can stabilize training and improve the generalizability of neural networks. While it has been shown that phase content can be restored in deep neural networks, we show how including phase information in feature maps improves both training and inference from deterministic physical data. Furthermore, we show that the reduction of parameters in a complex network outperforms larger real-valued networks.
Learnable Adaptive Time-Frequency Representation via Differentiable Short-Time Fourier Transform
The short-time Fourier transform (STFT) is widely used for analyzing non-stationary signals. However, its performance is highly sensitive to its parameters, and manual or heuristic tuning often yields suboptimal results. To overcome this limitation, we propose a unified differentiable formulation of the STFT that enables gradient-based optimization of its parameters. This approach addresses the limitations of traditional STFT parameter tuning methods, which often rely on computationally intensive discrete searches. It enables fine-tuning of the time-frequency representation (TFR) based on any desired criterion. Moreover, our approach integrates seamlessly with neural networks, allowing joint optimization of the STFT parameters and network weights. The efficacy of the proposed differentiable STFT in enhancing TFRs and improving performance in downstream tasks is demonstrated through experiments on both simulated and real-world data.
A domain splitting strategy for solving PDEs
In this work we develop a novel domain splitting strategy for the solution of partial differential equations. Focusing on a uniform discretization of the d-dimensional advection-diffusion equation, our proposal is a two-level algorithm that merges the solutions obtained from the discretization of the equation over highly anisotropic submeshes to compute an initial approximation of the fine solution. The algorithm then iteratively refines the initial guess by leveraging the structure of the residual. Performing costly calculations on anisotropic submeshes enable us to reduce the dimensionality of the problem by one, and the merging process, which involves the computation of solutions over disjoint domains, allows for parallel implementation.
Analytic Approximation of Free-Space Path Loss for Implanted Antennas
Implantable wireless bioelectronic devices enable communication and/or power transfer through RF wireless connections with external nodes. These devices encounter notable design challenges due to the lossy nature of the host body, which significantly diminishes the radiation efficiency of the implanted antenna and tightens the wireless link budget. Prior research has yielded closed-form approximate expressions for estimating losses occurring within the lossy host body, known as the in-body path loss. To assess the total path loss between the implanted transmitter and external receiver, this paper focuses on the free-space path loss of the implanted antenna, from the body-air interface to the external node. This is not trivial, as in addition to the inherent radial spreading of spherical electromagnetic waves common to all antennas, implanted antennas confront additional losses arising from electromagnetic scattering at the interface between the host body and air. Employing analytical modeling, we propose closed-form approximate expressions for estimating this free-space path loss. The approximation is formulated as a function of the free-space distance, the curvature radius of the body-air interface, the depth of the implanted antenna, and the permittivity of the lossy medium. This proposed method undergoes thorough validation through numerical calculations, simulations, and measurements for different implanted antenna scenarios. This study contributes to a comprehensive understanding of the path loss in implanted antennas and provides a reliable analytical framework for their efficient design and performance evaluation.
A Neural Operator based on Dynamic Mode Decomposition
The scientific computation methods development in conjunction with artificial intelligence technologies remains a hot research topic. Finding a balance between lightweight and accurate computations is a solid foundation for this direction. The study presents a neural operator based on the dynamic mode decomposition algorithm (DMD), mapping functional spaces, which combines DMD and deep learning (DL) for spatiotemporal processes efficient modeling. Solving PDEs for various initial and boundary conditions requires significant computational resources. The method suggested automatically extracts key modes and system dynamics using them to construct predictions, reducing computational costs compared to traditional numerical methods. The approach has demonstrated its efficiency through comparative analysis of performance with closest analogues DeepONet and FNO in the heat equation, Laplaces equation, and Burgers equation solutions approximation, where it achieves high reconstruction accuracy.
PROSE-FD: A Multimodal PDE Foundation Model for Learning Multiple Operators for Forecasting Fluid Dynamics
We propose PROSE-FD, a zero-shot multimodal PDE foundational model for simultaneous prediction of heterogeneous two-dimensional physical systems related to distinct fluid dynamics settings. These systems include shallow water equations and the Navier-Stokes equations with incompressible and compressible flow, regular and complex geometries, and different buoyancy settings. This work presents a new transformer-based multi-operator learning approach that fuses symbolic information to perform operator-based data prediction, i.e. non-autoregressive. By incorporating multiple modalities in the inputs, the PDE foundation model builds in a pathway for including mathematical descriptions of the physical behavior. We pre-train our foundation model on 6 parametric families of equations collected from 13 datasets, including over 60K trajectories. Our model outperforms popular operator learning, computer vision, and multi-physics models, in benchmark forward prediction tasks. We test our architecture choices with ablation studies.
RadioDiff-3D: A 3Dtimes3D Radio Map Dataset and Generative Diffusion Based Benchmark for 6G Environment-Aware Communication
Radio maps (RMs) serve as a critical foundation for enabling environment-aware wireless communication, as they provide the spatial distribution of wireless channel characteristics. Despite recent progress in RM construction using data-driven approaches, most existing methods focus solely on pathloss prediction in a fixed 2D plane, neglecting key parameters such as direction of arrival (DoA), time of arrival (ToA), and vertical spatial variations. Such a limitation is primarily due to the reliance on static learning paradigms, which hinder generalization beyond the training data distribution. To address these challenges, we propose UrbanRadio3D, a large-scale, high-resolution 3D RM dataset constructed via ray tracing in realistic urban environments. UrbanRadio3D is over 37times3 larger than previous datasets across a 3D space with 3 metrics as pathloss, DoA, and ToA, forming a novel 3Dtimes33D dataset with 7times3 more height layers than prior state-of-the-art (SOTA) dataset. To benchmark 3D RM construction, a UNet with 3D convolutional operators is proposed. Moreover, we further introduce RadioDiff-3D, a diffusion-model-based generative framework utilizing the 3D convolutional architecture. RadioDiff-3D supports both radiation-aware scenarios with known transmitter locations and radiation-unaware settings based on sparse spatial observations. Extensive evaluations on UrbanRadio3D validate that RadioDiff-3D achieves superior performance in constructing rich, high-dimensional radio maps under diverse environmental dynamics. This work provides a foundational dataset and benchmark for future research in 3D environment-aware communication. The dataset is available at https://github.com/UNIC-Lab/UrbanRadio3D.
Solitons near avoided mode crossing in χ^{(2)} nanowaveguides
We present a model for chi^{(2)} waveguides accounting for three modes, two of which make an avoided crossing at the second harmonic wavelength. We introduce two linearly coupled pure modes and adjust the coupling to replicate the waveguide dispersion near the avoided crossing. Analysis of the nonlinear system reveals continuous wave (CW) solutions across much of the parameter-space and prevalence of its modulational instability. We also predict the existence of the avoided-crossing solitons, and study peculiarities of their dynamics and spectral properties, which include formation of a pedestal in the pulse tails and associated pronounced spectral peaks. Mapping these solitons onto the linear dispersion diagrams, we make connections between their existence and CW existence and stability. We also simulate the two-color soliton generation from a single frequency pump pulse to back up its formation and stability properties.
1d-qt-ideal-solver: 1D Idealized Quantum Tunneling Solver with Absorbing Boundaries
We present 1d-qt-ideal-solver, an open-source Python library for simulating one-dimensional quantum tunneling dynamics under idealized coherent conditions. The solver implements the split-operator method with second-order Trotter-Suzuki factorization, utilizing FFT-based spectral differentiation for the kinetic operator and complex absorbing potentials to eliminate boundary reflections. Numba just-in-time compilation achieves performance comparable to compiled languages while maintaining code accessibility. We validate the implementation through two canonical test cases: rectangular barriers modeling field emission through oxide layers and Gaussian barriers approximating scanning tunneling microscopy interactions. Both simulations achieve exceptional numerical fidelity with machine-precision energy conservation over femtosecond-scale propagation. Comparative analysis employing information-theoretic measures and nonparametric hypothesis tests reveals that rectangular barriers exhibit moderately higher transmission coefficients than Gaussian barriers in the over-barrier regime, though Jensen-Shannon divergence analysis indicates modest practical differences between geometries. Phase space analysis confirms complete decoherence when averaged over spatial-temporal domains. The library name reflects its scope: idealized signifies deliberate exclusion of dissipation, environmental coupling, and many-body interactions, limiting applicability to qualitative insights and pedagogical purposes rather than quantitative experimental predictions. Distributed under the MIT License, the library provides a deployable tool for teaching quantum mechanics and preliminary exploration of tunneling dynamics.
A Deep Conjugate Direction Method for Iteratively Solving Linear Systems
We present a novel deep learning approach to approximate the solution of large, sparse, symmetric, positive-definite linear systems of equations. These systems arise from many problems in applied science, e.g., in numerical methods for partial differential equations. Algorithms for approximating the solution to these systems are often the bottleneck in problems that require their solution, particularly for modern applications that require many millions of unknowns. Indeed, numerical linear algebra techniques have been investigated for many decades to alleviate this computational burden. Recently, data-driven techniques have also shown promise for these problems. Motivated by the conjugate gradients algorithm that iteratively selects search directions for minimizing the matrix norm of the approximation error, we design an approach that utilizes a deep neural network to accelerate convergence via data-driven improvement of the search directions. Our method leverages a carefully chosen convolutional network to approximate the action of the inverse of the linear operator up to an arbitrary constant. We train the network using unsupervised learning with a loss function equal to the L^2 difference between an input and the system matrix times the network evaluation, where the unspecified constant in the approximate inverse is accounted for. We demonstrate the efficacy of our approach on spatially discretized Poisson equations with millions of degrees of freedom arising in computational fluid dynamics applications. Unlike state-of-the-art learning approaches, our algorithm is capable of reducing the linear system residual to a given tolerance in a small number of iterations, independent of the problem size. Moreover, our method generalizes effectively to various systems beyond those encountered during training.
simple-idealized-1d-nlse: Pseudo-Spectral Solver for the 1D Nonlinear Schrödinger Equation
We present an open-source Python implementation of an idealized high-order pseudo-spectral solver for the one-dimensional nonlinear Schr\"odinger equation (NLSE). The solver combines Fourier spectral spatial discretization with an adaptive eighth-order Dormand-Prince time integration scheme to achieve machine-precision conservation of mass and near-perfect preservation of momentum and energy for smooth solutions. The implementation accurately reproduces fundamental NLSE phenomena including soliton collisions with analytically predicted phase shifts, Akhmediev breather dynamics, and the development of modulation instability from noisy initial conditions. Four canonical test cases validate the numerical scheme: single soliton propagation, two-soliton elastic collision, breather evolution, and noise-seeded modulation instability. The solver employs a 2/3 dealiasing rule with exponential filtering to prevent aliasing errors from the cubic nonlinearity. Statistical analysis using Shannon, R\'enyi, and Tsallis entropies quantifies the spatio-temporal complexity of solutions, while phase space representations reveal the underlying coherence structure. The implementation prioritizes code transparency and educational accessibility over computational performance, providing a valuable pedagogical tool for exploring nonlinear wave dynamics. Complete source code, documentation, and example configurations are freely available, enabling reproducible computational experiments across diverse physical contexts where the NLSE governs wave evolution, including nonlinear optics, Bose-Einstein condensates, and ocean surface waves.
Using a Metasurface to Enhance the Radiation Efficiency of Subterahertz Antennas Printed on Thick Substrates
This study investigates the possibility of increasing the radiation efficiency of printed antennas and arrays by suppressing their inherent surface waves using a metasurface made of quad-split rings (QSR). A symmetrical resonant microstrip dipole and a four-element series-fed dipole array printed on an infinite grounded dielectric layer (layer thickness: 0.2 mm; relative permittivity: 9.4; tan delta: 0.0005) were simulated with FEKO 2022 software. Conducted at 100-116 GHz, the numerical results revealed extremely low radiation efficiencies of approximately 31% and 40% for the studied dipole and dipole array, respectively, which resulted from the presence of surface waves in the dielectric. However, placing only one QSR near each dipole arm triggered an increase in radiation efficiency by 2.5 times (up to 75%). The use of a metasurface in the form of two small QSR arrays triggered a pronounced improvement in radiation efficiency, reaching 93.6% and 96.5% for the studied dipole and dipole array, respectively. Analysis of the electric field distribution images showed that this enhancement resulted from surface wave suppression.
Simulation of integrated nonlinear quantum optics: from nonlinear interferometer to temporal walk-off compensator
Nonlinear quantum photonics serves as a cornerstone in photonic quantum technologies, such as universal quantum computing and quantum communications. The emergence of integrated photonics platform not only offers the advantage of large-scale manufacturing but also provides a variety of engineering methods. Given the complexity of integrated photonics engineering, a comprehensive simulation framework is essential to fully harness the potential of the platform. In this context, we introduce a nonlinear quantum photonics simulation framework which can accurately model a variety of features such as adiabatic waveguide, material anisotropy, linear optics components, photon losses, and detectors. Furthermore, utilizing the framework, we have developed a device scheme, chip-scale temporal walk-off compensation, that is useful for various quantum information processing tasks. Applying the simulation framework, we show that the proposed device scheme can enhance the squeezing parameter of photon-pair sources and the conversion efficiency of quantum frequency converters without relying on higher pump power.
Analyzing black-hole ringdowns II: data conditioning
Time series data from observations of black hole ringdown gravitational waves are often analyzed in the time domain by using damped sinusoid models with acyclic boundary conditions. Data conditioning operations, including downsampling, filtering, and the choice of data segment duration, reduce the computational cost of such analyses and can improve numerical stability. Here we analyze simulated damped sinsuoid signals to illustrate how data conditioning operations, if not carefully applied, can undesirably alter the analysis' posterior distributions. We discuss how currently implemented downsampling and filtering methods, if applied too aggressively, can introduce systematic errors and skew tests of general relativity. These issues arise because current downsampling and filtering methods do not operate identically on the data and model. Alternative downsampling and filtering methods which identically operate on the data and model may be achievable, but we argue that the current operations can still be implemented safely. We also show that our preferred anti-alias filtering technique, which has an instantaneous frequency-domain response at its roll-off frequency, preserves the structure of posterior distributions better than other commonly used filters with transient frequency-domain responses. Lastly, we highlight that exceptionally long data segments may need to be analyzed in cases where thin lines in the noise power spectral density overlap with central signal frequencies. Our findings may be broadly applicable to any analysis of truncated time domain data with acyclic boundary conditions.
On the generation of periodic discrete structures with identical two-point correlation
Strategies for the generation of periodic discrete structures with identical two-point correlation are developed. Starting from a pair of root structures, which are not related by translation, phase inversion or axis reflections, child structures of arbitrary resolution (i.e., pixel or voxel numbers) and number of phases (i.e., material phases/species) can be generated by means of trivial embedding based phase extension, application of kernels and/or phase coalescence, such that the generated structures inherit the two-point-correlation equivalence. Proofs of the inheritance property are provided by means of the Discrete Fourier Transform theory. A Python 3 implementation of the results is offered by the authors through the Github repository https://github.com/DataAnalyticsEngineering/EQ2PC in order to make the provided results reproducible and useful for all interested readers. Examples for the generation of structures are demonstrated, together with applications in the homogenization theory of periodic media.
Uncertainty quantification in a mechanical submodel driven by a Wasserstein-GAN
The analysis of parametric and non-parametric uncertainties of very large dynamical systems requires the construction of a stochastic model of said system. Linear approaches relying on random matrix theory and principal componant analysis can be used when systems undergo low-frequency vibrations. In the case of fast dynamics and wave propagation, we investigate a random generator of boundary conditions for fast submodels by using machine learning. We show that the use of non-linear techniques in machine learning and data-driven methods is highly relevant. Physics-informed neural networks is a possible choice for a data-driven method to replace linear modal analysis. An architecture that support a random component is necessary for the construction of the stochastic model of the physical system for non-parametric uncertainties, since the goal is to learn the underlying probabilistic distribution of uncertainty in the data. Generative Adversarial Networks (GANs) are suited for such applications, where the Wasserstein-GAN with gradient penalty variant offers improved convergence results for our problem. The objective of our approach is to train a GAN on data from a finite element method code (Fenics) so as to extract stochastic boundary conditions for faster finite element predictions on a submodel. The submodel and the training data have both the same geometrical support. It is a zone of interest for uncertainty quantification and relevant to engineering purposes. In the exploitation phase, the framework can be viewed as a randomized and parametrized simulation generator on the submodel, which can be used as a Monte Carlo estimator.
DC is all you need: describing ReLU from a signal processing standpoint
Non-linear activation functions are crucial in Convolutional Neural Networks. However, until now they have not been well described in the frequency domain. In this work, we study the spectral behavior of ReLU, a popular activation function. We use the ReLU's Taylor expansion to derive its frequency domain behavior. We demonstrate that ReLU introduces higher frequency oscillations in the signal and a constant DC component. Furthermore, we investigate the importance of this DC component, where we demonstrate that it helps the model extract meaningful features related to the input frequency content. We accompany our theoretical derivations with experiments and real-world examples. First, we numerically validate our frequency response model. Then we observe ReLU's spectral behavior on two example models and a real-world one. Finally, we experimentally investigate the role of the DC component introduced by ReLU in the CNN's representations. Our results indicate that the DC helps to converge to a weight configuration that is close to the initial random weights.
SineNet: Learning Temporal Dynamics in Time-Dependent Partial Differential Equations
We consider using deep neural networks to solve time-dependent partial differential equations (PDEs), where multi-scale processing is crucial for modeling complex, time-evolving dynamics. While the U-Net architecture with skip connections is commonly used by prior studies to enable multi-scale processing, our analysis shows that the need for features to evolve across layers results in temporally misaligned features in skip connections, which limits the model's performance. To address this limitation, we propose SineNet, consisting of multiple sequentially connected U-shaped network blocks, referred to as waves. In SineNet, high-resolution features are evolved progressively through multiple stages, thereby reducing the amount of misalignment within each stage. We furthermore analyze the role of skip connections in enabling both parallel and sequential processing of multi-scale information. Our method is rigorously tested on multiple PDE datasets, including the Navier-Stokes equations and shallow water equations, showcasing the advantages of our proposed approach over conventional U-Nets with a comparable parameter budget. We further demonstrate that increasing the number of waves in SineNet while maintaining the same number of parameters leads to a monotonically improved performance. The results highlight the effectiveness of SineNet and the potential of our approach in advancing the state-of-the-art in neural PDE solver design. Our code is available as part of AIRS (https://github.com/divelab/AIRS).
Frequency-Aware Deepfake Detection: Improving Generalizability through Frequency Space Learning
This research addresses the challenge of developing a universal deepfake detector that can effectively identify unseen deepfake images despite limited training data. Existing frequency-based paradigms have relied on frequency-level artifacts introduced during the up-sampling in GAN pipelines to detect forgeries. However, the rapid advancements in synthesis technology have led to specific artifacts for each generation model. Consequently, these detectors have exhibited a lack of proficiency in learning the frequency domain and tend to overfit to the artifacts present in the training data, leading to suboptimal performance on unseen sources. To address this issue, we introduce a novel frequency-aware approach called FreqNet, centered around frequency domain learning, specifically designed to enhance the generalizability of deepfake detectors. Our method forces the detector to continuously focus on high-frequency information, exploiting high-frequency representation of features across spatial and channel dimensions. Additionally, we incorporate a straightforward frequency domain learning module to learn source-agnostic features. It involves convolutional layers applied to both the phase spectrum and amplitude spectrum between the Fast Fourier Transform (FFT) and Inverse Fast Fourier Transform (iFFT). Extensive experimentation involving 17 GANs demonstrates the effectiveness of our proposed method, showcasing state-of-the-art performance (+9.8\%) while requiring fewer parameters. The code is available at {\cred https://github.com/chuangchuangtan/FreqNet-DeepfakeDetection}.
Adversarial Generation of Time-Frequency Features with application in audio synthesis
Time-frequency (TF) representations provide powerful and intuitive features for the analysis of time series such as audio. But still, generative modeling of audio in the TF domain is a subtle matter. Consequently, neural audio synthesis widely relies on directly modeling the waveform and previous attempts at unconditionally synthesizing audio from neurally generated invertible TF features still struggle to produce audio at satisfying quality. In this article, focusing on the short-time Fourier transform, we discuss the challenges that arise in audio synthesis based on generated invertible TF features and how to overcome them. We demonstrate the potential of deliberate generative TF modeling by training a generative adversarial network (GAN) on short-time Fourier features. We show that by applying our guidelines, our TF-based network was able to outperform a state-of-the-art GAN generating waveforms directly, despite the similar architecture in the two networks.
On feasibility of extrapolation of the complex electromagnetic permittivity function using Kramer-Kronig relations
We study the degree of reliability of extrapolation of complex electromagnetic permittivity functions based on their analyticity properties. Given two analytic functions, representing extrapolants of the same experimental data, we examine how much they can differ at an extrapolation point outside of the experimentally accessible frequency band. We give a sharp upper bound on the worst case extrapolation error, in terms of a solution of an integral equation of Fredholm type. We conjecture and give numerical evidence that this bound exhibits a power law precision deterioration as one moves further away from the frequency band containing measurement data.
Neural Operators with Localized Integral and Differential Kernels
Neural operators learn mappings between function spaces, which is practical for learning solution operators of PDEs and other scientific modeling applications. Among them, the Fourier neural operator (FNO) is a popular architecture that performs global convolutions in the Fourier space. However, such global operations are often prone to over-smoothing and may fail to capture local details. In contrast, convolutional neural networks (CNN) can capture local features but are limited to training and inference at a single resolution. In this work, we present a principled approach to operator learning that can capture local features under two frameworks by learning differential operators and integral operators with locally supported kernels. Specifically, inspired by stencil methods, we prove that we obtain differential operators under an appropriate scaling of the kernel values of CNNs. To obtain local integral operators, we utilize suitable basis representations for the kernels based on discrete-continuous convolutions. Both these approaches preserve the properties of operator learning and, hence, the ability to predict at any resolution. Adding our layers to FNOs significantly improves their performance, reducing the relative L2-error by 34-72% in our experiments, which include a turbulent 2D Navier-Stokes and the spherical shallow water equations.
Implicit Neural Spatial Representations for Time-dependent PDEs
Implicit Neural Spatial Representation (INSR) has emerged as an effective representation of spatially-dependent vector fields. This work explores solving time-dependent PDEs with INSR. Classical PDE solvers introduce both temporal and spatial discretizations. Common spatial discretizations include meshes and meshless point clouds, where each degree-of-freedom corresponds to a location in space. While these explicit spatial correspondences are intuitive to model and understand, these representations are not necessarily optimal for accuracy, memory usage, or adaptivity. Keeping the classical temporal discretization unchanged (e.g., explicit/implicit Euler), we explore INSR as an alternative spatial discretization, where spatial information is implicitly stored in the neural network weights. The network weights then evolve over time via time integration. Our approach does not require any training data generated by existing solvers because our approach is the solver itself. We validate our approach on various PDEs with examples involving large elastic deformations, turbulent fluids, and multi-scale phenomena. While slower to compute than traditional representations, our approach exhibits higher accuracy and lower memory consumption. Whereas classical solvers can dynamically adapt their spatial representation only by resorting to complex remeshing algorithms, our INSR approach is intrinsically adaptive. By tapping into the rich literature of classic time integrators, e.g., operator-splitting schemes, our method enables challenging simulations in contact mechanics and turbulent flows where previous neural-physics approaches struggle. Videos and codes are available on the project page: http://www.cs.columbia.edu/cg/INSR-PDE/
Text2PDE: Latent Diffusion Models for Accessible Physics Simulation
Recent advances in deep learning have inspired numerous works on data-driven solutions to partial differential equation (PDE) problems. These neural PDE solvers can often be much faster than their numerical counterparts; however, each presents its unique limitations and generally balances training cost, numerical accuracy, and ease of applicability to different problem setups. To address these limitations, we introduce several methods to apply latent diffusion models to physics simulation. Firstly, we introduce a mesh autoencoder to compress arbitrarily discretized PDE data, allowing for efficient diffusion training across various physics. Furthermore, we investigate full spatio-temporal solution generation to mitigate autoregressive error accumulation. Lastly, we investigate conditioning on initial physical quantities, as well as conditioning solely on a text prompt to introduce text2PDE generation. We show that language can be a compact, interpretable, and accurate modality for generating physics simulations, paving the way for more usable and accessible PDE solvers. Through experiments on both uniform and structured grids, we show that the proposed approach is competitive with current neural PDE solvers in both accuracy and efficiency, with promising scaling behavior up to sim3 billion parameters. By introducing a scalable, accurate, and usable physics simulator, we hope to bring neural PDE solvers closer to practical use.
Optical Properties of Superconducting K_{0.8}Fe_{1.7}(Se_{0.73}S_{0.27})_2 Single Crystals
The optical properties of the superconducting K_{0.8}Fe_{1.7}(Se_{0.73}S_{0.27})_2 single crystals with a critical temperature T_capprox 26 K have been measured in the {\it ab} plane in a wide frequency range using both infrared Fourier-transform spectroscopy and spectroscopic ellipsometry at temperatures of 4--300 K. The normal-state reflectance of K_{0.8}Fe_{1.7}(Se_{0.73}S_{0.27})_2 is analyzed using a Drude-Lorentz model with one Drude component. The temperature dependences of the plasma frequency, optical conductivity, scattering rate, and dc resistivity of the Drude contribution in the normal state are presented. In the superconducting state, we observe a signature of the superconducting gap opening at 2Δ(5~K) = 11.8~meV. An abrupt decrease in the low-frequency dielectric permittivity varepsilon _1(ω) at T < T_c also evidences the formation of the superconducting condensate. The superconducting plasma frequency ω_{pl,s} = (213pm 5)~cm^{-1} and the magnetic penetration depth λ=(7.5pm 0.2)~μm at T=5~K are determined.
Quasinormal modes in two-photon autocorrelation and the geometric-optics approximation
In this work, we study the black hole light echoes in terms of the two-photon autocorrelation and explore their connection with the quasinormal modes. It is shown that the above time-domain phenomenon can be analyzed by utilizing the well-known frequency-domain relations between the quasinormal modes and characteristic parameters of null geodesics. We found that the time-domain correlator, obtained by the inverse Fourier transform, naturally acquires the echo feature, which can be attributed to a collective effect of the asymptotic poles through a weighted summation of the squared modulus of the relevant Green's functions. Specifically, the contour integral leads to a summation taking over both the overtone index and angular momentum. Moreover, the dominant contributions to the light echoes are from those in the eikonal limit, consistent with the existing findings using the geometric-optics arguments. For the Schwarzschild black holes, we demonstrate the results numerically by considering a transient spherical light source. Also, for the Kerr spacetimes, we point out a potential difference between the resulting light echoes using the geometric-optics approach and those obtained by the black hole perturbation theory. Possible astrophysical implications of the present study are addressed.
Learning the Solution Operator of Boundary Value Problems using Graph Neural Networks
As an alternative to classical numerical solvers for partial differential equations (PDEs) subject to boundary value constraints, there has been a surge of interest in investigating neural networks that can solve such problems efficiently. In this work, we design a general solution operator for two different time-independent PDEs using graph neural networks (GNNs) and spectral graph convolutions. We train the networks on simulated data from a finite elements solver on a variety of shapes and inhomogeneities. In contrast to previous works, we focus on the ability of the trained operator to generalize to previously unseen scenarios. Specifically, we test generalization to meshes with different shapes and superposition of solutions for a different number of inhomogeneities. We find that training on a diverse dataset with lots of variation in the finite element meshes is a key ingredient for achieving good generalization results in all cases. With this, we believe that GNNs can be used to learn solution operators that generalize over a range of properties and produce solutions much faster than a generic solver. Our dataset, which we make publicly available, can be used and extended to verify the robustness of these models under varying conditions.
FDG-Diff: Frequency-Domain-Guided Diffusion Framework for Compressed Hazy Image Restoration
In this study, we reveal that the interaction between haze degradation and JPEG compression introduces complex joint loss effects, which significantly complicate image restoration. Existing dehazing models often neglect compression effects, which limits their effectiveness in practical applications. To address these challenges, we introduce three key contributions. First, we design FDG-Diff, a novel frequency-domain-guided dehazing framework that improves JPEG image restoration by leveraging frequency-domain information. Second, we introduce the High-Frequency Compensation Module (HFCM), which enhances spatial-domain detail restoration by incorporating frequency-domain augmentation techniques into a diffusion-based restoration framework. Lastly, the introduction of the Degradation-Aware Denoising Timestep Predictor (DADTP) module further enhances restoration quality by enabling adaptive region-specific restoration, effectively addressing regional degradation inconsistencies in compressed hazy images. Experimental results across multiple compressed dehazing datasets demonstrate that our method consistently outperforms the latest state-of-the-art approaches. Code be available at https://github.com/SYSUzrc/FDG-Diff.
An Overview of Machine Learning Techniques for Radiowave Propagation Modeling
We give an overview of recent developments in the modeling of radiowave propagation, based on machine learning algorithms. We identify the input and output specification and the architecture of the model as the main challenges associated with machine learning-driven propagation models. Relevant papers are discussed and categorized based on their approach to each of these challenges. Emphasis is given on presenting the prospects and open problems in this promising and rapidly evolving area.
Solving High Frequency and Multi-Scale PDEs with Gaussian Processes
Machine learning based solvers have garnered much attention in physical simulation and scientific computing, with a prominent example, physics-informed neural networks (PINNs). However, PINNs often struggle to solve high-frequency and multi-scale PDEs, which can be due to spectral bias during neural network training. To address this problem, we resort to the Gaussian process (GP) framework. To flexibly capture the dominant frequencies, we model the power spectrum of the PDE solution with a student t mixture or Gaussian mixture. We apply the inverse Fourier transform to obtain the covariance function (by Wiener-Khinchin theorem). The covariance derived from the Gaussian mixture spectrum corresponds to the known spectral mixture kernel. Next, we estimate the mixture weights in the log domain, which we show is equivalent to placing a Jeffreys prior. It automatically induces sparsity, prunes excessive frequencies, and adjusts the remaining toward the ground truth. Third, to enable efficient and scalable computation on massive collocation points, which are critical to capture high frequencies, we place the collocation points on a grid, and multiply our covariance function at each input dimension. We use the GP conditional mean to predict the solution and its derivatives so as to fit the boundary condition and the equation itself. As a result, we can derive a Kronecker product structure in the covariance matrix. We use Kronecker product properties and multilinear algebra to promote computational efficiency and scalability, without low-rank approximations. We show the advantage of our method in systematic experiments. The code is released at https://github.com/xuangu-fang/Gaussian-Process-Slover-for-High-Freq-PDE.
An Iterative Direct Sampling Method for Reconstructing Moving Inhomogeneities in Parabolic Problems
We propose in this work a novel iterative direct sampling method for imaging moving inhomogeneities in parabolic problems using boundary measurements. It can efficiently identify the locations and shapes of moving inhomogeneities when very limited data are available, even with only one pair of lateral Cauchy data, and enjoys remarkable numerical stability for noisy data and over an extended time horizon. The method is formulated in an abstract framework, and is applicable to linear and nonlinear parabolic problems, including linear, nonlinear, and mixed-type inhomogeneities. Numerical experiments across diverse scenarios show its effectiveness and robustness against the data noise.
Differentiable Radio Frequency Ray Tracing for Millimeter-Wave Sensing
Millimeter wave (mmWave) sensing is an emerging technology with applications in 3D object characterization and environment mapping. However, realizing precise 3D reconstruction from sparse mmWave signals remains challenging. Existing methods rely on data-driven learning, constrained by dataset availability and difficulty in generalization. We propose DiffSBR, a differentiable framework for mmWave-based 3D reconstruction. DiffSBR incorporates a differentiable ray tracing engine to simulate radar point clouds from virtual 3D models. A gradient-based optimizer refines the model parameters to minimize the discrepancy between simulated and real point clouds. Experiments using various radar hardware validate DiffSBR's capability for fine-grained 3D reconstruction, even for novel objects unseen by the radar previously. By integrating physics-based simulation with gradient optimization, DiffSBR transcends the limitations of data-driven approaches and pioneers a new paradigm for mmWave sensing.
Beta Sampling is All You Need: Efficient Image Generation Strategy for Diffusion Models using Stepwise Spectral Analysis
Generative diffusion models have emerged as a powerful tool for high-quality image synthesis, yet their iterative nature demands significant computational resources. This paper proposes an efficient time step sampling method based on an image spectral analysis of the diffusion process, aimed at optimizing the denoising process. Instead of the traditional uniform distribution-based time step sampling, we introduce a Beta distribution-like sampling technique that prioritizes critical steps in the early and late stages of the process. Our hypothesis is that certain steps exhibit significant changes in image content, while others contribute minimally. We validated our approach using Fourier transforms to measure frequency response changes at each step, revealing substantial low-frequency changes early on and high-frequency adjustments later. Experiments with ADM and Stable Diffusion demonstrated that our Beta Sampling method consistently outperforms uniform sampling, achieving better FID and IS scores, and offers competitive efficiency relative to state-of-the-art methods like AutoDiffusion. This work provides a practical framework for enhancing diffusion model efficiency by focusing computational resources on the most impactful steps, with potential for further optimization and broader application.
A neural network for forward and inverse nonlinear Fourier transforms for fiber optic communication
We propose a neural network for both forward and inverse continuous nonlinear Fourier transforms, NFT and INFT respectively. We demonstrate the network's capability to perform NFT and INFT for a random mix of NFDM-QAM signals. The network transformations (NFT and INFT) exhibit true characteristics of these transformations; they are significantly different for low and high-power input pulses. The network shows adequate accuracy with an RMSE of 5e-3 for forward and 3e-2 for inverse transforms. We further show that the trained network can be used to perform general nonlinear Fourier transforms on arbitrary pulses beyond the training pulse types.
Gradient-Based Optimization of Core-Shell Particles with Discrete Materials for Directional Scattering
Designing nanophotonic structures traditionally grapples with the complexities of discrete parameters, such as real materials, often resorting to costly global optimization methods. This paper introduces an approach that leverages generative deep learning to map discrete parameter sets into a continuous latent space, enabling direct gradient-based optimization. For scenarios with non-differentiable physics evaluation functions, a neural network is employed as a differentiable surrogate model. The efficacy of this methodology is demonstrated by optimizing the directional scattering properties of core-shell nanoparticles composed of a selection of realistic materials. We derive suggestions for core-shell geometries with strong forward scattering and minimized backscattering. Our findings reveal significant improvements in computational efficiency and performance when compared to global optimization techniques. Beyond nanophotonics design problems, this framework holds promise for broad applications across all types of inverse problems constrained by discrete variables.
MultiLevel Variational MultiScale (ML-VMS) framework for large-scale simulation
In this paper, we propose the MultiLevel Variational MultiScale (ML-VMS) method, a novel approach that seamlessly integrates a multilevel mesh strategy into the Variational Multiscale (VMS) framework. A key feature of the ML-VMS method is the use of the Convolutional Hierarchical Deep Neural Network (C-HiDeNN) as the approximation basis. The framework employs a coarse mesh throughout the domain, with localized fine meshes placed only in subdomains of high interest, such as those surrounding a source. Solutions at different resolutions are robustly coupled through the variational weak form and interface conditions. Compared to existing multilevel methods, ML-VMS (1) can couple an arbitrary number of mesh levels across different scales using variational multiscale framework; (2) allows approximating functions with arbitrary orders with linear finite element mesh due to the C-HiDeNN basis; (3) is supported by a rigorous theoretical error analysis; (4) features several tunable hyperparameters (e.g., order p, patch size s) with a systematic guide for their selection. We first show the theoretical error estimates of ML-VMS. Then through numerical examples, we demonstrate that ML-VMS with the C-HiDeNN takes less computational time than the FEM basis given comparable accuracy. Furthermore, we incorporate a space-time reduced-order model (ROM) based on C-HiDeNN-Tensor Decomposition (TD) into the ML-VMS framework. For a large-scale single-track laser powder bed fusion (LPBF) transient heat transfer problem that is equivalent to a full-order finite element model with 10^{10} spatial degrees of freedom (DoFs), our 3-level ML-VMS C-HiDeNN-TD achieves an approximately 5,000x speedup on a single CPU over a single-level linear FEM-TD ROM.
Radio Frequency Fingerprint Identification for LoRa Using Spectrogram and CNN
Radio frequency fingerprint identification (RFFI) is an emerging device authentication technique that relies on intrinsic hardware characteristics of wireless devices. We designed an RFFI scheme for Long Range (LoRa) systems based on spectrogram and convolutional neural network (CNN). Specifically, we used spectrogram to represent the fine-grained time-frequency characteristics of LoRa signals. In addition, we revealed that the instantaneous carrier frequency offset (CFO) is drifting, which will result in misclassification and significantly compromise the system stability; we demonstrated CFO compensation is an effective mitigation. Finally, we designed a hybrid classifier that can adjust CNN outputs with the estimated CFO. The mean value of CFO remains relatively stable, hence it can be used to rule out CNN predictions whose estimated CFO falls out of the range. We performed experiments in real wireless environments using 20 LoRa devices under test (DUTs) and a Universal Software Radio Peripheral (USRP) N210 receiver. By comparing with the IQ-based and FFT-based RFFI schemes, our spectrogram-based scheme can reach the best classification accuracy, i.e., 97.61% for 20 LoRa DUTs.
Time-Fractional Approach to the Electrochemical Impedance: The Displacement Current
We establish, in general terms, the conditions to be satisfied by a time-fractional approach formulation of the Poisson-Nernst-Planck model in order to guarantee that the total current across the sample be solenoidal, as required by the Maxwell equation. Only in this case the electric impedance of a cell can be determined as the ratio between the applied difference of potential and the current across the cell. We show that in the case of anomalous diffusion, the model predicts for the electric impedance of the cell a constant phase element behaviour in the low frequency region. In the parametric curve of the reactance versus the resistance, the slope coincides with the order of the fractional time derivative.
Boundary Element and Finite Element Coupling for Aeroacoustics Simulations
We consider the scattering of acoustic perturbations in a presence of a flow. We suppose that the space can be split into a zone where the flow is uniform and a zone where the flow is potential. In the first zone, we apply a Prandtl-Glauert transformation to recover the Helmholtz equation. The well-known setting of boundary element method for the Helmholtz equation is available. In the second zone, the flow quantities are space dependent, we have to consider a local resolution, namely the finite element method. Herein, we carry out the coupling of these two methods and present various applications and validation test cases. The source term is given through the decomposition of an incident acoustic field on a section of the computational domain's boundary.
Multi-frequency antenna for quasi-isotropic radiator and 6G massive IoT
An isotropic antenna radiates and receives electromagnetic wave uniformly in magnitude in 3D space. A multi-frequency quasi-isotropic antenna can serve as a practically feasible solution to emulate an ideal multi-frequency isotropic radiator. It is also an essential technology for mobile smart devices for massive IoT in the upcoming 6G. However, ever since the quasi-isotropic antenna was proposed and achieved more than half a century ago, at most two discrete narrow frequency bands can be achieved, because of the significantly increased structural complexity from multi-frequency isotropic radiation. This limitation impedes numerous related electromagnetic experiments and the advances in wireless communication. Here, for the first time, a design method for multi-band (>2) quasi-isotropic antennas is proposed. An exemplified quasi-isotropic antenna with the desired four frequency bands is also presented for demonstration. The measured results validate excellent performance on both electromagnetics and wireless communications for this antenna.
Optimal Control of Coefficients in Parabolic Free Boundary Problems Modeling Laser Ablation
Inverse Stefan problem arising in modeling of laser ablation of biomedical tissues is analyzed, where information on the coefficients, heat flux on the fixed boundary, and density of heat sources are missing and must be found along with the temperature and free boundary. Optimal control framework is employed, where the missing data and the free boundary are components of the control vector, and optimality criteria are based on the final moment measurement of the temperature and position of the free boundary. Discretization by finite differences is pursued, and convergence of the discrete optimal control problems to the original problem is proven.
Deep learning for prediction of complex geology ahead of drilling
During a geosteering operation the well path is intentionally adjusted in response to the new data acquired while drilling. To achieve consistent high-quality decisions, especially when drilling in complex environments, decision support systems can help cope with high volumes of data and interpretation complexities. They can assimilate the real-time measurements into a probabilistic earth model and use the updated model for decision recommendations. Recently, machine learning (ML) techniques have enabled a wide range of methods that redistribute computational cost from on-line to off-line calculations. In this paper, we introduce two ML techniques into the geosteering decision support framework. Firstly, a complex earth model representation is generated using a Generative Adversarial Network (GAN). Secondly, a commercial extra-deep electromagnetic simulator is represented using a Forward Deep Neural Network (FDNN). The numerical experiments demonstrate that the combination of the GAN and the FDNN in an ensemble randomized maximum likelihood data assimilation scheme provides real-time estimates of complex geological uncertainty. This yields reduction of geological uncertainty ahead of the drill-bit from the measurements gathered behind and around the well bore.
Symmetric Basis Convolutions for Learning Lagrangian Fluid Mechanics
Learning physical simulations has been an essential and central aspect of many recent research efforts in machine learning, particularly for Navier-Stokes-based fluid mechanics. Classic numerical solvers have traditionally been computationally expensive and challenging to use in inverse problems, whereas Neural solvers aim to address both concerns through machine learning. We propose a general formulation for continuous convolutions using separable basis functions as a superset of existing methods and evaluate a large set of basis functions in the context of (a) a compressible 1D SPH simulation, (b) a weakly compressible 2D SPH simulation, and (c) an incompressible 2D SPH Simulation. We demonstrate that even and odd symmetries included in the basis functions are key aspects of stability and accuracy. Our broad evaluation shows that Fourier-based continuous convolutions outperform all other architectures regarding accuracy and generalization. Finally, using these Fourier-based networks, we show that prior inductive biases, such as window functions, are no longer necessary. An implementation of our approach, as well as complete datasets and solver implementations, is available at https://github.com/tum-pbs/SFBC.
PDE-Refiner: Achieving Accurate Long Rollouts with Neural PDE Solvers
Time-dependent partial differential equations (PDEs) are ubiquitous in science and engineering. Recently, mostly due to the high computational cost of traditional solution techniques, deep neural network based surrogates have gained increased interest. The practical utility of such neural PDE solvers relies on their ability to provide accurate, stable predictions over long time horizons, which is a notoriously hard problem. In this work, we present a large-scale analysis of common temporal rollout strategies, identifying the neglect of non-dominant spatial frequency information, often associated with high frequencies in PDE solutions, as the primary pitfall limiting stable, accurate rollout performance. Based on these insights, we draw inspiration from recent advances in diffusion models to introduce PDE-Refiner; a novel model class that enables more accurate modeling of all frequency components via a multistep refinement process. We validate PDE-Refiner on challenging benchmarks of complex fluid dynamics, demonstrating stable and accurate rollouts that consistently outperform state-of-the-art models, including neural, numerical, and hybrid neural-numerical architectures. We further demonstrate that PDE-Refiner greatly enhances data efficiency, since the denoising objective implicitly induces a novel form of spectral data augmentation. Finally, PDE-Refiner's connection to diffusion models enables an accurate and efficient assessment of the model's predictive uncertainty, allowing us to estimate when the surrogate becomes inaccurate.
Improving Rectified Flow with Boundary Conditions
Rectified Flow offers a simple and effective approach to high-quality generative modeling by learning a velocity field. However, we identify a limitation in directly modeling the velocity with an unconstrained neural network: the learned velocity often fails to satisfy certain boundary conditions, leading to inaccurate velocity field estimations that deviate from the desired ODE. This issue is particularly critical during stochastic sampling at inference, as the score function's errors are amplified near the boundary. To mitigate this, we propose a Boundary-enforced Rectified Flow Model (Boundary RF Model), in which we enforce boundary conditions with a minimal code modification. Boundary RF Model improves performance over vanilla RF model, demonstrating 8.01% improvement in FID score on ImageNet using ODE sampling and 8.98% improvement using SDE sampling.
Frequency-domain multiplexing of SNSPDs with tunable superconducting resonators
This work culminates in a demonstration of an alternative Frequency Domain Multiplexing (FDM) scheme for Superconducting Nanowire Single-Photon Detectors (SNSPDs) using the Kinetic inductance Parametric UP-converter (KPUP) made out of NbTiN. There are multiple multiplexing architectures for SNSPDs that are already in use, but FDM could prove superior in applications where the operational bias currents are very low, especially for mid- and far-infrared SNSPDs. Previous FDM schemes integrated the SNSPD within the resonator, while in this work we use an external resonator, which gives more flexibility to optimize the SNSPD architecture. The KPUP is a DC-biased superconducting resonator in which a nanowire is used as its inductive element to enable sensitivity to current perturbations. When coupled to an SNSPD, the KPUP can be used to read out current pulses on the few μA scale. The KPUP is made out of NbTiN, which has high non-linear kinetic inductance for increased sensitivity at higher current bias and high operating temperature. Meanwhile, the SNSPD is made from WSi, which is a popular material for broadband SNSPDs. To read out the KPUP and SNSPD array, a software-defined radio platform and a graphics processing unit are used. Frequency Domain Multiplexed SNSPDs have applications in astronomy, remote sensing, exoplanet science, dark matter detection, and quantum sensing.
Numerical modeling of SNSPD absorption utilizing optical conductivity with quantum corrections
Superconducting nanowire single-photon detectors are widely used in various fields of physics and technology, due to their high efficiency and timing precision. Although, in principle, their detection mechanism offers broadband operation, their wavelength range has to be optimized by the optical cavity parameters for a specific task. We present a study of the optical absorption of a superconducting nanowire single photon detector (SNSPD) with an optical cavity. The optical properties of the niobium nitride films, measured by spectroscopic ellipsometry, were modelled using the Drude-Lorentz model with quantum corrections. The numerical simulations of the optical response of the detectors show that the wavelength range of the detector is not solely determined by its geometry, but the optical conductivity of the disordered thin metallic films contributes considerably. This contribution can be conveniently expressed by the ratio of imaginary and real parts of the optical conductivity. This knowledge can be utilized in detector design.
Backpropagation-free Training of Deep Physical Neural Networks
Recent years have witnessed the outstanding success of deep learning in various fields such as vision and natural language processing. This success is largely indebted to the massive size of deep learning models that is expected to increase unceasingly. This growth of the deep learning models is accompanied by issues related to their considerable energy consumption, both during the training and inference phases, as well as their scalability. Although a number of work based on unconventional physical systems have been proposed which addresses the issue of energy efficiency in the inference phase, efficient training of deep learning models has remained unaddressed. So far, training of digital deep learning models mainly relies on backpropagation, which is not suitable for physical implementation as it requires perfect knowledge of the computation performed in the so-called forward pass of the neural network. Here, we tackle this issue by proposing a simple deep neural network architecture augmented by a biologically plausible learning algorithm, referred to as "model-free forward-forward training". The proposed architecture enables training deep physical neural networks consisting of layers of physical nonlinear systems, without requiring detailed knowledge of the nonlinear physical layers' properties. We show that our method outperforms state-of-the-art hardware-aware training methods by improving training speed, decreasing digital computations, and reducing power consumption in physical systems. We demonstrate the adaptability of the proposed method, even in systems exposed to dynamic or unpredictable external perturbations. To showcase the universality of our approach, we train diverse wave-based physical neural networks that vary in the underlying wave phenomenon and the type of non-linearity they use, to perform vowel and image classification tasks experimentally.
NeuRBF: A Neural Fields Representation with Adaptive Radial Basis Functions
We present a novel type of neural fields that uses general radial bases for signal representation. State-of-the-art neural fields typically rely on grid-based representations for storing local neural features and N-dimensional linear kernels for interpolating features at continuous query points. The spatial positions of their neural features are fixed on grid nodes and cannot well adapt to target signals. Our method instead builds upon general radial bases with flexible kernel position and shape, which have higher spatial adaptivity and can more closely fit target signals. To further improve the channel-wise capacity of radial basis functions, we propose to compose them with multi-frequency sinusoid functions. This technique extends a radial basis to multiple Fourier radial bases of different frequency bands without requiring extra parameters, facilitating the representation of details. Moreover, by marrying adaptive radial bases with grid-based ones, our hybrid combination inherits both adaptivity and interpolation smoothness. We carefully designed weighting schemes to let radial bases adapt to different types of signals effectively. Our experiments on 2D image and 3D signed distance field representation demonstrate the higher accuracy and compactness of our method than prior arts. When applied to neural radiance field reconstruction, our method achieves state-of-the-art rendering quality, with small model size and comparable training speed.
Efficient View Synthesis with Neural Radiance Distribution Field
Recent work on Neural Radiance Fields (NeRF) has demonstrated significant advances in high-quality view synthesis. A major limitation of NeRF is its low rendering efficiency due to the need for multiple network forwardings to render a single pixel. Existing methods to improve NeRF either reduce the number of required samples or optimize the implementation to accelerate the network forwarding. Despite these efforts, the problem of multiple sampling persists due to the intrinsic representation of radiance fields. In contrast, Neural Light Fields (NeLF) reduce the computation cost of NeRF by querying only one single network forwarding per pixel. To achieve a close visual quality to NeRF, existing NeLF methods require significantly larger network capacities which limits their rendering efficiency in practice. In this work, we propose a new representation called Neural Radiance Distribution Field (NeRDF) that targets efficient view synthesis in real-time. Specifically, we use a small network similar to NeRF while preserving the rendering speed with a single network forwarding per pixel as in NeLF. The key is to model the radiance distribution along each ray with frequency basis and predict frequency weights using the network. Pixel values are then computed via volume rendering on radiance distributions. Experiments show that our proposed method offers a better trade-off among speed, quality, and network size than existing methods: we achieve a ~254x speed-up over NeRF with similar network size, with only a marginal performance decline. Our project page is at yushuang-wu.github.io/NeRDF.
Message Passing Neural PDE Solvers
The numerical solution of partial differential equations (PDEs) is difficult, having led to a century of research so far. Recently, there have been pushes to build neural--numerical hybrid solvers, which piggy-backs the modern trend towards fully end-to-end learned systems. Most works so far can only generalize over a subset of properties to which a generic solver would be faced, including: resolution, topology, geometry, boundary conditions, domain discretization regularity, dimensionality, etc. In this work, we build a solver, satisfying these properties, where all the components are based on neural message passing, replacing all heuristically designed components in the computation graph with backprop-optimized neural function approximators. We show that neural message passing solvers representationally contain some classical methods, such as finite differences, finite volumes, and WENO schemes. In order to encourage stability in training autoregressive models, we put forward a method that is based on the principle of zero-stability, posing stability as a domain adaptation problem. We validate our method on various fluid-like flow problems, demonstrating fast, stable, and accurate performance across different domain topologies, equation parameters, discretizations, etc., in 1D and 2D.
Structure-Preserving Operator Learning
Learning complex dynamics driven by partial differential equations directly from data holds great promise for fast and accurate simulations of complex physical systems. In most cases, this problem can be formulated as an operator learning task, where one aims to learn the operator representing the physics of interest, which entails discretization of the continuous system. However, preserving key continuous properties at the discrete level, such as boundary conditions, and addressing physical systems with complex geometries is challenging for most existing approaches. We introduce a family of operator learning architectures, structure-preserving operator networks (SPONs), that allows to preserve key mathematical and physical properties of the continuous system by leveraging finite element (FE) discretizations of the input-output spaces. SPONs are encode-process-decode architectures that are end-to-end differentiable, where the encoder and decoder follows from the discretizations of the input-output spaces. SPONs can operate on complex geometries, enforce certain boundary conditions exactly, and offer theoretical guarantees. Our framework provides a flexible way of devising structure-preserving architectures tailored to specific applications, and offers an explicit trade-off between performance and efficiency, all thanks to the FE discretization of the input-output spaces. Additionally, we introduce a multigrid-inspired SPON architecture that yields improved performance at higher efficiency. Finally, we release a software to automate the design and training of SPON architectures.
AERO: Audio Super Resolution in the Spectral Domain
We present AERO, a audio super-resolution model that processes speech and music signals in the spectral domain. AERO is based on an encoder-decoder architecture with U-Net like skip connections. We optimize the model using both time and frequency domain loss functions. Specifically, we consider a set of reconstruction losses together with perceptual ones in the form of adversarial and feature discriminator loss functions. To better handle phase information the proposed method operates over the complex-valued spectrogram using two separate channels. Unlike prior work which mainly considers low and high frequency concatenation for audio super-resolution, the proposed method directly predicts the full frequency range. We demonstrate high performance across a wide range of sample rates considering both speech and music. AERO outperforms the evaluated baselines considering Log-Spectral Distance, ViSQOL, and the subjective MUSHRA test. Audio samples and code are available at https://pages.cs.huji.ac.il/adiyoss-lab/aero
LESnets (Large-Eddy Simulation nets): Physics-informed neural operator for large-eddy simulation of turbulence
Acquisition of large datasets for three-dimensional (3D) partial differential equations are usually very expensive. Physics-informed neural operator (PINO) eliminates the high costs associated with generation of training datasets, and shows great potential in a variety of partial differential equations. In this work, we employ physics-informed neural operator, encoding the large-eddy simulation (LES) equations directly into the neural operator for simulating three-dimensional incompressible turbulent flows. We develop the LESnets (Large-Eddy Simulation nets) by adding large-eddy simulation equations to two different data-driven models, including Fourier neural operator (FNO) and implicit Fourier neural operator (IFNO) without using label data. Notably, by leveraging only PDE constraints to learn the spatio-temporal dynamics problem, LESnets retains the computational efficiency of data-driven approaches while obviating the necessity for data. Meanwhile, using large-eddy simulation equations as PDE constraints makes it possible to efficiently predict complex turbulence at coarse grids. We investigate the performance of the LESnets with two standard three-dimensional turbulent flows: decaying homogeneous isotropic turbulence and temporally evolving turbulent mixing layer. In the numerical experiments, the LESnets model shows a similar or even better accuracy as compared to traditional large-eddy simulation and data-driven models of FNO and IFNO. Moreover, the well-trained LESnets is significantly faster than traditional LES, and has a similar efficiency as the data-driven FNO and IFNO models. Thus, physics-informed neural operators have a strong potential for 3D nonlinear engineering applications.
Materiatronics: Modular analysis of arbitrary meta-atoms
Within the paradigm of metamaterials and metasurfaces, electromagnetic properties of composite materials can be engineered by shaping or modulating their constituents, so-called meta-atoms. Synthesis and analysis of complex-shape meta-atoms with general polarization properties is a challenging task. In this paper, we demonstrate that the most general response can be conceptually decomposed into a set of basic, fundamental polarization phenomena, which enables immediate all-direction characterization of electromagnetic properties of arbitrary linear metamaterials and metasurfaces. The proposed platform of modular characterization (called "materiatronics") is tested on several examples of bianisotropic and nonreciprocal meta-atoms. As a demonstration of the potential of the modular analysis, we use it to design a single-layer metasurface of vanishing thickness with unitary circular dichroism. The analysis approach developed in this paper is supported by a ready-to-use computational code and can be further extended to meta-atoms engineered for other types of wave interactions, such as acoustics and mechanics.
ShapeNet: Shape Constraint for Galaxy Image Deconvolution
Deep Learning (DL) has shown remarkable results in solving inverse problems in various domains. In particular, the Tikhonet approach is very powerful to deconvolve optical astronomical images (Sureau et al. 2020). Yet, this approach only uses the ell_2 loss, which does not guarantee the preservation of physical information (e.g. flux and shape) of the object reconstructed in the image. In Nammour et al. (2021), a new loss function was proposed in the framework of sparse deconvolution, which better preserves the shape of galaxies and reduces the pixel error. In this paper, we extend Tikhonet to take into account this shape constraint, and apply our new DL method, called ShapeNet, to optical and radio-interferometry simulated data set. The originality of the paper relies on i) the shape constraint we use in the neural network framework, ii) the application of deep learning to radio-interferometry image deconvolution for the first time, and iii) the generation of a simulated radio data set that we make available for the community. A range of examples illustrates the results.
Refracting Reality: Generating Images with Realistic Transparent Objects
Generative image models can produce convincingly real images, with plausible shapes, textures, layouts and lighting. However, one domain in which they perform notably poorly is in the synthesis of transparent objects, which exhibit refraction, reflection, absorption and scattering. Refraction is a particular challenge, because refracted pixel rays often intersect with surfaces observed in other parts of the image, providing a constraint on the color. It is clear from inspection that generative models have not distilled the laws of optics sufficiently well to accurately render refractive objects. In this work, we consider the problem of generating images with accurate refraction, given a text prompt. We synchronize the pixels within the object's boundary with those outside by warping and merging the pixels using Snell's Law of Refraction, at each step of the generation trajectory. For those surfaces that are not directly observed in the image, but are visible via refraction or reflection, we recover their appearance by synchronizing the image with a second generated image -- a panorama centered at the object -- using the same warping and merging procedure. We demonstrate that our approach generates much more optically-plausible images that respect the physical constraints.
Treble10: A high-quality dataset for far-field speech recognition, dereverberation, and enhancement
Accurate far-field speech datasets are critical for tasks such as automatic speech recognition (ASR), dereverberation, speech enhancement, and source separation. However, current datasets are limited by the trade-off between acoustic realism and scalability. Measured corpora provide faithful physics but are expensive, low-coverage, and rarely include paired clean and reverberant data. In contrast, most simulation-based datasets rely on simplified geometrical acoustics, thus failing to reproduce key physical phenomena like diffraction, scattering, and interference that govern sound propagation in complex environments. We introduce Treble10, a large-scale, physically accurate room-acoustic dataset. Treble10 contains over 3000 broadband room impulse responses (RIRs) simulated in 10 fully furnished real-world rooms, using a hybrid simulation paradigm implemented in the Treble SDK that combines a wave-based and geometrical acoustics solver. The dataset provides six complementary subsets, spanning mono, 8th-order Ambisonics, and 6-channel device RIRs, as well as pre-convolved reverberant speech scenes paired with LibriSpeech utterances. All signals are simulated at 32 kHz, accurately modelling low-frequency wave effects and high-frequency reflections. Treble10 bridges the realism gap between measurement and simulation, enabling reproducible, physically grounded evaluation and large-scale data augmentation for far-field speech tasks. The dataset is openly available via the Hugging Face Hub, and is intended as both a benchmark and a template for next-generation simulation-driven audio research.
Efficient Physics-Based Learned Reconstruction Methods for Real-Time 3D Near-Field MIMO Radar Imaging
Near-field multiple-input multiple-output (MIMO) radar imaging systems have recently gained significant attention. In this paper, we develop novel non-iterative deep learning-based reconstruction methods for real-time near-field MIMO imaging. The goal is to achieve high image quality with low computational cost at compressive settings. The developed approaches have two stages. In the first approach, physics-based initial stage performs adjoint operation to back-project the measurements to the image-space, and deep neural network (DNN)-based second stage converts the 3D backprojected measurements to a magnitude-only reflectivity image. Since scene reflectivities often have random phase, DNN processes directly the magnitude of the adjoint result. As DNN, 3D U-Net is used to jointly exploit range and cross-range correlations. To comparatively evaluate the significance of exploiting physics in a learning-based approach, two additional approaches that replace the physics-based first stage with fully connected layers are also developed as purely learning-based methods. The performance is also analyzed by changing the DNN architecture for the second stage to include complex-valued processing (instead of magnitude-only processing), 2D convolution kernels (instead of 3D), and ResNet architecture (instead of U-Net). Moreover, we develop a synthesizer to generate large-scale dataset for training with 3D extended targets. We illustrate the performance through experimental data and extensive simulations. The results show the effectiveness of the developed physics-based learned reconstruction approach in terms of both run-time and image quality at highly compressive settings. Our source codes and dataset are made available at GitHub.
Wavelet Diffusion Neural Operator
Simulating and controlling physical systems described by partial differential equations (PDEs) are crucial tasks across science and engineering. Recently, diffusion generative models have emerged as a competitive class of methods for these tasks due to their ability to capture long-term dependencies and model high-dimensional states. However, diffusion models typically struggle with handling system states with abrupt changes and generalizing to higher resolutions. In this work, we propose Wavelet Diffusion Neural Operator (WDNO), a novel PDE simulation and control framework that enhances the handling of these complexities. WDNO comprises two key innovations. Firstly, WDNO performs diffusion-based generative modeling in the wavelet domain for the entire trajectory to handle abrupt changes and long-term dependencies effectively. Secondly, to address the issue of poor generalization across different resolutions, which is one of the fundamental tasks in modeling physical systems, we introduce multi-resolution training. We validate WDNO on five physical systems, including 1D advection equation, three challenging physical systems with abrupt changes (1D Burgers' equation, 1D compressible Navier-Stokes equation and 2D incompressible fluid), and a real-world dataset ERA5, which demonstrates superior performance on both simulation and control tasks over state-of-the-art methods, with significant improvements in long-term and detail prediction accuracy. Remarkably, in the challenging context of the 2D high-dimensional and indirect control task aimed at reducing smoke leakage, WDNO reduces the leakage by 33.2% compared to the second-best baseline. The code can be found at https://github.com/AI4Science-WestlakeU/wdno.git.
kh2d-solver: A Python Library for Idealized Two-Dimensional Incompressible Kelvin-Helmholtz Instability
We present an open-source Python library for simulating two-dimensional incompressible Kelvin-Helmholtz instabilities in stratified shear flows. The solver employs a fractional-step projection method with spectral Poisson solution via Fast Sine Transform, achieving second-order spatial accuracy. Implementation leverages NumPy, SciPy, and Numba JIT compilation for efficient computation. Four canonical test cases explore Reynolds numbers 1000--5000 and Richardson numbers 0.1--0.3: classical shear layer, double shear configuration, rotating flow, and forced turbulence. Statistical analysis using Shannon entropy and complexity indices reveals that double shear layers achieve 2.8times higher mixing rates than forced turbulence despite lower Reynolds numbers. The solver runs efficiently on standard desktop hardware, with 384times192 grid simulations completing in approximately 31 minutes. Results demonstrate that mixing efficiency depends on instability generation pathways rather than intensity measures alone, challenging Richardson number-based parameterizations and suggesting refinements for subgrid-scale representation in climate models.
Exact Coset Sampling for Quantum Lattice Algorithms
We give a simple, fully correct, and assumption-light replacement for the contested "domain-extension" in Step 9 of a recent windowed-QFT lattice algorithm with complex-Gaussian windows~chen2024quantum. The published Step~9 suffers from a periodicity/support mismatch. We present a pair-shift difference construction that coherently cancels all unknown offsets, produces an exact uniform CRT-coset state over Z_{P}, and then uses the QFT to enforce the intended modular linear relation. The unitary is reversible, uses poly(log M_2) gates, and preserves the algorithm's asymptotics. Project Page: https://github.com/yifanzhang-pro/quantum-lattice.
Reflected Schrödinger Bridge for Constrained Generative Modeling
Diffusion models have become the go-to method for large-scale generative models in real-world applications. These applications often involve data distributions confined within bounded domains, typically requiring ad-hoc thresholding techniques for boundary enforcement. Reflected diffusion models (Lou23) aim to enhance generalizability by generating the data distribution through a backward process governed by reflected Brownian motion. However, reflected diffusion models may not easily adapt to diverse domains without the derivation of proper diffeomorphic mappings and do not guarantee optimal transport properties. To overcome these limitations, we introduce the Reflected Schrodinger Bridge algorithm: an entropy-regularized optimal transport approach tailored for generating data within diverse bounded domains. We derive elegant reflected forward-backward stochastic differential equations with Neumann and Robin boundary conditions, extend divergence-based likelihood training to bounded domains, and explore natural connections to entropic optimal transport for the study of approximate linear convergence - a valuable insight for practical training. Our algorithm yields robust generative modeling in diverse domains, and its scalability is demonstrated in real-world constrained generative modeling through standard image benchmarks.
Self-Supervised Learning with Lie Symmetries for Partial Differential Equations
Machine learning for differential equations paves the way for computationally efficient alternatives to numerical solvers, with potentially broad impacts in science and engineering. Though current algorithms typically require simulated training data tailored to a given setting, one may instead wish to learn useful information from heterogeneous sources, or from real dynamical systems observations that are messy or incomplete. In this work, we learn general-purpose representations of PDEs from heterogeneous data by implementing joint embedding methods for self-supervised learning (SSL), a framework for unsupervised representation learning that has had notable success in computer vision. Our representation outperforms baseline approaches to invariant tasks, such as regressing the coefficients of a PDE, while also improving the time-stepping performance of neural solvers. We hope that our proposed methodology will prove useful in the eventual development of general-purpose foundation models for PDEs.
Solving High-Dimensional PDEs with Latent Spectral Models
Deep models have achieved impressive progress in solving partial differential equations (PDEs). A burgeoning paradigm is learning neural operators to approximate the input-output mappings of PDEs. While previous deep models have explored the multiscale architectures and various operator designs, they are limited to learning the operators as a whole in the coordinate space. In real physical science problems, PDEs are complex coupled equations with numerical solvers relying on discretization into high-dimensional coordinate space, which cannot be precisely approximated by a single operator nor efficiently learned due to the curse of dimensionality. We present Latent Spectral Models (LSM) toward an efficient and precise solver for high-dimensional PDEs. Going beyond the coordinate space, LSM enables an attention-based hierarchical projection network to reduce the high-dimensional data into a compact latent space in linear time. Inspired by classical spectral methods in numerical analysis, we design a neural spectral block to solve PDEs in the latent space that approximates complex input-output mappings via learning multiple basis operators, enjoying nice theoretical guarantees for convergence and approximation. Experimentally, LSM achieves consistent state-of-the-art and yields a relative gain of 11.5% averaged on seven benchmarks covering both solid and fluid physics. Code is available at https://github.com/thuml/Latent-Spectral-Models.
DDSP: Differentiable Digital Signal Processing
Most generative models of audio directly generate samples in one of two domains: time or frequency. While sufficient to express any signal, these representations are inefficient, as they do not utilize existing knowledge of how sound is generated and perceived. A third approach (vocoders/synthesizers) successfully incorporates strong domain knowledge of signal processing and perception, but has been less actively researched due to limited expressivity and difficulty integrating with modern auto-differentiation-based machine learning methods. In this paper, we introduce the Differentiable Digital Signal Processing (DDSP) library, which enables direct integration of classic signal processing elements with deep learning methods. Focusing on audio synthesis, we achieve high-fidelity generation without the need for large autoregressive models or adversarial losses, demonstrating that DDSP enables utilizing strong inductive biases without losing the expressive power of neural networks. Further, we show that combining interpretable modules permits manipulation of each separate model component, with applications such as independent control of pitch and loudness, realistic extrapolation to pitches not seen during training, blind dereverberation of room acoustics, transfer of extracted room acoustics to new environments, and transformation of timbre between disparate sources. In short, DDSP enables an interpretable and modular approach to generative modeling, without sacrificing the benefits of deep learning. The library is publicly available at https://github.com/magenta/ddsp and we welcome further contributions from the community and domain experts.
OL-Transformer: A Fast and Universal Surrogate Simulator for Optical Multilayer Thin Film Structures
Deep learning-based methods have recently been established as fast and accurate surrogate simulators for optical multilayer thin film structures. However, existing methods only work for limited types of structures with different material arrangements, preventing their applications towards diverse and universal structures. Here, we propose the Opto-Layer (OL) Transformer to act as a universal surrogate simulator for enormous types of structures. Combined with the technique of structure serialization, our model can predict accurate reflection and transmission spectra for up to 10^{25} different multilayer structures, while still achieving a six-fold degradation in simulation time compared to physical solvers. Further investigation reveals that the general learning ability comes from the fact that our model first learns the physical embeddings and then uses the self-attention mechanism to capture the hidden relationship of light-matter interaction between each layer.
RADIANCE: Radio-Frequency Adversarial Deep-learning Inference for Automated Network Coverage Estimation
Radio-frequency coverage maps (RF maps) are extensively utilized in wireless networks for capacity planning, placement of access points and base stations, localization, and coverage estimation. Conducting site surveys to obtain RF maps is labor-intensive and sometimes not feasible. In this paper, we propose radio-frequency adversarial deep-learning inference for automated network coverage estimation (RADIANCE), a generative adversarial network (GAN) based approach for synthesizing RF maps in indoor scenarios. RADIANCE utilizes a semantic map, a high-level representation of the indoor environment to encode spatial relationships and attributes of objects within the environment and guide the RF map generation process. We introduce a new gradient-based loss function that computes the magnitude and direction of change in received signal strength (RSS) values from a point within the environment. RADIANCE incorporates this loss function along with the antenna pattern to capture signal propagation within a given indoor configuration and generate new patterns under new configuration, antenna (beam) pattern, and center frequency. Extensive simulations are conducted to compare RADIANCE with ray-tracing simulations of RF maps. Our results show that RADIANCE achieves a mean average error (MAE) of 0.09, root-mean-squared error (RMSE) of 0.29, peak signal-to-noise ratio (PSNR) of 10.78, and multi-scale structural similarity index (MS-SSIM) of 0.80.
Low Rank Optimization for Efficient Deep Learning: Making A Balance between Compact Architecture and Fast Training
Deep neural networks have achieved great success in many data processing applications. However, the high computational complexity and storage cost makes deep learning hard to be used on resource-constrained devices, and it is not environmental-friendly with much power cost. In this paper, we focus on low-rank optimization for efficient deep learning techniques. In the space domain, deep neural networks are compressed by low rank approximation of the network parameters, which directly reduces the storage requirement with a smaller number of network parameters. In the time domain, the network parameters can be trained in a few subspaces, which enables efficient training for fast convergence. The model compression in the spatial domain is summarized into three categories as pre-train, pre-set, and compression-aware methods, respectively. With a series of integrable techniques discussed, such as sparse pruning, quantization, and entropy coding, we can ensemble them in an integration framework with lower computational complexity and storage. Besides of summary of recent technical advances, we have two findings for motivating future works: one is that the effective rank outperforms other sparse measures for network compression. The other is a spatial and temporal balance for tensorized neural networks.
Transformer Meets Boundary Value Inverse Problems
A Transformer-based deep direct sampling method is proposed for electrical impedance tomography, a well-known severely ill-posed nonlinear boundary value inverse problem. A real-time reconstruction is achieved by evaluating the learned inverse operator between carefully designed data and the reconstructed images. An effort is made to give a specific example to a fundamental question: whether and how one can benefit from the theoretical structure of a mathematical problem to develop task-oriented and structure-conforming deep neural networks? Specifically, inspired by direct sampling methods for inverse problems, the 1D boundary data in different frequencies are preprocessed by a partial differential equation-based feature map to yield 2D harmonic extensions as different input channels. Then, by introducing learnable non-local kernels, the direct sampling is recast to a modified attention mechanism. The new method achieves superior accuracy over its predecessors and contemporary operator learners and shows robustness to noises in benchmarks. This research shall strengthen the insights that, despite being invented for natural language processing tasks, the attention mechanism offers great flexibility to be modified in conformity with the a priori mathematical knowledge, which ultimately leads to the design of more physics-compatible neural architectures.
NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates
Conventionally, audio super-resolution models fixed the initial and the target sampling rates, which necessitate the model to be trained for each pair of sampling rates. We introduce NU-Wave 2, a diffusion model for neural audio upsampling that enables the generation of 48 kHz audio signals from inputs of various sampling rates with a single model. Based on the architecture of NU-Wave, NU-Wave 2 uses short-time Fourier convolution (STFC) to generate harmonics to resolve the main failure modes of NU-Wave, and incorporates bandwidth spectral feature transform (BSFT) to condition the bandwidths of inputs in the frequency domain. We experimentally demonstrate that NU-Wave 2 produces high-resolution audio regardless of the sampling rate of input while requiring fewer parameters than other models. The official code and the audio samples are available at https://mindslab-ai.github.io/nuwave2.
Outdoor-to-Indoor 28 GHz Wireless Measurements in Manhattan: Path Loss, Environmental Effects, and 90% Coverage
Outdoor-to-indoor (OtI) signal propagation further challenges the already tight link budgets at millimeter-wave (mmWave). To gain insight into OtI mmWave scenarios at 28 GHz, we conducted an extensive measurement campaign consisting of over 2,200 link measurements. In total, 43 OtI scenarios were measured in West Harlem, New York City, covering seven highly diverse buildings. The measured OtI path gain can vary by up to 40 dB for a given link distance, and the empirical path gain model for all data shows an average of 30 dB excess loss over free space at distances beyond 50 m, with an RMS fitting error of 11.7 dB. The type of glass is found to be the single dominant feature for OtI loss, with 20 dB observed difference between empirical path gain models for scenarios with low-loss and high-loss glass. The presence of scaffolding, tree foliage, or elevated subway tracks, as well as difference in floor height are each found to have an impact between 5-10 dB. We show that for urban buildings with high-loss glass, OtI coverage can support 500 Mbps for 90% of indoor user equipment (UEs) with a base station (BS) antenna placed up to 49 m away. For buildings with low-loss glass, such as our case study covering multiple classrooms of a public school, data rates over 2.5/1.2 Gbps are possible from a BS 68/175 m away from the school building, when a line-of-sight path is available. We expect these results to be useful for the deployment of mmWave networks in dense urban environments as well as the development of relevant scheduling and beam management algorithms.
Accelerating High-Fidelity Waveform Generation via Adversarial Flow Matching Optimization
This paper introduces PeriodWave-Turbo, a high-fidelity and high-efficient waveform generation model via adversarial flow matching optimization. Recently, conditional flow matching (CFM) generative models have been successfully adopted for waveform generation tasks, leveraging a single vector field estimation objective for training. Although these models can generate high-fidelity waveform signals, they require significantly more ODE steps compared to GAN-based models, which only need a single generation step. Additionally, the generated samples often lack high-frequency information due to noisy vector field estimation, which fails to ensure high-frequency reproduction. To address this limitation, we enhance pre-trained CFM-based generative models by incorporating a fixed-step generator modification. We utilized reconstruction losses and adversarial feedback to accelerate high-fidelity waveform generation. Through adversarial flow matching optimization, it only requires 1,000 steps of fine-tuning to achieve state-of-the-art performance across various objective metrics. Moreover, we significantly reduce inference speed from 16 steps to 2 or 4 steps. Additionally, by scaling up the backbone of PeriodWave from 29M to 70M parameters for improved generalization, PeriodWave-Turbo achieves unprecedented performance, with a perceptual evaluation of speech quality (PESQ) score of 4.454 on the LibriTTS dataset. Audio samples, source code and checkpoints will be available at https://github.com/sh-lee-prml/PeriodWave.
