Refine Search

New Search

Results in ArXiv: 2,132,461

(searched for: container_group_id:18405)
Page of 42,650
Articles per Page
by
Show export options
  Select all
Olivia Beckwith, Martin Raum, Olav Richter
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
Given an odd prime $\ell$ and finite set of odd primes $S_+$, we prove the existence of an imaginary quadratic field whose class number is indivisible by $\ell$ and which splits at every prime in $S_+$. Notably, we do not require that $p \not\equiv -1 \pmod{\ell}$ for any of the split primes $p$ that we impose. Our theorem is in the spirit of a result by Wiles, but we introduce a new method. It relies on a significant improvement of our earlier work on the classification of non-holomorphic Ramanujan-type congruences for Hurwitz class numbers.
Abhilash Potluri, Fangyuan Xu, Eunsol Choi
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
Long-form question answering systems provide rich information by presenting paragraph-level answers, often containing optional background or auxiliary information. While such comprehensive answers are helpful, not all information is required to answer the question (e.g. users with domain knowledge do not need an explanation of background). Can we provide a concise version of the answer by summarizing it, while still addressing the question? We conduct a user study on summarized answers generated from state-of-the-art models and our newly proposed extract-and-decontextualize approach. We find a large proportion of long-form answers (over 90%) in the ELI5 domain can be adequately summarized by at least one system, while complex and implicit answers are challenging to compress. We observe that decontextualization improves the quality of the extractive summary, exemplifying its potential in the summarization task. To promote future work, we provide an extractive summarization dataset covering 1K long-form answers and our user study annotations. Together, we present the first study on summarizing long-form answers, taking a step forward for QA agents that can provide answers at multiple granularities.
Da-Wei Zhou, Yuanhan Zhang, Jingyi Ning, Han-Jia Ye, De-Chuan Zhan, Ziwei Liu
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
Class-Incremental Learning (CIL) or continual learning is a desired capability in the real world, which requires a learning system to adapt to new tasks without forgetting former ones. While traditional CIL methods focus on visual information to grasp core features, recent advances in Vision-Language Models (VLM) have shown promising capabilities in learning generalizable representations with the aid of textual information. However, when continually trained with new classes, VLMs often suffer from catastrophic forgetting of former knowledge. Applying VLMs to CIL poses two major challenges: 1) how to adapt the model without forgetting; and 2) how to make full use of the multi-modal information. To this end, we propose PROjectiOn Fusion (PROOF) that enables VLMs to learn without forgetting. To handle the first challenge, we propose training task-specific projections based on the frozen image/text encoders. When facing new tasks, new projections are expanded and former projections are fixed, alleviating the forgetting of old concepts. For the second challenge, we propose the fusion module to better utilize the cross-modality information. By jointly adjusting visual and textual features, the model can capture semantic information with stronger representation ability. Extensive experiments on nine benchmark datasets validate PROOF achieves state-of-the-art performance.
Rongjie Huang, Chunlei Zhang, Yongqi Wang, Dongchao Yang, Luping Liu, Zhenhui Ye, Ziyue Jiang, Chao Weng, Zhou Zhao, Dong Yu
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
Various applications of voice synthesis have been developed independently despite the fact that they generate "voice" as output in common. In addition, the majority of voice synthesis models currently rely on annotated audio data, but it is crucial to scale them to self-supervised datasets in order to effectively capture the wide range of acoustic variations present in human voice, including speaker identity, emotion, and prosody. In this work, we propose Make-A-Voice, a unified framework for synthesizing and manipulating voice signals from discrete representations. Make-A-Voice leverages a "coarse-to-fine" approach to model the human voice, which involves three stages: 1) semantic stage: model high-level transformation between linguistic content and self-supervised semantic tokens, 2) acoustic stage: introduce varying control signals as acoustic conditions for semantic-to-acoustic modeling, and 3) generation stage: synthesize high-fidelity waveforms from acoustic tokens. Make-A-Voice offers notable benefits as a unified voice synthesis framework: 1) Data scalability: the major backbone (i.e., acoustic and generation stage) does not require any annotations, and thus the training data could be scaled up. 2) Controllability and conditioning flexibility: we investigate different conditioning mechanisms and effectively handle three voice synthesis applications, including text-to-speech (TTS), voice conversion (VC), and singing voice synthesis (SVS) by re-synthesizing the discrete voice representations with prompt guidance. Experimental results demonstrate that Make-A-Voice exhibits superior audio quality and style similarity compared with competitive baseline models. Audio samples are available at https://Make-A-Voice.github.io
Arash Ahmadian, Saurabh Dash, Hongyu Chen, Bharat Venkitesh, Stephen Gou, Phil Blunsom, Ahmet Üstün, Sara Hooker
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
Emergent properties have been widely adopted as a term to describe behavior not present in smaller models but observed in larger models. Recent work suggests that the trade-off incurred by quantization is also an emergent property, with sharp drops in performance in models over 6B parameters. In this work, we ask "are quantization cliffs in performance solely a factor of scale?" Against a backdrop of increased research focus on why certain emergent properties surface at scale, this work provides a useful counter-example. We posit that it is possible to optimize for a quantization friendly training recipe that suppresses large activation magnitude outliers. Here, we find that outlier dimensions are not an inherent product of scale, but rather sensitive to the optimization conditions present during pre-training. This both opens up directions for more efficient quantization, and poses the question of whether other emergent properties are inherent or can be altered and conditioned by optimization and architecture design choices. We successfully quantize models ranging in size from 410M to 52B with minimal degradation in performance.
Jesús Torrado, Nils Schöneberg, Jonas El Gammal
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
Bayesian inference remains one of the most important tool-kits for any scientist, but increasingly expensive likelihood functions are required for ever-more complex experiments, raising the cost of generating a Monte Carlo sample of the posterior. Recent attention has been directed towards the use of emulators of the posterior based on Gaussian Process (GP) regression combined with active sampling to achieve comparable precision with far fewer costly likelihood evaluations. Key to this approach is the batched acquisition of proposals, so that the true posterior can be evaluated in parallel. This is usually achieved via sequential maximization of the highly multimodal acquisition function. Unfortunately, this approach parallelizes poorly and is prone to getting stuck in local maxima. Our approach addresses this issue by generating nearly-optimal batches of candidates using an almost-embarrassingly parallel Nested Sampler on the mean prediction of the GP. The resulting nearly-sorted Monte Carlo sample is used to generate a batch of candidates ranked according to their sequentially conditioned acquisition function values at little cost. The final sample can also be used for inferring marginal quantities. Our proposed implementation (NORA) demonstrates comparable accuracy to sequential conditioned acquisition optimization and efficient parallelization in various synthetic and cosmological inference problems.
Joanna W. Lis, Aruku Senoo, William F. McGrew, Felix Rönchen, Alec Jenkins, Adam M. Kaufman
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
We implement mid-circuit operations in a 48-site array of neutral atoms, enabled by new methods for control of the $\textit{omg}$ (optical-metastable-ground state qubit) architecture present in ${}^{171}$Yb. We demonstrate laser-based control of ground, metastable and optical qubits with average single-qubit fidelities of $F_{g} = 99.968(3)$, $F_{m} = 99.12(4)$ and $F_{o} = 99.804(8)$. With state-sensitive shelving between the ground and metastable states, we realize a non-destructive state-detection for $^{171}$Yb, and reinitialize in the ground state with either global control or local feed-forward operations. We use local addressing of the optical clock transition to perform mid-circuit operations, including measurement, spin reset, and motional reset in the form of ground-state cooling. In characterizing mid-circuit measurement on ground-state qubits, we observe raw errors of $1.8(6)\%$ on ancilla qubits and $4.5(1.0)\%$ on data qubits, with the former (latter) uncorrected for $1.0(2)\%$ ($2.0(2)\%$) preparation and measurement error; we observe similar performance for mid-circuit reset operations. The reported realization of the $\textit{omg}$ architecture and mid-circuit operations are door-opening for many tasks in quantum information science, including quantum error-correction, entanglement generation, and metrology.
Hengyuan Ma, Yang Qi, Li Zhang, Wenlian Lu, Jianfeng Feng
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
Building robust, interpretable, and secure artificial intelligence system requires some degree of quantifying and representing uncertainty via a probabilistic perspective, as it allows to mimic human cognitive abilities. However, probabilistic computation presents significant challenges due to its inherent complexity. In this paper, we develop an efficient and interpretable probabilistic computation framework by truncating the probabilistic representation up to its first two moments, i.e., mean and covariance. We instantiate the framework by training a deterministic surrogate of a stochastic network that learns the complex probabilistic representation via combinations of simple activations, encapsulating the non-linearities coupling of the mean and covariance. We show that when the mean is supervised for optimizing the task objective, the unsupervised covariance spontaneously emerging from the non-linear coupling with the mean faithfully captures the uncertainty associated with model predictions. Our research highlights the inherent computability and simplicity of probabilistic computation, enabling its wider application in large-scale settings.
Umang Gupta, Aram Galstyan, Greg Ver Steeg
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
Efficient finetuning of pretrained language transformers is becoming increasingly prevalent for solving natural language processing tasks. While effective, it can still require a large number of tunable parameters. This can be a drawback for low-resource applications and training with differential-privacy constraints, where excessive noise may be introduced during finetuning. To this end, we propose a novel language transformer finetuning strategy that introduces task-specific parameters in multiple transformer layers. These parameters are derived from fixed random projections of a single trainable vector, enabling finetuning with significantly fewer parameters while maintaining performance. We achieve within 5% of full finetuning performance on GLUE tasks with as few as 4,100 parameters per task, outperforming other parameter-efficient finetuning approaches that use a similar number of per-task parameters. Besides, the random projections can be precomputed at inference, avoiding additional computational latency. All these make our method particularly appealing for low-resource applications. Finally, our method achieves the best or comparable utility compared to several recent finetuning methods when training with the same privacy constraints, underscoring its effectiveness and potential real-world impact.
Jacopo Fumagalli
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
We show that one-loop corrections to the large-scale power spectrum from small-scale modes in non-slow-roll dynamics are always negligible, namely they are volume suppressed by the ratio of the short to long distance scales. One-loop contributions proportional to the long wavelength tree-level power spectrum, and not sharing this suppression, appear only when considering a subset of vertexes, but they cancel exactly when all relevant interactions are taken into account. We prove the previous statement in two different ways, i.e. by using two equivalent forms of the interaction Hamiltonian. Contributions from boundary terms to equal time correlators are included when necessary.
Maico Hendrikus Wilhelmus Engelaar, Sofie Haesaert, Mircea Lazar
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
In this work, we introduce a stochastic model predictive control scheme for dynamic chance constraints. We consider linear discrete-time systems affected by unbounded additive stochastic disturbance and subject to chance constraints that are defined by time-varying probabilities with a common, fixed lower bound. By utilizing probabilistic reachable tubes with dynamic cross-sections, we are reformulating the stochastic optimization problem into a deterministic tube-based MPC problem with time-varying tightened constraints. We show that the resulting deterministic MPC formulation with dynamic tightened constraints is recursively feasible and that the closed-loop stochastic system will satisfy the corresponding dynamic chance constraints. In addition, we will also introduce a novel implementation using zonotopes to describe the tightening analytically. Finally, we will end with an example to illustrate the benefits of the developed approach to stochastic MPC with dynamic chance constraints.
Cory Arnold, Gabriela Pena Carmona, David A. Quiroz, Chung X. Thai, Brenda A. A. B. Ametepe, I-Hung Khoo, Melinda G. Simon, Perla Ayala, Siavash Ahrar
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
Significant progress has been made to increase access to droplet microfluidics for labs with limited microfluidics expertise or fabrication equipment. In particular, using off-the-shelf systems has been a valuable approach. However, the ability to modify a channel design and, thus, the functional characteristics of the system is of great value. In this work, we describe the development of co-flow microfluidics and their fabrication methods for generating uniform millimeter-sized (0.5 - 2 mm) hydrogel droplets. Two complementary approaches based on desktop CO2 laser cutting were developed to prototype and build durable co-flow droplet microfluidics. After demonstrating the co-flow systems, water-in-oil experiments and dimensionless number analysis were used to examine the operational characteristics of the system. Specifically, the Capillary number analysis indicated that millimeter-sized droplet generators operated in the desirable geometry-controlled regime despite their length scales being larger than traditional microfluidics systems. Next, the tunable generation of Matrigel droplets was demonstrated. By adjusting the relative flow rates, the droplet size could be tuned. Finally, we demonstrated fibroblast encapsulation and cell viability for up to 7 days as a proof-of-concept experiment. The systems presented are simple and effective tools to generate robust hydrogel droplets and increase the accessibility of this technology to teaching labs or research settings with limited resources or access to microfluidics.
AbdAlGhaffar K. Amer, F. Robicheaux
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
Simulations and theory are presented for the power spectral density functions (PSDs) of particles in time dependent and anharmonic potentials including the effects of a thermal environment leading to damping and fluctuating forces. We investigate three one dimensional perturbations to the harmonic oscillator of which two are time dependent changes in the natural frequency of the oscillator, while the other is a time independent extension of the quadratic potential to include a quartic term. We investigate the effect of these perturbations on two PSDs of the motion that are used in experiments on trapped nano-oscillators. We also derive and numerically test the PSDs for the motion of a spherical nanoparticle in a Paul trap. We found that the simple harmonic Langevin oscillator's PSDs are good approximations for the $x$-and $y$-coordinates' PSDs for small values of the parameter $q$ of the Mathieu equation, but the difference can be more than a factor of two as '$q$' increases. We also numerically showed that the presence of a permanent electric dipole on the nanosphere can significantly affect the PSDs in the $x$-and $y$-coordinates.
Anastasia Koloskova, Nikita Doikov, Sebastian U. Stich, Martin Jaggi
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
Stochastic Gradient Descent (SGD) algorithms are widely used in optimizing neural networks, with Random Reshuffling (RR) and Single Shuffle (SS) being popular choices for cycling through random or single permutations of the training data. However, the convergence properties of these algorithms in the non-convex case are not fully understood. Existing results suggest that, in realistic training scenarios where the number of epochs is smaller than the training set size, RR may perform worse than SGD. In this paper, we analyze a general SGD algorithm that allows for arbitrary data orderings and show improved convergence rates for non-convex functions. Specifically, our analysis reveals that SGD with random and single shuffling is always faster or at least as good as classical SGD with replacement, regardless of the number of iterations. Overall, our study highlights the benefits of using SGD with random/single shuffling and provides new insights into its convergence properties for non-convex optimization.
Yuta Watanabe
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
The generalized wreath product of association schemes was introduced by R.~A.~Bailey in European Journal of Combinatorics 27 (2006) 428--435. It is known as a generalization of both wreath and direct products of association schemes. In this paper, we discuss the Terwilliger algebra of the generalized wreath product of commutative association schemes. I will describe its structure and its central primitive idempotents in terms of the parameters of each factors and their central primitive idempotents.
Sina Hooshangi, Mohammad Hossein Namjoo, Mahdiyar Noorbala
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
The tail of the distribution of primordial fluctuations (corresponding to the likelihood of realization of large fluctuations) is of interest, from both theoretical and observational perspectives. In particular, it is relevant for the accurate evaluation of the primordial black hole (PBH) abundance. In this paper, we first analyze the non-perturbative $\delta N$ formalism as a method to non-perturbatively estimate the probability distribution function (PDF) of primordial fluctuations, discuss its underlying assumptions and deal with several subtleties that may arise as a result of considering large fluctuations. Next, we employ the method to study several non-attractor single-field inflationary models as the simplest examples that may lead to the abundant production of PBHs. We conclude that the Gaussian extrapolation from linear perturbation theory may fail drastically to predict the likelihood of large fluctuations. Specifically, we show that a truncation of the tail, a power-law tail, a double-exponential tail, and a doubly peaked distribution can all be realized for the curvature perturbation in the single-field non-attractor models of inflation. We thus show that there is a diverse zoo of possible tails from inflation so that a model-dependent, non-perturbative study of the distribution of the primordial fluctuations seems inevitable concerning PBH abundance.
Giannis Daras, Kulin Shah, Yuval Dagan, Aravind Gollakota, Alexandros G. Dimakis, Adam Klivans
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
We present the first diffusion-based framework that can learn an unknown distribution using only highly-corrupted samples. This problem arises in scientific applications where access to uncorrupted samples is impossible or expensive to acquire. Another benefit of our approach is the ability to train generative models that are less likely to memorize individual training samples since they never observe clean training data. Our main idea is to introduce additional measurement distortion during the diffusion process and require the model to predict the original corrupted image from the further corrupted image. We prove that our method leads to models that learn the conditional expectation of the full uncorrupted image given this additional measurement corruption. This holds for any corruption process that satisfies some technical conditions (and in particular includes inpainting and compressed sensing). We train models on standard benchmarks (CelebA, CIFAR-10 and AFHQ) and show that we can learn the distribution even when all the training samples have $90\%$ of their pixels missing. We also show that we can finetune foundation models on small corrupted datasets (e.g. MRI scans with block corruptions) and learn the clean distribution without memorizing the training set.
Sebastian P. Bayerl, Dominik Wagner, Ilja Baumann, Florian Hönig, Tobias Bocklet, Elmar Nöth, Korbinian Riedhammer
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
Most stuttering detection and classification research has viewed stuttering as a multi-class classification problem or a binary detection task for each dysfluency type; however, this does not match the nature of stuttering, in which one dysfluency seldom comes alone but rather co-occurs with others. This paper explores multi-language and cross-corpus end-to-end stuttering detection as a multi-label problem using a modified wav2vec 2.0 system with an attention-based classification head and multi-task learning. We evaluate the method using combinations of three datasets containing English and German stuttered speech, one containing speech modified by fluency shaping. The experimental results and an error analysis show that multi-label stuttering detection systems trained on cross-corpus and multi-language data achieve competitive results but performance on samples with multiple labels stays below over-all detection results.
Pedro Sandoval-Segura, Vasu Singla, Jonas Geiping, Micah Goldblum, Tom Goldstein
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
In an era of widespread web scraping, unlearnable dataset methods have the potential to protect data privacy by preventing deep neural networks from generalizing. But in addition to a number of practical limitations that make their use unlikely, we make a number of findings that call into question their ability to safeguard data. First, it is widely believed that neural networks trained on unlearnable datasets only learn shortcuts, simpler rules that are not useful for generalization. In contrast, we find that networks actually can learn useful features that can be reweighed for high test performance, suggesting that image privacy is not preserved. Unlearnable datasets are also believed to induce learning shortcuts through linear separability of added perturbations. We provide a counterexample, demonstrating that linear separability of perturbations is not a necessary condition. To emphasize why linearly separable perturbations should not be relied upon, we propose an orthogonal projection attack which allows learning from unlearnable datasets published in ICML 2021 and ICLR 2023. Our proposed attack is significantly less complex than recently proposed techniques.
P. K. Meena, S. Jangid, R. K. Kushwaha, R. P. Singh
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
High upper-critical field superconducting alloys are required for superconducting device applications. In this study, we extensively characterized the structure and superconducting properties of alloys Ta$_{x}$ Hf$_{1-x}$ (x = 0.2, 0.4, 0.5, 0.6 and 0.8). The substitution of Hf (T$_{C}$ = 0.12 K, type-I superconductor) with Ta (T$_{C}$ = 4.4 K, type-I superconductor) shows an anomalous enhancement of T$_{C}$ with variation of composition. Interestingly, all compositions exhibited strongly coupled bulk type-II superconductivity with a high upper critical field. In particular, for compositions x = 0.2, and 0.4, the upper critical field (H$_{C2}$) approached the Pauli limiting field.
Xiangze Zeng
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
We prove the boundedness of $n$-complements for surface pairs in a generalized case without restrictions on multiplicities or the Fano type assumption.
Panagiotis Aivasiliotis, Aris Pagourtzis
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
Domination problems in general can capture situations in which some entities have an effect on other entities (and sometimes on themselves). The usual goal is to select a minimum number of entities that can influence a target group of entities or to influence a maximum number of target entities with a certain number of available influencers. In this work, we focus on the distinction between \textit{internal} and \textit{external} domination in the respective maximization problem. In particular, a dominator can dominate its entire neighborhood in a graph, internally dominating itself, while those of its neighbors which are not dominators themselves are externally dominated. We study the problem of maximizing the external domination that a given number of dominators can yield and we present a 0.5307-approximation algorithm for this problem. Moreover, our methods provide a framework for approximating a number of problems that can be cast in terms of external domination. In particular, we observe that an interesting interpretation of the maximum coverage problem can capture a new problem in elections, in which we want to maximize the number of \textit{externally represented} voters. We study this problem in two different settings, namely Non-Secrecy and Rational-Candidate, and provide approximability analysis for two alternative approaches; our analysis reveals, among other contributions, that an earlier resource allocation algorithm is, in fact, a 0.462-approximation algorithm for maximum external domination in directed graphs.
A. Salch
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
It is well-known that the Steenrod algebra $A$ is self-injective as a graded ring. We make the observation that simply changing the grading on $A$ can make it cease to be self-injective. We see also that $A$ is not self-injective as an ungraded ring. These observations follow from the failure of certain coproducts of injective $A$-modules to be injective. Hence it is natural to ask: which coproducts of graded-injective modules, over a general graded ring, remain graded-injective? We give a complete solution to that question by proving a graded generalization of Carl Faith's characterization of $\Sigma$-injective modules. Specializing again to the Steenrod algebra, we use our graded generalization of Faith's theorem to prove that the covariant embedding of graded $A_*$-comodules into graded $A$-modules preserves injectivity of bounded-above objects, but does not preserve injectivity in general.
Guande He, Jianfei Chen, Jun Zhu
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
Large pre-trained language models (PLMs) have demonstrated strong performance on natural language understanding (NLU) tasks through fine-tuning. However, fine-tuned models still suffer from overconfident predictions, especially in out-of-domain settings. In this paper, we tackle the problem of calibrating fine-tuned language models. We demonstrate that the PLMs are well-calibrated on the masked language modeling task with robust predictive confidence under domain shift, yet the fine-tuned models fail to retain such property due to catastrophic forgetting, which impacts the calibration on the downstream classification task. In light of these observations, we evaluate the calibration of several methods that preserve pre-trained features and show that preserving pre-trained features can improve the calibration of fine-tuned language models. Among these methods, our proposed method that encourages the fine-tuned model to learn generative representations with auxiliary language modeling objective achieves competitive accuracy and the lowest expected calibration error compared to several strong baselines under both in-domain and out-of-domain settings on three downstream NLU tasks.
Jihao Liu, V. V. Shokurov
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
We prove that the first gap of $\mathbb R$-complementary thresholds of surfaces is $\frac{1}{13}$. More precisely, the largest $\mathbb R$-complementary threshold for surfaces that is strictly less than $1$ is $\frac{12}{13}$. This result has many applications in explicit birational geometry of surfaces and threefolds and allows us to find several other optimal bounds on surfaces. We show that the first gap of global log canonical threshold for surfaces is $\frac{1}{13}$, answering a question of V. Alexeev and W. Liu. We show that the minimal volume of log surfaces with reduced boundary and ample log canonical divisor is $\frac{1}{462}$, answering a question of J. Koll\'ar. We show that the smallest minimal log discrepancy (mld) of exceptional surfaces is $\frac{1}{13}$. As a special case, we show that the smallest mld of klt Calabi-Yau surfaces is $\frac{1}{13}$, reproving a recent result of L. Esser, B. Totaro, and C. Wang. After a more detailed classification, we classify all exceptional del Pezzo surfaces that are not $\frac{1}{11}$-lt, and show that the smallest mld of exceptional del Pezzo surfaces is $\frac{3}{35}$. We also get better upper bounds of $n$-complements and Tian's $\alpha$-invariants for surfaces. Finally, as an analogue of our main theorem in high dimensions, we propose a question associating the gaps of $\mathbb R$-complementary thresholds with the gaps of mld's and study some special cases of this question.
Lucas E. A. Porto, Rafael Rabelo, Marcelo Terra Cunha, Adán Cabello
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
A necessary condition for the probabilities of a set of events to exhibit Bell nonlocality or Kochen-Specker contextuality is that the graph of exclusivity of the events contains induced odd cycles with five or more vertices, called odd holes, or their complements, called odd antiholes. From this perspective, events whose graph of exclusivity are odd holes or antiholes are the building blocks of contextuality. For any odd hole or antihole, any assignment of probabilities allowed by quantum mechanics can be achieved in specific contextuality scenarios. However, here we prove that, for any odd hole, the probabilities that attain the quantum maxima cannot be achieved in Bell scenarios. We also prove it for the simplest odd antiholes. This leads us to the conjecture that the quantum maxima for any of the building blocks cannot be achieved in Bell scenarios. This result sheds light on why the problem of whether a probability assignment is quantum is decidable, while whether a probability assignment within a given Bell scenario is quantum is, in general, undecidable. This also helps to undertand why identifying principles for quantum correlations is simpler when we start by identifying principles for quantum sets of probabilities defined with no reference to specific scenarios.
Nathanial Lowry
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
In this paper, we classify the possible group structures on the set of $R$-valued points of an abelian variety, where $R$ is any real closed field. We make use of a family of abelian varieties that, in effect, allows one to quantify over all abelian varieties of a fixed dimension and degree of polarization in a first-order fashion.
Thu Nguyen-Phuoc, Gabriel Schwartz, Yuting Ye, Stephen Lombardi, Lei Xiao
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
This paper presents a method that can quickly adapt dynamic 3D avatars to arbitrary text descriptions of novel styles. Among existing approaches for avatar stylization, direct optimization methods can produce excellent results for arbitrary styles but they are unpleasantly slow. Furthermore, they require redoing the optimization process from scratch for every new input. Fast approximation methods using feed-forward networks trained on a large dataset of style images can generate results for new inputs quickly, but tend not to generalize well to novel styles and fall short in quality. We therefore investigate a new approach, AlteredAvatar, that combines those two approaches using the meta-learning framework. In the inner loop, the model learns to optimize to match a single target style well; while in the outer loop, the model learns to stylize efficiently across many styles. After training, AlteredAvatar learns an initialization that can quickly adapt within a small number of update steps to a novel style, which can be given using texts, a reference image, or a combination of both. We show that AlteredAvatar can achieve a good balance between speed, flexibility and quality, while maintaining consistency across a wide range of novel views and facial expressions.
Yunzhe Zhou, Chengchun Shi, Lexin Li, Qiwei Yao
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
The Markov property is widely imposed in analysis of time series data. Correspondingly, testing the Markov property, and relatedly, inferring the order of a Markov model, are of paramount importance. In this article, we propose a nonparametric test for the Markov property in high-dimensional time series via deep conditional generative learning. We also apply the test sequentially to determine the order of the Markov model. We show that the test controls the type-I error asymptotically, and has the power approaching one. Our proposal makes novel contributions in several ways. We utilize and extend state-of-the-art deep generative learning to estimate the conditional density functions, and establish a sharp upper bound on the approximation error of the estimators. We derive a doubly robust test statistic, which employs a nonparametric estimation but achieves a parametric convergence rate. We further adopt sample splitting and cross-fitting to minimize the conditions required to ensure the consistency of the test. We demonstrate the efficacy of the test through both simulations and the three data applications.
Xitong Zhang, Avrajit Ghosh, Guangliang Liu, Rongrong Wang
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
It is widely recognized that the generalization ability of neural networks can be greatly enhanced through carefully designing the training procedure. The current state-of-the-art training approach involves utilizing stochastic gradient descent (SGD) or Adam optimization algorithms along with a combination of additional regularization techniques such as weight decay, dropout, or noise injection. Optimal generalization can only be achieved by tuning a multitude of hyperparameters through grid search, which can be time-consuming and necessitates additional validation datasets. To address this issue, we introduce a practical PAC-Bayes training framework that is nearly tuning-free and requires no additional regularization while achieving comparable testing performance to that of SGD/Adam after a complete grid search and with extra regularizations. Our proposed algorithm demonstrates the remarkable potential of PAC training to achieve state-of-the-art performance on deep neural networks with enhanced robustness and interpretability.
Annie Sauer, Andrew Cooper, Robert B. Gramacy
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
We provide a survey of non-stationary surrogate models which utilize Gaussian processes (GPs) or variations thereof, including non-stationary kernel adaptations, partition and local GPs, and spatial warpings through deep Gaussian processes. We also overview publicly available software implementations and conclude with a bake-off involving an 8-dimensional satellite drag computer experiment. Code for this example is provided in a public git repository.
Emma Dauterman, Danny Lin, Henry Corrigan-Gibbs, David Mazières
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
Credential compromise is hard to detect and hard to mitigate. To address this problem, we present larch, an accountable authentication framework with strong security and privacy properties. Larch protects user privacy while ensuring that the larch log server correctly records every authentication. Specifically, an attacker who compromises a user's device cannot authenticate without creating evidence in the log, and the log cannot learn which web service (relying party) the user is authenticating to. To enable fast adoption, larch is backwards-compatible with relying parties that support FIDO2, TOTP, and password-based login. Furthermore, larch does not degrade the security and privacy a user already expects: the log server cannot authenticate on behalf of a user, and larch does not allow relying parties to link a user across accounts. We implement larch for FIDO2, TOTP, and password-based login. Given a client with four cores and a log server with eight cores, an authentication with larch takes 150ms for FIDO2, 91ms for TOTP, and 74ms for passwords (excluding preprocessing, which takes 1.23s for TOTP).
Ulyana Piterbarg, Lerrel Pinto, Rob Fergus
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
Neural policy learning methods have achieved remarkable results in various control problems, ranging from Atari games to simulated locomotion. However, these methods struggle in long-horizon tasks, especially in open-ended environments with multi-modal observations, such as the popular dungeon-crawler game, NetHack. Intriguingly, the NeurIPS 2021 NetHack Challenge revealed that symbolic agents outperformed neural approaches by over four times in median game score. In this paper, we delve into the reasons behind this performance gap and present an extensive study on neural policy learning for NetHack. To conduct this study, we analyze the winning symbolic agent, extending its codebase to track internal strategy selection in order to generate one of the largest available demonstration datasets. Utilizing this dataset, we examine (i) the advantages of an action hierarchy; (ii) enhancements in neural architecture; and (iii) the integration of reinforcement learning with imitation learning. Our investigations produce a state-of-the-art neural agent that surpasses previous fully neural policies by 127% in offline settings and 25% in online settings on median game score. However, we also demonstrate that mere scaling is insufficient to bridge the performance gap with the best symbolic models or even the top human players.
Guillaume Saës
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
The theory of orthonormal wavelet bases is a useful tool in multifractal analysis, as it provides a characterization of the different exponents of pointwise regularities (H{\"o}lder, p-exponent, lacunarity, oscillation, etc.). However, for some homogeneous self-similar processes, such as sums of random pulses (sums of regular, well-localized functions whose expansions and translations are random), it is easier to estimate the spectrum using continuous wavelet transforms. In this article, we present a new characterization of p-exponents by continuous wavelet transforms and we provide an application to the regularity analysis of sums of random pulses.
Peter Maták
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
We study the role of perturbative unitarity in the resonant annihilation of two dark matter particles into the standard model bath. Systematically including all kinematically allowed holomorphic cuts of the corresponding forward-scattering diagram, cancellation of the singularities occurs, resulting in a fixed-order correction to the narrow-width approximation for the annihilation cross section. Unlike the standard approach based on including the finite width of the mediator, no double-counting of intermediate states occurs.
Stein K. F. Stoter, Tom B. van Sluijs, Tristan H. B. Demont, E. Harald van Brummelen, Clemens V. Verhoosel
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
Binary-fluid flows can be modeled using the Navier-Stokes-Cahn-Hilliard equations, which represent the boundary between the fluid constituents by a diffuse interface. The diffuse-interface model allows for complex geometries and topological changes of the binary-fluid interface. In this work, we propose an immersed isogeometric analysis framework to solve the Navier-Stokes-Cahn-Hilliard equations on domains with geometrically complex external binary-fluid boundaries. The use of optimal-regularity B-splines results in a computationally efficient higher-order method. The key features of the proposed framework are a generalized Navier-slip boundary condition for the tangential velocity components, Nitsche's method for the convective impermeability boundary condition, and skeleton- and ghost-penalties to guarantee stability. A binary-fluid Taylor-Couette flow is considered for benchmarking. Porous medium simulations demonstrate the ability of the immersed isogeometric analysis framework to model complex binary-fluid flow phenomena such as break-up and coalescence in complex geometries.
Pêdra D. S. Andrade, Julio C. Correa-Hoyos
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
In this article, we study a class of fully nonlinear double-divergence systems with free boundaries associated with a minimization problem. The variational structure of Hessian-dependent functional plays a fundamental role in proving the existence of the minimizers and then the existence of the solutions for the system. In addition, we establish gains of the integrability for the double-divergence equation. Consequently, we improve the regularity for the fully nonlinear equation in Sobolev and H\"older spaces.
Antonio Marino, Claudio Pacchierotti, Paolo Robuffo Giordano
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
In this paper, we aim to find the conditions for input-state stability (ISS) and incremental input-state stability ($\delta$ISS) of Gated Graph Neural Networks (GGNNs). We show that this recurrent version of Graph Neural Networks (GNNs) can be expressed as a dynamical distributed system and, as a consequence, can be analysed using model-based techniques to assess its stability and robustness properties. Then, the stability criteria found can be exploited as constraints during the training process to enforce the internal stability of the neural network. Two distributed control examples, flocking and multi-robot motion control, show that using these conditions increases the performance and robustness of the gated GNNs.
Bailin Wang, Zi Wang, Xuezhi Wang, Yuan Cao, Rif A. Saurous, Yoon Kim
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
Large language models (LLMs) can learn to perform a wide range of natural language tasks from just a handful of in-context examples. However, for generating strings from highly structured languages (e.g., semantic parsing to complex domain-specific languages), it is challenging for the LLM to generalize from just a few exemplars. We explore $\textbf{grammar prompting}$ as a simple approach for enabling LLMs to use external knowledge and domain-specific constraints, expressed through a grammar expressed in Backus--Naur Form (BNF), during in-context learning. Grammar prompting augments each demonstration example with a specialized grammar that is minimally sufficient for generating the particular output example, where the specialized grammar is a subset of the full DSL grammar. For inference, the LLM first predicts a BNF grammar given a test input, and then generates the output according to the rules of the grammar. Experiments demonstrate that grammar prompting can enable LLMs to perform competitively on a diverse set of DSL generation tasks, including semantic parsing (SMCalFlow, Overnight, GeoQuery), PDDL planning, and even molecule generation (SMILES).
Julian Sienkiewicz, Anna Chmiel
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
In this paper, we have examined the interplay between the lobby size $q$ in the $q$-neighbor Ising model of opinion formation [Phys. Rev. E 92, 052105] and the level of overlap $v$ of two fully connected graphs. Results suggest that for each lobby size $q \ge 3$ there exists a specific level of overlap $v^*$ which destroys initially polarized clusters of opinions. By performing Monte-Carlo simulations, backed by an analytical approach we show that the dependence of the $v^*$ on the lobby size $q$ is far from trivial in the absence of temperature $T \rightarrow 0$, showing a clear maximum that additionally depends on the parity of $q$. On the other hand, the temperature is a destructive factor, its increase leads to the earlier collapse of polarized clusters but additionally brings a substantial decrease in the level of polarization.
Adrien Le Franc, Victor Magron, Jean-Bernard Lasserre, Manuel Ruiz, Patrick Panciatici
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
AC-OPF (Alternative Current Optimal Power Flow)aims at minimizing the operating costs of a power gridunder physical constraints on voltages and power injections.Its mathematical formulation results in a nonconvex polynomial optimizationproblem which is hard to solve in general,but that can be tackled by a sequence of SDP(Semidefinite Programming) relaxationscorresponding to the steps of the moment-SOS (Sums-Of-Squares) hierarchy.Unfortunately, the size of these SDPs grows drastically in the hierarchy,so that even second-order relaxationsexploiting the correlative sparsity pattern of AC-OPFare hardly numerically tractable for largeinstances -- with thousands of power buses.Our contribution lies in a new sparsityframework, termed minimal sparsity, inspiredfrom the specific structure of power flowequations.Despite its heuristic nature, numerical examples show that minimal sparsity allows the computation ofhighly accurate second-order moment-SOS relaxationsof AC-OPF, while requiring far less computing time and memory resources than the standard correlative sparsity pattern. Thus, we manage to compute second-order relaxations on test caseswith about 6000 power buses, which we believe to be unprecedented.
Baptiste Anselme Martin, Thomas Ayral, François Jamet, Marko J. Rančić, Pascal Simon
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
Matrix Product States (MPS) have been proven to be a powerful tool to study quantum many-body systems but are restricted to moderately entangled states as the number of parameters scales exponentially with the entanglement entropy. While MPS can efficiently find ground states of 1D systems, their capacities are limited when simulating their dynamics, where the entanglement can increase ballistically with time. On the other hand, quantum devices appear as a natural platform to encode correlated many-body states, suited to perform time evolution. However, accessing the regime of modeling long-time dynamics is hampered by quantum noise. In this study we use the best of worlds: the short-time dynamics is efficiently performed by MPSs, compiled into short-depth quantum circuits followed by Trotter circuits run on a quantum computer. We quantify the capacities of this hybrid classical-quantum scheme in terms of fidelities and entanglement production taking into account a realistic noise model. We show that using classical knowledge in the form of MPSs provides a way to better use limited quantum resources and lowers the noise requirements to reach a practical quantum advantage. Combined with powerful noise-mitigation methods our approach allows us to simulate an 8-qubit system on an actual quantum device over a longer time scale than low bond dimension MPSs and purely quantum Trotter evolution.
Vaibhav Kumar, Hana Koorehdavoudi, Masud Moshtaghi, Amita Misra, Ankit Chadha, Emilio Ferrara
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
We propose CHRT (Control Hidden Representation Transformation) - a controlled language generation framework that steers large language models to generate text pertaining to certain attributes (such as toxicity). CHRT gains attribute control by modifying the hidden representation of the base model through learned transformations. We employ a contrastive-learning framework to learn these transformations that can be combined to gain multi-attribute control. The effectiveness of CHRT is experimentally shown by comparing it with seven baselines over three attributes. CHRT outperforms all the baselines in the task of detoxification, positive sentiment steering, and text simplification while minimizing the loss in linguistic qualities. Further, our approach has the lowest inference latency of only 0.01 seconds more than the base model, making it the most suitable for high-performance production environments. We open-source our code and release two novel datasets to further propel controlled language generation research.
Rui Ye, Mingkai Xu, Jianyu Wang, Chenxin Xu, Siheng Chen, Yanfeng Wang
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
This work considers the category distribution heterogeneity in federated learning. This issue is due to biased labeling preferences at multiple clients and is a typical setting of data heterogeneity. To alleviate this issue, most previous works consider either regularizing local models or fine-tuning the global model, while they ignore the adjustment of aggregation weights and simply assign weights based on the dataset size. However, based on our empirical observations and theoretical analysis, we find that the dataset size is not optimal and the discrepancy between local and global category distributions could be a beneficial and complementary indicator for determining aggregation weights. We thus propose a novel aggregation method, Federated Learning with Discrepancy-aware Collaboration (FedDisco), whose aggregation weights not only involve both the dataset size and the discrepancy value, but also contribute to a tighter theoretical upper bound of the optimization error. FedDisco also promotes privacy-preservation, communication and computation efficiency, as well as modularity. Extensive experiments show that our FedDisco outperforms several state-of-the-art methods and can be easily incorporated with many existing methods to further enhance the performance. Our code will be available at https://github.com/MediaBrain-SJTU/FedDisco.
Yufei Tian, Anjali Narayan-Chen, Shereen Oraby, Alessandra Cervone, Gunnar Sigurdsson, Chenyang Tao, Wenbo Zhao, Tagyoung Chung, Jing Huang, Nanyun Peng
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
Automatic melody-to-lyric generation is a task in which song lyrics are generated to go with a given melody. It is of significant practical interest and more challenging than unconstrained lyric generation as the music imposes additional constraints onto the lyrics. The training data is limited as most songs are copyrighted, resulting in models that underfit the complicated cross-modal relationship between melody and lyrics. In this work, we propose a method for generating high-quality lyrics without training on any aligned melody-lyric data. Specifically, we design a hierarchical lyric generation framework that first generates a song outline and second the complete lyrics. The framework enables disentanglement of training (based purely on text) from inference (melody-guided text generation) to circumvent the shortage of parallel data. We leverage the segmentation and rhythm alignment between melody and lyrics to compile the given melody into decoding constraints as guidance during inference. The two-step hierarchical design also enables content control via the lyric outline, a much-desired feature for democratizing collaborative song creation. Experimental results show that our model can generate high-quality lyrics that are more on-topic, singable, intelligible, and coherent than strong baselines, for example SongMASS, a SOTA model trained on a parallel dataset, with a 24% relative overall quality improvement based on human ratings. O
Ilya Kolpakov, Thijs Smulders
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
Radiative tunneling recombination mechanism is observed in an InP nanowire solar cell at low temperatures. A link between observed radiative tunneling and field-emission dominated electrical transport is established through the characteristic tunneling energy. Plasmon-phonon interaction is found to play an important role in solar cell performance
Suraj, Shankar Kumar Selvaraja
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
We report $\approx$400\% enhancement in PZT Pockels coefficient on DFT simulation of lattice strain due to phonon mode softening.The simulation showed a relation between the rumpling and the Pockels coefficient divergence that happens at -8\% and 25\% strain developed in PZT film.The simulation was verified experimentally by RF sputter deposited PZT film on Pt/SiO$_2$/Si layer.The strain developed in PZT varied from -0.04\% for film annealed at 530\degree C to -0.21\% for 600\degree C annealing temperature.The strain was insensitive to RF power with a value of -0.13\% for power varying between 70-130 W. Pockels coefficient enhancement was experimentally confirmed by Si Mach Zehnder interferometer loaded with PZT and probed with the co-planar electrode.An enhancement of $\approx$300\% in Pockels coefficient was observed from 2-8 pm/V with strain increasing from -0.04\% to -0.21\%. To the best of our knowledge, this is the first time study and demonstration of strain engineering on Pockels coefficient of PZT using DFT simulation, film deposition, and photonic device fabrication.
Irina Wang, Cole Becker, Bart Van Parys, Bartolomeo Stellato
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
We propose a data-driven technique to automatically learn the uncertainty sets in robust optimization. Our method reshapes the uncertainty sets by minimizing the expected performance across a family of problems while guaranteeing constraint satisfaction. We learn the uncertainty sets using a novel stochastic augmented Lagrangian method that relies on differentiating the solutions of the robust optimization problems with respect to the parameters of the uncertainty set. We show sublinear convergence to stationary points under mild assumptions, and finite-sample probabilistic guarantees of constraint satisfaction using empirical process theory. Our approach is very flexible and can learn a wide variety of uncertainty sets while preserving tractability. Numerical experiments show that our method outperforms traditional approaches in robust and distributionally robust optimization in terms of out of sample performance and constraint satisfaction guarantees. We implemented our method in the open-source package LROPT.
Ousmane Ly
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
We propose an analytical formulation for the Scanning Gate Microscopy (SGM) response to local tips with arbitrary strength in two dimensional nanostructures. The real space resolved conductance is expressed in terms of the unperturbed quantities underlying the scattering problem. Providing a non-dynamical approach for obtaining the SGM maps, the proposed expression enables for a significant reduction in the computational cost of SGM response calculations. This feature is particularly advantageous for deep learning-based approaches which have been recently proposed for accessing local properties and disorder landscapes from conductance measurements. This opens up new possibilities for the SGM technique and holds exciting prospects for quantum transport. Further, the formula's versatility extends beyond this specific application, offering a straightforward and computationally efficient method for obtaining the SGM response in a more general context.
Catalin Mitelut, Ben Smith, Peter Vamplew
Published: 30 May 2023
by ArXiv
Journal: ArXiv
Abstract:
The rapid advancement of artificial intelligence (AI) systems suggests that artificial general intelligence (AGI) systems may soon arrive. Many researchers are concerned that AIs and AGIs will harm humans via intentional misuse (AI-misuse) or through accidents (AI-accidents). In respect of AI-accidents, there is an increasing effort focused on developing algorithms and paradigms that ensure AI systems are aligned to what humans intend, e.g. AI systems that yield actions or recommendations that humans might judge as consistent with their intentions and goals. Here we argue that alignment to human intent is insufficient for safe AI systems and that preservation of long-term agency of humans may be a more robust standard, and one that needs to be separated explicitly and a priori during optimization. We argue that AI systems can reshape human intention and discuss the lack of biological and psychological mechanisms that protect humans from loss of agency. We provide the first formal definition of agency-preserving AI-human interactions which focuses on forward-looking agency evaluations and argue that AI systems - not humans - must be increasingly tasked with making these evaluations. We show how agency loss can occur in simple environments containing embedded agents that use temporal-difference learning to make action recommendations. Finally, we propose a new area of research called "agency foundations" and pose four initial topics designed to improve our understanding of agency in AI-human interactions: benevolent game theory, algorithmic foundations of human rights, mechanistic interpretability of agency representation in neural-networks and reinforcement learning from internal states.
Page of 42,650
Articles per Page
by
Show export options
  Select all
Back to Top Top