Acoustical Science and Technology
Online ISSN : 1347-5177
Print ISSN : 1346-3969
ISSN-L : 0369-4232
Advance online publication
Displaying 1-27 of 27 articles from this issue
  • Takayuki Hidaka, Noriko Nishihara, Kazunori Suzuki, Takehiko Nakagawa, ...
    Article ID: e25.55
    Published: 2025
    Advance online publication: August 09, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    This is an associated paper to a previous one, “Favorable clarity, sound strength, and spaciousness as well as overall acoustical quality of concert halls measured in a 3D synthesized sound field,” Acoust. Sci. & Tech. 46, accepted (2025). Anechoic music was reproduced by a virtual orchestra in concert halls and was recorded at audience seats. Four music excerpts were chosen. By a fourth order Ambisonics playback in the laboratory, 21 music experts judged the loudness–perceived sound strength–of the presented sound. The results confirmed that constancy was held for loudness judgments. The early to late reverberation energy ratio was found to be a plausible acoustic physical quantity as a cue of constancy.

    Download PDF (621K)
  • Koji Aizawa
    Article ID: e25.49
    Published: 2025
    Advance online publication: August 02, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    Pulsed pressure waves propagating through a narrow air gap surrounded by spherical walls of different diameters were computationally and experimentally investigated to generate an impact force for noncontact and nondestructive testing. It was confirmed that with the proposed model, an impulse-like wave with peak positive pressure exceeding 10 kPa can be obtained with a gap width of 0.9 mm at a laser energy of 35 mJ.

    Download PDF (758K)
  • Masahiro Izumi, Akiko Sugahara, Yasuhiro Hiraguri
    Article ID: e25.15
    Published: 2025
    Advance online publication: July 30, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    Environmental noise poses significant challenges, necessitating effective sound insulation. Sonic crystals (SCs) have the potential to become a building material that provides both sound insulation and ventilation but suffers from a narrow sonic bandgap (SBG) and anisotropic behavior. In this study, sound insulation properties of two-dimensional triangular-lattice hierarchical sonic crystals (HSCs), in which a hierarchical structure is applied to SCs, were examined by the finite element method. Results indicate that multiple lattice constants facilitate broader SBGs, with second-order SBGs exhibiting isotropy. This suggests that HSCs can mitigate the incident angle dependence.

    Download PDF (1303K)
  • Tomoo Kamakura, Shinichi Sakai, Hideo Hayashi, Yoshinobu Yasuno, Hidey ...
    Article ID: e25.21
    Published: 2025
    Advance online publication: July 30, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    A cellular polypropylene film metallized with gold electrodes is directly glued to the back plate of a commercially available condenser microphone cartridge whose metallic diaphragm is removed, composed of a microphone set with an existing pre-amplifier and power module. Lack of vibration resonance of the diaphragm makes it possible to widen frequency ranges in pressure sensitivity compared with those of commonly used condenser microphones. In fact, it has been experimentally verified that the prototyped microphone has good response to sound waves over the extremely wide range of frequencies from 200 Hz to 400 kHz and is highly acceptable to intense sound pressures of up to several tens of kilo-pascals.

    Download PDF (635K)
  • Al Jamii Zahra, Yuji Wada, Kentaro Nakamura
    Article ID: e25.32
    Published: 2025
    Advance online publication: July 30, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    This study investigates strategies to enhance the loading and dispensing capabilities of a droplet in ultrasonic levitation systems through acoustic field optimization. Using a 28.58 kHz transducer, two approaches were evaluated: (1) horizontal standing waves with a 5° angled reflector at first- to third-order resonances, and (2) inclined standing waves at different angles (first-order resonance) under a fixed surface vibration velocity of 0.8 m/s. Results show that the horizontal configurations with the angled reflector required up to 112.5% higher surface vibration velocity to reach comparable levitation performance to that of parallel reflectors, revealing inefficiencies in reflector-angle adjustments. In contrast, tilting the standing wave angle to 45° significantly enhanced stability, enabling reliable levitation of an averaged 1.0 µL droplet with reduced energy input. The inclined-wave method outperformed reflector-angle modifications, achieving precise droplet insertion and dispensing while minimizing acoustic energy consumption.

    Download PDF (1243K)
  • Ryoya Mizuno, Keigo Kano, Akira Emoto, Daisuke Koyama
    Article ID: e25.52
    Published: 2025
    Advance online publication: July 23, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    Liquid crystal (LC) varifocal lenses are characterized by their need for compactness and high-speed response, rendering them well-suited for next-generation optical devices. The focal length can be modulated by reorienting LC molecules through acoustic radiation force. In this study, the influence of the geometrical structure of ultrasonic LC lenses on the optical performance was examined. Two LC lenses with distinct glass substrate thicknesses were fabricated, and their optical characteristics were evaluated. The electro-mechanical parameters were found to be altered by the thickness of the glass substrate, which consequently led to an improvement in the power consumption in focus tunability.

    Download PDF (748K)
  • Rintaro Fujii, Takeshi Okuzono
    Article ID: e25.02
    Published: 2025
    Advance online publication: July 19, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    In this paper, we propose a metaporous absorber with air-filled resonators that enhances low-frequency sound absorption for thin porous materials with low flow resistivity and a simple theoretical model under the assumption of plane wave propagation inside the material. The metaporous absorber features a periodic array structure made up of unit cells, within which microslit resonators are strategically placed in the porous material. Three distinct unit cells, each exhibiting unique sound absorption characteristics, are proposed. Combining two of these unit cells makes it possible to offset the shortcomings of each other’s sound absorption capabilities, resulting in a broader range of high-sound-absorption effects. Firstly, a transfer matrix modeling of the metaporous absorbers is proposed, numerically verified by the finite element method and experimentally validated by impedance tube measurement. Using the constructed transfer matrix model and genetic algorithm optimization, we designed two highly efficient near-perfect sound absorbers at frequencies from 700 Hz to 1.7 kHz and experimentally demonstrated their sound absorption characteristics. The present absorber is particularly effective for enhancing the performance of thin porous materials with lower flow resistivity.

    Download PDF (2015K)
  • Shigeaki Amano, Kimiko Yamakawa, Mariko Kondo
    Article ID: e24.126
    Published: 2025
    Advance online publication: July 11, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    Recent studies have demonstrated that combinations of logarithmic durations can classify duration-sensitive phonemes, such as Japanese singleton and geminate consonants, at various speaking rates. The acoustic features of the Japanese fricative /s/ and affricate /ts/ are related to duration; therefore, a combination of logarithmic durations can likely classify these consonants. To examine this possibility, discriminant models using linear and logarithmic durations, with and without a speaking-rate-related variable, were compared in terms of their performance in classifying /s/ and /ts/ at word-initial position at various speaking rates. The results indicate that the discriminant model using logarithmic duration with a speaking-rate-related variable can classify the consonants better than the other models, indicating the importance of logarithmic duration. The results are considered in the framework of logarithmic information processing in the brain.

    Download PDF (442K)
  • Hikaru Miura, Takashi Kasashima
    Article ID: e25.45
    Published: 2025
    Advance online publication: July 11, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    In general, the resonance frequencies of ultrasonic emitters that use bolt-clamped Langevin transducers differ slightly. However, when using these emitters in an arrayed device, the resonance frequencies of each emitter must be matched. In this paper, the arbitrary reduction of the resonance frequency by adding a small amount of mass after fabrication is examined. The resonance frequency was lowered by adding more mass, demonstrating the utility of this method.

    Download PDF (681K)
  • Renta Kushiro, Akira Omoto
    Article ID: e25.28
    Published: 2025
    Advance online publication: July 09, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    A method for adjusting room acoustics through active control using conventional loudspeakers as secondary sources is proposed to suppress inhomogeneity caused by standing waves and room modes. Unlike conventional active control, the control target is the specific acoustic impedance, which is the ratio of sound pressure to particle velocity. The aim is to approach a sound field with only direct waves and no reflections. First, to perform the control, the condition for the maximum absorption coefficient of the virtual boundary surface was determined from the perspective of impedance matching, and an error function was set. Next, a particle velocity measurement method was introduced to obtain the values of specific acoustic impedance, and the weighting of sound pressure and particle velocity was modified. Furthermore, using these tools, experiments were conducted using both a real sound field and simulations to verify the effect of impedance control. Finally, impedance control was reinterpreted from the perspective of microphone directivity, clarifying the control mechanism. The results confirm the method’s effectiveness in low-frequency sound field adjustment, where passive absorbers are insufficient.

    Download PDF (1536K)
  • Junichi Mori, Funa Kodomari, Takenobu Tsuchiya, Makoto Morinaga, Ippei ...
    Article ID: e24.127
    Published: 2025
    Advance online publication: July 05, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    At civil airports and military facilities, various aircraft types are operated. For the mitigation and management of aircraft noise, several surveys and studies have been conducted around airports. Generally, to build a valuable database from these aircraft noise observation data, it is necessary to identify the aircraft types. To achieve this, we are developing an AI model that identifies aircraft types by applying machine learning techniques and using measured acoustic aircraft noise data. In this study, to improve the generalization performance of the model by expanding the dataset for training used in machine learning, we applied Swarm Learning technology, which enables machine learning to be executed under distributed conditions without centralizing measured noise data, and verified its accuracy. In the study, the accuracy of convolutional neural networks using general procedures with all datasets was compared with the results analyzed using Swarm Learning, where the dataset was divided into several groups. As a result, although Swarm Learning showed a slight decrease in accuracy compared with convolutional neural networks, its accuracy remained very high at 94%, demonstrating that it is a sufficiently effective method considering the effort required to centralize measured data in one location.

    Download PDF (928K)
  • Yimeng Wang, Manabu Aoyagi
    Article ID: e25.34
    Published: 2025
    Advance online publication: July 02, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    The effect of the cylinder with a cavity around the target location on underwater acoustic streaming at 28.2 kHz was investigated. Finite element analysis was performed to optimize the dimensions of the transparent acrylic cylinder by the resonance frequency analysis of the whole structure to increase sound pressure in the cavity. Simulation methods ignoring cavitation bubbles and considering bubbles were used to obtain distributions of sound pressure and acoustic streaming at the initial period and stable state separately. For comparison, particle image velocimetry experiments were conducted using the adjusted and original cylinders. The results showed that when the gap was smaller than 25 mm, the cavity had an obvious enhancement effect on streaming velocity, increasing it to twice the maximum value.

    Download PDF (2380K)
  • Jun Takahashi, Itsuki Shishime, Natsuki Toda, Hironori Takemoto
    Article ID: e25.22
    Published: 2025
    Advance online publication: June 28, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    This study investigated whether individuals with musical experience could accurately perceive changes in the singing voice and underlying vocal tract movements of a singer who completed one year of vocal training, using only the singing voice. Vocal tract modifications were analyzed using real-time magnetic resonance imaging and evaluated by professional singers, instrumentalists, and students. Professional singers demonstrated more nuanced evaluations of vocal tract shape. Instrumentalists, while capable of assessing voice quality, showed less differentiation across vocal tract features. Students used narrower rating ranges and struggled to assess both aspects. These findings indicate that musical background influences evaluative tendencies regarding voice quality and vocal tract configurations.

    Download PDF (1915K)
  • Ryo Teraoka, Yuki Tanaka, Wataru Teramoto
    Article ID: e25.08
    Published: 2025
    Advance online publication: June 26, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    Auditory spatial attention is crucial for extracting relevant sounds from background noise in noisy environments. Despite its significance in daily life, the effect of auditory spatial attention on the depth direction remains poorly understood. The present study aimed to investigate how auditory selective attention influences the detection of target sounds in the depth direction using sensitivity (d′), false alarm rates, and reaction time (RT) for the target sound. In each trial, either a target or distractor sound was presented from one of the five distances (32, 64, 96, 128, and 160 cm). The listeners were directed to respond as soon as they heard the target sound, while ignoring distractor sounds. The results indicated that directing attention to a specific distance significantly increased the sensitivity (d′) at that distance compared to other distances. Furthermore, the false alarm rate was the lowest at the attended position and progressively increased as sound positions deviated from the focus of attention. However, no significant effect of attention on the RT was observed. These findings suggest that auditory selective attention is not limited to the horizontal direction but can also operate along the depth direction in reverberant environments, expanding our understanding of auditory spatial attention.

    Download PDF (698K)
  • Kentaro Seki, Nobutaka Ito, Kazuki Yamauchi, Yuki Okamoto, Kouei Yamao ...
    Article ID: e25.27
    Published: 2025
    Advance online publication: June 24, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    This paper proposes a new language-queried target speech extraction (TSE) task called para-linguistic and non-linguistic text prompts-based TSE (PNTP-TSE), which uses text prompts that describe para-linguistic and non-linguistic information. This framework addresses the limitations of conventional TSE methods, such as privacy concerns in voiceprint-based systems and dependency on dedicated microphone arrays or video cameras. To support this framework, we construct and provide a new dataset, PromptTSE, which is specifically designed to facilitate various types of language-queried TSE, including PNTP-TSE. We develop a baseline method for PNTP-TSE and conduct experimental evaluations. The experimental results show that PNTP-TSE overcomes the performance degradation issue of voiceprint-based systems caused by the gap in speaking style between enrollment speech and target speech.

    Download PDF (961K)
  • Toru Nakashika, Kohei Yatabe
    Article ID: e24.95
    Published: 2025
    Advance online publication: June 20, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    To bypass phase estimation, complex-valued generative models have been developed to directly handle spectra of audio signals. The complex-valued restricted Boltzmann machine (CRBM) is one of such promising models proposed recently. However, similar to the other models, CRBM cannot treat the logarithmic nature of auditory perception important to realize a better model for audio application. This is because CRBM handles complex values in the rectangular coordinate (i.e., real and imaginary parts), which hinders applying the logarithmic transform to magnitude. To overcome this drawback of CRBM, we propose the gamma-von-Mises (GVM) RBM that models complex-valued spectra in the polar coordinate (i.e., magnitude and phase). GVM RBM handles magnitude by the gamma distribution using the logarithmic function and phase by the von Mises distribution. Our objective and subjective experiments showed that GVM RBM outperformed the other models including CRBM and complex-valued variational autoencoder (CVAE).

    Download PDF (666K)
  • Tong Zhou, Kazuya Yasueda, Akitoshi Kataoka
    Article ID: e25.12
    Published: 2025
    Advance online publication: June 17, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    This study introduces two efficient methods for selecting Tikhonov regularization parameters in acoustical inverse problems. The first approach employs a binary search (BS) algorithm to identify the regularization parameter that satisfies a predefined power constraint. Compared to traditional iterative searches over N candidate values, BS reduces the number of iterations from N to log2N. The second method, Adaptive Normalized Tikhonov (ANT), combines the conventional L-curve and Normalized Tikhonov techniques. By fitting the ratio of the inverse system matrix’s largest eigenvalue to an exponential decay function during preprocessing at a few sample frequencies, ANT determines the regularization parameter with a single calculation for other frequencies. Both methods were experimentally validated in a multi-zone sound field reproduction scenario using a measured reverberant room impulse responses database. Results demonstrated that BS achieves a balance between reproduction accuracy and robustness while significantly improving efficiency. The ANT method provided the most stable system without iterative calculations. These improvements indicate that both approaches offer compelling solutions for real-time applications.

    Download PDF (1358K)
  • Takayuki Hidaka, Noriko Nishihara, Kazunori Suzuki, Takehiko Nakagawa
    Article ID: e24.107
    Published: 2025
    Advance online publication: June 14, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    This paper is a companion to “Reexamination of the favorable reverberation time of concert halls measured in a 3D synthesized sound field,” A.S.T., 45, 204–215 [2024]. Anechoic music sources were reproduced by a virtual orchestra set on concert hall stages and were recorded at audience seats. Four music excerpts were chosen. By an Ambisonics playback in the laboratory, a series of psychological experiments were conducted. Twenty-one music experts judged the clarity, sound strength, spaciousness, and overall acoustical quality of the presented sound. Adding the results from the previous paper, a regression analysis on the relationships between contributing subjective attributes and objective parameters found that EDTM and C80,3 contributed to clarity, GM (or GL) and RTM to sound strength, and BQI and GL to spaciousness. Here, subscripts “L,” “M,” and “3” denote octave band averages at 125 and 250 Hz, 500 and 1000 Hz, and at 500, 1000, and 2000 Hz, respectively, and “E” designates early sound, i.e., less than 80 msec. Favorable ranges of physical parameters for each subjective attribute were determined. Reverberance, spaciousness, and clarity were identified as significant subjective attributes contributing to overall acoustic quality, with the corresponding physical metrics being RTM, GL, and BQI.

    Download PDF (831K)
  • Hiroki Iida, Kohei Yatabe
    Article ID: e25.07
    Published: 2025
    Advance online publication: June 14, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    This study explores the design and implementation of an IIR all-pass filter that simulates cochlear delay characteristics. This paper contains three topics: filter design, implementation, and musical evaluation. First, we designed an IIR all-pass filter to simulate cochlear delay characteristics by optimizing its zeros and poles to achieve the desired group delay. Additionally, the filter was implemented as a VST plug-in for real-time applications and is publicity available. Next, subjective evaluations were conducted to assess the musical impact of this filter. We applied the filter to snare drum, bass drum, bass guitar, and electric guitar to explore its musical applicability. Participants compared the filtered and original sounds. Percussion instruments received mixed feedback, with the filter sometimes described as “artificial.” In contrast, string instruments like bass guitar and electric guitar were rated as “impressive” and “attractive,” suggesting greater relevance for these sounds. Finally, we investigated the impact of the filter on guitar performance. Performance deviations from a metronome were measured under 10 different conditions by varying the number of filters and delay times. The results indicated that excessive delay introduced by the filter could disrupt synchronization during performances.

    Download PDF (1737K)
  • Toshiki Hanyu
    Article ID: e24.98
    Published: 2025
    Advance online publication: June 13, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    Room acoustics is mainly based on the reverberation theories of Saine and Eyring. In Sabine's theory however, the reverberation time does not reach zero, even if the condition of absolute absorption is fulfilled. Eyring revised reverberation theory to resolve this contradiction. However, Eyring's theory has an inconsistency between the formulations of the steady-state and decay processes. Therefore, the author revised Sabine's theory, taking a different approach from that of Eyring. This revised theory was constructed by introducing the concept of "reverberation of a direct sound.” In this study, a new mathematical model of reverberation using reflection orders is proposed. This is a reconstruction of the author’s revised theory. The new model includes the temporal energy distribution in each reflection order and uses the concept of "reverberation of a direct sound” for the entire reverberation process. It shows that the concept is also essential for the reflected sounds. In addition, the reverberation decay agrees with the revised theory previously proposed by the author. Overall, the new model showed good agreement with the simulation results.

    Download PDF (4244K)
  • Mari Ueda, Kohei Naito, Hiroshi Tanaka, Takahiro Miura
    Article ID: e25.13
    Published: 2025
    Advance online publication: June 07, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    In this study, we measured the acoustic characteristics of nonwood baseball bats modified according to the Revised Japanese Product Standards (hereafter, Safe Goods (SG) Standards) enforced in 2024. New standard bats showed peak frequencies approximately 500 Hz higher than previous models. During Spring Koshien 2024, players reported differences in bat sound and ball travel distance, with the onomatopoeic description changing from “kakkin” to “kyu-in” following the revision, according to various media. The results of acoustic measurements conducted in compliance with the SG Standards confirm the observations of the players, indicating a tonal shift in the bats after the SG Standards were revised.

    Download PDF (464K)
  • Hansjörg Mixdorff, Takayuki Arai
    Article ID: e25.39
    Published: 2025
    Advance online publication: May 29, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    In this study we compare the prosody of Japanese with that of Maori and New Zealand English, the contact language, as impressionistic analysis of Māori and Japanese indicates prosodic similarities, despite many other differences. This may be due to the fact that proto-Japanese just like Māori stems from the Pacific region. However, Māori under the influence of English changed substantially. As an indirect way of comparing the prosody we devised a perception experiment using delexicalized speech employing Japanese listeners. Most listeners were able to differentiate between Japanese and NZ English, but did not place Maori closer to Japanese than English.

    Download PDF (824K)
  • Takahiro Iwami, Naohisa Inoue, Akira Omoto
    Article ID: e25.10
    Published: 2025
    Advance online publication: May 23, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    We construct an orthonormal basis for interior problems of the Helmholtz equation, based on the properties of a reproducing kernel Hilbert space defined by the spectral characteristics of interior sound fields. The constructed basis coincides with what is commonly known as spherical basis functions. Furthermore, leveraging the structure of this space, we derive the addition theorem in a compact form. This facilitates the conversion between reproducing kernel representations and spherical harmonic expansions and provides insights into estimating spherical harmonic coefficients from sampled measurements.

    Download PDF (230K)
  • Kakeru Yazawa, Takayuki Konishi, Mariko Kondo
    Article ID: e25.06
    Published: 2025
    Advance online publication: May 13, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    This paper presents the current design of the J-AESOP corpus, a learner speech corpus featuring Japanese speakers’ English. It has been developed as part of the Asian English Speech cOrpus Project (AESOP), an international and multi-institutional project to construct a collection of Asian English speech databases. While the recording procedures and speech materials are standardized in the AESOP project, the J-AESOP corpus incorporates additional features not found in other AESOP corpora, such as data from native English speakers, Japanese reading materials (Japanese version of “The North Wind and the Sun”), manual correction of automatic forced alignment, and perceptual ratings of accentedness/nativelikeness and comprehensibility. These unique features allow an in-depth investigation of Japanese-English bilingual speech, as exemplified by our exploratory investigation of the production of voiceless coronal fricatives in Japanese (i.e., [s, ɕ]) and English (i.e., /s, ʃ, θ/) reported in this paper. The paper also discusses directions for further development of the corpus, including improvements in data availability.

    Download PDF (1992K)
  • Yuki Kimura, Takeshi Okuzono
    Article ID: e24.110
    Published: 2025
    Advance online publication: May 03, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    In this paper, we propose an easily designable low-frequency acoustic metasurface (AMS) absorber composed of multiple imperfect microslit resonators designed to achieve near-perfect sound absorption within a one-third-octave-band. Some specific designs of one-third-octave-band near-perfect absorbers at 125, 250, and 500 Hz are presented. We have developed a robust and efficient user-friendly absorber design method combining the transfer matrix method and a unique geometry design rule of component resonators. To develop this design method, we conducted extensive numerical and experiment-based examinations by thermoviscous acoustic simulation and impedance tube measurements, particularly addressing the number of component resonators and their peak sound absorption coefficient. The numerical and experimental results demonstrated the importance of creating a coupled resonator with the appropriate number of imperfect component resonators, each with a lower sound absorptivity peak. These features are crucially important for achieving thin sound absorbers without compromising the desired sound absorption properties. Numerical sound absorptivity evaluation revealed that using more component resonators to create a coupled resonator enables individual component resonators to operate as resonators with lower sound absorptivity peaks. This simple operation achieves robust sound absorption characteristics with less degradation.

    Download PDF (3030K)
  • Hisako Orimoto, Akira Ikuta
    Article ID: e25.11
    Published: 2025
    Advance online publication: April 17, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    In general, a speech signal can be measured by a microphone, such as a throat microphone. However, the speech signal measured by a microphone often contains surrounding noise. On the other hand, although a throat microphone is effective for surrounding noise, the speech signal it measures includes body-conducted internal noise. In this study, we propose an improvement method for the sound quality of the speech signal measured by a throat microphone to achieve speech recognition well. The relationship between the original speech signal and the speech measured by the throat microphone is not clear. Therefore, we consider the relationship as a multiplicative and additive model of the original speech signal and noise components with unknown parameters. An algorithm is proposed to simultaneously estimate the original speech signal and the unknown parameters using Bayes’ theorem based on the speech signal measured by the throat microphone. Finally, a speech recognition experiment is conducted to confirm the effectiveness of the proposed algorithm.

    Download PDF (667K)
  • Fumiki Yohena, Kohei Yatabe
    Article ID: e24.119
    Published: 2025
    Advance online publication: April 15, 2025
    JOURNAL OPEN ACCESS ADVANCE PUBLICATION

    Single-channel blind dereverberation aims to remove reverberation from a single-channel reverberant signal without using any prior knowledge. In acoustics, weighted prediction error (WPE), a method mainly used for a multi-channel signal, is often applied for this task. However, it is difficult to achieve well-performed dereverberation for a single-channel signal. In this paper, for better single-channel dereverberation, we propose to simultaneously estimate the source signal and the room impulse response (RIR) instead of only predicting reverberation. By modeling convolution using matrix lifting in the time-frequency domain, we formulate the dereverberation problem as a non-convex optimization problem of recovering a sparse rank-1 matrix. In sparse regularization, we introduce reweighting, enabling the improvement of sparse matrix recovery. The alternating direction method of multipliers (ADMM) with acceleration is applied to approximately solve the optimization problem, resulting in closed form updates. In our experiments, we confirmed that the proposed method outperforms existing methods in several reverberant conditions and is capable of removing both early reflection and late reverberation. MATLAB code of the proposed method is available online (https://doi.org/10.24433/CO.3541617.v1).

    Download PDF (1010K)
feedback
Top